# Obtaining URL components

In order to get a component-wise URL representation as necessary for the verification of parsed URLs, one can use [tribble](https://github.com/havrikov/tribble/tree/7797acd8801e48cbedb86485032f577cee8ea94c) with a few additions.
In detail these additions consist of

* the class **DictExtractor** which given a list of component names to look for, searches the parse tree for these components and returns a string containing all components as well as their content contained in the parse tree

* the abstract class **ComponentsBuilder** which serves as blueprint for actual implementations and additionally contains a list of components to search for and thus is closely related to the grammar beeing used 

* the class **FirefoxURLComponentsBuilder** which creates a mapping between the component names used in the grammar and the component names used by a parser (here the firefox parser), additionally this class takes care of the formatting of the final string representation of components and their content

* an addition to **tasks** in the execution package, that calls `dictExtractor.extract(tree)` and writes the resulting string into a file

These are some example outputs aimed at firefox generated with the URL grammar that comes with tribble:

{scheme:"ftp://",pathQueryRef:"/",host:"8.2.28.48926",hostPort:"8.2.28.48926",spec:"ftp://8.2.28.48926/;ET"}

{scheme:"gopher://",host:"9.80.4.54",hostPort:"9.80.4.54",spec:"gopher://9.80.4.54/!"}

{scheme:"ftp://",pathQueryRef:"/6",host:"2.03.19.7",hostPort:"2.03.19.7",spec:"ftp://2.03.19.7/6"}

_**Note**_: due to the complexity of the included grammar not all special features of each scheme are represented accordingly (see the ";ET" part in the first example, or the "!" in the second example. Once a URL grammar to be used is fixed, all special features will be represented accordingly.