# Fleur parser functions in the `masci-tools` repository

For versions `0.4.0` and higher of the `masci-tools` repository (https://github.com/JuDFTteam/masci-tools/tree/develop), there are new parsers for the fleur input and output files available.
This short tutorial demonstrates the usage of the main parsers and small functions that were implemented in this process, which are also very useful
on their own.

## The input file parser

Basic usage of the input parser is very simple. We just import the parser from the `masci-tools` repository and provide a path to a input file to parse

In [None]:
from masci_tools.io.parsers.fleur import inpxml_parser
inpxml_parser?
input_dict = inpxml_parser('./files/Fe_Example_input.xml')

The input parser navigates the whole input file recursively and converts every attribute according to the XML Schmema of the file version. This results in a python dictionary mirroring the input file structure.

Notice, however, that there are also some points, where the structure of the dictionary does not directly match the input file structure. An example of this can be seen for the `atomSpecies` tag. In the input file this tag is only used to mask a list of `species` tags and has no additional information of its own. These types of tags are automatically detected and the infomration is moved forward. A direct translation would lead to the `atomSpecies` key containg a dictionary with only a `species` key, which in turn holds the list containing the parsed `species` tags. Instead the metioned list is moved under the `atomSpecies` tag directly

In [None]:
from pprint import pprint
pprint(input_dict)

The parsers (this also applies to the output parser) can be called in three different ways

1. Like we've seen above, you can provide the path to the input file directly
2. The file can be opened and parsed into an xmltree beforehand

In [None]:
from lxml import etree

parser = etree.XMLParser(attribute_defaults=True, encoding='utf-8')
xmltree = etree.parse('./files/Fe_Example_input.xml', parser)
input_dict = inpxml_parser(xmltree)

3. A opened file handle can be passed in

In [None]:
with open('./files/Fe_Example_input.xml', 'r') as input_file:
    input_dict = inpxml_parser(input_file)

The parsers provide more information about the parsing itself, if the argument `parser_info_out` is provided

In [None]:
from pprint import pprint
parser_info = {'parser_warnings': []}
input_dict = inpxml_parser('./files/Fe_Example_input.xml', parser_info_out=parser_info)
pprint(parser_info)

In this case everything worked, so the information is limited to the version of the parser and the file version of the parsed file.
If we take a file, that has been slightly modified we will see some example warnings:

In [None]:
from pprint import pprint
parser_info = {'parser_warnings': []}
input_dict = inpxml_parser('./files/Fe_Example_input_invalid_attributes.xml', parser_info_out=parser_info)
pprint(parser_info)

If the file does not validate against the InputSchema of the given version an error is raised

In [None]:
from pprint import pprint
parser_info = {'parser_warnings': []}
input_dict = inpxml_parser('./files/Fe_Example_input_validation_errors.xml', parser_info_out=parser_info)
pprint(parser_info)

The input parser will also automatically execute any `xi:include` tags present in the input file 

In [None]:
!ls -l files/automatic_include/

In [None]:
input_dict = inpxml_parser('./files/automatic_include/inp.xml')
pprint(input_dict['cell']['bzIntegration'])

One big advantage of these new parsers is that they support many different fleur versions with not much effort to maintain this version compatibility. The following input file is still of the version `0.27`. The parser converts every attribute for which it can find a definition in InputSchema with a defined type.

In [None]:
input_dict = inpxml_parser('./files/old_input_file.xml')
pprint(input_dict)