# Processing Queries

### 1) Reading a configuration file

At this point the module will read into a '.cfg' file containing the names of three files that will be used. Only the first need to be provided, the other two are going to be created by the module.
The files format are:

* XML file for reading (provided)
* CSV file with the queries (created)
* CSV file with the results (created)

In [1]:
read_file = ""
queries_file = "../data/"
results_file = "../results/"

with open("../data/pc.cfg", "r") as config_file:
    for line in config_file.readlines():
        instruction, filename = line.split("=")
        filename = filename.strip()
        
        if instruction == "LEIA":
            read_file += filename
        elif instruction == "CONSULTAS":
            queries_file += filename
        elif instruction == "ESPERADOS":
            results_file += filename

### 2) Reading the XML file

In [2]:
from xml.etree import ElementTree as ET
import os


path = os.path.join("../data/CysticFibrosis/", read_file)
xml_file = ET.parse(path)
xml_root = xml_file.getroot()

### 3) Generating files

O caractere de separação será o ponto e vírgula ;
A primeira linha do arquivo csv será com os cabeçalhos

O primeiro a ser gerado é o arquivo de consultas

In [3]:
with open(queries_file, 'w') as queries:
    queries.write("QueryNumber;QueryText\n")
    
    for query in xml_root:
        query_number = ""
        query_text = ""
        for element in query:
            if element.tag == "QueryNumber":
                query_number = int(element.text)
            elif element.tag == "QueryText":
                query_text = element.text.upper()
                query_text = query_text.replace('\n  ', '')
                query_text = query_text.replace(';', '')
        queries.write(f"{query_number};{query_text}")

In [4]:
with open(results_file, 'w') as results:
    results.write("QueryNumber;DocNumber;DocVotes\n")
    
    for query in xml_root:
        query_number = ""
        for element in query:
            if element.tag == "QueryNumber":
                query_number = int(element.text)
            elif element.tag == "Records":
                for item in element:
                    doc_number = int(item.text)
                    score = item.attrib['score'].replace('0', '')
                    doc_votes = len(score)
                    results.write(f"{query_number};{doc_number};{doc_votes}\n")