# Homework 9: Getting Familiar with NASA Polynomials
## Due Date:  Tuesday, November 7th at 11:59 PM

Read the NASA Polynomial dataset in raw format and parse and store the data into an .xml file.

### Review of the NASA Polynomials
You can find the NASA Polynomial file in `thermo.txt`.

You can find some details on the NASA Polynomials [at this site](http://combustion.berkeley.edu/gri_mech/data/nasa_plnm.html) in addition to the Lecture 16 notes.


The NASA polynomials for specie $i$ have the form:
$$
    \frac{C_{p,i}}{R}= a_{i1} + a_{i2} T + a_{i3} T^2 + a_{i4} T^3 + a_{i5} T^4
$$

$$
    \frac{H_{i}}{RT} = a_{i1} + a_{i2} \frac{T}{2} + a_{i3} \frac{T^2}{3} + a_{i4} \frac{T^3}{4} + a_{i5} \frac{T^4}{5} + \frac{a_{i6}}{T}
$$

$$
    \frac{S_{i}}{R}  = a_{i1} \ln(T) + a_{i2} T + a_{i3} \frac{T^2}{2} + a_{i4} \frac{T^3}{3} + a_{i5} \frac{T^4}{4} + a_{i7}
$$

where $a_{i1}$, $a_{i2}$, $a_{i3}$, $a_{i4}$, $a_{i5}$, $a_{i6}$, and $a_{i7}$ are the numerical coefficients supplied in NASA thermodynamic files. 

### Some Notes on `thermo.txt`
The first 7 numbers starting on the second line of each species entry (five of the second line and the first two of the third line) are the seven coefficients ($a_{i1}$ through $a_{i7}$, respectively) for the high-temperature range (above 1000 K, the upper boundary is specified on the first line of the species entry). 

The next seven numbers are the coefficients ($a_{i1}$ through $a_{i7}$, respectively) for the low-temperature range (below 1000 K, the lower boundary is specified on the first line of the species entry).

### Additional Specifications
Your final .xml file should contain the following specifications:

1. A `speciesArray` field that contains a space-separated list of all of the species present in the file.
2. Each species contains a `species` field with a `name` attribute as the species name.

    1. For each temperature range, use a sub-field with the minimum and maximum temperature as attributes.
    2. `floatArray` field that contains comma-separated string of each coefficient.
    
You can reference the `example_thermo.xml` file for an example .xml output.

**Hint**: First parse the file into a Python dictionary. 

In [49]:
import re
from copy import deepcopy
import xml.etree.ElementTree as ET

#read the file
file = open('thermo.txt','r')

#read in each line
lines = file.readlines()

#close the file
file.close()

#dictionary
species_info = {}

#iterate over lines
for i,line in enumerate(lines):
    #spilt the line
    strings = line.split()
    
    #stop if at the end of the file
    if strings[0]=="END":
        break  
    
    #find line with species
    if strings[-1] == '1':
        #print(strings)
        #get species name
        specie = strings[0]
        #print('specie: ', strings[0])
        #get the low temp min and max
        low_min = strings[-4]
        low_max = 1000.000
        
        #get the high temp min and max
        high_min = 1000.000
        high_max = strings[-3] 
        
    if strings[-1] == '2': 
        #spilt the line by number as the numbers are not broken up by spaces in txt file
        strings = re.findall(r"[+\-]?(?:0|[1-9]\d*)(?:\.\d*)?(?:[eE][+\-]?\d+)?", line)
        #get first 5 high coefs
        high_coeffs = []
        high_coeffs.extend(strings[0:-1])
        #print(strings[0:-1])
        
    if strings[-1] == '3': 
        #spilt the line by number as the numbers are not broken up by spaces in txt file
        strings = re.findall(r"[+\-]?(?:0|[1-9]\d*)(?:\.\d*)?(?:[eE][+\-]?\d+)?", line)
        #first two are high coeefs
        high_coeffs.extend(strings[0:2])
        
        #remaining are low coeffs
        low_coeffs = []
        low_coeffs.extend(strings[2:-1])
        
    if strings[-1] == '4': 
        #spilt the line by number as the numbers are not broken up by spaces in txt file
        strings = re.findall(r"[+\-]?(?:0|[1-9]\d*)(?:\.\d*)?(?:[eE][+\-]?\d+)?", line)
        # get low coefs
        low_coeffs.extend(strings[0:-1])
        #print("numbers in 4:",strings[0:-1])

        #Add to dictionary
        species_info[specie]={'l':{},'h':{}}
        species_info[specie]['l']['Tmax'] = low_max
        species_info[specie]['l']['Tmin'] = low_min
        species_info[specie]['l']['coeffs'] = low_coeffs
        species_info[specie]['h']['Tmax'] = high_max
        species_info[specie]['h']['Tmin'] = high_min
        species_info[specie]['h']['coeffs'] = high_coeffs
        
        #print everything
        #print('specie: ',specie)
        #print('low min: ',low_min)
        #print('low max: ',low_max)
        #print("low",low_coeffs)
        #print('high min: ',high_min)
        #print('high max: ',high_max)
        #print("high",high_coeffs)
        
#Print out dict to test

'''for i,k in enumerate(species_info):
    print(k)
    print(species_info[k]['h']['Tmax'])
    print(species_info[k]['h']['Tmin'])
    print(species_info[k]['h']['coeffs'])
    print(species_info[k]['l']['Tmax'])
    print(species_info[k]['l']['Tmin'])
    print(species_info[k]['l']['coeffs'])'''

"for i,k in enumerate(species_info):\n    print(k)\n    print(species_info[k]['h']['Tmax'])\n    print(species_info[k]['h']['Tmin'])\n    print(species_info[k]['h']['coeffs'])\n    print(species_info[k]['l']['Tmax'])\n    print(species_info[k]['l']['Tmin'])\n    print(species_info[k]['l']['coeffs'])"

In [80]:
#Create a xml from Dictionary
from copy import deepcopy
import xml.etree.ElementTree as ET

#Get the xml structure from file
tree = ET.parse('example_thermo.xml')
root = tree.getroot()

#Create species array element
speciesArray = deepcopy(root.find('phase').find('speciesArray'))

#Create string of all species
species_list = ''
for i,k in enumerate(species_info):
    species_list += ' '+ str(k)

#Add species string to copied species array element
speciesArray.text = species_list

#Delete old species array element
example_speciesArray= root.find('phase').find('speciesArray')
root.find('phase').remove(example_speciesArray)

#Append new species array eleemnt
root.find('phase').append(speciesArray)

#Make copy of species element
specie = deepcopy(root.find('speciesData').find('species'))


 O O2 H H2 OH H2O HO2 H2O2
 O O2 H H2 OH H2O HO2 H2O2
