# **Extensible Markup Language (XML) processing with Python

Python libraries to work with *XML* 
- **xml.etree.ElementTree** is an API for *analyzing and creating* XML data 
- **xml.dom.minidom** uses a *Document Object Model* approach for XML where each node of the tree structure is an object
- **xml.sax** is where SAX (Simple API for XML) deals with event-driven XML document analysis

 XML is a *markup language* intended for storing and transporting data
 
Here are some XML *elements*:
- **prolog** (optional) is specify character encoding `<?xml version="1.0" encoding="ISO-8859-2>` 
- **root element** is the main element that contains all other element 
- **elements** consist of opening and closing tags (text, attributes, child elements) 
- **attributes** key-value pairs that are inside elements

```xml
<?xml version="1.0"?>
<data>
    <book title="The Little Prince">
        <author>Antoine de Saint-Exupéry</author>
        <year>1943</year>
    </book>
    <book title="Hamlet">
        <author>William Shakespeare</author>
        <year>1603</year>
    </book>
</data>
```
`<?xml version="1.0">` is the **prolog**
`data` is your **root element**
`<book title="The Little Prince">` is an **element** with an **attribute** title 

In [4]:
# Importing ElementTree with an Alias 
import xml.etree.ElementTree as ET 

# Creating a tree from an existing XML document using the parse() method
tree = ET.parse('breakfast.xml')
# getroot() returns the root element in which we can reach any element in the document
root = tree.getroot()   

# Another way is to use formstring from an XML as a string returning a root element 
# root = ET.fromstring(your_xml_as_string)

print("Root tag is: ", root.tag)
for child in root:
    print("Child tag is: ", child.tag)
    print("Attributes: ", child.attrib)

Root tag is:  breakfast_menu
Child tag is:  food
Attributes:  {}
Child tag is:  food
Attributes:  {}
Child tag is:  food
Attributes:  {}
Child tag is:  food
Attributes:  {}
Child tag is:  food
Attributes:  {}


In [25]:
# We could access things directly using indexes
for child in root:  # inside breakfast_menu
    for inner_child in child:   # inside each food element 
        print(inner_child.text)

Berry-Berry Belgian Waffles
$8.95

Light Belgian waffles covered with an assortment of fresh berries and whipped cream

900
French Toast
$4.50

Thick slices made from our homemade sourdough bread

600
Homestyle Breakfast
$6.95

Two eggs, bacon or sausage, toast, and our ever-popular hash browns

950


**IF** my child had attributes...

You could access them using `child.attrib['attrib_name']` and work with `.text` or `.tag` properties 

In [31]:
# Looking inside and finds all child elements (and nested elements) for the requested tag
for breakfast in root.iter('name'): 
    print(breakfast.text)   # we could access the Element class object by using .text
    
# We'll see that this prints nothing because it ONLY looks at the first/closest child of the root element
for breakfast in root.findall('name'):  
    print(breakfast.text)

Berry-Berry Belgian Waffles
French Toast
Homestyle Breakfast


In [35]:
# Using the find method to parse XML 
print(root.find('food').tag)    # Represents the FIRST child element containing the "food" tag

food


In [41]:
# Exercise creating a class to convert cel to fahr
import xml.etree.ElementTree as ET

class TemperatureConverter:
    
    def __init__(self, temp_c):
        self.temp_c = temp_c
        self.temp_f = None
    
    def convert_celsius_to_fahrenheit(self):
        # perform the mafhs 
        temp_f = 9/5*float(self.temp_c) + 32
        self.temp_f = round(temp_f,1)
        return temp_f


class ForecastXmlParser:
    
    def __init__(self):
        pass 
    
    def parse(self, file_name):
        # Create the XML tree 
        tree = ET.parse(file_name)
        # Creating the root based off the tree (grabs root element)
        root = tree.getroot()
        # Looping through all the child memories of root 
        for item in root:
            # Let's convert our Cel to Fahr
            converter = TemperatureConverter(item.find('temperature_in_celsius').text)
            converter.convert_celsius_to_fahrenheit()
            # We could use index slice for each child element (0-day, 1-temp in cel) but we could also work with .find()
            # Our requested format is: "Day: C Celsius, F.0 Fahrenheit" where Fahrenheit is rounded to the first decimal
            print(f'{item[0].text}: {item.find('temperature_in_celsius').text} Celsius, {converter.temp_f} Fahrenheit')
            

forecast = ForecastXmlParser()
forecast.parse('forecast.xml')
        

Monday: 28 Celsius, 82.4 Fahrenheit
Tuesday: 27 Celsius, 80.6 Fahrenheit
Wednesday: 28 Celsius, 82.4 Fahrenheit
Thursday: 29 Celsius, 84.2 Fahrenheit
Friday: 29 Celsius, 84.2 Fahrenheit
Saturday: 31 Celsius, 87.8 Fahrenheit
Sunday: 32 Celsius, 89.6 Fahrenheit


the exact result: Check the Lab_1.PNG