# XML python documentation example
XML is an inherently hierarchical data format, and the most natural way to represent it is with a tree. ET has two classes for this purpose - ElementTree represents the whole XML document as a tree, and Element represents a single node in this tree. Interactions with the whole document (reading and writing to/from files) are usually done on the ElementTree level. Interactions with a single XML element and its sub-elements are done on the Element level.

# Parsing XML

## Document used

<img src = "country.PNG">

In [2]:
import xml.etree.ElementTree as ET

In [6]:
tree = ET.parse("country_data.xml")  # loading the xml data in the file to a variable 'tree' and data stored in tree format
type(tree)

xml.etree.ElementTree.ElementTree

In [8]:
root = tree.getroot()  # getting the root element of the given xml document
type(root)

xml.etree.ElementTree.Element

### As an __Element, root__ has a tag and a dictionary of attributes:

In [9]:
root.tag

'data'

In [10]:
root.attrib

{}

### Root also has children nodes over which we can iterate:

In [11]:
for child in root:
    print(child.tag, child.attrib)

country {'name': 'Liechtenstein'}
country {'name': 'Singapore'}
country {'name': 'Panama'}


### Children are nested, and we can access specific child nodes by index:

In [12]:
root[0][1]  # <country name="Liechtenstein"> and then go to <year>2008</year>

<Element 'year' at 0x00000067827AFB38>

In [13]:
root[0][1].text   # <country name="Liechtenstein"> and then go to <year>2008</year>

'2008'

In [14]:
root[2][0].text #   <country name="Panama"> and then go to <rank>68</rank>

'68'

In [16]:
root[2].text  # No idea

'\n        '

## Finding interesting elements

Element has some useful methods that help iterate recursively over all the sub-tree below it (its children, their children, and so on). 

### element.iter() - iterates recursively over all the sub-tree below it

In [20]:
for neighbour in root.iter("neighbor"):
    print(neighbour.attrib)

{'name': 'Austria', 'direction': 'E'}
{'name': 'Switzerland', 'direction': 'W'}
{'name': 'Malaysia', 'direction': 'N'}
{'name': 'Costa Rica', 'direction': 'W'}
{'name': 'Colombia', 'direction': 'E'}


### Element.findall() -  finds only elements with a tag which are direct children of the current element. 
### Element.find() finds the first child with a particular tag.
###  Element.text accesses the element’s text content.
### Element.get() accesses the element’s attributes:

In [27]:
for children in root.findall("country"):
    print(children.tag, children.attrib)

country {'name': 'Liechtenstein'}
country {'name': 'Singapore'}
country {'name': 'Panama'}


In [28]:
for country in root.findall('country'):
    rank = country.find('rank').text
    name = country.get('name')
    print(name, rank)

Liechtenstein 1
Singapore 4
Panama 68


More sophisticated specification of which elements to look for is possible by using [XPath](https://docs.python.org/2/library/xml.etree.elementtree.html#elementtree-xpath).