# Introduction to XML with BeautifulSoup

[Parsing XML with BeautifulSoup in Python](https://stackabuse.com/parsing-xml-with-beautifulsoup-in-python/)

## XML Structure

In [3]:
xml_file = """
<?xml version="1.0" encoding="UTF-8"?>
<teachers>
    <teacher>
        <name>Sam Davies</name>
        <age>35</age>
        <subject>Maths</subject>
    </teacher>
    <teacher>
        <name>Cassie Stone</name>
        <age>24</age>
        <subject>Science</subject>
    </teacher>
    <teacher>
        <name>Derek Brandon</name>
        <age>32</age>
        <subject>History</subject>
    </teacher>
</teachers>
"""

In the above, `<teachers>` is the root tag.

Each of the `<teacher>` tags is a child of `<teachers>`.

`<name>`, `<age>`, `<subject>` are children of each individual `<teacher>` tag.

`<?xml version="1.0" encoding="UTF-8"?>` is called 'xml prolog.

## Parsing XML

### Parsing the XML into BeautifulSoup

In [5]:
from bs4 import BeautifulSoup

soup = BeautifulSoup(xml_file, 'xml')

### Returning the Contents of Particular Tags

names = soup.find_all('name')

for name in names:
    print(name.text)

## Parsing XML Tables

[Python Programming Tutorials](https://pythonprogramming.net/tables-xml-scraping-parsing-beautiful-soup-tutorial/)

In [16]:
import bs4 as bs
import urllib.request

source = urllib.request.urlopen('https://pythonprogramming.net/parsememcparseface/').read()
soup = bs.BeautifulSoup(source,'lxml')

In [17]:
table = soup.find('table')

In [18]:
table_rows = table.find_all('tr')

Find all table rows in the table.

In [19]:
for tr in table_rows:
    td = tr.find_all('td')
    row = [i.text for i in td]
    print(row)

[]
['Python', '932914021', 'Definitely']
['Pascal', '532', 'Unlikely']
['Lisp', '1522', 'Uncertain']
['D#', '12', 'Possibly']
['Cobol', '3', 'No.']
['Fortran', '52124', 'Yes.']
['Haskell', '24', 'lol.']
