# **Pandas Working with XML**

In this part, we are going to learn about
1. Read XML and get Dataframe
2. Convert Dataframe to XML

What is XML ?

1. XML stands for eXtensible Markup Language
2. XML is a markup language much like HTML
3. XML was designed to store and transport data
4. XML was designed to be self-descriptive
5. XML is a W3C Recommendation

In [34]:
# importing pandas
import pandas as pd

## **Pandas read_xml**

Pandas has an inbuilt function,i.e., read_xml which help us to read a XML file tables and convert it into a dataframe

In [35]:
# reading a XML file
xml = pd.read_xml("/content/text.xml")

In [37]:
xml.head()

Unnamed: 0,id,author,title,genre,price,publish_date,description
0,bk101,"Gambardella, Matthew",XML Developer's Guide,Computer,44.95,2000-10-01,An in-depth look at creating applications \n ...
1,bk102,"Ralls, Kim",Midnight Rain,Fantasy,5.95,2000-12-16,"A former architect battles corporate zombies, ..."
2,bk103,"Corets, Eva",Maeve Ascendant,Fantasy,5.95,2000-11-17,After the collapse of a nanotechnology \n ...
3,bk104,"Corets, Eva",Oberon's Legacy,Fantasy,5.95,2001-03-10,"In post-apocalypse England, the mysterious \n ..."
4,bk105,"Corets, Eva",The Sundered Grail,Fantasy,5.95,2001-09-10,"The two daughters of Maeve, half-sisters, \n ..."


In [39]:
# reading another xml file
xml = pd.read_xml("/content/sample.xml")
xml

Unnamed: 0,shape,degrees,sides
0,Square,360,4.0
1,Circle,360,
2,Triangle,180,3.0


In [42]:
# another way
xml = '''<?xml version="1.0" encoding='utf-8'?>
<data xmls="http://example.com">
<row>
	<shape>Square</shape>
	<degrees>360</degrees>
	<sides>4.0</sides>
</row>
<row>
	<shape>Circle</shape>
	<degrees>360</degrees>
</row>
<row>
	<shape>Triangle</shape>
	<degrees>180</degrees>
	<sides>3.0</sides>
</row>
</data>
'''

In [44]:
xml = pd.read_xml(xml)
xml

Unnamed: 0,shape,degrees,sides
0,Square,360,4.0
1,Circle,360,
2,Triangle,180,3.0


In [56]:
# another way
xml = '''<?xml version='1.0' encoding='utf-8'?>
<doc:data xmlns:doc="https://example.com">
  <doc:row>
    <doc:shape>square</doc:shape>
    <doc:degrees>360</doc:degrees>
    <doc:sides>4.0</doc:sides>
  </doc:row>
  <doc:row>
    <doc:shape>circle</doc:shape>
    <doc:degrees>360</doc:degrees>
    <doc:sides/>
  </doc:row>
  <doc:row>
    <doc:shape>triangle</doc:shape>
    <doc:degrees>180</doc:degrees>
    <doc:sides>3.0</doc:sides>
  </doc:row>
</doc:data>'''

In [57]:
pd.read_xml(xml)

Unnamed: 0,shape,degrees,sides
0,square,360,4.0
1,circle,360,
2,triangle,180,3.0


In [58]:
pd.read_xml(xml, xpath="//doc:row", namespaces={"doc": "https://example.com"})

Unnamed: 0,shape,degrees,sides
0,square,360,4.0
1,circle,360,
2,triangle,180,3.0


## **Pandas to_xml**

Pandas has an inbuilt function,i.e., to_xml which help us to write a dataframe to an xml file.

In [59]:
# creating an xml
xml = '''<?xml version='1.0' encoding='utf-8'?>
<doc:data xmlns:doc="https://example.com">
  <doc:row>
    <doc:shape>square</doc:shape>
    <doc:degrees>360</doc:degrees>
    <doc:sides>4.0</doc:sides>
  </doc:row>
  <doc:row>
    <doc:shape>circle</doc:shape>
    <doc:degrees>360</doc:degrees>
    <doc:sides/>
  </doc:row>
  <doc:row>
    <doc:shape>triangle</doc:shape>
    <doc:degrees>180</doc:degrees>
    <doc:sides>3.0</doc:sides>
  </doc:row>
</doc:data>'''

In [60]:
# reading the same xml file
xml = pd.read_xml(xml)
xml

Unnamed: 0,shape,degrees,sides
0,square,360,4.0
1,circle,360,
2,triangle,180,3.0


In [61]:
# writing a dataframe to xml file
xml.to_xml('test.xml')