# 5.1 XML Processing

SQL databases are useful for storing large amounts of data. However, sometimes we want to store data in a less complicated data structure. This is useful as a format for **data interchange**. For example, we have already seen how JSON can do this since it is independent of programming language and platform. XML (Extensible Markup Language) is another such data format. There are two librarys for working with XML in the PSL called **SAX** and **DOM**. Starting with the SAX module,

In [2]:
import xml.sax

SAX is useful for when we are dealing with large XML files and/or we want to use a limit amount of RAM. We also cannot modify XML, but only read it.

In [24]:
class Handler(xml.sax.ContentHandler):
    """We must create our own handler which executues our desired code when the SAX
    parser parses an XML tag. In doing so, we must override existing methods."""

    def startElement(self, name, attrs):
        """The startElement method is called when the starting tag of an element is parsed.
        Here, we are overriding it with our own code."""
        
        print(f"Opening Tag: {name}")

    def endElement(self, name):
        """The endElement method is called when the end tag of an element is parsed. Here,
        we are overriding it with our own code."""
        print(f"Closing Tag: {name}")

    def characters(self, content):
        print(f"Content: {content}")

# We create our handler object,
handler = Handler()

# Then we create our SAX parser and set its handler,
parser = xml.sax.make_parser()
parser.setContentHandler(handler)

# Finally, we must called the parse() method to begin parsing the XML file,
parser.parse("books.xml")

Opening Tag: catalog
Content: 

Content:    
Opening Tag: book
Content: 

Content:       
Opening Tag: author
Content: Gambardella, Matthew
CLosing Tag: author
Content: 

Content:       
Opening Tag: title
Content: XML Developer's Guide
CLosing Tag: title
Content: 

Content:       
Opening Tag: genre
Content: Computer
CLosing Tag: genre
Content: 

Content:       
Opening Tag: price
Content: 44.95
CLosing Tag: price
Content: 

Content:       
Opening Tag: publish_date
Content: 2000-10-01
CLosing Tag: publish_date
Content: 

Content:       
Opening Tag: description
Content: An in-depth look at creating applications 
Content: 

Content:       with XML.
CLosing Tag: description
Content: 

Content:    
CLosing Tag: book
Content: 

Content:    
Opening Tag: book
Content: 

Content:       
Opening Tag: author
Content: Ralls, Kim
CLosing Tag: author
Content: 

Content:       
Opening Tag: title
Content: Midnight Rain
CLosing Tag: title
Content: 

Content:       
Opening Tag: genre
Content: Fan

In [None]:
class Handler(xml.sax.ContentHandler):
    """We must create our own handler which executues our desired code when the SAX
    parser parses an XML tag. In doing so, we must override existing methods."""

    def __init__(self):
        self.current_tag = None
        self.READfirst_tag = False

    def startElement(self, name, attrs):
        """The startElement method is called when the starting tag of an element is parsed.
        Here, we are overriding it with our own code."""

        if READfirst_tag is False:
            self.first_tag = name
            self.READfirst_tag = True

        

        
        
        print(f"Opening Tag: {name}")

    def endElement(self, name):
        """The endElement method is called when the end tag of an element is parsed. Here,
        we are overriding it with our own code."""
        print(f"Closing Tag: {name}")

    def characters(self, content):
        print(f"Content: {content}")

# We create our handler object,
handler = Handler()

# Then we create our SAX parser and set its handler,
parser = xml.sax.make_parser()
parser.setContentHandler(handler)

# Finally, we must called the