# Web Services and Appications

## Topic02 Representing Data (XML and JSON)

###### **XML:** https://realpython.com/python-xml-parser/#xmldomminidom-minimal-dom-implementation <br> **JSON:** https://www.w3schools.com/js/js_json_intro.asp

DOM (Document Object Model):

DOM or Document Object Model is a programming interface for web documents. It creates an object-oriented representation of the structure and content of a web page as a tree structure, with nodes representing objects in the document.
Node Types: Nodes include element nodes, text nodes, attribute nodes, etc.
Hierarchy: Nodes have parent-child relationships, forming a hierarchical structure.
Node Examples: Document Node, Element Nodes, Text Nodes, Attribute Nodes.
> Module to navigate HTML and XML Dom trees: xml.dom.minidom.<br> ref: https://docs.python.org/3/library/xml.dom.minidom.html


#### Reading in XML data from Local file

In [4]:
# for navigating local files
from xml.dom.minidom import parse # import function used to parse XML files.
filename = "employees.xml" # get file path
# there are two ways to read in data
# alternative 1 - Direct Parsing (parse(filename)):It directly parses the XML file using the parse function (filename is a file path.)
doc = parse(filename)
# alternative 2 - Parsing via File Object (parse(fp)) within a *with* statement: It opens the file using a context manager (with statement) and parses the file object (fp) using the parse function.
                    #The with statement ensures proper resource management by automatically closing the file after the code block execution 
                    #even if errors occur during file operations or parsing, allowing for graceful exception handling
with open(filename) as fp:
    doc=parse(fp)

# check result
print(doc.toprettyxml(), end='')


<?xml version="1.0" ?>
<Company>
	
	
	<Employee category="Technical">
		
		
		<FirstName>
			Joe
		</FirstName>
		
		
		<LastName>
			Murphy
		</LastName>
		
		
		<ContactNo>
			1234567890
		</ContactNo>
		
	
	</Employee>
	
	
	<Employee category="Non-Technical">
		
		
		<FirstName>
			Mary
		</FirstName>
		
		
		<LastName>
			Martin
		</LastName>
		
		
		<ContactNo>
			1234667898
		</ContactNo>
		
	
	</Employee>
	

</Company>


##### Reading in XML from cloud / online source

In [10]:
# for navigating local files
import requests # requests library is used to make HTTP requests to web servers.
from xml.dom.minidom import parseString # parseString function used to parse XML data from strings

url="http://api.irishrail.ie/realtime/realtime.asmx/getCurrentTrainsXML" # define URL
page= requests.get(url) #Sends a GET request to the specified URL and stores the response in 'page'
doc= parseString(page.content) # parsing the XML content (which is in the form of a string) into a DOM (Document Object Model) structure

# print results
print(doc.toprettyxml(), end='')

<?xml version="1.0" ?>
<ArrayOfObjTrainPositions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://api.irishrail.ie/realtime/">
	
  
	<objTrainPositions>
		
    
		<TrainStatus>N</TrainStatus>
		
    
		<TrainLatitude>51.9018</TrainLatitude>
		
    
		<TrainLongitude>-8.4582</TrainLongitude>
		
    
		<TrainCode>B211</TrainCode>
		
    
		<TrainDate>03 Feb 2024</TrainDate>
		
    
		<PublicMessage>B211\nCork to Mallow\nExpected Departure 10:00</PublicMessage>
		
    
		<Direction>To Mallow</Direction>
		
  
	</objTrainPositions>
	
  
	<objTrainPositions>
		
    
		<TrainStatus>N</TrainStatus>
		
    
		<TrainLatitude>51.9018</TrainLatitude>
		
    
		<TrainLongitude>-8.4582</TrainLongitude>
		
    
		<TrainCode>B507</TrainCode>
		
    
		<TrainDate>03 Feb 2024</TrainDate>
		
    
		<PublicMessage>B507\nCork to Cobh\nExpected Departure 10:00</PublicMessage>
		
    
		<Direction>To Cobh</Direction>
		
  
	</objTrainPositions>

#### Access data in XML (from local file)

In [16]:
# Get employee name by tags

# load module
from xml.dom.minidom import parse
# set file path
filename = "employees.xml"
#read in XML
doc = parse(filename)
#print(doc.toprettyxml(), end='')

employeeNodelist=doc.getElementsByTagName("Employee") #get everythin inside the Employee tag

for employeeNode in employeeNodelist: #itearate through each item inside Employee
    firstNameNode = employeeNode.getElementsByTagName("FirstName").item(0) #first item with a FirstName tag inside current employee
    firstName = firstNameNode.firstChild.nodeValue.strip() #value inside FirstName tag without extra spaces
    lastNameNode = employeeNode.getElementsByTagName("LastName").item(0) #first item with a LastName tag inside current employee
    lastName = lastNameNode.firstChild.nodeValue.strip() #value inside FirstName tag without extra spaces
    print(firstName, lastName)
#print source XML for reference
print("\nsource XML file structure\n")
print(doc.toprettyxml(), end='')
    

Joe Murphy
Mary Martin

source XML file structure

<?xml version="1.0" ?>
<Company>
	
	
	<Employee category="Technical">
		
		
		<FirstName>
			Joe
		</FirstName>
		
		
		<LastName>
			Murphy
		</LastName>
		
		
		<ContactNo>
			1234567890
		</ContactNo>
		
	
	</Employee>
	
	
	<Employee category="Non-Technical">
		
		
		<FirstName>
			Mary
		</FirstName>
		
		
		<LastName>
			Martin
		</LastName>
		
		
		<ContactNo>
			1234667898
		</ContactNo>
		
	
	</Employee>
	

</Company>


## JSON
JSON (JavaScript Object Notation) is a lightweight data interchange format used to transmit data between a server and a web application, for configuration files, and more. It's human-readable and easy to parse. It is simpler and more concise compared to XML.
**Key differences between XML and JSON:**
+ Format:
XML: Uses tags to define data elements and structure.
JSON: Uses key-value pairs to define data elements and structure.
+ Readability:
XML: Can be verbose and less readable due to tags and attributes.
JSON: Typically more concise and easier to read due to its simple syntax.
+ Data Types:
XML: Supports various data types and complex structures.
JSON: Primarily supports primitive data types (strings, numbers, booleans) and arrays/objects.
+ Parsing:
XML: Parsing XML documents usually requires specialized parsers.
JSON: Parsing JSON is straightforward and can be done with built-in functions in most programming languages.
+ Usage:
XML: Commonly used in web services, document storage, and configuration files.
JSON: Widely used for web APIs, AJAX requests, and exchanging data between web servers and clients due to its simplicity and efficiency.

### Read in JSON from local file

In [27]:
import json #json module, which provides functions for encoding and decoding JSON data.
filename = "wsaa2.4-json.json"# set file path

with open(filename, "r") as fp: #Opens the JSON file in read mode using a *with* statement, so the file is properly closed after finished, even if there is an error.
    jsonobject = json.load(fp)
#print (jsonobject)
#this is the structure:
'''
{'employees': 
    [
    {'firstName': 'John', 'lastName': 'Doe'},
    {'firstName': 'Anna', 'lastName': 'Smith'}, 
    {'firstName': 'Peter', 'lastName': 'Jones'}
    ]
}
'''
for employee in jsonobject["employees"]:
    print(employee["firstName"],employee["lastName"])

John Doe
Anna Smith
Peter Jones


### Read in JSON from online source / cloud

In [31]:
import requests
import json

url = "https://api.coindesk.com/v1/bpi/currentprice.json"
response = requests.get(url)#Sends an HTTP GET request to the specified URL and stores the response.
data = response.json()#parse data from JSON format into a python dictionary
# output result in a json file:
with open ("bitcoindump.json", "w") as fp:
    json.dump(data, fp)

#extract data:
bpi = data["bpi"] #bpi is the key or header or name of the whole python dictionary (Bitcoin Price Index)
#print(bpi)
rate = bpi["EUR"]["rate"] # this will find "bpi" -> "EUR" -> "rate"
print(rate)

39,830.742


In [None]:

# this is the JSON structure (Bitcoin Price Index) starting from bpi is parsed into the python dictionary
'''
{
    "time": {
        "updated": "Feb 3, 2024 10:27:01 UTC",
        "updatedISO": "2024-02-03T10:27:01+00:00",
        "updateduk": "Feb 3, 2024 at 10:27 GMT"
    },
    "disclaimer": "This data was produced from the CoinDesk Bitcoin Price Index (USD). Non-USD currency data converted using hourly conversion rate from openexchangerates.org",
    "chartName": "Bitcoin",
    "bpi": {
        "USD": {
            "code": "USD",
            "symbol": "&#36;",
            "rate": "43,111.003",
            "description": "United States Dollar",
            "rate_float": 43111.0034
        },
        "GBP": {
            "code": "GBP",
            "symbol": "&pound;",
            "rate": "34,125.722",
            "description": "British Pound Sterling",
            "rate_float": 34125.7218
        },
        "EUR": {
            "code": "EUR",
            "symbol": "&euro;",
            "rate": "39,922.945",
            "description": "Euro",
            "rate_float": 39922.9447
        }
    }
}
'''