# Parsing XML and JSON

## Practice Programming Assignment

In this assignment we are provided with an XML file and JSON file. The files contain some information about stock prices from Moscow Stock Exchange. We will need to inspect the data to answer some questions. 

### Part 1. XML

We are provided with file `securities.xml`. Explore it to answer the quesionts.

**Question 1.** How many elements are in the XML file?

In [1]:
import xml.etree.ElementTree as ET

xml_from_file = ET.parse('securities.xml')
root = xml_from_file.getroot()

In [2]:
answer_part_1 = len(root)
answer_part_1

2

<br>

**Question 2.** How many XML elements named 'row' are in the XML-file? 

In [3]:
count = 0
for column in root.iter('row'):
    count += 1
answer_part_2 = count
answer_part_2

55

<br>

**Question 3.** What is the height of the file's XML tree? 

<br>

*Note:* By the height of the tree we mean the length of the longest sequence of nodes from root element to a leaf element. For example: let's look at the following XML:

```
<root>
    <element1>
        <some_element></some_element>
    </element1>
    <element_2></element_2>
</root>
```

The height of the tree here is 3, since there are two sequences from root to leaf:

1. `<root>` - `<element_1>` - `<some_element>`
2. `<root>` - `<element_2>`

The first sequence is the longest, and its length is 3.

In [4]:
answer_part_3 = 5

<br>

Each `row` element contains data about some stock in its attribute values. For example, attributes CLOSE and OPEN stand for close price and open price for a stock in this day accordingly. Attribute VOLUME stands for total trade volume of the stock in this particular day.

**Question 4.** What is average value of a difference between CLOSE and OPEN prices among all stocks present? 

*Note:* If a stock doesn't have data about its CLOSE and OPEN prices, skip it.

In [5]:
count = 0
itog = 0
for column in root.iter('row'):
    try:
        itog += float(column.attrib['CLOSE']) - float(column.attrib['OPEN'])
        count+=1
    except:
        continue

answer_part_4 = itog / count
answer_part_4

-81.72016052499997

<br>

**Question 5.** What is the value of largest VOLUME among all stocks present? 

In [6]:
l = []
for column in root.iter('row'):
    try:
        l.append(float(column.attrib['VOLUME']))
    except:
        continue

answer_part_5 = max(l)
answer_part_5

63615300000.0

### Part 2. JSON

You are provided with file `securities.json`. It also has some information about stocks, but It has a slightly different structure. Explore it to answer the quesionts.

*Note:* `data`-element in the file containts rows with data values. To see names for these values (what data value means what) you need to check element `securities['history']['columns']`

<br>



**Question 6.** What is the height of the file's JSON tree? 

*Note:* By the height of the tree we mean the length of the longest sequence of nodes from root element to a leaf element (similar to the height of an XML defined in Question 3)

In [7]:
import json
with open('securities.json') as f:
    securities = json.load(f)

# securities['history']['columns']
answer_part_6 = 4
answer_part_6

4

<br>

**Question 7.** How many branches does `data` element have? 

In [8]:
answer_part_7 = len(securities['history']['data'])
answer_part_7

63

<br>

**Question 8.** What is the average value of a difference between HIGH and LOW prices? 

*Note:* If a stock doesn't have data about its HIGH and LOW prices, skip it.

In [9]:
count = 0
itog = 0

for i in securities['history']['data']:
    if i[7] is not None and i[8] is not None:
        itog += float(i[7]) - float(i[8])
        count += 1

answer_part_8 = itog/count
answer_part_8

15.93157894736842

<br>

**Question 9.** How many unique values of BOARDID do we see in data? 

In [10]:
l_=[]
for i in securities['history']['data']:
    l_.append(i[0])

answer_part_9 = len(set(l_))
answer_part_9

2

<br>

**Question 10.** What value of BOARDID is the most occuring?  

In [11]:

SNDX=0
RTSI=0

for i in securities['history']['data']:
    if i[0] == 'SNDX':
        SNDX +=1
    else:
        RTSI +=1
print(SNDX)
print(RTSI)
# answer_part_10

33
30
