# Lab 7:Reading and writing TXT, CSV, JSON, XML and YAML with Python

### Author: <font color='red'>Aliyah Lewis</font>

In [1]:
# Import necessary libraries for this lab
import os
import csv

## Part A - Reading and parsing a TXT file (10 points)

<div class="alert alert-warning">
    <strong>IMPORTANT: </strong>Make sure file <strong>charlotte_hornets_stats.txt</strong> is located in the current directory.
</div>

### File "charlotte_hornets_stats.txt" layout

This file has __15 columns__ of data. The column headings are: 

    Player,POS,GP,GS,MIN,PTS,OR,DR,REB,AST,STL,BLK,TO,PF,AST/TO
    
    - Each field is separated by the '#' character
    - Use Python's "with open() as <filehandle>:" syntax to open the file
    - Save each line as an entry in a Python LIST


__NOTE:__ The data used to create this file was exported from the following website: 

> Charlotte Hornets Stats 2022-23 (ESPN webpage) 
>> URL: https://www.espn.com/nba/team/stats/_/name/cha/table/game/sort/gamesPlayed/dir/desc


### Read a TXT file 'charlotte_hornets_stats.txt' using Python's 'with open() ...' function

Python files are read using the Python's open() function<br>

REFERENCES: Information about the __open()__ library can be found here:<br>
> REF: https://www.w3schools.com/python/python_file_open.asp <br>
> REF: https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files <br>

<strong>HOWEVER</strong>, a better method for opening files in Python is to use the __with open() as _file___ syntax, <br>where ___file___ references the file name/handle <br><br>
Using __with open() as _file___ syntax eliminates the need to __close()__ the file.<br>

> REF: https://www.geeksforgeeks.org/with-statement-in-python/

You MUST use the __with open() as _file___ syntax when reading the <b>charlotte_hornets_stats.txt</b> file.

<b>REMINDER:</b> Python has multiple <b>read()</b> methods:
- _file_.read() ~ The read() method by default returns the whole file (text)
  
- _file_.readline() ~ The readline() method returns one line of the file 

> REF: https://www.w3schools.com/python/python_file_open.asp

<div class="alert alert-block alert-danger">
    <strong>REMINDER: </strong>Make sure you use the correct <b>with open() as XXXX</b> syntax to open &amp; read the <b>charlotte_hornets_stats.txt</b> file!!!
</div>

<span style="color:blue">
Read the <b>charlotte_hornets_stats.txt</b> file into a list ...<br><br>
    1. Create an empty list with the variable name <b>hornets_stats</b> <br>
    2. Read file <b>charlotte_hornets_stats.txt</b> using Python's <b>with open() as <i>file</i></b> syntax. <br>
    -- Read the file <b>one line at a time</b><br>
    -- Use <b>with open() as <i>file</i></b> to eliminate having to <b>close()</b> the file.<br>
    -- HINT: Remember to <b>strip()</b> all whitespace<br>
    3. Parse each line using the <b>.split()</b> method <br>
    -- HINT: Refer to file format information to determine how to split the lines.<br>
    4. Save the results of the line parsing in a variable named <b>player_stats</b> <br>
    5. Add(append) <b>player_stats</b> to the <b>hornets_stats</b> list.
</span>

In [6]:
# Read a TXT file 'charlotte_hornets.txt' using Python's 'with open() as XXXX' command

# INSERT CODE FOR STEPS 1-5
hornets_stats = []
with open('charlotte_hornets_stats.txt', 'r') as file:
    for line in file:
        player_stats = line.strip().split(',')
        hornets_stats.append(player_stats)

In [7]:
# DO NOT MODIFY !!!
print('PART A')
print('STEPS 1-5 - hornets_stats LIST')
print(f'Lines in hornets_stats = {len(hornets_stats)}\n')
# Print each line in hornets_stats list
for line in hornets_stats:
    print(line)

PART A
STEPS 1-5 - hornets_stats LIST
Lines in hornets_stats = 18

['NAME              #POS#GP#GS#MIN#PTS#OR#DR#REB#AST#STL#BLK#TO#PF#AST/TO']
['P.J. Washington   #PF #59#59#32.9#15.2#1.0#3.7#4.7#2.4#0.9#1.1#1.4#2.6#1.7']
['Mason Plumlee     #C  #56#56#28.5#12.2#3.3#6.3#9.7#3.7#0.6#0.6#1.6#2.9#2.4']
['Jalen McDaniels   #PF #56#21#26.7#10.6#0.8#4.0#4.8#2.0#1.2#0.5#1.4#2.8#1.4']
['Terry Rozier      #SG #49#49#35.6#21.8#1.0#3.3#4.3#4.9#1.1#0.3#2.1#1.9#2.3']
['JT Thor           #F  #47#0#10.8#2.6#0.5#1.2#1.7#0.4#0.2#0.2#0.5#0.8#0.8']
['Nick Richards     #C  #45#0#17.3#8.0#2.5#3.3#5.8#0.5#0.2#1.0#0.9#2.3#0.6']
['Dennis Smith Jr.  #PG #38#12#25.0#8.4#0.4#2.3#2.8#4.4#1.4#0.4#1.6#2.2#2.8']
['Theo Maledon      #PG #36#1#16.1#5.6#0.3#2.1#2.4#2.4#0.8#0.2#1.0#1.0#2.5']
['Kelly Oubre Jr.   #SG #35#30#32.6#20.2#1.5#3.6#5.1#1.2#1.6#0.4#1.3#3.0#1.0']
['Gordon Hayward    #SF #34#34#31.1#13.9#0.7#3.5#4.2#3.6#0.9#0.2#1.9#1.3#2.0']
['LaMelo Ball       #PG #33#33#35.4#23.3#1.2#5.1#6.3#8.4#1.3#0.3#3.5#3.4#2

## Part B - Reading and writing a CSV file (20 points)

<div class="alert alert-warning">
<strong>IMPORTANT:</strong> You will be using the <strong>hornets_stats</strong> LIST created in Part A to write a new CSV file named <strong>charlotte_hornets_stats.csv</strong>. 
<br>Be sure you get Part A working correctly before starting Part B.<br><br>
    <b>VERIFICATION INFORMATION</b>: You should have <b>18</b> lines in the CSV file and the first <b>3</b> lines should look like this:<br>(with trailing spaces in NAME &amp; POS columns):<br><br>
['NAME              ', 'POS', 'GP', 'GS', 'MIN', 'PTS', 'OR', 'DR', 'REB', 'AST', 'STL', 'BLK', 'TO', 'PF', 'AST/TO']<br>
['P.J. Washington   ', 'PF ', '59', '59', '32.9', '15.2', '1.0', '3.7', '4.7', '2.4', '0.9', '1.1', '1.4', '2.6', '1.7']<br>
['Mason Plumlee     ', 'C  ', '56', '56', '28.5', '12.2', '3.3', '6.3', '9.7', '3.7', '0.6', '0.6', '1.6', '2.9', '2.4']<br>
</div>

### Write a CSV file 'charlotte_hornets_stats.csv' using Python's csv.writer() function

CSV files are plain text files used to store data in a tabular format, with each piece of data separated by a comma (,)<br>
Python has a __csv__ library that allows you to read, parse and write CSV files.<br>

SYNTAX:<br>
* csv.writer((_csvfile, dialect='excel', \*\*fmtparams_) ~ returns a writer object responsible for converting the user's data into delimited strings <br>
    * csvwriter.writerow(row) ~ Writes the row parameter to the writer’s file object
    * csvwriter.writerows(rows) ~ Write all elements in rows (an iterable of row objects) to the writer’s file object
    

REFERENCES:  Information about the __csv__ library can be found here: <br> 
>REF: https://docs.python.org/3/library/csv.html <br>
>REF: https://www.geeksforgeeks.org/writing-csv-files-in-python/ <br>
>REF: https://docs.python.org/3/library/csv.html
(shows use of newline=' ' parameter) <br>

<span style="color:blue">
    1. Open a file named <strong>charlotte_hornets_stats.csv</strong> using <b>with open() as <i>file</i></b> <br>
    2. Use <b>csv.writer(...)</b> to get a "writer object" <br>
    3. Iterate thru the <b>hornets_stats</b> list using <b>.writerow(...)</b> to create the new CSV file named <b>charlotte_hornets_stats.csv</b><br>
    -- REMINDER: Make sure you specify the correct delimiter for a CSV file!
</span>

In [8]:
# DO NOT MODIFY !!!
print('PART B')
print('STEPS 1-6 - charlotte_hornets_stats.csv FILE CREATION')

PART B
STEPS 1-6 - charlotte_hornets_stats.csv FILE CREATION


In [9]:
# Write a CSV file named 'charlotte_hornets_stats.csv' using Python's csv.writer() function

# INSERT CODE FOR STEPS 1-3
with open('charlotte_hornets_stats.csv', 'w', newline='') as csv_file:
    csv_writer = csv.writer(csv_file, delimiter=',')
    for player_stats in hornets_stats:
        csv_writer.writerow(player_stats)

<div class="alert alert-block alert-danger">
    <b>IMPORTANT: </b>Verify the contents of the <b>charlotte_hornets_stats.csv</b> you created.<br>
    * <b>Question:</b> Is it a CSV file? Are there commas(,) separating each field on a line?<br>
    * <b>Question:</b> Do you have extra blank lines in the file? (The file should not be double-spaced)<br>
    -- You can view the file in Jupyter Notebook or a simple editor to check.
</div>

### Read a CSV file 'charlotte_hornets_stats.csv' using Python's csv.reader() function

Now that you have written a new CSV file containing the Charlotte Hornets statistics data, <br>
let's see if you can <b>read</b> the data back in correctly.<br>

SYNTAX:<br>
* csv.reader(_csvfile, dialect='excel', \*\*fmtparams_) ~ returns a reader object which will iterate over the lines in the csv file<br>
    * Each row read from the csv file is returned as a list of strings.


REFERENCES:  Information about the __csv__ library can be found here: <br> 
>REF: https://docs.python.org/3/library/csv.html <br>
>REF: https://www.geeksforgeeks.org/writing-csv-files-in-python/ <br>
>REF: https://docs.python.org/3/library/csv.html
(shows use of newline=' ' parameter) <br>

<span style="color:blue">
    4. Open the newly created file named <b>charlotte_hornets_stats.csv</b> using <b>with open() as <i>file</i></b>, then: <br>
    5. Use <b>csv.reader(...)</b> to get a "reader object" <br>
    6. Use a <b>FOR-loop</b> to iterate thru each row/line of the file and <b>print</b> the row/line.</span>

In [10]:
# Read CSV file 'charlotte_hornets_stats.csv' using Python's csv.writer() function

# INSERT CODE FOR STEPS 4-6
with open('charlotte_hornets_stats.csv', 'r') as csv_file:
    csv_reader = csv.reader(csv_file)
    for row in csv_reader:
        print(row)

['NAME              #POS#GP#GS#MIN#PTS#OR#DR#REB#AST#STL#BLK#TO#PF#AST/TO']
['P.J. Washington   #PF #59#59#32.9#15.2#1.0#3.7#4.7#2.4#0.9#1.1#1.4#2.6#1.7']
['Mason Plumlee     #C  #56#56#28.5#12.2#3.3#6.3#9.7#3.7#0.6#0.6#1.6#2.9#2.4']
['Jalen McDaniels   #PF #56#21#26.7#10.6#0.8#4.0#4.8#2.0#1.2#0.5#1.4#2.8#1.4']
['Terry Rozier      #SG #49#49#35.6#21.8#1.0#3.3#4.3#4.9#1.1#0.3#2.1#1.9#2.3']
['JT Thor           #F  #47#0#10.8#2.6#0.5#1.2#1.7#0.4#0.2#0.2#0.5#0.8#0.8']
['Nick Richards     #C  #45#0#17.3#8.0#2.5#3.3#5.8#0.5#0.2#1.0#0.9#2.3#0.6']
['Dennis Smith Jr.  #PG #38#12#25.0#8.4#0.4#2.3#2.8#4.4#1.4#0.4#1.6#2.2#2.8']
['Theo Maledon      #PG #36#1#16.1#5.6#0.3#2.1#2.4#2.4#0.8#0.2#1.0#1.0#2.5']
['Kelly Oubre Jr.   #SG #35#30#32.6#20.2#1.5#3.6#5.1#1.2#1.6#0.4#1.3#3.0#1.0']
['Gordon Hayward    #SF #34#34#31.1#13.9#0.7#3.5#4.2#3.6#0.9#0.2#1.9#1.3#2.0']
['LaMelo Ball       #PG #33#33#35.4#23.3#1.2#5.1#6.3#8.4#1.3#0.3#3.5#3.4#2.4']
['Mark Williams     #C  #29#4#15.9#7.6#1.7#3.9#5.6#0.3#0.7#1.2

### Read a CSV file 'charlotte_hornets_stats.csv' using Pandas read_csv() function

Now that you've used Python's CSV Read & Write functions, let's see what Pandas has to offer! 

Pandas provides a quick &amp; simple way to read a CSV file into a <b>DataFrame</b> using <b>read_csv()</b> &amp; <b>write_csv()</b>.

SYNTAX:<br>
<b>import pandas as pd</b> <== REQUIRED

* _df_ = pd.read_csv(_csvfilename_) ~ Reads a comma-separated values (csv) file into DataFrame.
    * EXAMPLE: df = pd.read_csv('myCSVfile.csv')
    
To __print__ the contents of the ___df___ you can use the following command:
* print(_df_.to_string()) 

REFERENCES:  Information about the __csv__ library can be found here: <br> 
>REF: https://www.w3schools.com/python/pandas/pandas_csv.asp <br>
>REF: https://pythonbasics.org/read-csv-with-pandas/ <br>
>REF: https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

Let's give it a try ...

<span style="color:blue">
    7. Add the required <b>import</b> statement for pandas <br>
    8. Read file <b>charlotte_hornets_stats.csv</b> using Pandas <b>read_csv(...)</b> function. <br>
    9. Save the file contents in a <b>DataFrame</b> named <b>hornets_df</b> <br>
    10. Print the DataFrame <b>hornets_df</b>
</span>

In [12]:
# INSERT CODE FOR STEPS 7-10
import pandas as pd

with open('charlotte_hornets_stats.csv', 'r') as csv_file:
    csv_reader = csv.reader(csv_file)
    for row in csv_reader:
        print(row)

hornets_df = pd.read_csv('charlotte_hornets_stats.csv')

print(hornets_df)


['NAME              #POS#GP#GS#MIN#PTS#OR#DR#REB#AST#STL#BLK#TO#PF#AST/TO']
['P.J. Washington   #PF #59#59#32.9#15.2#1.0#3.7#4.7#2.4#0.9#1.1#1.4#2.6#1.7']
['Mason Plumlee     #C  #56#56#28.5#12.2#3.3#6.3#9.7#3.7#0.6#0.6#1.6#2.9#2.4']
['Jalen McDaniels   #PF #56#21#26.7#10.6#0.8#4.0#4.8#2.0#1.2#0.5#1.4#2.8#1.4']
['Terry Rozier      #SG #49#49#35.6#21.8#1.0#3.3#4.3#4.9#1.1#0.3#2.1#1.9#2.3']
['JT Thor           #F  #47#0#10.8#2.6#0.5#1.2#1.7#0.4#0.2#0.2#0.5#0.8#0.8']
['Nick Richards     #C  #45#0#17.3#8.0#2.5#3.3#5.8#0.5#0.2#1.0#0.9#2.3#0.6']
['Dennis Smith Jr.  #PG #38#12#25.0#8.4#0.4#2.3#2.8#4.4#1.4#0.4#1.6#2.2#2.8']
['Theo Maledon      #PG #36#1#16.1#5.6#0.3#2.1#2.4#2.4#0.8#0.2#1.0#1.0#2.5']
['Kelly Oubre Jr.   #SG #35#30#32.6#20.2#1.5#3.6#5.1#1.2#1.6#0.4#1.3#3.0#1.0']
['Gordon Hayward    #SF #34#34#31.1#13.9#0.7#3.5#4.2#3.6#0.9#0.2#1.9#1.3#2.0']
['LaMelo Ball       #PG #33#33#35.4#23.3#1.2#5.1#6.3#8.4#1.3#0.3#3.5#3.4#2.4']
['Mark Williams     #C  #29#4#15.9#7.6#1.7#3.9#5.6#0.3#0.7#1.2

<div class="alert alert-success">
 <strong>NOTE how much less code is required to read a CSV file using Pandas!!!</strong>
</div>

## Part C - Reading and writing JSON files (20 points)

<div class="alert alert-warning">
    <strong>IMPORTANT: </strong>Make sure file <strong>best_selling_books.json</strong> is located in the current directory.<br>
    -- File Format: There are 6 keys: Book, Authors(s), Original Language, First Published, Approximate Sales, Genre<br></div>

### Read a JSON file 'best_selling_books.json' using Python's json.load(...)function

JSON can easily be read or writen in Python by using a __Dictionary__ object (read into/write from)<br> 
To read a JSON file into a Python dictionary, you use __json.load()__ <br>

SYNTAX:<br>
<b>import json</b> <== REQUIRED

* json.load(...) ~ accepts a file object, parses the JSON data 
    - Populates a Python dictionary with the data and returns it back to you
    - EXAMPLE: _dict_ = json.load(_fileobject_)


REFERENCES:  Information about Python & JSON can be found here: <br> 
>REF: https://www.w3schools.com/python/python_json.asp <br>
>REF: https://www.geeksforgeeks.org/read-json-file-using-python/ <br>
>REF: https://docs.python.org/3/library/json.html <br>

Let's give it a try ...

<span style="color:blue">
    1. Add the required import statement for <b>json</b> <br>
    2. Open for 'read' the file <b>'best_selling_books.json'</b> <br> 
    3. Use the <b>json.load()</b> command to parse the JSON into a variable named <b>bookdata_dict</b>.
</span>

In [13]:
# INSERT CODE FOR STEPS 1-3
import json

with open('best_selling_books.json', 'r') as file:
    bookdata_dict = json.load(file)

In [14]:
# DO NOT MODIFY !!!
print('PART C')
print('STEPS 1-3 - best_selling_books.json file')
print(f'bookdata_dict has {len(bookdata_dict)} entries')
print(f'bookdata_dict is TYPE: {type(bookdata_dict)}\n')
# Print contents using FOR-loop
for b in bookdata_dict['Best Selling Books']:
    print(b)

PART C
STEPS 1-3 - best_selling_books.json file
bookdata_dict has 1 entries
bookdata_dict is TYPE: <class 'dict'>

{'Book': 'A Tale of Two Cities', 'Author(s)': 'Charles Dickens', 'Language': 'English', 'Published': 1859, 'Sales': '200 million', 'Genre': 'Historical fiction'}
{'Book': 'The Little Prince', 'Author(s)': 'Antoine de Saint-Exupery', 'Language': 'English', 'Published': 1943, 'Sales': '200 million', 'Genre': 'Novella'}
{'Book': "Harry Potter and the Philosopher's Stone", 'Author(s)': 'J. K. Rowling', 'Language': 'English', 'Published': 1997, 'Sales': '120 million', 'Genre': 'Fantasy'}
{'Book': 'And Then There Were None', 'Author(s)': 'Agatha Christie', 'Language': 'English', 'Published': 1939, 'Sales': '100 million', 'Genre': 'Mystery'}
{'Book': 'Dream of the Red Chamber', 'Author(s)': 'Cao Xueqin', 'Language': 'Chinese', 'Published': 1791, 'Sales': '100 million', 'Genre': 'Family saga'}
{'Book': 'The Hobbit', 'Author(s)': 'J. R. R. Tolkien', 'Language': 'English', 'Publishe

<div class="alert alert-success">
    Take a look at the <b>best_selling_books.json</b> file in an editor (or another Jupyter Notebook tab)<br>
    <b>NOTICE how much the JSON looks like a Python Dictionary!</b>
</div>

### Create JSON formatted string from a Python dictionary using json.dumps() function

If you look at the format of a JSON file, you will notice how similar it is to Python's Dictionary format.<br><br>
So ... it should be no surprise that you can use a Python Dictionary to <b>create</b> a JSON file.<br>
To write a Python dictionary to a JSON formatted string, you use the __json.dumps()__ method<br>

SYNTAX:<br>
- json.dumps(dictionary, indent=#) ~ converts a Python object into a JSON string 
    - where indent=# specifies spaces to indent to make for easy reading
    
<b>Here is a GeeksForGeeks reference that explains json.dumps():</b>
    
> REF: https://www.geeksforgeeks.org/json-dump-in-python/


<b>NOTE:</b> There is a subtle difference between <b>json.dumps()</b> and <b>json.dump()</b> and the two commands often get confused. Make sure you know which you want to use -- and watch out for typos.


REFERENCES:  Information about Python & JSON can be found here: <br> 
>REF: https://www.w3schools.com/python/python_json.asp <br>
>REF: https://www.geeksforgeeks.org/reading-and-writing-json-to-a-file-in-python/ <br>
>REF: https://docs.python.org/3/library/json.html <br>

Let's give it a try ...

<span style="color:blue">
4. Use <strong>json.dumps()</strong> with <strong>indent=4</strong> to create a JSON object named <strong>json_desserts</strong> <br>
 
<strong>NOTE:</strong> Use the <strong>desserts_dict</strong> dictionary definition provided for you below!!!
</span>

In [16]:
# DO NOT MODIFY !!!
# Dictionary of Desserts (PROVIDED)
desserts_dict = {'Cookies': ['Chocolate Chip','Peanut Butter', 'Oatmeal Raisin', 'Macaroons', 'Sugar'],
                'Cakes': ['Chocolate', 'Red Velvet', 'Lemon', 'Black Forest', 'Cheesecake'],
                'Ice Cream': ['Vanilla', 'Chocolate', 'Strawberry', 'Mint Chocolate Chip', 'Butter Pecan'],
                'Pastries': ['Bear Claw', 'Cinnamon Roll', 'Beignets', 'Doughnut', 'Pain Au Chocolat']
                }

In [17]:
# Convert Python dictionary (desserts_dict) to JSON formatted string

# INSERT CODE FOR STEP 4
json_desserts = json.dumps(desserts_dict, indent=4)


In [18]:
# DO NOT MODIFY !!!
print('PART C')
print('STEP 4 - JSON desserts_dict')
# Print the Python Dictionary & JSON object
print(f'Contents of desserts_dict:\n{desserts_dict}\n')
print(f'Contents of json_desserts:\n{json_desserts}')

PART C
STEP 4 - JSON desserts_dict
Contents of desserts_dict:
{'Cookies': ['Chocolate Chip', 'Peanut Butter', 'Oatmeal Raisin', 'Macaroons', 'Sugar'], 'Cakes': ['Chocolate', 'Red Velvet', 'Lemon', 'Black Forest', 'Cheesecake'], 'Ice Cream': ['Vanilla', 'Chocolate', 'Strawberry', 'Mint Chocolate Chip', 'Butter Pecan'], 'Pastries': ['Bear Claw', 'Cinnamon Roll', 'Beignets', 'Doughnut', 'Pain Au Chocolat']}

Contents of json_desserts:
{
    "Cookies": [
        "Chocolate Chip",
        "Peanut Butter",
        "Oatmeal Raisin",
        "Macaroons",
        "Sugar"
    ],
    "Cakes": [
        "Chocolate",
        "Red Velvet",
        "Lemon",
        "Black Forest",
        "Cheesecake"
    ],
    "Ice Cream": [
        "Vanilla",
        "Chocolate",
        "Strawberry",
        "Mint Chocolate Chip",
        "Butter Pecan"
    ],
    "Pastries": [
        "Bear Claw",
        "Cinnamon Roll",
        "Beignets",
        "Doughnut",
        "Pain Au Chocolat"
    ]
}


### Create JSON formatted file from a Python dictionary using the json.dump() function

The process of writing JSON to a file is called Serialization. Serializing JSON refers to the transformation of data into a series of bytes to be stored or often transmitted across a network. The command used to do this is: <b>json.dump()</b> 

SYNTAX:<br>
- json.dump(dictionary, file pointer, indent=#) ~ writes the dictionary to a file without conversion to JSON object
    - where # is the number of spaces to use for indentation
    
<b>Again, here is a GeeksForGeeks reference that explains json.dump():</b>
    
> REF: https://www.geeksforgeeks.org/json-dump-in-python/

<b>NOTE:</b> There is a subtle difference between <b>json.dumps()</b> and <b>json.dump()</b> and the two commands often get confused. Make sure you know which you want to use -- and watch out for typos.


REFERENCES:  Information about Python & JSON can be found here: <br> 
>REF: https://www.w3schools.com/python/python_json.asp <br>
>REF: https://www.geeksforgeeks.org/reading-and-writing-json-to-a-file-in-python/ <br>
>REF: https://docs.python.org/3/library/json.html <br>

Let's give it a try ...

<span style="color:blue">
5. Write a JSON file named <strong>desserts.json</strong> using Pythons <strong>json.dump()</strong> command. <br>
 
<strong>NOTE:</strong> Use the <strong>desserts_dict</strong> dictionary created in the precious steps!!!
</span>

In [19]:
# Write desserts_dict to file in JSON format using Python

# INSERT CODE FOR STEP 5
with open('desserts.json', 'w') as json_file:
    json.dump(desserts_dict, json_file, indent=4)


<div class="alert alert-success">
 <b>NOTE:</b>  You should VISUALLY VERIFY the existance of this new file in your current directory. <br>
    * <b>Question:</b> Does the file <b>'desserts.json'</b> now appear in Jupyter Notebook directory list (left pane)?<br>
    * <b>Question:</b> If you open the file, does the content look like a JSON file?<br>
    -- You can view the file in Jupyter Notebook or a simple editor to check. 
</div>

In [20]:
# DO NOT MODIFY !!!
print('PART C')
print('STEP 5 - desserts.json')
print(f'')
# Read JSON data from file and pretty print it
with open('desserts.json', 'r') as newfile:
    # Convert JSON file to Python Types
    data = json.load(newfile)
    print(json.dumps(data, indent=4))

PART C
STEP 5 - desserts.json

{
    "Cookies": [
        "Chocolate Chip",
        "Peanut Butter",
        "Oatmeal Raisin",
        "Macaroons",
        "Sugar"
    ],
    "Cakes": [
        "Chocolate",
        "Red Velvet",
        "Lemon",
        "Black Forest",
        "Cheesecake"
    ],
    "Ice Cream": [
        "Vanilla",
        "Chocolate",
        "Strawberry",
        "Mint Chocolate Chip",
        "Butter Pecan"
    ],
    "Pastries": [
        "Bear Claw",
        "Cinnamon Roll",
        "Beignets",
        "Doughnut",
        "Pain Au Chocolat"
    ]
}


## Part D - Reading and parsing XML files (20 points)

<div class="alert alert-warning">
<strong>IMPORTANT: </strong>Make sure file <strong>best_selling_books.xml</strong> is located in the current directory.  
</div>

### Read and parse XML file best_selling_books.xml' using ElementTree


XML stands for eXtensible Markup Language. It was originally designed to store and transport data while being both human & machine readable. XML has been around for quite awhile, but is gradually being replaced by JSON & YAML.

XML is a hierarchical data format, so it is often represented by a tree structure. The xml.etree.ElementTree (ET in short) is comprised of two classes: 

- ElementTree.parse() ~ method used to enable parsing of xml file 
    - EXAMPLE: mytree = ET.parse('file.xml')
    
- tree.getroot() ~ used to get the ROOT element of the XML tree
    - EXAMPLE: mytree.getroot()
   
There are various __find__ methods that can be used to search for specific XML elements (ie. by TAG)
    
- tree.find()       ~ .find() method returns an element of the tree
- tree.findall()    ~ .findall() method returns all matching subelements of the tree.
    

REFERENCES:  Information about using XML ElementTree() to read and parse XML files can be found here: <br> 
>REF: https://www.w3schools.com/xml/xml_parser.asp <br>
>REF: https://www.edureka.co/blog/python-xml-parser-tutorial/ <br>
>REF: https://docs.python.org/3/library/xml.etree.elementtree.html <br>
>REF: https://www.geeksforgeeks.org/reading-and-writing-xml-files-in-python/

Let's give it a try ...

<div class="alert alert-success">
    Take a look at the <b>best_selling_books.xml</b> file in an editor (or another Jupyter Notebook tab)<br>
    <b>NOTICE the HTML-like tagging used to for what could be thought of as KEYS.</b><br>
    Also take note of the hierarchical structure of the data -- how would you reference an individual element?
</div>

<span style="color:blue">
Use <strong>ElementTree()</strong> to read and parse file <strong>best_selling_books.xml</strong> <br><br>
    1. Use <strong>ET.parse('best_selling_books.xml')</strong> to read and part the XML file into <strong>mytree</strong> <br>
    2. Use <strong>mytree.getroot()</strong> to get the root of the tree in <strong>myroot</strong>
</span>

In [21]:
# Import necessary libraries
import xml.etree.ElementTree as ET

# INSERT CODE FOR STEPS 1-2
mytree = ET.parse('best_selling_books.xml')
myroot = mytree.getroot()


In [23]:
# DO NOT MODIFY !!!
print('PART D')
print('STEPS 1-2 - myroot.tag')

# Print myroot.tag value
print(myroot.tag)

PART D
STEPS 1-2 - myroot.tag
BESTSELLERS


<span style="color:blue">
Now that you have the root of the tree, use it to find ALL the book titles <br>
    3. Use a <strong>for loop</strong> to <strong>findall('BOOK')</strong> elements, then ... <br>
    4. Use <strong>.find('TITLE').text</strong> to get the <strong>TITLE text</strong> from each book element <br>
    5. Use <strong>.find('AUTHOR').text</strong> to get the <strong>AUTHOR text</strong> from each book element <br>
    6. Use <strong>.find('PUBLISHED').text</strong> to get the <strong>PUBLISHED year</strong> from each book element <br>
    7. Use <strong>.find('SALES').text</strong> to get the <strong>SALES text</strong> from each book element <br>
    8. Use <strong>.find('GENRE').text</strong> to get the <strong>GENRE text</strong> from each book element <br>
    9. Create a STRING variable named <strong>book_info</strong> by concatenating the 5 fields: TITLE, AUTHOR, PUBLISHED, SALES, GENRE.<br>
    -- Separate each field with a <b>SPACE COLON SPACE (' : ')</b> to make the output easier to read!<br>
    -- <b>EXAMPLE:</b> The Hobbit : J. R. R. Tolkien : 1937 : 100 million : Fantasy<br>
    10. Print the <strong>book_info</strong> <br><br>
</span>

In [24]:
# DO NOT MODIFY !!!
print('PART D')
print('STEPS 3-10 - Print the Title, Author, Publish Year, Sales & Genre for each book')

PART D
STEPS 3-10 - Print the Title, Author, Publish Year, Sales & Genre for each book


In [26]:
# Find and print ALL the TITLE values (text) for every BOOK element

# INSERT CODE FOR STEPS 3-5
for book_element in myroot.findall('BOOK'):
    title = book_element.find('TITLE').text
    author = book_element.find('AUTHOR').text
    published = book_element.find('PUBLISHED').text
    sales = book_element.find('SALES').text
    genre = book_element.find('GENRE').text
    book_info = f"{title} : {author} : {published} : {sales} : {genre}"

    print(book_info)

A Tale of Two Cities : Charles Dickens : 1859 : 200 million : Historical fiction
The Little Prince : Antoine de Saint-Exupery : 1943 : 200 million : Novella
Harry Potter and the Philosopher's Stone : J. K. Rowling : 1997 : 120 million : Fantasy
And Then There Were None : Agatha Christie : 1939 : 100 million : Mystery
Dream of the Red Chamber : Cao Xueqin : 1791 : 100 million : Family saga
The Hobbit : J. R. R. Tolkien : 1937 : 100 million : Fantasy
The Lion, the Witch and the Wardrobe : C. S. Lewis : 1950 : 85 million : Fantasy
She: A History of Adventure : H. Rider Haggard : 1887 : 83 million : Adventure
Vardi Wala Gunda : Ved Prakash Sharma : 1992 : 80 million : Detective
The Da Vinci Code : Dan Brown : 2003 : 80 million : Mystery thriller
Harry Potter and the Chamber of Serets : J. K. Rowling : 1998 : 77 million : Fantasy
Harry Potter and the Prisoner of Azkaban : J. K. Rowling : 1999 : 65 million : Fantasy
Harry Potter and the Goblet of Fire : J. K. Rowling : 2000 : 65 million : Fa

## Part E - Reading and parsing YAML files (20 points)

<div class="alert alert-warning">
<strong>IMPORTANT: </strong>Make sure file <strong>charlotte_hornets.yaml</strong> is located in the current directory.
</div>

### Read and parse YAML file charlotte_hornets.xml' using yaml.load()

The YAML file syntax is structured very much like a Python Dictionary, but without the __{ }s__<br>
However, the YAML file syntax does have a close tie-in to Python -- <b>INDENTATION!</b> <br>
- Indentation is a key aspect of YAML and is integral to defining the structure of a YAML file.<br>
- Indentation problems cause ERRORS, so you have to be very careful when creating YAML files <b>(NO TABS!)</b>.<br>
- The good news is there are lots of YAML verification tools available, here are a few:<br>
>http://www.yamllint.com/<br>
>https://codebeautify.org/yaml-validator<br>

It is in simple human-readable format makes which makes it suitable for the Configuration files.<br> 
To read a YAML file into a Python dictionary, you use __yaml.load()__ <br>

SYNTAX:<br>
<b>import yaml</b> <== REQUIRED

* yaml.load(fileobject, Loader) ~ parses and converts a YAML object to a Python dictionary
    - There are 4 types of Loaders that can be used:
        - BaseLoader: Loads all the basic YAML scalars as Strings
        - SafeLoader: Loads subset of the YAML safely, mainly used if the input is from an untrusted source.
        - FullLoader: Loads the full YAML but avoids arbitrary code execution. 
        - UnsafeLoader: Original loader for untrusted inputs and generally used for backward compatibility.
        
    - EXAMPLE: _dict_ = yaml.load(_fileobject_, Loader=yaml.FullLoader)

REFERENCES:  Information about using reading & writing YAML in Python can be found here: <br> 
>REF: https://python.land/data-processing/python-yaml <br>
>REF: https://stackabuse.com/reading-and-writing-yaml-to-a-file-in-python/ <br>

Let's give it a try ...

<span style="color:blue">
    1. Open file <strong>charlotte_hornets.yaml</strong> <br>
    - NOTE: Use <strong>with open(...) as XXXXX:</strong> to eliminate having to <strong>close()</strong> the file.<br>
    2. Use <strong>yaml.load()</strong> with <strong>Loader=yaml.FullLoader</strong> to read file <strong>charlotte_hornets.yaml</strong> into a variable named <strong>charlotte_hornets</strong>.<br>
</span>

In [29]:
# Import necessary libraries
import yaml

# Read and parse YAML file using yaml.load() with Loader=yaml.FullLoader

# INSERT CODE FOR STEPS 1-2
with open('charlotte_hornets.yaml', 'r') as file:
    charlotte_hornets_dict = yaml.load(file, Loader=yaml.FullLoader)


In [30]:
# DO NOT MODIFY !!!
print('PART E')
print('STEPS 1-2 - charlotte_hornets_dict')
print(charlotte_hornets_dict)

PART E
STEPS 1-2 - charlotte_hornets_dict
{'hornets': {'players': [{'rank': 1, 'name': 'LaMelo Ball', 'pos': 'PG', 'ppg': 23.3}, {'rank': 2, 'name': 'Terry Rozier', 'pos': 'SG', 'ppg': 21.8}, {'rank': 3, 'name': 'Kelly Oubre Jr', 'pos': 'SG', 'ppg': 20.2}, {'rank': 4, 'name': 'P.J. Washington', 'pos': 'PF', 'ppg': 15.2}, {'rank': 5, 'name': 'Gordon Hayward', 'pos': 'SF', 'ppg': 13.9}, {'rank': 6, 'name': 'Mason Plumlee', 'pos': 'C', 'ppg': 12.2}, {'rank': 7, 'name': 'Svi Mykhailiuk', 'pos': 'SG', 'ppg': 12.0}, {'rank': 8, 'name': 'Jalen McDaniels', 'pos': 'PF', 'ppg': 10.6}, {'rank': 9, 'name': 'Dennis Smith Jr', 'pos': 'PG', 'ppg': 8.4}, {'rank': 10, 'name': 'Nick Richards', 'pos': 'C', 'ppg': 8.0}, {'rank': 11, 'name': 'Mark Williams', 'pos': 'C', 'ppg': 7.6}, {'rank': 12, 'name': 'Theo Maledon', 'pos': 'PG', 'ppg': 5.6}, {'rank': 13, 'name': 'Cody Martin', 'pos': 'SF', 'ppg': 5.0}, {'rank': 14, 'name': 'James Bouknight', 'pos': 'SG', 'ppg': 4.3}, {'rank': 15, 'name': 'Bryce McGowens

<div class="alert alert-success">
    <strong>NOTE</strong> the structure of the <b>charlotte_hornets_dict</b> dictionary data ... you have a dictionary entry that contains another dictionary entry that contains a list of dictionaries. This needs to be taken into consideration when "drilling" down into the data structure.
</div>

<span style="color:blue">   
    Start "drilling" down into the various levels of <b>charlotte_hornets_dict</b> ...<br>
    3. Create a dictionary named <b>hornets_dict</b> containing the values using KEY = <b>hornets</b><br>
    -- NOTE: You should be referencing the <b>charlotte_hornets_dict</b>
</span>

In [37]:
# INSERT CODE FOR STEP 3

# Get the Charlotte Hornet players using 'hornets' as the KEY
hornets_dict = charlotte_hornets_dict.get('hornets', {})

In [38]:
# DO NOT MODIFY !!!
print('STEP 3  - hornets_dict')
print(f'hornets_dict is of TYPE: {type(hornets_dict)}\n')
print(f'Contents of hornets_dict:\n {hornets_dict}')

STEP 3  - hornets_dict
hornets_dict is of TYPE: <class 'dict'>

Contents of hornets_dict:
 {'players': [{'rank': 1, 'name': 'LaMelo Ball', 'pos': 'PG', 'ppg': 23.3}, {'rank': 2, 'name': 'Terry Rozier', 'pos': 'SG', 'ppg': 21.8}, {'rank': 3, 'name': 'Kelly Oubre Jr', 'pos': 'SG', 'ppg': 20.2}, {'rank': 4, 'name': 'P.J. Washington', 'pos': 'PF', 'ppg': 15.2}, {'rank': 5, 'name': 'Gordon Hayward', 'pos': 'SF', 'ppg': 13.9}, {'rank': 6, 'name': 'Mason Plumlee', 'pos': 'C', 'ppg': 12.2}, {'rank': 7, 'name': 'Svi Mykhailiuk', 'pos': 'SG', 'ppg': 12.0}, {'rank': 8, 'name': 'Jalen McDaniels', 'pos': 'PF', 'ppg': 10.6}, {'rank': 9, 'name': 'Dennis Smith Jr', 'pos': 'PG', 'ppg': 8.4}, {'rank': 10, 'name': 'Nick Richards', 'pos': 'C', 'ppg': 8.0}, {'rank': 11, 'name': 'Mark Williams', 'pos': 'C', 'ppg': 7.6}, {'rank': 12, 'name': 'Theo Maledon', 'pos': 'PG', 'ppg': 5.6}, {'rank': 13, 'name': 'Cody Martin', 'pos': 'SF', 'ppg': 5.0}, {'rank': 14, 'name': 'James Bouknight', 'pos': 'SG', 'ppg': 4.3},

<span style="color:blue">
    Now "drill" down another level to get the <b>players</b> LIST ... using <b>hornets_dict</b> from previous step.<br>    
    4. Create a Python LIST named <b>players_list</b> containing the values using KEY = <b>players</b>
</span>

In [35]:
# INSERT CODE FOR STEP 4

# Get the player list using 'players' as the KEY
players_list = hornets_dict.get('players', [])

In [36]:
# DO NOT MODIFY !!!
print('STEP 4  - players_list')
print(f'players_list is of TYPE: {type(players_list)}\n')
print(f'Contents of players_list:\n {players_list}')

STEP 4  - players_list
players_list is of TYPE: <class 'list'>

Contents of players_list:
 [{'rank': 1, 'name': 'LaMelo Ball', 'pos': 'PG', 'ppg': 23.3}, {'rank': 2, 'name': 'Terry Rozier', 'pos': 'SG', 'ppg': 21.8}, {'rank': 3, 'name': 'Kelly Oubre Jr', 'pos': 'SG', 'ppg': 20.2}, {'rank': 4, 'name': 'P.J. Washington', 'pos': 'PF', 'ppg': 15.2}, {'rank': 5, 'name': 'Gordon Hayward', 'pos': 'SF', 'ppg': 13.9}, {'rank': 6, 'name': 'Mason Plumlee', 'pos': 'C', 'ppg': 12.2}, {'rank': 7, 'name': 'Svi Mykhailiuk', 'pos': 'SG', 'ppg': 12.0}, {'rank': 8, 'name': 'Jalen McDaniels', 'pos': 'PF', 'ppg': 10.6}, {'rank': 9, 'name': 'Dennis Smith Jr', 'pos': 'PG', 'ppg': 8.4}, {'rank': 10, 'name': 'Nick Richards', 'pos': 'C', 'ppg': 8.0}, {'rank': 11, 'name': 'Mark Williams', 'pos': 'C', 'ppg': 7.6}, {'rank': 12, 'name': 'Theo Maledon', 'pos': 'PG', 'ppg': 5.6}, {'rank': 13, 'name': 'Cody Martin', 'pos': 'SF', 'ppg': 5.0}, {'rank': 14, 'name': 'James Bouknight', 'pos': 'SG', 'ppg': 4.3}, {'rank': 15

<span style="color:blue">  
Now create a new LIST to collect the player data (rank,name,pos,ppg) -- one entry per player ...<br>    
    5. Create a Python LIST named <b>player_data</b> using the following formatting:<br>
    - 5a. Each entry in the list should contain the <b>values</b> for <b>keys = 'rank', 'name', 'pos' and 'ppg'</b><br>
    - 5b. Concatenate the values to create a single STRING with values separated by a <b>SPACE COLON SPACE (' : ')</b><br>
    - <b>HINT:</b> Use a FOR-loop to process each entry in players_list<br>
    
    Each list entry should look like: <b>1 : LaMelo Ball : PG : 23.3</b>
</span>

In [39]:
# INSERT CODE FOR STEP 5

# Get the player data using keys: 'rank', 'name', 'pos' and 'ppg'
player_data = []
for player in players_list:
    player_entry = f"{player['rank']} : {player['name']} : {player['pos']} : {player['ppg']}"
    player_data.append(player_entry)


In [40]:
# DO NOT MODIFY !!!
print('STEP 5  - player_data')
# Use for loop to loop thru the player_data printing line of player info
for p in player_data:
    print(p)

STEP 5  - player_data
1 : LaMelo Ball : PG : 23.3
2 : Terry Rozier : SG : 21.8
3 : Kelly Oubre Jr : SG : 20.2
4 : P.J. Washington : PF : 15.2
5 : Gordon Hayward : SF : 13.9
6 : Mason Plumlee : C : 12.2
7 : Svi Mykhailiuk : SG : 12.0
8 : Jalen McDaniels : PF : 10.6
9 : Dennis Smith Jr : PG : 8.4
10 : Nick Richards : C : 8.0
11 : Mark Williams : C : 7.6
12 : Theo Maledon : PG : 5.6
13 : Cody Martin : SF : 5.0
14 : James Bouknight : SG : 4.3
15 : Bryce McGowens : G : 4.1


### Part F - Writing a ZIP file (10 points)

#### Write a ZIP file 'CSC221Lab7.zip' containing 2 files from the lab

<b>What is a zip file?</b>

You've all probably created and received (extracted) ZIP files before, but have you created one programmatically? <br>
A ZIP file is an <b>archive file format</b> used to compression files. <br>
ZIP algorithms use "lossless compression", which means the original file(s) are perfectly reconstructed when the ZIP file is "unzipped" -- nothing is lost.<br><br>
<b>Here is a GeeksForGeeks reference that further explains ZIP:</b>

> REF: https://www.geeksforgeeks.org/working-zip-files-python/

<span style="color:blue">
Use <strong>zipfile</strong> to create a ZIP file containing the CSV & JSON files created during this lab <br>
    1. Create a list named <strong>files_to_zip</strong> with the following filenames:<br>
        <ol>
        <li>charlotte_hornets_stats.csv</li>
        <li>desserts.json</li>
        </ol>
    2. Use <strong>zipfile.ZipFile()</strong> to create a ZIP file named <strong>CSC221Lab7.zip</strong> containing the <strong>TWO (2)</strong> files in <strong>files_to_zip</strong> <br>
    3. Print the message <strong>"<i>zipfilename</i> created successfully"</strong> where <i>zipfilename</i> is the name of the ZIP file.
</span>

In [42]:
# Import zipfile libraries
import zipfile

# Write a ZIP file using Python zipfile.Zipfile()

# INSERT CODE FOR STEPS 1-3
files_to_zip = ["charlotte_hornets_stats.csv", "desserts.json"]
with zipfile.ZipFile('CSC221Lab7.zip', 'w') as zip_file:
    for file in files_to_zip:
        zip_file.write(file)

print('CSC221Lab7.zip created successfully')

CSC221Lab7.zip created successfully
