# **Tutorial 11: Working with Files (Part 02)** 👀

<a id='t11toc'></a>
#### Contents: ####
- **[Parsing](#t11parsing)**
    - [`strip()`](#t11strip)
        - [Exercise 1](#t11ex1)
    - [`split()`](#t11split)
        - [Exercise 2](#t11ex2)
- **[JSON](#t11json)**
    - [Reading JSON from a String](#t11loads)
    - [Reading a JSON File](#t11load)
    - [Converting to a JSON String](#t11dumps)
    - [Writing to a JSON File](#t11dump)
- **[CSV](#t11csv)**
    - [Reading CSV Files in Python](#t11readcsv)
    - [Writing into CSV Files (Row by Row)](#t11writecsvrbr)
    - [Writing into CSV Files (Multiple Rows)](#t11writecsvmultiple)
    - [Writing into CSV Files (Custom Delimiter)](#t11delimiter)
- [Exercise 3](#t11ex3)
- [Exercises Solutions](#t11sol)

💡 <b>TIP</b><br>
> <i>In Exercises, when time permits, try to write the codes yourself, and do not copy it from the other cells.</i>

<a id='t11parsing'></a>
## ▙▂ **🄿ARSING ▂▂**

Parsing means splitting up a text into meaningful components (meaningful for a *given purpose*).  
Python has some built-in methods that you can be used for some basic parsing tasks on strings. We will practise with few of them.

<a id='t11strip'></a>
#### **▇▂  `strip()` ▂▂**

The `strip()` method removes characters from both left and right sides of a string, based on the argument (a string specifying the set of characters to be removed).  

The syntax of the `strip()` method is:  `string.strip([chars])`

**`strip()` Parameters**  
- `chars` (optional) - a string specifying the set of characters to be removed from the left and right sides of the string.
    - If the chars argument is not provided, all **leading and trailing whitespaces** are removed from the string.

In [None]:
Str = '  Analysis 3: Object Oriented Programming  '

The code below removes all white spaces (blank) from the left and right side of the string:

In [None]:
CleanStr1 = Str.strip()

In [None]:
print(f'Original String is = "{Str}" --- (length={len(Str)})')
print(f'Removing Leading and Trailing White spaces = "{CleanStr1}" --- (length={len(CleanStr1)})')

The method can also be diectly applied to a string:

In [None]:
CleanStr2 = 'OOOOOOOOAnalysis 3: Object Oriented ProgrammingOOOOO'.strip('O')
print(f'Removing O\'s = "{CleanStr2}"')

<br>⚠ <b>NOTE</b><br>
>It removes only leading and trailing `'O'`s, but not those in between.<br>

##### **Multiple Characters**

The `chars` parameter is not a prefix or suffix; rather, all combinations of its values are stripped.  

In below example, `strip()` would strip all the characters provided in the argument i.e. `'+'`, `'*'`.

In [None]:
CleanStr3 = '+*++*++Analysis 3: Object Oriented Programming**++**'.strip('+*')
print(f'Stripping + and * on both sides  = {CleanStr3}')

##### **Only One Side**

- `lstrip()` trims leading characters and return the trimmed string.
- `rstrip()` strips any trailing white-spaces, tabs, and newlines, and returns the trimmed string.


In [None]:
CleanStr4 = '***********Analysis 3: Object Oriented Programming***********'.lstrip('*')
print('Removing Left Side using lstrip() is = "%s"' %CleanStr4)

<i>Try to do this first without checking the solution in the next cell. After typing your code, compare it with the solution. </i>

In [None]:
CleanStr5 = '***********Analysis 3: Object Oriented Programming***********'.rstrip('*')
print('Removing Right Side using rstrip() is = "%s"' %CleanStr5)

<br>[back to top ↥](#t11toc)

<br><br><a id='t11ex1'></a>
◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾

**✎ Exercise 𝟙**<br> <br> ▙ ⏰ ~ 2+2 min. ▟ <br>

❶ We have a file `studentsgrades.txt` in the current folder which contains the students first name and last names and their grades for Analysis 2.

the information are not properly formatted in the file and there are some extra characters on each row. (Open the file to see the records.) 

**CMI-Inf St. Num 1002121 Andrew Bren 8.4  
CMI-Inf St. Num 1002121 Peter Cole 7.0  
CMI-Inf St. Num 1002121 Chris Charles 9.1  
CMI-Inf St. Num 1002121 Andy Frankline 6.9  
CMI-Inf St. Num 1002121 Robert Ford 5.6  
CMI-Inf St. Num 1002121 Charley Walton 7.7**

Write a short piece of code to read the file and remove the extra leading characters and students numbers from each row, and Create a new file `studentsgrades-m.txt` containing the new records. The new file should be like:  

**Andrew Bren 8.4  
Peter Cole 7.0  
Chris Charles 9.1  
Andy Frankline 6.9  
Robert Ford 5.6  
Charley Walton 7.7**

In [None]:
# Exercise 1.1


❷ Modify your code to work on the original file `studentsgrades.txt` and create another new file `studentsnames.txt` containing only the students names, without grades.

In [None]:
# Exercise 1.2


<br>[back to top ↥](#t11toc)

<a id='t11split'></a>
#### **▇▂  `split()` ▂▂**

The `split()` method breaks up a string at the specified separator and returns a *list of strings*.  


The syntax of the `split()` method is:  `str.split([separator [, maxsplit]])`



**`split()` Parameters**  
`split()` method takes a maximum of 2 parameters:
- `separator` (optional)- It is a delimiter. The string splits at the specified separator.
    - If the separator is not specified, any whitespace (space, newline etc.) string is a separator.
- `maxsplit` (optional) - The maxsplit defines the maximum number of splits.
    - The default value of maxsplit is `-1`, meaning, no limit on the number of splits.


In [None]:
text = 'Never regret anything that made you smile'

In [None]:
print(text.split())

##### **with `seperator`**

In [None]:
grocery = 'Milk, Chicken, Bread, Butter'

In [None]:
print(grocery.split(', '))

Try the following code:

In [None]:
print(grocery.split(':'))

🔴 How many elements are in the python list above? Why? Discuss it.

##### **with `maxsplit`**

In [None]:
print(grocery.split(', ', 1))

In [None]:
print(grocery.split(', ', 2))

In [None]:
print(grocery.split(', ', 5))

<br>⚠ <b>NOTE</b><br>
>If `maxsplit` is specified, the list will have at most `maxsplit+1` items.<br>

<br>[back to top ↥](#t11toc)

<br><br><a id='t11ex2'></a>
◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾

**✎ Exercise 𝟚**<br> <br> ▙ ⏰ ~ 3+3 min. ▟ <br>

❶ We want to further process `studentsgrades.txt` file discussed in Exercise 1. Now, we would like to make a **list** of students. Each item in the list must be a student. A student should be created as an object with three attributes: `first name`, `last name`, and `grade`.

In [None]:
# Exercise 2.1


❷ Define a class to represent a group of students. Create an object to contain the students in our list, and add a method to the class for a simple linear search based on the last name of a student.  

In [None]:
# Exercise 2.2


<br>[back to top ↥](#t11toc)

<a id='t11json'></a>
## ▙▂ **🄹SON ▂▂**

JSON is a syntax for storing and exchanging data.
- JSON is text, written with JavaScript object notation.
- JSON is language independent.
- JSON uses JavaScript syntax, but the JSON format is text only.
- Text can be read and used as a data format by any programming language.

<a id='t11loads'></a>
#### **▇▂  Reading JSON from a String ▂▂**

`json.loads()` reads JSON from a string and converts it to a Python dictionary.

In [None]:
import json

book = """ 
    {
    "author": "Chinua Achebe",
    "editor": null,
    "country": "Nigeria",
    "imageLink": "images/things-fall-apart.jpg",
    "language": "English",
    "link": "https://en.wikipedia.org/wiki/Things_Fall_Apart",
    "pages": 209,
    "title": "Things Fall Apart",
    "year": 1958,
    "available": true
    }
    """

book_dict = json.loads(book)

print(book_dict)
print("\n")
print("The title of the book is:", book_dict['title'])
print(f'book_json is: {type(book_dict)}')

<br>[back to top ↥](#t11toc)

<a id='t11load'></a>
#### **▇▂  Reading a JSON File ▂▂**

Now let's load a JSON file into a JSON object in Python. For this, we use the file `book.json`, located in the current directory.

First, let's take a look into the file contents.

In [None]:
f = open("book.json")
text = f.read()
f.close()
print(f"The full text in the file is:\n\n{text}")

you could also open the `book.json` in the Jupyter or a text editor.

The `json.load()` method reads a file containing JSON object:

In [None]:
import json

with open('book.json') as f:
  data = json.load(f)

print(data)
print('\n')
print(type(data))

<br>⚠ <b>NOTE</b><br>
>Note the difference between JSON object and Python object.<br>

<br>[back to top ↥](#t11toc)

<a id='t11dumps'></a>
#### **▇▂  Converting to a JSON String ▂▂**

`json.dumps()` converts a python dictionary to a JSON string.

In [None]:
import json

book = {
    "author": "Chinua Achebe",
    "editor": None,
    "country": "Nigeria",
    "imageLink": "images/things-fall-apart.jpg",
    "language": "English",
    "link": "https://en.wikipedia.org/wiki/Things_Fall_Apart",
    "pages": 209,
    "title": "Things Fall Apart",
    "year": 1958,
    "available": True
    }
    
book_json = json.dumps(book)
print(book_json)
print('\n')
print(f'book is: {type(book)}')
print(f'book_json is: {type(book_json)}')

<br>[back to top ↥](#t11toc)

<a id='t11dump'></a>
#### **▇▂  Writing to a JSON File ▂▂**

`json.dump()` converts and writes a dictionary to a JSON file.

In [None]:
import json

book_dict = {
    "author": "Chinua Achebe",
    "editor": None,
    "country": "Nigeria",
    "imageLink": "images/things-fall-apart.jpg",
    "language": "English",
    "link": "https://en.wikipedia.org/wiki/Things_Fall_Apart",
    "pages": 209,
    "title": "Things Fall Apart",
    "year": 1958,
    "available": True
    }

with open('book-new.json', 'w') as json_file:
  json.dump(book_dict, json_file)

Now let's take a look into the contents of the file we have just written:

In [None]:
f = open("book-new.json")
text = f.read()
f.close()
print(f"The full text in the file is:\n\n{text}")

<br>[back to top ↥](#t11toc)

<a id='t11csv'></a>
## ▙▂ **🄲SV ▂▂**

While we could use the built-in `open()` function to work with `CSV` files in Python, there is a dedicated `csv` module that makes working with `CSV` files much easier.  

Before we can use the methods to the `csv` module, we need to import the module first, using:

In [None]:
import csv

<a id='t11readcsv'></a>
#### **▇▂  Reading CSV Files in Python ▂▂**

To read a `CSV` file in Python, we can use the `csv.reader()` function.  

The `csv.reader()` returns an iterable `reader` object.

The `reader` object is then iterated using a for loop to print the contents of each row.

##### **Using comma (`,`) as delimiter**

Comma is the default delimiter for `csv.reader()`.

In [None]:
import csv
with open('grades.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

##### **Using tab (`\t`) as delimiter**

In [None]:
import csv
with open('gradesTab.csv', 'r',) as file:
    reader = csv.reader(file, delimiter = '\t')
    for row in reader:
        print(row)

<br>[back to top ↥](#t11toc)

<a id='t11writecsvrbr'></a>
#### **▇▂  Writing into CSV Files (Row by Row) ▂▂**

To write to a CSV file in Python, we can use the `csv.writer()` function.  

The `csv.writer()` function returns a `writer` object that converts the user's data into a delimited string. This string can later be used to write into `CSV` files using the `writerow()` function. 

`csv.writer` class provides two methods for writing to `CSV`. They are `writerow()` and `writerows()`:
- `writerow()`: This method writes a single row at a time. Fields row can be written using this method.
- `writerows()`: This method is used to write multiple rows at a time. This can be used to write rows list.

In [None]:
import csv
with open('gradesW1.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    
    writer.writerow(["Lastname","Firstname","SSN","Test1","Test2","Test3","Test4","Final","Grade"])
    writer.writerow(["George","Boy","345-67-3901",40.0,1.0,11.0,-1.0,4.0,"B"])
    writer.writerow(["Heffalump","Harvey","632-79-9439",30.0,1.0,20.0,30.0,40.0,"C"])

Now let's take a look into the contents of the file we have just written: 

In [None]:
f = open("gradesW1.csv")
text = f.read()
f.close()
print(f"The full text in the file is:\n\n{text}")

<br>[back to top ↥](#t11toc)

<a id='t11writecsvmultiple'></a>
#### **▇▂  Writing into CSV Files (Multiple Rows) ▂▂**

If we need to write the contents of the 2-dimensional list to a `CSV file`, here's how we can do it:

In [None]:
import csv
csv_rowlist = [ ["Lastname","Firstname","SSN","Test1","Test2","Test3","Test4","Final","Grade"],
                ["Dandy","Jim","087-75-4321",47.0,1.0,23.0,36.0,45.0,"C+"],
                ["Elephant","Ima","456-71-9012",45.0,1.0,78.0,88.0,77.0,"B-"],
                ["Franklin","Benny","234-56-2890",50.0,1.0,90.0,80.0,90.0,"B-"]]

with open('gradesW2.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerows(csv_rowlist)

Now let's take a look into the contents of the file we have just written:

In [None]:
f = open("gradesW2.csv")
text = f.read()
f.close()
print(f"The full text in the file is:\n\n{text}")

<br>[back to top ↥](#t11toc)

<a id='t11delimiter'></a>
#### **▇▂  Writing into CSV Files (Custom Delimiter) ▂▂**

As mentioned before, by default, a comma `,` is used as a delimiter in a `CSV` file.

However, we can pass a different delimiter parameter as argument to the `csv.writer()` function:

In [None]:
import csv
csv_rowlist = [ ["Lastname","Firstname","SSN","Test1","Test2","Test3","Test4","Final","Grade"],
                ["Dandy","Jim","087-75-4321",47.0,1.0,23.0,36.0,45.0,"C+"],
                ["Elephant","Ima","456-71-9012",45.0,1.0,78.0,88.0,77.0,"B-"],
                ["Franklin","Benny","234-56-2890",50.0,1.0,90.0,80.0,90.0,"B-"]]

with open('gradesW3.csv', 'w', newline='') as file:
    writer = csv.writer(file, delimiter='|')
    writer.writerows(csv_rowlist)

Now let's take a look into the contents of the file we have just written:

In [None]:
f = open("gradesW3.csv")
text = f.read()
f.close()
print(f"The full text in the file is:\n\n{text}")

<br>[back to top ↥](#t11toc)

<br><br><a id='t11ex3'></a>
◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾

**✎ Exercise 𝟛**<br> <br> ▙ ⏰ ~ 3+3 min. ▟ <br>

❶ Write a code to write the result of the Exercise 2.1 into a JSON file. 

In [None]:
# Exercise 3.1


❷ Write a code to write the result of the Exercise 2.1 into a CSV file.

In [None]:
# Exercise 3.2


<br>[back to top ↥](#t11toc)

<br><br><a id='t11sol'></a>
◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼<br>
◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼◼

#### 🔑 **Exercises Solutions** ####

**Exercise 1.1:**

In [None]:
with open("studentsgrades.txt") as f:
    t = f.readlines()

t = [student.lstrip('CMI-InfSt.Num ') for student in t]
t = [student.lstrip(' 0123456789') for student in t]


with open("studentsgrades-m.txt", "w") as f:
    f.writelines(t)

**Exercise 1.2:**

In [None]:
with open("studentsgrades.txt") as f:
    t = f.readlines()

t = [student.lstrip('CMI-InfSt.Num ') for student in t]
t = [student.strip(' .0123456789\n') + '\n' for student in t]


with open("studentsnames.txt", "w") as f:
    f.writelines(t)

<br>[back to Exercise 1 ↥](#t11ex1)

<br>[back to top ↥](#t11toc)

**Exercise 2.1:**

In [None]:
class student:
    
    def __init__(self, fn, ln, gr):
        self.firstName = fn
        self.lastName = ln
        self.grade = gr

    def __str__(self):
        return f'{self.firstName} {self.lastName} ({self.grade})'

    def __repr__(self):
        return f'{self.__class__.__name__}({self.firstName}, {self.lastName}, {self.grade})'


with open("studentsgrades.txt") as f:
    t = f.readlines()

t = [st.lstrip('CMI-InfSt.Num ') for st in t]
t = [st.lstrip(' 0123456789') for st in t]
t = [st.rstrip('\n') for st in t]

t = [st.split(' ') for st in t]

studentsList= [student(st[0],st[1],st[2]) for st in t]

for st in studentsList:
    print(st)

**Exercise 2.2:**

In [None]:
class student:
    def __init__(self, fn, ln, gr):
        self.firstName = fn
        self.lastName = ln
        self.grade = gr

    def __str__(self):
        return f'{self.firstName} {self.lastName} ({self.grade})'

    def __repr__(self):
        return f'{self.__class__.__name__}({self.firstName}, {self.lastName}, {self.grade})'

class group:
    def __init__(self, sl):
        self.studentsList = sl

    def search(self, key_lastname):
        for st in self.studentsList:
            if st.lastName == key_lastname:
                return st
        return None

with open("studentsgrades.txt") as f:
    t = f.readlines()

t = [st.lstrip('CMI-InfSt.Num ') for st in t]
t = [st.lstrip(' 0123456789') for st in t]
t = [st.rstrip('\n') for st in t]

t = [st.split(' ') for st in t]

studentsList= [student(st[0],st[1],st[2]) for st in t]

analysis2_cmiinf1M = group(studentsList)

print(analysis2_cmiinf1M.search('Charles'))
print(analysis2_cmiinf1M.search('Andy'))

<br>[back to Exercise 2 ↥](#t11ex2)

<br>[back to top ↥](#t11toc)

**Exercise 3.1:**

In [None]:
import json

class student:
    
    def __init__(self, fn, ln, gr):
        self.firstName = fn
        self.lastName = ln
        self.grade = gr

    def __str__(self):
        return f'{self.firstName} {self.lastName} ({self.grade})'

    def __repr__(self):
        return f'{self.__class__.__name__}({self.firstName}, {self.lastName}, {self.grade})'


with open("studentsgrades.txt") as f:
    t = f.readlines()

t = [st.lstrip('CMI-InfSt.Num ') for st in t]
t = [st.lstrip(' 0123456789') for st in t]
t = [st.rstrip('\n') for st in t]

t = [st.split(' ') for st in t]

studentsList= [student(st[0], st[1], st[2]) for st in t]

studentsDict = []

for st in studentsList:
    dictItem ={"First Name": st.firstName, "Last Name": st.lastName, "Grade": st.grade}
    studentsDict.append(dictItem)

with open('students-grades.json', 'w') as f:
    json.dump(studentsDict , f, indent = 1)

**Exercise 3.2:**

In [None]:
import csv

class student:
    
    def __init__(self, fn, ln, gr):
        self.firstName = fn
        self.lastName = ln
        self.grade = gr

    def __str__(self):
        return f'{self.firstName} {self.lastName} ({self.grade})'

    def __repr__(self):
        return f'{self.__class__.__name__}({self.firstName}, {self.lastName}, {self.grade})'


with open("studentsgrades.txt") as f:
    t = f.readlines()

t = [st.lstrip('CMI-InfSt.Num ') for st in t]
t = [st.lstrip(' 0123456789') for st in t]
t = [st.rstrip('\n') for st in t]

t = [st.split(' ') for st in t]

studentsList= [student(st[0], st[1], st[2]) for st in t]

header = ["FName", "LName", "Grade"] 
rows = [header]
for st in studentsList:
    new_row =[st.firstName, st.lastName, st.grade]
    rows.append(new_row)

with open('students-grades.csv', 'w', newline='') as f:
    writer = csv.writer(f, delimiter='\t')
    writer.writerows(rows)  

<br>[back to Exercise 3 ↥](#t11ex3)

<br>[back to top ↥](#t11toc)