# Introduction to Computer Programming

## Week 6.3: Modules for reading & writing files

* * *

<img src="img/full-colour-logo-UoB.png" alt="Bristol" style="width: 300px;"/>

We use/store data in different formats.

Some Python modules for reading/writing, importing/exporting data files:
- `csv`: working with delimited files
- `os`: operating-system level operations e.g. manipulating a file-system

# CSV (and other delimited files)

CSV (comma-seperated-value) file : a delimited text file that uses a comma to separate values.

A delimited file uses a set character (tab, space, vertical bar etc) to separate values. 

The CSV file is a widely used format for storing tabular data in plain text   and is supported by software applications e.g. Microsoft Excel, Google Spreadsheet.





<img src="img/csv_file_example.png" alt="Bristol" style="width: 400px;"/>

Python's `csv` module can be used to handle files of this type.

https://docs.python.org/3/library/csv.html

In [40]:
import csv

### Writing CSV files

Steps for writing/appending a CSV are almost the same as for a text file  (only step 2 is different):
1. open the csv file in `w`(write) or `a`(append) mode using `open` / `with open`
1. create a CSV writer object using `writer` 
1. write data to file using `writerow`(`s`)
1. (close file using `close`) 

In [49]:
import csv

f = open('sample_data/scores.csv', 'w')

writer = csv.writer(f) 

writer.writerow([1, 'Elena', 550]) # row should be iterable, e.g. list

f.close()

__Example:__ Edit the file 'scores.txt' to create the hihg score table shown as a csv file (we have already written the first line).

| Place | Name | Score | 
| :-: | :-: | :-: |  
| 1 | Elena | 550 | 
| 2 | Sajid | 480 | 
| 3 | Tom | 380 | 
| 4 | Farhad | 305 | 
| 5 | Manesha | 150 |

### Appending CSV files

In [50]:
import csv

with open('sample_data/scores.csv', 'a') as f:

    writer = csv.writer(f) 

    writer.writerow([2, 'Sajid', 480]) 
    writer.writerow([3, 'Tom', 380]) 
    writer.writerow([4, 'Farhad', 305]) 
    writer.writerow([5, 'Manesha', 150]) 

If you open `scores.csv` you will see there is an additional blank row between subsequent rows: 

<img src="img/scores_csv.png" alt="Bristol" style="width: 200px;"/>

To avoid the blank line, pass the argument `newline=''` to the open function. 

High score table data:

In [51]:
header = ['place', 'name', 'score']

data = [[1, 'Elena', 550], 
        [2, 'Sajid', 480],
        [3, 'Tom', 380],
        [4, 'Farhad', 305],
        [5, 'Manesha', 150]    
       ]

__Example:__ Write the high score table data to a csv file

In [52]:
import csv

with open('sample_data/scores.csv', 'w', newline='') as f:

    writer = csv.writer(f) 

    writer.writerow(header) # write single row
    
    writer.writerows(data)  # write multiple rows


<img src="img/scores_csv_newline.png" alt="Bristol" style="width: 200px;"/>

### Writing data as columns

The data for the high score table may be stored in the Python programme as:

    places = [1, 2, 3, 4, 5]
    names = ['Elena', 'Sajid', 'Tom', 'Farhad', 'Manesha']
    scores = [550, 480, 380, 305, 150]
    
We cannot write a column, explicitly as we would in Excel. 

The CSV file is essentially a text file with commas to separate values. 

We can use `zip` to transpose the data from rows to columns before writing to a CSV file. 

Like when using `zip` to iterate through two lists, items from mutiple lists are regrouped elementwise.  

In [53]:
places = [1, 2, 3, 4, 5]
names = ['Elena', 'Sajid', 'Tom', 'Farhad', 'Manesha']
scores = [550, 480, 380, 305, 150]

data = zip(places, names, scores)

print(list(data))

[(1, 'Elena', 550), (2, 'Sajid', 480), (3, 'Tom', 380), (4, 'Farhad', 305), (5, 'Manesha', 150)]


To transpose a list of lists we can use `*`. 

This *unpacks* the list (removing the outer brackets). 

In [54]:
data = [[1, 2, 3, 4, 5], 
        ['Elena', 'Sajid', 'Tom', 'Farhad', 'Manesha'],
        [550, 480, 380, 305, 150]    
       ]

data_cols = zip(*data)

print(list(data_cols))

[(1, 'Elena', 550), (2, 'Sajid', 480), (3, 'Tom', 380), (4, 'Farhad', 305), (5, 'Manesha', 150)]


### Reading CSV files
Steps for reading a CSV are almost the same as for a text file (only step 2 is different):
1. open the csv file in `r`(read) mode (default mode specifier) 
1. create a CSV reader object using `reader` 
1. The file contents are imported as an *iterable* object i.e. behaves as a list. 
<br>The items of the iterable object are the lines of the file as lists.
<br>Then items of each list are the comma separated values as strings. 
1. (close file using `close`) 

In [56]:
import csv

f = open('sample_data/scores.csv')

reader = csv.reader(f) 

for line in reader:
    print(line)

f.close()

['place', 'name', 'score']
['1', 'Elena', '550']
['2', 'Sajid', '480']
['3', 'Tom', '380']
['4', 'Farhad', '305']
['5', 'Manesha', '150']


# Iterating multiple files 

We may want to include multiple files in a program, e.g. iterate over multiple files in a directory. 

 Python module `os` has many useful functions for system level operations: 
https://docs.python.org/3/library/os.html

__Example:__ Print the names of all the files in the directory `sample_data/a_folder`

In [87]:
import os

# gets the current directory as a string
current_directory = os.getcwd() 

# joins directory names to create a path to the target directory
directory_to_iterate = os.path.join(current_directory, 'sample_data', 'a_folder')

# loops through the files in that directory as a list
for file in os.listdir(directory_to_iterate):
    print(file)

sample_student_data.csv
signal_data.csv
temperature_data.csv


# Summary 
Some Python modules for reading/writing, importing/exporting data files:
- `csv`: working with delimited files
- `os`: operating-system level operations e.g. manipulating a file-system

# Further reading
- Explore the `csv` module further (e.g. exporting Python dictionaries to .csv files) https://docs.python.org/3/library/csv.html
- Explore the `os` module for system-level operations (e.g. creating a new directory in your filesystem)  https://docs.python.org/3/library/os.html
- Will will learn more ways to read and write files using packages we study later on the course (e.g. `matplotlib`, `numpy`).  

In [None]:
# Accessing multiple files 

TODO : 
Other file locations 




Other file types

csv

https://www.pluralsight.com/guides/importing-data-from-tab-delimited-files-with-python

https://www.pythontutorial.net/python-basics/python-write-csv-file/



reading in multiple files, creating a directory 

https://automatetheboringstuff.com/chapter8/

https://automatetheboringstuff.com/chapter9/

Further reading -

excel files?

dictionaries

csv librray 

os librray for making diretcories 

challenge 

https://thispointer.com/python-add-a-column-to-an-existing-csv-file/

In [None]:
example exercise total area
https://www.pythontutorial.net/python-basics/python-read-csv-file/

# Creating a directory
We have seen how to create a file in another directory, but that directory must already exist. 

The Python `os` module is useful for system level operations.
https://docs.python.org/3/library/os.html

In [57]:
import os

To create a new directory:

In [76]:
os.getcwd()

'C:\\Users\\hemma\\iCloudDrive\\Documents\\Code\\Jupyter_NBooks\\Teaching\\UoB\\UoB_ICP_2021'

In [77]:
os.chdir(os.path.join(os.getcwd(), 'sample_data'))

os.mkdir('a_folder')

os.makedirs(os.path.join('another_folder', 'my_folder'))

In [78]:
os.getcwd()

'C:\\Users\\hemma\\iCloudDrive\\Documents\\Code\\Jupyter_NBooks\\Teaching\\UoB\\UoB_ICP_2021\\sample_data'

In [79]:
os.chdir(os.path.join(os.getcwd(), '../'))

In [65]:
os.getcwd()

'C:\\Users\\hemma\\iCloudDrive\\Documents\\Code\\Jupyter_NBooks\\Teaching\\UoB\\UoB_ICP_2021'

In [None]:
To creates multiple new directories, listed in downstream order:

In [None]:
os.mkdirs