## Module 4: Python


# What We Can Do With Data
## READING and WRITING FILES
<br>

Asel Kushkeyeva<br>
Data Science Institute, University of Toronto<br>
2022

### Jupyter Notebook as a Slideshow

To see this notebook as a live slideshow, we need to install RISE (Reveal.js - Jupyter/IPython Slideshow Extension):

1. Insert a cell and execute the following code: `conda install -c conda-forge rise`
2. Restart the Jupyter Notebook.
3. On the top of your notebook you have a new icon that looks like a bar chart; hover over the icon to see 'Enter/Exit RISE Slideshow'.
4. Click on the RISE icon and enjoy the slideshow.
5. You can edit the notebook in a slideshow mode by double clicking the line.
*This is done only once. Now all your notebooks will have the RISE extension (unless you re-install the Jupyter Notebook).*

# Agenda

1. Reading Files
2. Writing Files

# We Can Read and Write Files

Few things to consider while working with files:

- File paths consist of folder path, file name, and extension.
<br>

For example, the current Jupyter Notebook's file path could be user/name/Desktop/DataScienceInstitute/whatcanwedowithdata.ipynb. Folder path is *users/name/Desktop/DataScienceInstitute/*; file name is *whatcanwedowithdata*; extension is *.ipynb*.

In [18]:
# to display the current working directory
import os
os.getcwd()

'/Users/aselkushkeyeva/Desktop/DSI'

To change working directory: os.chdir('new file path')

To work with a file in the directory above the current:
<br>

__../DSI/whatcanwedowithdata.ipynb__

<br>

And two directories above:
<br>

__../../DSI/whatcanwedowithdata.ipynb__

- Line endings.
<br>

Carriage return __\r__ and new line __\n__ symbols indicate end of line differently on Windows and Unix. We may need to account for such situations.

- Character encoding.
<br>

The most common systems of encodings are ASCII and UNICODE. Our code might throw an error if we are trying to parse a code using ASCII that was encoded by UNICODE. 

## Reading files

__open('abc.txt')__ is the code to open a file in Python. `open` is a built-in function that requires one argument -- file path. In our case the file path is *'abc.txt'*.

After opening a file and completing any work with it, the file needs to be closed. *close()* method does it for us. There is a cleaner way to open and close files: the *with* statement.

In [15]:
with open('abc.txt', 'r') as abc_file:
    # some file processing goes here
    
# this code will result in a error as this is a partial statement,
# and we do not have the abc.txt file in the current directory.

IndentationError: expected an indented block (3111970361.py, line 4)

__'r'__ tells Python to only read the file. We replace __'r'__ with __'w'__ if we want to write a file, and __'a'__ for adding information to an existing file.

### Methods to read files:

- .read() -- reads an entire file.
- .readline() -- reads each line.
- .readlines() -- reads each line and returns them as a list.

### *for* loop to read line by line

In [None]:
with open('abc.txt', 'r') as abc_file:
    for line in abc_file:
        print(line)
    # some file processing goes here
    
# this code will result in a error as this is a partial statement,
# and we do not have the abc.txt file in the current directory.

*An exercise with a txt file*

To read csv files:

In [None]:
import csv
with open('123.csv', 'r', newline = '') as number_file:
    contents = csv.reader(number_file)
    for row in contents:
        print(row)
# as usual please replace '123.csv' with an existing file in the current directory for this code to work.

## PRACTICE IN YOUR NOTEBOOK

A *with open* and *for* loop exercise with a *txt* file.

## Writing to Files

In [1]:
with open('fruits.txt', 'w') as fruit_file:
    fruit_file.write('tangerines')

That's it! We created a txt file *fruits* in the current working directory. The *fruits* file contain string 'tangerines'.

__Warning!__ Writing a file with *'w'* will create a new file. In case a file already exist, 'w' will overwrite the entire file.

To add content to *fruits*:

In [3]:
with open('fruits.txt', 'a') as fruit_file:
    fruit_file.write('dragon fruit')

Right now the *fruit* file contains the following: 
<br>

tangerinesdragon fruit

For a content to appear on a new line:

In [4]:
with open('fruits.txt', 'a') as fruit_file:
    fruit_file.write('\napples')

# References

- Chapter 10, Gries, Campbell, and Montojo, 2017, *Practical Programming: An Introduction to Computer Science Using Python 3.6*