<img src="images/inmas.png" width=130x align=right />

# Notebook 08 - Input and Output

Material covered in this notebook:
- How to open and close a file
- How to read lines from a text file
- How to write text to a file

### Prerequisite
Notebooks 07

### Opening files for reading and writing

The most common way to access a file in Python is to use the built-in <code>open</code> function

- [`open()`](https://docs.python.org/3/library/functions.html#open) : returns a file object
    - It commonly takes two arguments: `open(filename, mode)`
      - filename : name of the file you want to open
      - mode (optional)
           - `'r'` : reading only (default)
           - `'w'` : Write - will create a file if the specified file does not exist
           - `'x'` : Create - will create a file, returns an error if the file exist
           - `'a'` : Append - will create a file if the specified file does not exist 
           - `'r+'`: both reading and writing
           




### Providing a file name
The first parameter you need is the file path and the file name. An example is shown as follow:

```
file = open('somepath/somefile.xxx', 'r')
```

On Windows, your file path can use backslashes ('\\') while macOS and Linux use forward slashes ('/'). See the README file associated with this Workshop for a more detailed explanation on file paths.

The data we will use is located in the `data` folder inside the folder where you saved the notebooks. To access files in this folder you could include the relative path in the `filename` variable as below:

In [None]:
filename= "data/sample.txt"

### Reading a text file
Once a text file is open, two commonly used built-in functions can be used to read its contents
- `read()` : method that reads the specified number of bytes from the file
    - If not specified, default is -1 which means to read the whole file
- `readline()` : method that reads the file object line by line (use for text files)

These are called methods, because they are invoked as `f.read()` for a file object `f`

Let's look at a full example:

The file *data/sample.txt* contains the following:
```
Hello World!
Welcome to the INMAS Python workshop.
```
Let's open the file and read its contents. The function `read()` returns a single string containing all the file contents.

In that case, all new lines are included in the string as newline `'\n'` characters

In [None]:
filename= "data/sample.txt"
f = open(filename, 'r')
contents = f.read()         # Default to read the contents of the whole file.
f.close()                   # It's important to close the file! It will free up the resources that were tied with the file.
print(contents)

### A more robust way to perform I/O
Using the <code>with</code> statement is better practice, as it automatically closes the file even if the code encounters an exception. The code will run everything in the indented block then close the file object. 

- `with` statement
    - better syntax and frees resources if exceptions occur
    - no need to explicitly call the `close()` method. This is done internally.

The steps are examplified in this code snippet here:
```
    with open('somepath/somefile.xxx', 'r') as file:
        fileContents = file.read()
        
    # End of block - At this point the file object has been automatically closed
    print(fileContents)
```


### Using `with` to read files
Here is an example using the `read()` function. This function returns a string.

In [None]:
# Read file using read()
with open(filename, 'r') as f:
    contents = f.read()
    print(contents)

Or with the `readline()` function, which keeps the carriage returns (thus `end=''` in `print()`):

In [None]:
# Read file using readline()
with open(filename, 'r') as f:
    line = f.readline()
    while line:
        print(line, end='')
        line = f.readline()

### Stripping new line characters
Sometimes it is useful to get rid of the newline characters (i.e., '\n') at the end of the lines. For this we use the string `strip()` method from the `str` (string) class.

In [None]:
# Read file using readline() but stripping new line character(s)
with open(filename, 'r') as f:
    line = f.readline().strip()
    while line:
        print('Stripped line is "%s"'%line)
        line = f.readline().strip()

Compare with the version on the previous slide

### Working with CSV files in Python

A common file standard for storing data in columns is the comma-separated-values (CSV) format. There are multiple ways in which we can open and see CSV files in Python.

We can either use the csv module, or use the Pandas module.

### Using the csv module
While we could use the built-in `open()` function to work with CSV files in Python, there is a dedicated `csv` module that makes working with CSV files much easier.

Before we can use the methods to the `csv` module, we need to import the module first using:

In [None]:
import csv

To read a CSV file in Python, we can use the `csv.reader()` function. Let's open the csv file named `people.csv`:

In [None]:
people= "data/people.csv"
with open(people, 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

### Understand what you write and read
With csv.reader(), each line gets parsed into a list, and each entry is a string

Even if 'numbers' were written to the csv file, they are written as strings that will have to be converted back to an integer, or a float. Fortunately, the cast functions of Python are very powerful and understand things like:

In [None]:
print(int('2'))
print(float('2.4'))

where we are converting from string to int (through a cast) back to string (through print())

Depending on the precision being used when floats are written, precision can be lost. This is a common pitfall for losing accuracy. Be vigilent!

### Specifying different delimiters
Note that if you open the *people.csv* file in a text editor everything is separated by commas. This is called a delimiter. Suppose our CSV file was using tab as a delimiter. To read such files, one must provide the *delimiter* parameters to the `csv.reader()` function. This would be done as follows, as `\t` is the code for a tab.
```
reader = csv.reader(file, delimiter='\t')
```

### Using the Pandas module to handle CSV files


Pandas is a popular data science library in Python for data manipulation and analysis. If we are working with huge chunks of data, it's better to use Pandas to handle CSV files for ease and efficiency.

In [None]:
import pandas as pd

To read the CSV file using pandas, we can use the `read_csv()` function. This function returns a dataframe that we will explore in detail in a subsequent notebook. Notice the nice printing output.

In [None]:
pd.read_csv(people)

### Writing to a file
As for reading, there are two built-in methods for writing to a file:

- `write()` :  writes a string to a text file
- `writelines()` : write a list of strings to a file all at once

As for reading, we will use a `with` construct which will close the file at the end of the code block

### Writing using `write()`
In this example, we use `write()` to write a list of lines, string by string:

In [None]:
lines2 = ['Hello World!', 'We are happily coding in Python.']

with open("data/sample2.txt", 'w') as f:
    for line in lines2:
        f.write(line)
        f.write('\n')

### Writing using `writelines()`
An example of writing in a file using `writelines()` all at once, providing the whole list:

In [None]:
lines3 = ['INMAS', 'Python Workshop.']

with open("data/sample3.txt", 'w') as f:
    f.writelines(lines3)

### Removing Files
If you want to remove the file we just created, we need to import the operating system (os) module:

In [None]:
import os

The following code will remove the files *data/sample2.txt* and *3* if they exist

In [None]:
myfiles = ['data/sample2.txt', 'data/sample3.txt']
for file in myfiles:
    if os.path.isfile(file):
        print('removing file "%s"...' % file)
        os.remove(file)
    else:
        print('File "%s" not found!' % file)

### Listing directories
We can then verify that the file was removed. This returns a list of strings of all the files in the directory:

In [None]:
os.listdir('data')

### Key Points
- Using `with` when opening files will take care of resources and closes open files
- There are two ways to read a file: all at once, or line by line
- Same for writing files
- Be aware of what is being written and read to a file and how it converts back to numbers
- Files can be removed using the `os` module

### What's Next?
- Complete the exercises in this associated exercise notebook [X-08-InputOutput.ipynb](X-08-InputOutput.ipynb)
- Next notebook is [N-09-MiniProject.ipynb](N-09-MiniProject.ipynb)