## File Input and Output

### Introduction

Reading and writing files is common in programming. Python makes these operations easy. You can read both text and binary files. Reading and writing files.

![rw_file_mode](src/rw_file_mode.png "rw_file_mode")

If you are dealing with text files, typically you will use 'r' as the mode to read a
file, and 'w' to write a file

### Reading files

In [4]:
file_path = "src/students.csv"
file = open(file_path, 'r')
for line in file:
    print(line.strip())
file.close()

id,first_name,last_name,date_of_birth,ethnicity,gender,status,entry_academic_period
1,John,Doe,01/2000,Hispanic,M,FT,Fall 2008
2,Jane,Smith,05/2001,Hispanic,F,TRANSFER,Fall 2006
3,Sarah,Thomas,21/2002,Hispanic,M,FTFT,Fall 2006
4,Frank,Brown,13/2002,Race/ethnicity unknown,M,FTFT,Fall 2006
5,Mike,Davis,31/2001,White,F,FTFT,Fall 2007
6,Jennifer,Wilson,01/2002,Asian,M,TRANSFER,Fall 2006
7,Jessica,Garcia,01/2000,White,F,FTFT,Fall 2007
8,Fred,Clark,17/1999,Hispanic,F,FTGRAD,Fall 2010
9,Bob,Lopez,04/1998,White,F,FTFT,Fall 2007


The `with` statement is for obtaining a context manager that will be used as an execution context for the commands inside the `with`. Context managers guarantee that certain operations are done when exiting the context.


In this case, the context manager guarantees that `simple_file.close()` is implicitly called when exiting the context. This is a way to make developers life easier: you don't have to remember to explicitly close the file you openened nor be worried about an exception occuring while the file is open. Unclosed file maybe a source of a resource leak. Thus, prefer using `with open()` structure always with file I/O.

In [6]:
file_path = "src/students.csv"
with open(file_path, "r") as file:
    for line in file:
        print(line.strip())

id,first_name,last_name,date_of_birth,ethnicity,gender,status,entry_academic_period
1,John,Doe,01/2000,Hispanic,M,FT,Fall 2008
2,Jane,Smith,05/2001,Hispanic,F,TRANSFER,Fall 2006
3,Sarah,Thomas,21/2002,Hispanic,M,FTFT,Fall 2006
4,Frank,Brown,13/2002,Race/ethnicity unknown,M,FTFT,Fall 2006
5,Mike,Davis,31/2001,White,F,FTFT,Fall 2007
6,Jennifer,Wilson,01/2002,Asian,M,TRANSFER,Fall 2006
7,Jessica,Garcia,01/2000,White,F,FTFT,Fall 2007
8,Fred,Clark,17/1999,Hispanic,F,FTGRAD,Fall 2010
9,Bob,Lopez,04/1998,White,F,FTFT,Fall 2007


### Writing files

In [12]:
file_path = "src/students_w.csv"
header = ["id","first_name","last_name","date_of_birth","ethnicity"]
content = [
    [1,'John','Doe','01/2000','Hispanic'],
    [2,'Jane','Smith','05/2001','Hispanic'],
    [3,'Sarah','Thomas','21/2002','Hispanic'],
    [4,'Frank','Brown','13/2002','Race/ethnicity unknown'],
    [5,'Mike','Davis','31/2001','White']
]

with open(file_path, "w") as file:
    for item in header:
        file.write("{},".format(item))

This is tough the write each of the row with the iteration from each row.

In [15]:
import csv

with open(file_path, 'w') as f: 
    write = csv.writer(f) 
    write.writerow(header) 
    # for row in content:
    #     write.writerow(row) 
    write.writerows(content) 

### [FYI] Working with paths

In [1]:
from pathlib import Path

dir_path = Path('src').resolve()
file_path = Path('src/students_w.csv').resolve()

print(f"current file: {dir_path}")
# Note: in .py files you can get the path of current file by Path(__file__)

current_dir = dir_path.parent
print(f"current directory: {current_dir}")

src_dir = current_dir.parent / "src"
print(f"data directory: {src_dir}")

current file: /Users/tobliao/Documents/workspace/GEMS_Python_Basic/notebooks/src
current directory: /Users/tobliao/Documents/workspace/GEMS_Python_Basic/notebooks
data directory: /Users/tobliao/Documents/workspace/GEMS_Python_Basic/src


#### Checking if path exists

In [2]:

print(f"exists: {dir_path.exists()}")
print(f"is directory: {dir_path.is_dir()}")
print(f"is file: {file_path.is_file()}")

exists: True
is directory: True
is file: True
