<a href="https://colab.research.google.com/github/ahatem-csustan/CS3100/blob/main/file_reading_python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# File Reading

In this tutorial, we will cover the following concepts:
*   Open files
*   Close files
*   Reading from and writing to text files
*   Split lines 
*   Reading from and writing to csv files
*   Read from a directory of files 





## File Opening 

File opening and handle in Python is more straight forward than other languages. To open all file, all you need to do is to use the open function in Python.

In [None]:
file_handler = open("demo_file.txt", "r")

The ``` open()``` function can take one input, two inputs, or more. Here, we will only focus on two inputs. 

Let's look at the line of code we have above. There are two inputs. The first one, ```demo_file.txt```, is the name of the file that you want to either write to or read from. The second input, ``` r ``` is the opening mode. There are multiple modes depending on they type of the file and what you want to open the file for. 

*   ``` r ```  opens a file for reading (default).
*   ``` w ```  opens a file for writing. Creates a new file if it does not exist or truncates the file if it exists.
*   ``` x ```  Opens a file for exclusive creation. If the file already exists, the operation fails.
*   ``` a ```  Opens a file for appending at the end of the file without truncating it. Creates a new file if it does not exist.
*   ``` t ```  Opens in text mode. 
*   ``` b ```  Opens in binary mode.
*   ``` + ```  Opens a file for updating (reading and writing)

## File Closing

Now that we opened the file, we need to close it after we use it to free up the memory. We use the ```close()``` function to do that.



In [None]:
file_handler.close()

The above method is not entirely safe. If something happens while we are closing the file, the program might not correctly close it. It is better to use the try finally block.

In [None]:
try:
   file_handler = open("demo_file.txt", "r")
finally:
   file_handler.close()

Even better is to use the ```with``` statement. his ensures that the file is closed when the block inside the ```with``` statement is exited.

> Indented block



In [None]:
with open("file_demo.txt","r") as file_handler:
  # perform file operations.

In the following, we will stick to using the ```with``` statement

## File Reading and Writing

Since we do not have a file yet, we will start with writing to files. 

In [1]:
### COMPILE AND RUN THIS CODE ###
with open("demo_file.txt", "w") as file_handler:
  file_handler.write("my first line\n")
  file_handler.write("my second line\n")


Now we will go ahead and open the file for reading and read it.

In [2]:
### COMPILE AND RUN THIS CODE ###

with open("demo_file.txt", "r") as file_handler:
  print(file_handler.read())


my first line
my second line



The ``` read()``` function reads the whole file at once. You can also decide the size of what you want to read.

In [4]:
### COMPILE AND RUN THIS CODE ###

with open("demo_file.txt", "r") as file_handler:
  print(file_handler.read(2)) # Read the first 2 data
  print(file_handler.read(5)) # Read the second 5 data


my
 firs


It counts the spaces as a character (check the output of the run above). There are also other methods to read from files. For example, if you want to read one line at a time, you can use the ```readline()``` function. This function reads the whole line up until the newline character, including it.

In [5]:
### COMPILE AND RUN THIS CODE ###

with open("demo_file.txt", "r") as file_handler:
  print(file_handler.readline()) 
  print(file_handler.readline()) 

my first line

my second line



You can also read all lines into a list using the ```readlines()``` function.

In [6]:
### COMPILE AND RUN THIS CODE ###

with open("demo_file.txt", "r") as file_handler:
  print(file_handler.readlines()) 

['my first line\n', 'my second line\n']


You can also loop through all the lines using a for loop.

In [7]:
### COMPILE AND RUN THIS CODE ###

with open("demo_file.txt", "r") as file_handler:
  for line in file_handler:
    print(line) 

my first line

my second line



## Split Lines

Sometimes after reading a line, we need further to process it to extract every single word. We can do that using the ```split``` function. However, we need to understand what the delimiter is between the word in order to correctly split it. 

Let's go back to our ```demo_file.txt```



In [4]:
### COMPILE AND RUN THIS CODE ###
with open("demo_file.txt", "w") as file_handler:
  file_handler.write("my first line\n")
  file_handler.write("my second line\n")

with open("demo_file.txt", "r") as file_handler: 
  for line in file_handler: # looping through all the lines in the file 
    
    line = line.strip("\n") # removing the end of line 

    words = line.split() # splitting the line based on space delimiter

    print(words)

['my', 'first', 'line']
['my', 'second', 'line']


The function ```split``` can also take a delimiter as input.

In [6]:
### COMPILE AND RUN THIS CODE ###
with open("demo_file.txt", "w") as file_handler:
  file_handler.write("my,first,line\n")
  file_handler.write("my,second,line\n")

with open("demo_file.txt", "r") as file_handler: 
  for line in file_handler: # looping through all the lines in the file 
    
    line = line.strip("\n") # removing the end of line 

    words = line.split(",") # splitting the line based on a comma delimiter

    print(words)

['my', 'first', 'line']
['my', 'second', 'line']


So, you have to know exactly how your words are organized in the file in order to correctly split the line into words.

## Reading and Writing to CSV files

Python allows you to open and read from other types of files like csv and excel sheets. 

To read a csv file, you have to import the csv library. You will be able to open the csv file using the ``` open()``` function but you will use other specific functions for reading and writing. 

Let's first start with writing to a csv file and then reading from. 

First, you have to create a ```csvwriter``` object to be able to write to the csv file. 

In [None]:
import csv

with open("demo_file.csv", "w") as csvfile:
  csvwriter = csv.writer(csvfile) 

The function ```writerow``` is then used to write a complete row into the csv file. 

In [1]:
### COMPILE AND RUN THIS CODE ###

import csv

fields = ["first_name", "last_name", "ID"]

with open("demo_file.csv", "w") as csvfile:
  csvwriter = csv.writer(csvfile) 
  csvwriter.writerow(fields)

You could also write multiple rows as once to the csv file using the ```writerows``` function.

In [2]:
### COMPILE AND RUN THIS CODE ###

import csv

fields = ["first_name", "last_name", "ID"]
rows = [["Jack", "John", 1234],
        ["Amy", "Brian", 2345],
        ["Annie", "Jackson", 3456]]

with open("demo_file.csv", "w") as csvfile:
  csvwriter = csv.writer(csvfile) 
  csvwriter.writerow(fields)
  csvwriter.writerows(rows)

Now, to read from the csvfile, we would need to create a ```csvreader``` object that iterates throughout the lines in the specified CSV document.

In [3]:
### COMPILE AND RUN THIS CODE ###

import csv

with open("demo_file.csv", "r") as csvfile:
  csvreader = csv.reader(csvfile)

  for lines in csvreader:
        print(lines)


['first_name', 'last_name', 'ID']
['Jack', 'John', '1234']
['Amy', 'Brian', '2345']
['Annie', 'Jackson', '3456']


## Reading from a Directory

If you have multiple files that you need to open and read from, instead of adding every file name, we could use the Python functions that would allow us to read all the file names in a directory. Then, we could open each file afterwards.

In [None]:
import os

path = "directory"
files = os.listdir(path)

the ```os.listdir``` function lists all the files and the directories in the given folder. 

After you get the file names, you would have to add the path to the file name in order to open the file (You have to give the full path to the file in order to open it).

In [None]:
import os

path = "directory"

files = os.listdir(path)
files_path = [os.path.join(path, f) for f in files]

You can now loop through the ```files_path``` list in order to open all the files in the directory.

## Resources:

https://www.geeksforgeeks.org/reading-csv-files-in-python/

https://www.geeksforgeeks.org/writing-csv-files-in-python/

https://realpython.com/python-csv/

https://www.tutorialspoint.com/python/os_listdir.htm