> *The creation of the lessons in this unit relied heavily on the existing lessons created by Mrs. FitzZaland as well as the [lecture series](https://github.com/milaan9/05_Python_Files) produced by Dr. Milaan Parmar. Additionally, these lessons have largely been modelled off of the book [Think Python](https://open.umn.edu/opentextbooks/textbooks/43) by Allen Downey.*

# Python File I/O

In this lesson, you'll learn about Python file operations. More specifically, opening a file, reading from it, writing into it, closing it, and various file methods that you should be aware of.

**[This video](https://www.youtube.com/watch?v=gSbEXZvgyBw) provides a good introduction into reading and writing to files with Python.**

<div class="alert alert-info"><h4>Tasks</h4><p>Alert boxes like this will provide you with tasks that you must do while going through this lesson.</p></div>


## Files

When we want to read from or write to a file, we need to open it first. When we are done, it needs to be closed so that the resources that are tied to the file are freed.

## 1. Opening Files in Python

Python has a built-in **`open()`** function to open a file. This function returns a file object (also called a handle) which is used to read or modify the file accordingly.

```python
>>> f = open("test.txt")  # open file in current directory
>>> f = open("C:/Python99/README.txt")   # specifying full path
```

We can specify the **mode** while opening a file. In mode, we specify whether we want to read **`r`**, write **`w`** or append **`a`** to the file. We can also specify if we want to open the file in text mode or binary mode.

> The default is reading in text mode. 

In text mode, we get strings when reading from the file.

Binary mode returns bytes and this is the mode to be used when dealing with non-text files like images or executable files.

The keys for each mode are described below:

| Mode | Description |
|:----:| :--- |
| **`r`** | **Read** -Opens a file for reading only. The file pointer is placed at the beginning of the file. This is the default mode.   | 
| **`t`** | **Text** - Opens in text mode. (default).   | 
| **`b`** | **Binary** - Opens in binary mode (e.g. images).  | 
| **`x`** | **Create** - Opens a file for exclusive creation. If the file already exists, the operation fails.   | 
| **`rb`** | Opens a file for reading only in binary format. The file pointer is placed at the beginning of the file. This is the default mode.   | 
| **`r+`** | Opens a file for both reading and writing. The file pointer placed at the beginning of the file.   | 
| **`rb+`** | Opens a file for both reading and writing in binary format. The file pointer placed at the beginning of the file.   |  
| **`w`** | **Write** - Opens a file for writing only. Overwrites the file if the file exists. If the file does not exist, creates a new file for writing.   | 
| **`wb`** | Opens a file for writing only in binary format. Overwrites the file if the file exists. If the file does not exist, creates a new file for writing.   | 
| **`w+`** | Opens a file for both writing and reading. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing.   | 
| **`wb+`** | Opens a file for both writing and reading in binary format. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing.   | 
| **`a`** | **Append** - Opens a file for appending. The file pointer is at the end of the file if the file exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing.   | 
| **`ab`** | Opens a file for appending in binary format. The file pointer is at the end of the file if the file exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing.   | 
| **`a+`** | Opens a file for both appending and reading. The file pointer is at the end of the file if the file exists. The file opens in the append mode. If the file does not exist, it creates a new file for reading and writing.   |
| **`ab+`** | Opens a file for both appending and reading in binary format. The file pointer is at the end of the file if the file exists. The file opens in the append mode. If the file does not exist, it creates a new file for reading and writing.   |  

In [1]:
f = open("test.txt")   # equivalent to 'r' or 'rt'
print(f)               # <_io.TextIOWrapper name='test.txt' mode='r' encoding='cp1252'>
f.close()

<_io.TextIOWrapper name='test.txt' mode='r' encoding='UTF-8'>


As you can see in the example above, I printed the opened file and it gave  some information about it. 

An opened file has different reading methods - for example, **`read()`**, **`readline`**, **`readlines`**. 

> An opened file has to be closed with **`close()`** method.

In [2]:
f = open("test.txt",'w')  # write in text mode
print(f)
f.close()

<_io.TextIOWrapper name='test.txt' mode='w' encoding='UTF-8'>


In [3]:
f = open("logo.png",'a+b')  # append (read and write) in binary mode
f.close()

The default encoding is platform dependent. On windows, it is **`cp1252`** but **`utf-8`** on Linux.

When working with files in text mode, it is highly recommended to specify the encoding type so our code won't behave differently in different platforms.

In [4]:
f = open("test.txt", mode='r', encoding='utf-8')
f.close()

## 2. Closing Files in Python

As seen above, when we are done with performing operations on the file, we need to properly close the file.

Closing a file will free up the resources that were tied with the file. It is done using the **`close()`** method available in Python.

In [5]:
# Open file
f = open("test.txt", encoding = 'utf-8')
# Perform file operations here...

# Then close
f.close()

This method is not entirely safe. If an exception occurs when we are performing some operation with the file, the code exits without closing the file.

> **The best way to open and close a file is by using the **`with`** statement.** This ensures that the file is closed when the block inside the **`with`** statement is exited.

We don't need to explicitly call the **`close()`** method. It is done internally.

```python
with open("test.txt", encoding = 'utf-8') as f:
   # Perform file operations inside with statement

# Continue code outside of with statement
```

In the above example, the `f` variable will only be accessible inside the with statement.

### The file Object Attributes

* **file.closed** - Returns true if file is closed, false otherwise.
* **file.mode** - Returns access mode with which file was opened.
* **file.name** - Returns name of the file.

For example:

In [8]:
# Open a file
with open("data.txt", "wb") as data:

    print ("Name of the file: ", data.name)
    print ("Closed or not : ", data.closed)
    print ("Opening mode : ", data.mode)


Name of the file:  data.txt
Closed or not :  False
Opening mode :  wb


## 3. Writing to Files in Python

In order to write into a file in Python, we need to open it in write **`w`**, append **`a`** or exclusive creation **`x`** mode.

> We need to be careful with the **`w`** mode, as **it will overwrite into the file if it already exists.** In this case, all of the previous information in the file will be erased.

Writing a string or sequence of bytes (for binary files) is done using the **`write()`** method.

For example:

In [9]:
with open("test_1.txt", 'w', encoding='utf-8') as f:
    f.write("my first file\n")
    f.write("This file\n\n")
    f.write("contains three lines\n")

This program will create a new file named **`test_1.txt`** in the current directory if it does not exist. **If it does exist, it is overwritten.**

We must include the newline characters ourselves to distinguish the different lines.

<div class="alert alert-info"><h4>1.</h4><p>Create a new notebook and name it Lesson11_Tasks.</p></div>

<div class="alert alert-info"><h4>2.</h4><p>In your notebook, add to the following code to write some text to a file.</p></div>

```python
with open("test123.txt", 'w', encoding='utf-8') as f:
    # Write something to this file
```

<div class="alert alert-info"><h4>3.</h4><p>Open the file to see if it worked as you expected.</p></div>

We can "append" **`a`** some text to a file we previously created:

In [6]:
with open("test_1.txt", 'a', encoding='utf-8') as f:
    f.write('This text has to be appended at the end')

<div class="alert alert-info"><h4>4.</h4><p>In a new code cell, append to the file you just created:</p></div>

```python
with open("test123.txt", 'a', encoding='utf-8') as f:
    # Append some more text to this file
```

<div class="alert alert-info"><h4>5.</h4><p>Open the file to see if it worked as you expected.</p></div>

## 4. Reading Files in Python

To read a file, we must open the file in read **`r`** mode.

We can read the **`text_1.txt`** file we wrote in the above section in the following way:

In [14]:
with open("test_1.txt",'r',encoding = 'utf-8') as f:
    # Read all the characters in the file
    txt = f.read()
    
    # Print the contents
    print(txt)

my first file
This file

contains three lines



We can also read part of the text:

In [21]:
f = open("test_1.txt",'r',encoding = 'utf-8')
# read the first 8 data characters
f.read(8)

'my first'

In [22]:
# read the next 5 data characters
f.read(5)  

' file'

In [23]:
# read in the rest till end of file
f.read()  

'\nThis file\n\ncontains three lines\n'

In [24]:
# further reading returns empty sting
f.read()  

''

We can see that the **`read()`** method returns a newline as **`'\n'`**. Once you reach the end of the file, we get an empty string upon further reading.

We can change our current file cursor (position) using the **`seek()`** method. Similarly, the **`tell()`** method returns our current position (in number of bytes).

In [25]:
f.tell()    # get the current file position

46

In [26]:
f.seek(0)   # bring file cursor to initial position

0

In [27]:
print(f.read())  # read the entire file

my first file
This file

contains three lines



We can read a file line-by-line using a for loop through the file. This is both efficient and fast.

<div class="alert alert-info"><h4>6.</h4><p>In a new code cell, read the lines of your file using the following for loop:</p></div>

```python
with open("test123.txt", 'r') as f:
    for line in f:
        print(line)
```

In this program, the lines in the file itself include a newline character **`\n`**, which results in two newlines when printing.

Alternatively, we can use the **`readline()`** method to read individual lines of a file. This method reads a file till the newline, including the newline character.

In [28]:
f.seek(0)  # bring file cursor to initial position
f.readline()

'my first file\n'

In [29]:
f.readline()

'This file\n'

In [30]:
f.readline()

'\n'

In [31]:
f.readline()

'contains three lines\n'

In [32]:
with open("test_1.txt", 'r') as f:
    for line in f.readlines():
        print(line)

my first file

This file



contains three lines



Lastly, the **`readlines()`** method returns a list of remaining lines of the entire file.

<div class="alert alert-info"><h4>7.</h4><p>In a new code cell, open your text file and create a forloop similar to the previous task. This time, rather than iterating through your file "f", iterate through "f.readlines()"</p></div>

Another way to get all the lines as a list is using **`splitlines()`**

In [38]:
with open("test_1.txt", 'r') as f:
    lines =  f.read().splitlines()
    print(lines)

['my first file', 'This file', '', 'contains three lines']


## 5. Python File Methods

There are various methods available with the file object. Some of them have been used in the above examples.

Here is the complete list of methods in text mode with a brief description:

| Method | Description |
|:----| :--- |
| **`close()`** |   Closes an opened file. It has no effect if the file is already closed.   | 
| **`detach()`** |   Separates the underlying binary buffer from the **`TextIOBase`** and returns it.   | 
| **`fileno()`** |   Returns an integer number (file descriptor) of the file.   | 
| **`flush()`** |   Flushes the write buffer of the file stream.   | 
| **`isatty()`** |   Returns **`True`** if the file stream is interactive.   | 
| **`read(n)`** |   Reads at most `n` characters from the file. Reads till end of file if it is negative or `None`.   | 
| **`readable()`** |   Returns **`True`** if the file stream can be read from.   | 
| **`readline(n=-1)`** |   Reads and returns one line from the file. Reads in at most **`n`** bytes if specified.   | 
| **`readlines(n=-1)`** |   Reads and returns a list of lines from the file. Reads in at most **`n`** bytes/characters if specified.   | 
| **`seek(offset,from=SEEK_SET)`** |   Changes the file position to **`offset`** bytes, in reference to `from` (start, current, end).   | 
| **`seekable()`** |   Returns **`True`** if the file stream supports random access.   | 
| **`tell()`** |   Returns the current file location.   | 
| **`truncate(size=None)`** |   Resizes the file stream to **`size`** bytes. If **`size`** is not specified, resizes to current location..   | 
| **`writable()`** |   Returns **`True`** if the file stream can be written to.   | 
| **`write(s)`** |   Writes the string **`s`** to the file and returns the number of characters written..   | 
| **`writelines(lines)`** |   Writes a list of **`lines`** to the file..   | 

## 6. Deleting Files

If we want to remove a file we use the built-in **`os`** module.

If the file does not exist, the `os.remove()` method will raise an error, so it is good practice to use a condition like this:

In [26]:
import os
if os.path.exists('./files/example.txt'):
    os.remove('./files/example.txt')
else:
    print('The file does not exist')

The file does not exist


## 7. File with csv Extension

**CSV** stands for **C**omma **S**eparated **V**alues. CSV is a simple file format used to store tabular data like you see in excel.

We can easily read a csv file by using the `csv` module:

In [52]:
import csv

with open('fifa_data.csv') as f:
    # Create a reader for the csv file
    csv_reader = csv.reader(f, delimiter=',')
    
    # Iterate through the rows of the csv file
    line_count = 0
    for row in csv_reader:
        line_count += 1
        
        if line_count == 0:
            # The first row are the column names
            column_names = row
        else:
            # Check for Messi's info
            if 'Messi' in row:
                messi = row
    print(f'Number of lines:  {line_count}\n\n')
    
# Display Messi's stats
print("Messi's info:")
for col, data in zip(column_names, messi):
    print(f'{col}: {data}')

Number of lines:  18208


Messi's info:
﻿: 0
ID: 158023
Name: L. Messi
Age: 31
Photo: https://cdn.sofifa.org/players/4/19/158023.png
Nationality: Argentina
Flag: https://cdn.sofifa.org/flags/52.png
Overall: 94
Potential: 94
Club: FC Barcelona
Club Logo: https://cdn.sofifa.org/teams/2/light/241.png
Value: €110.5M
Wage: €565K
Special: 2202
Preferred Foot: Left
International Reputation: 5
Weak Foot: 4
Skill Moves: 4
Work Rate: Medium/ Medium
Body Type: Messi
Real Face: Yes
Position: RF
Jersey Number: 10
Joined: Jul 1, 2004
Loaned From: 
Contract Valid Until: 2021
Height: 5'7
Weight: 159lbs
LS: 88+2
ST: 88+2
RS: 88+2
LW: 92+2
LF: 93+2
CF: 93+2
RF: 93+2
RW: 92+2
LAM: 93+2
CAM: 93+2
RAM: 93+2
LM: 91+2
LCM: 84+2
CM: 84+2
RCM: 84+2
RM: 91+2
LWB: 64+2
LDM: 61+2
CDM: 61+2
RDM: 61+2
RWB: 64+2
LB: 59+2
LCB: 47+2
CB: 47+2
RCB: 47+2
RB: 59+2
Crossing: 84
Finishing: 95
HeadingAccuracy: 70
ShortPassing: 90
Volleys: 86
Dribbling: 97
Curve: 93
FKAccuracy: 94
LongPassing: 87
BallControl: 96
Acceleration:

## Challenge

**1. Download `Challenge_34.ipynb` from Teams.**

**2. Upload this file into your own *Project* on Deepnote by dragging the `Challenge_34.ipynb` file onto the Notebooks tab on the left-hand side.** 

**3. Also download the txt and csv files in the Challenge_34 folder and upload these to the Files tab in your project.**

**4. Use this notebook to complete Challenge 34 in Deepnote.**