# Files

Python uses file objects to interact with the external files on your computer. These file objects cab be of any file format on your computer i.e. can be an audio file, a text file, emails, Excel documents, etc. Note that You will probably need to install certain libraries or modules to interact with those various file types, but they are easily available. (We will cover downloading modules later on in the course).

Python has a built-in open function that allows us to open and play with basic file types. First we will need a file though. We're going to use some iPython magic to create a text file!

## iPython Writing a File

In [1]:
%%writefile text1.txt
hello everyone,welcome

Overwriting text1.txt


In [2]:
pwd()

'C:\\Users\\hp\\Documents\\machine learning\\python'

# Python Opening a file

We can open a file with the open() function. This function also takes in arguments (also called parameters). Let's see how this is used:

Python has a built-in **`open()`** function to open a file. This function returns a file object, also called a handle, as it is used to read or modify the file accordingly.

```python
>>> f = open("test.txt")  # open file in current directory
>>> f = open("C:/Python99/README.txt")   # specifying full path
```

We can specify the mode while opening a file. In mode, we specify whether we want to read **`r`**, write **`w`** or append **`a`** to the file. We can also specify if we want to open the file in text mode or binary mode.

The default is reading in text mode. In this mode, we get strings when reading from the file.

On the other hand, binary mode returns bytes and this is the mode to be used when dealing with non-text files like images or executable files.

| Mode | Description |
|:----:| :--- |
| **`r`** | **Read** -Opens a file for reading only. The file pointer is placed at the beginning of the file. This is the default mode.   | 
| **`t`** | **Text** - Opens in text mode. (default).   | 
| **`b`** | **Binary** - Opens in binary mode (e.g. images).  | 
| **`x`** | **Create** - Opens a file for exclusive creation. If the file already exists, the operation fails.   | 
| **`rb`** | Opens a file for reading only in binary format. The file pointer is placed at the beginning of the file. This is the default mode.   | 
| **`r+`** | Opens a file for both reading and writing. The file pointer placed at the beginning of the file.   | 
| **`rb+`** | Opens a file for both reading and writing in binary format. The file pointer placed at the beginning of the file.   |  
| **`w`** | **Write** - Opens a file for writing only. Overwrites the file if the file exists. If the file does not exist, creates a new file for writing.   | 
| **`wb`** | Opens a file for writing only in binary format. Overwrites the file if the file exists. If the file does not exist, creates a new file for writing.   | 
| **`w+`** | Opens a file for both writing and reading. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing.   | 
| **`wb+`** | Opens a file for both writing and reading in binary format. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing.   | 
| **`a`** | **Append** - Opens a file for appending. The file pointer is at the end of the file if the file exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing.   | 
| **`ab`** | Opens a file for appending in binary format. The file pointer is at the end of the file if the file exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing.   | 
| **`a+`** | Opens a file for both appending and reading. The file pointer is at the end of the file if the file exists. The file opens in the append mode. If the file does not exist, it creates a new file for reading and writing.   |
| **`ab+`** | Opens a file for both appending and reading in binary format. The file pointer is at the end of the file if the file exists. The file opens in the append mode. If the file does not exist, it creates a new file for reading and writing.   |  

In [3]:
#open the text file we made earlier
my_file=open("text1.txt")

In [4]:
my_file

<_io.TextIOWrapper name='text1.txt' mode='r' encoding='cp1252'>

In [5]:
#we can read the file
my_file.read()

'hello everyone,welcome\n'

In [6]:
# But what happens if we try to read it again?
my_file.read()

''

This happens because you can imagine the reading "cursor" is at the end of the file after having read it. So there is nothing left to read. We can reset the "cursor" like this:

In [7]:
#seek to the start of file
my_file.seek(10)

10

In [8]:
my_file.read()

'yone,welcome\n'

In [9]:
my_file.seek(0)

0

In [10]:
my_file.readline()

'hello everyone,welcome\n'

## Writing to a File

By default, using the open() function will only allow us to read the file, we need to pass the argument 'w' to write over the file. For example:

In [11]:
# Add the second argument to the function, 'w' which stands for write
my_file=open("text1.txt",mode="w+")

In [12]:
# Write to the file
my_file.write("this is new file")

16

In [13]:
my_file.seek(0)

0

In [14]:
my_file.read()

'this is new file'

## Iterating through a File

Let's get a quick preview of a for loop by iterating over a text file. First, let's make a new text file with some iPython Magic:


In [15]:
%%writefile text2.txt
first line mohamad
second line naseer

Overwriting text2.txt


In [16]:
my_file=open("text2.txt")

In [17]:
#if i want to read the data line by line
for line in my_file:
    print(line)

first line mohamad

second line naseer



# StringIO

It is the StringIO module is an in-memory, file-like object. It can be used to input or output the majority of functions users can expect from an ordinary file object. Once the user creates the StringIO objects, it is initially created by providing a string to the constructor. If there is no string, the StringIO will be empty. In both instances, the initially displayed cursor on the file will start at zero.

The module is not available in the most recent version of Python; thus, to be able to use this module, we need to transfer it into the Io module in Python in the form of io.StringIO.

In [18]:
s="mohamadnaseer"
s

'mohamadnaseer'

In [19]:
from io import StringIO

In [20]:
#arbitrary string
message = "this is a normal sting"

In [21]:
type(message)

str

In [22]:
# use StringIO method to set as file object
f=StringIO(message)

In [23]:
type(f) 

_io.StringIO

In [24]:
f.read()

'this is a normal sting'

In [25]:
f.write("second line written to file like object")

39

In [26]:
#reset cursor just like
f.seek(4)

4

In [27]:
f.read()

' is a normal stingsecond line written to file like object'

# Python File Methods

There are various methods available with the file object. Some of them have been used in the above examples.

Here is the complete list of methods in text mode with a brief description:

| Method | Description |
|:----| :--- |
| **`close()`** |   Closes an opened file. It has no effect if the file is already closed.   | 
| **`detach()`** |   Separates the underlying binary buffer from the **`TextIOBase`** and returns it.   | 
| **`fileno()`** |   Returns an integer number (file descriptor) of the file.   | 
| **`flush()`** |   Flushes the write buffer of the file stream.   | 
| **`isatty()`** |   Returns **`True`** if the file stream is interactive.   | 
| **`read(n)`** |   Reads at most `n` characters from the file. Reads till end of file if it is negative or `None`.   | 
| **`readable()`** |   Returns **`True`** if the file stream can be read from.   | 
| **`readline(n=-1)`** |   Reads and returns one line from the file. Reads in at most **`n`** bytes if specified.   | 
| **`readlines(n=-1)`** |   Reads and returns a list of lines from the file. Reads in at most **`n`** bytes/characters if specified.   | 
| **`seek(offset,from=SEEK_SET)`** |   Changes the file position to **`offset`** bytes, in reference to `from` (start, current, end).   | 
| **`seekable()`** |   Returns **`True`** if the file stream supports random access.   | 
| **`tell()`** |   Returns the current file location.   | 
| **`truncate(size=None)`** |   Resizes the file stream to **`size`** bytes. If **`size`** is not specified, resizes to current location..   | 
| **`writable()`** |   Returns **`True`** if the file stream can be written to.   | 
| **`write(s)`** |   Writes the string **`s`** to the file and returns the number of characters written..   | 
| **`writelines(lines)`** |   Writes a list of **`lines`** to the file..   | 

# Deleting Files
We have seen in previous section, how to make and remove a directory using os module (04_Python_Functions ➞ 007_Python_Function_Module ➞ Python Built-In Modules). Again now, if we want to remove a file we use os module.

In [29]:
import os
os.remove("text3.txt")

In [31]:
import os
if os.path.exists('./files/text3.txt'):
    os.remove('./files/text3.txt')
else:
    print('The file does not exist')

The file does not exist


# File type

## File with txt Extension

File with **txt** extension is a very common form of data and we have covered it in the previous section. Let us move to the JSON file.

## File with json Extension

JSON stands for **J**ava**s**cript **O**bject **N**otation. Actually, it is a stringified JavaScript object or Python dictionary.

In [32]:
import json

In [33]:
person = {
    "name":"mohamadnaseer",
    "country":"india",
    "city":"Chennai",
    "skills":["Python", "MATLAB","c#"]
}

In [36]:
# let's convert it to  json
person_json = json.dumps(person, indent=4) # indent could be 2, 4, 8. It beautifies the json
print(type(person_json))
print(person_json)

# when you print it, it does not have the quote, but actually it is a string
# JSON does not have type, it is a string type.

<class 'str'>
{
    "name": "mohamadnaseer",
    "country": "india",
    "city": "Chennai",
    "skills": [
        "Python",
        "MATLAB",
        "c#"
    ]
}


### Saving as JSON File

We can also save our data as a json file. Let us save it as a json file using the following steps. For writing a json file, we use the **`json.dump()`** method, it can take dictionary, output file, **`ensure_ascii`** and **`indent`**.

In [37]:
with open('json_example.json', 'w', encoding='utf-8') as f:
    json.dump(person, f, ensure_ascii=False, indent=4)

### File with csv Extension

**CSV** stands for **C**omma **S**eparated **V**alues. CSV is a simple file format used to store tabular data, such as a spreadsheet or database. CSV is a very common data format in data science.

For example, create **csv_example.csv** in your working directory with the following contents:

```csv
"name","country","city","skills"
"mohamadnaseer","India","chennai","Python"
```

In [38]:
import csv

In [46]:
with open('csv_example.csv') as f:
    csv_reader = csv.reader(f, delimiter=',') 
    line_count = 0
    for row in csv_reader:
        if line_count == 0:
            print(f'Column names are :{", ".join(row)}')
            line_count += 1
        else:
            print(f'\t{row[0]} is a teachers. He lives in {row[1]}, {row[2]}.And he know {row[3]} very well.')
            line_count += 1
    print(f'Number of lines:  {line_count}')

Column names are :name, country, city, skills
	mohamadnaseer is a teachers. He lives in India, chennai.And he know Python very well.
	munirahamad is a teachers. He lives in India, chennai.And he know java very well.
Number of lines:  3
