# Introduction

A file is a collection of data stored on a secondary storage device like hard disk.

When there is a huge amount of data to be processed, it is better to combine the data into a file and read the data from a Python program.

When a program is being executed, the data is stored in *Random Access Memory* (RAM).

CPU can access the data, stored on RAM, faster.

RAM is volatile i.e., the data stored on the RAM is lost when a program ends or the computer shuts down.

To store the data permanently, non-volatile storage media like hard disk, USB drive, DVD etc. to be used.

Data on non-volatile storage media is stored in named locations on the media called files.

Think of working with files as working with a notebook:

* To use a notebook, first we open the notebook.

* Then we can read the existing content or write new content.

* After using the notebook, we close it.

The above steps can be applied to files as well.

# File path

Files are stored on a storage medium like the hard disk.

The file system stores the files in a hierarchical (tree) structure.

At the top of the tree is one root node.

Under the root node, there can be other files and folders (directories).

Each folder can contain other files and folders. This can go on to a limitless depth.

The type of file is indicated by its extension.

Every file is identified by its path that begins from the root node or the root folder.

In Windows, the root folder by default is C:\. However, the root folder can be D:\\, E:\ etc.

## Relative Path and Absolute Path

A file path can be either *relative* or *absolute*.

An absolute path always contains the root node and the complete directory list to specify the exact location of the file.

A relative path is specified relative to the program's current working directory.

# Types of Files

Python supports 2 types of files:

* text files
* binary files

## ASCII Text Files

A text file is a stream of characters that can be sequentially processed by a computer in forward direction.

A text file is usually opened for only one kind operation i.e., reading or writing or appending, at any given time.

In a text file, each line contains zero or more characters.

In a text file, each line ends with one or more characters that specify the end of line.

Each line in a text file can have a maximum of 255 characters.

When data is written to a text file, each newline character is converted to a carriage return or line feed character.

When data is read from a text file, each carriage return or line feed character is converted to newline character.

In a text file, each line of data ends with a newline character. Each file ends with a special character called the end-of-file (EOF) marker.

## Binary Files

A binary file contains any type of data, encoded in binary form.

Binary file includes files such as word processing documents, PDFs, images, spreadsheets, videos, zip files, and other executable programs.

A binary file is a collection of bytes.

While text files can be processed sequentially, binary files can be either processed sequentially or randomly depending on the needs of the application.

Binary files takes less space to store the same piece of data as compared to  text files.

Text files contain only basic characters and do not store any information about the color, font, and size of the text.

Text files can be read by text editors while the binary files can't be.

# Opening and Closing Files

## Python functions and methods

Python has many built-in functions and methods to manipulate files.

The built-in functions and methods basically work on a file object.

### The *open()* function

Before reading from or writing to a file, the file must be opened using the *open* function.

The function has two parameters:

* file_name - a string value that specifies the name of the file to be accessed.

* access_mode - the mode in which the file has to be opened i.e., read, write, append etc. The default mode is read.

The function returns a file object, which will be used to perform the changes on the file.

#### Example

In [1]:
file = open("AboutPython.txt", 'r')

In [2]:
print(file)

<_io.TextIOWrapper name='AboutPython.txt' mode='r' encoding='cp1252'>


In [3]:
file.close()

In [4]:
print(file)

<_io.TextIOWrapper name='AboutPython.txt' mode='r' encoding='cp1252'>


#### Access Modes

| Mode | Purpose |
| ---- | ------- |
| r    | Default mode of opening the file. Opens the file for reading only. The file pointer is placed at the beginning of the file. |
| rb   | Opens the file for reading only in binary format. The file pointer is placed at the beginning of the file. |
| r+   | Opens the file for reading and writing. The file pointer is placed at the beginning of the file. |
| rb+  | Opens the file for reading and writing in binary format. The file pointer is placed at the beginning of the file. |
| w    | Opens the file for writing only. If the file doesn't exists, a new file is created for writing. If the file already exists and has some data, it is overwritten. |
| wb   | Opens the file for writing only in binary format. If the file doesn't exists, a new file is created for writing. If the file already exists and has some data, it is overwritten. |
| w+   | Opens the file for reading and writing. If the file doesn't exists, a new file is created for reading and writing. If the file already exists and has some data, it is overwritten. |
| wb+  | Opens the file for reading and writing in binary format. If the file doesn't exists, a new file is created for reading and writing. If the file already exists and has some data, it is overwritten. |
| a    | Opens the file for appending. The file pointer is placed at the end of the file, if file exists. If the file does not exists, it creates a new file for writing. |
| ab   | Opens the file for appending in binary format. The file pointer is placed at the end of the file, if file exists. If the file does not exists, it creates a new file for writing. |
| a+   | Opens the file for reading and appending. The file pointer is placed at the end of the file, if file exists. If the file does not exists, it creates a new file for reading and writing. |
| ab+  | Opens the file for reading and appending in binary format. The file pointer is placed at the end of the file, if file exists. If the file does not exists, it creates a new file for reading and writing. |

### The file object attributes

Once a file is successfully opened, a *file* object is returned.

Using the *file* object, the different types of information realted to the file can be accessed.

#### Example

In [5]:
file = open("AboutPython.txt", 'r')

In [6]:
print("The name of the file is: ", file.name)

The name of the file is:  AboutPython.txt


In [7]:
print("The file is closed: ", file.closed)

The file is closed:  False


In [8]:
print("The file has been opened in ", file.mode, "mode.")

The file has been opened in  r mode.


### The *close()* method

The *close()* method is used to close the file object.

Once a file object is closed, it is not possible to read from or write into the file associated with the file object and results in an error.

Open files always stand a chance of corruption and data loss.

As a good programming habbit, always explicitly use the *close()* method to close a file.

#### Example

In [9]:
file.closed

False

In [10]:
file.close()

In [11]:
file.closed

True

### *write()* method

The *write()* method is used to write a string to an already opened file.

The string can include numbers, special characters, or symbols.

While writing data to a file, the *write()* method does not add a new line character to the end of the string.

#### Example

In [12]:
file = open("AboutPython.txt", "w")

In [13]:
file.write("Python is a programming language.")

33

In [14]:
file.close()

#### Example

In [15]:
file = open("AboutPython.txt", "w")

In [16]:
file.write("Python is powerful and fast.")

28

In [17]:
file.close()

### *writelines()* method

The *writelines()* method is used to write a list of strings to a file.

#### Example

In [18]:
lines = ['Python runs everywhere', 'Python is friendly', 'Python is easy to learn', 'Python is open']

In [19]:
file = open("AboutPython.txt", "w")

In [20]:
file.writelines(lines)

In [21]:
file.close()

#### Example

In [22]:
lines = ['Python runs everywhere. ', 'Python is friendly. ', 'Python is easy to learn. ',
         'Python is open.']

file = open("AboutPython.txt", "w")

file.writelines(lines)

file.close()

### *append()* method

The *append()* method is used to write more data or append data to a file.

To append a file, use 'a' or 'ab' mode depending on whether the file is a text file or binary file.

Opening a file in 'w' or 'wb' mode and writing data into it, results in overwriting of the data.

#### Example

In [23]:
file = open("AboutPython.txt", 'a')

In [24]:
file.write("Python is simple yet poweful language.")

38

In [25]:
file.close()

### *read()* method

The *read()* method is used to read a string from an already opened file.

The *read()* method starts reading from the beginning of the file till the end of the file.

The *read()* method accepts a parameter, *count*, which is optional and specifies the number of bytes to read from the opened file.

#### Example

In [26]:
file = open("AboutPython.txt", 'r')

In [27]:
text = file.read()

In [28]:
print(text)

Python runs everywhere. Python is friendly. Python is easy to learn. Python is open.Python is simple yet poweful language.


In [29]:
file.close()

**Note:** Opening a file for reading, that doesn't exists, results in an error.

#### Example

In [30]:
file = open("AboutSQL.txt", 'r')

FileNotFoundError: [Errno 2] No such file or directory: 'AboutSQL.txt'

#### Example

In [46]:
file = open("AboutPython.txt", 'r')

In [47]:
file.read(6) # Passing the value for the count parameter

'Python'

In [48]:
file.close()

### *readline()* method

The *readline()* method is used to read a single line from the file.

The *readline()* method returns an empty string when the end of the file has been reached.

The *readline()* method returns a string containing only a single newline character when a blank line is encountered in the file.

#### Example

In [49]:
file = open('AboutPython.txt', 'r')

In [50]:
file.readline()

'Python runs everywhere. Python is friendly. Python is easy to learn. Python is open.Python is simple yet poweful language.'

In [51]:
file.readline()

''

In [52]:
file.readline()

''

In [53]:
file.readline()

''

In [54]:
file.close()

**Note:** After reading a line from the file, the control automatically passes to the next line.

### *readlines()* method

The *readlines()* method is used to read all the lines in the file.

#### Example

In [55]:
file = open('AboutPython.txt', 'r')

In [56]:
file.readlines()

['Python runs everywhere. Python is friendly. Python is easy to learn. Python is open.Python is simple yet poweful language.']

In [57]:
file.close()

### Loop over file object

The efficient way to display a file is to loop over the file object to print every line in it.

#### Example

In [58]:
file = open('AboutPython.txt', 'r')

In [59]:
for line in file:
    print(line)

Python runs everywhere. Python is friendly. Python is easy to learn. Python is open.Python is simple yet poweful language.


In [60]:
file.close()

**Note:** All reading methods return an empty string when end-of-file (EOF) is reached.

### Opening Files using *with* keyword

*with* keyword has the following advantages:

* File is properly closed after it is used even if an error occurs during read or write operation.

* File is properly closed after it is used even when forgot to close explicitly.

#### Example

In [63]:
with open('AboutPython.txt', 'r') as file:
    for line in file:
        print(line)

print()
print("Is the file closed? ", file.closed)

Python runs everywhere. Python is friendly. Python is easy to learn. Python is open.Python is simple yet poweful language.

Is the file closed?  True


### Splitting words

Splitting a line based on a character.

#### Example

In [67]:
with open("AboutPython.txt", "r") as file:
    line = file.readline()
    
    print(line)
    
    print()
    
    print(line.split()) # Split by space

Python runs everywhere. Python is friendly. Python is easy to learn. Python is open.Python is simple yet poweful language.

['Python', 'runs', 'everywhere.', 'Python', 'is', 'friendly.', 'Python', 'is', 'easy', 'to', 'learn.', 'Python', 'is', 'open.Python', 'is', 'simple', 'yet', 'poweful', 'language.']


# File Positions

The file management system associates a pointer with every file, known as *file pointer*.

The *file pointer* facilitates the movement across the file for reading or writing the data.

The *file pointer* specifies a location from where the current read or write operation is initiated.

Once the read/write operation is completed, the poiter is automatically updated.

## Python methods to tell or set the position of the file pointer

### *tell()* method

The *tell* method tells the current position within the file at which the next read or write operation will occur.

It is specified as the number of bytes from the beginning of the file.

When a file is opened for reading, the file pointer is positioned at the location 0, which is the beginning of the file.

#### Example

In [4]:
with open("AboutPython.txt", "r") as file:
    print("Position of the file pointer before read: ", file.tell())
       
    print(file.read(10))
    
    print("Position of the file pointer after reading: ", file.tell())

Position of the file pointer before read:  0
Python run
Position of the file pointer after reading:  10


### *seek()* method

The *seek* method is used to set/move the position of the file pointer to a new location.

The syntax is *seek(offset, from)*.

The *offset* argument indicates the number of bytes to be moved.

The *from* argument specifies the reference position from where the bytes are to be moved.

| from | Reference Position |
| ---- | ------------------ |
| 0    | From the beginning of the file |
| 1    | From the current position of the file |
| 2    | From the end of the file |

#### Example

In [15]:
with open("AboutPython.txt", "r") as file:
    print("Position of the file pointer before read: ", file.tell())
    
    print(file.readline())
    
    file.seek(5, 0)
    
    print("Position of the file pointer after seeking: ", file.tell())
    
    print(file.readline())
    
    file.seek(0, 2)
    
    print("Position of the file pointer after seeking: ", file.tell())
    
    print(file.readline())

Position of the file pointer before read:  0
Python runs everywhere.
Position of the file pointer after seeking:  5
n runs everywhere.
Position of the file pointer after seeking:  23



# Renaming and Deleting Files

In [1]:
import os

## The *rename()* method

The *rename()* method takes 2 arguments: the current file name and the new file name.

### Example

In [2]:
os.rename("AboutPython.txt", "About.txt")

## The *remove()* method

The *remove()* method is used to delete the file(s).

The method takes the file name as the argument and deletes that file.

### Example

In [3]:
os.remove("About.txt")