## File I/O

File is a named location on disk to store related information. It is used to permanently store data in a non-volatile memory (e.g. hard disk).

Since, random access memory (RAM) is volatile which loses its data when computer is turned off, we use files for future use of the data.

When we want to read from or write to a file we need to open it first. When we are done, it needs to be closed, so that resources that are tied with the file are freed. 

**File Operations:**

1. Open a file
2. Read or write (perform operation)
3. Close the file

### Opening a File

Python has a built-in function **open()** to open a file. This function returns a file object, also called a handle, as it is used to read or modify the file accordingly.

In [1]:
# open file in current directory
f = open('example.txt')

We can specify the mode while opening a file. In mode, we specify whether we want to **read 'r'**, **write 'w'** or **append 'a'** to the file. We also specify if we want to open the file in **text mode or binary mode**. 

### Python File Modes

**'r'** - Open a file for reading. (default)

**'w'** - Open a file for writing. Creates a new file if it doesn't exist or truncates\replace the file if it exists.

**'x'** - Open a file for exclusive creation. If the file already exists, the operation fails. 

**'a'** - Open for appending at the end of the file without truncating it. Creates a new file if it doesn't exists.

**'t'** - Open in text mode. (default)

**'b'** - Open in binary mode. 

**'+'** - Open a file for updating (reading and writing)

In [2]:
# create the file handle with all default options
f = open('example.txt')

In [3]:
# open the file in read mode
f = open('example.txt', 'r')

In [4]:
print(f)

<_io.TextIOWrapper name='example.txt' mode='r' encoding='cp1252'>


In [5]:
# open the file in write mode
f = open('example.txt', 'w')

In [6]:
print(f)

<_io.TextIOWrapper name='example.txt' mode='w' encoding='cp1252'>


The default encoding is platform dependent. In Windows, it is 'cp1252' but in Linux it is 'utf-8'.

So, we must not also reply on the default encoding or else our code will behave differently in different platforms. 

Hence, when working with files in text mode, it is highly recommended to specify the encoding type. 

### Closing a File

Closing a file will free up the resources that were tied with the file and is done using the **close()** method. 

Python has a garbage collector to clean up unreferenced objects, but we must not rely on it to close the file. 

In [7]:
f = open('example.txt')

In [8]:
f.close()

This method is not entirely safe. If an exception occurs when we are performing some operation with the file, the code exits without closing the file.

A safer way is to use a try...finally block.

In [9]:
# exception handling
try:
    # perform file operations
    f = open('example.txt')
    
finally:
    f.close()

This way, we are guaranteed that the file is properly closed even if an exception is raised, causing program flow to stop.

The best way to do this is using the with statement. This ensures that the file is closed when the block inside with is exited. 

We don't need to explicitly call the close() method. It is done internally. 

        with open('example.txt', encoding = 'utf8') as f:
            # perform file operations

### Write to a File 

In order to write into a file we need to open it in **write 'w', append 'a' or exclusive creation 'x' mode**.

We need to be careful with the 'w' mode as it will overwrite into the file if it already exists. All previous data are erased. 

Writing a string or sequence of bytes (for binary files) is done using the **write()** method. This method returns the number of characters written to the file. 

In [11]:
f = open('test.txt', 'w')
f.write("This is a First File\n")
f.write("Contains two lines\n")
f.close()

This program will create a new file named 'test.txt' if it doesn't exist. If it does exist, it is overwritten.

### Reading From a File

There are various methods available for this purpose. We can use the **read(size)** method to read in size number of data. If size parameter is not specified, it reads and returns up to the end of the file.

In [12]:
f = open("test.txt", 'r')
f.read()

'This is a First File\nContains two lines\n'

In [14]:
f = open("test.txt", 'r')

# it will read the first 4 characters
f.read(4)

'This'

In [15]:
# it will read the next 10 characters from the current cursor location.
f.read(10)

' is a Firs'

We can change our current file cursor (position) using the **seek()** method. 

Similarly, the **tell()** method returns our current position (in number of bytes).

In [16]:
# returns the current cursor location
f.tell()

14

In [17]:
# if we want to bring the file cursor to initial position
f.seek(0)

0

In [18]:
# read the entire file
print(f.read())

This is a First File
Contains two lines



We can read a file line-by-line using a for loop. This is both efficient and fast.

In [19]:
f.seek(0)

for line in f:
    print(line)

This is a First File

Contains two lines



Alternately, we can use **readline()** method to read individual lines of a file. This method reads a file till the newline, including the newline character. 

In [20]:
f = open("test.txt", "r")
f.readline()

'This is a First File\n'

In [21]:
f.readline()

'Contains two lines\n'

In [22]:
f.readline()

''

The **readlines()** method returns a list of remaining lines of the entire file. All these reading method return empty values when end of file (EOF) is reached.

In [26]:
f.seek(0)

f.readlines()
f.close()

## Renaming And Deleting Files in Python

While we were using the **read/write** functions, we may also need to **rename/delete** a file in Python. So, there comes a **os** module in Python which brings the support of file **rename/delete** operations. 

So, to continue, first of all, we should import the **os** module in our Python script.

In [24]:
import os

In [27]:
# rename a file from 'test.txt' to 'sample.txt'
os.rename("test.txt", "sample.txt")

In [30]:
f = open("sample.txt", 'r')
f.readline()
f.close()

In [31]:
# delete a file 'sample.txt'
os.remove("sample.txt")

In [32]:
f = open("sample.txt", 'r')
f.readline()

FileNotFoundError: [Errno 2] No such file or directory: 'sample.txt'

## Python Directory and File Management

If there are a large number of files to handle in our Python program, we can arrange our code within different directories to make things more manageable.

A directory or folder is a collection of files and sub directories. Python has the **os** module, which provides us with many useful methods to work with directories (and files as well). 


**Get Current Directory**


We can get the present working directory using the **getcwd()** method.

This methods returns the current working directory in the form of a string.

In [44]:
import os

os.getcwd()

'F:\\Data Science Rahul\\AppliedAICourse_Deepak\\Deepak_Lab'

### Changing Directory

We can change the current working directory using the **chdir()** method. 

The new path that we want to change to must be supplied as a string to this method. We can use both forward slash '(/)' or the backward slash (\) to separate path elements. 

In [43]:
os.chdir("F:\\Data Science Rahul\\AppliedAICourse_Deepak\\")

In [42]:
os.getcwd()

'F:\\Data Science Rahul\\AppliedAICourse_Deepak'

### List Directories and Files

All files and sub directories inside a directory can be known using the **listdir()** method.

In [45]:
os.listdir(os.getcwd())

['.ipynb_checkpoints',
 '1. Keywords and identifiers.ipynb',
 '10. Python List.ipynb',
 '11. Tuples.ipynb',
 '12. Python Sets.ipynb',
 '13. Python Dictionary.ipynb',
 '14. Strings.ipynb',
 '15. Python Functions.ipynb',
 '16. Python Modules.ipynb',
 '17. Python Packages.ipynb',
 '18. File Handling.ipynb',
 '2. Comments, Indentation and Statements.ipynb',
 '3. Variables and Data Types in Python.ipynb',
 '4. Standard Input and Output.ipynb',
 '5. Operators.ipynb',
 '6. Control flow if else.ipynb',
 '7. Control flow while loop.ipynb',
 '8. Control flow for loop.ipynb',
 '9. Control flow break and continue.ipynb',
 'example.txt']

### Making New Directory

We can make a new directory using the **mkdir()** method. 

This method takes in the path of the new directory. If the full path is not specified, the new directory is created in the current working directory. 

In [46]:
# create an empty directory
os.mkdir('test')

### Removing Directory

We can use the **rmdir()** method that can only remove empty directories.

In order to remove a non-empty directory we can use the **rmtree()** method inside the **shutil** module.

In [47]:
# remove an empty directory
os.rmdir('test')

In [48]:
# remove a non-empty directory
import shutil

# create an empty directory
os.mkdir('test')

In [49]:
# change the current directory path to the new directory
os.chdir('./test')

In [50]:
os.getcwd()

'F:\\Data Science Rahul\\AppliedAICourse_Deepak\\Deepak_Lab\\test'

In [52]:
# create a text file inside the new directory
f = open("testfile.txt", "w")

f.write("Hello World")

f.close()

In [53]:
os.chdir("../")

os.getcwd()

'F:\\Data Science Rahul\\AppliedAICourse_Deepak\\Deepak_Lab'

In [54]:
# this will return an error because the directory is not empty
os.rmdir('test')

OSError: [WinError 145] The directory is not empty: 'test'

In [55]:
# remove an non-empty directory
shutil.rmtree('test')

In [56]:
os.getcwd()

'F:\\Data Science Rahul\\AppliedAICourse_Deepak\\Deepak_Lab'