# Files

File handling in Python is a powerful and versatile tool that can be used to perform a wide range of operations.

Advantages of File Handling
- **Versatility**: File handling in Python allows you to perform a wide range of operations, such as creating, reading, writing, appending, renaming, and deleting files.
- **Flexibility**: File handling in Python is highly flexible, as it allows you to work with different file types (e.g. text files, binary files, CSV files, etc.), and to perform different operations on files (e.g. read, write, append, etc.).
- **User–friendly**: Python provides a user-friendly interface for file handling, making it easy to create, read, and manipulate files.
- **Cross-platform**: Python file-handling functions work across different platforms (e.g. Windows, Mac, Linux), allowing for seamless integration and compatibility.

Disadvantages of File Handling
- **Error-prone**: File handling operations in Python can be prone to errors, especially if the code is not carefully written or if there are issues with the file system (e.g. file permissions, file locks, etc.).
- **Security risks**: File handling in Python can also pose security risks, especially if the program accepts user input that can be used to access or modify sensitive files on the system.
- **Complexity**: File handling in Python can be complex, especially when working with more advanced file formats or operations. Careful attention must be paid to the code to ensure that files are handled properly and securely.
- **Performance**: File handling operations in Python can be slower than other programming languages, especially when dealing with large files or performing complex operations.

### open() function

f = open(filename, mode)

Where the following mode is supported:

1. r: Open an existing file for a read operation.
2. w: Open an existing file for a write operation. If the file already contains some data then it will be overridden but if the file is not present then it creates the file as well.
3. a:  Open an existing file for append operation. It won’t override existing data.
4. r+:  To read and write data into the file. The data can be overwritten if you write from the beginning or any specific position in the file. However, it does not automatically overwrite the entire file.
5. w+: To write and read data. It will override existing data.
6. a+: To append and read data from the file. It won’t override existing data.
7. x: Open for exclusive creation, failing if the file already exists.
8. t: Open for text mode (default) for a read operation,but it's usually specified along with other modes (e.g., "rt", "wt", "at").
9. rb: It opens the file in binary mode, but else is identical to r mode.
10. rb+: Similar to r+ mode, only opens the file in binary mode instead.
10. ab: It opens the file in binary format, but otherwise is identical to a mode.
11. ab+: The file opens in binary format, but else is similar to a+ mode.
12. wb: Similar as w mode, but opens the file in binary format.
13. wb+: The file is opened in binary format, unlike w+.

In [1]:
# Make a new file in output mode
f = open('data.txt', 'w')
f.write('Hello\n')
f.write('World!\n')
f.close()

d = open('data.txt')
d.readlines()

['Hello\n', 'World!\n']

In [2]:
# Make a new file in output mode
f = open('data.txt', 'a')
f.write('Hi Students!\n')
f.close()

d = open('data.txt')
d.readlines()

['Hello\n', 'World!\n', 'Hi Students!\n']

In [3]:
f = open('data.txt', 'r+')
f.write('Hi Student, How are you!\n')
f.close()

d = open('data.txt')
d.readlines()

['Hi Student, How are you!\n', '\n']

In [4]:
f = open('data.txt', 'w+')
f.write('Hi Student, How are you!.Is Elnur here?\n')
f.close()

d = open('data.txt')
d.readlines()

['Hi Student, How are you!.Is Elnur here?\n']

In [5]:
f = open('data.txt', 'a+')
f.write('Hi Student, How are you!.Is Ali here?\n')
f.close()

d = open('data.txt')
d.readlines()

['Hi Student, How are you!.Is Elnur here?\n',
 'Hi Student, How are you!.Is Ali here?\n']

In [6]:
f = open('data.txt', 'a+')
f.writelines('Hi Student, How are you!\nIs Nazim here?\n')
f.close()

d = open('data.txt')
d.readlines()

['Hi Student, How are you!.Is Elnur here?\n',
 'Hi Student, How are you!.Is Ali here?\n',
 'Hi Student, How are you!\n',
 'Is Nazim here?\n']

In [7]:
f = open('data.txt')
f.write('Hi Students!\n')
f.close()

d = open('data.txt')
d.readlines()

UnsupportedOperation: not writable

In [None]:
file = open("binary_file.txt", "wb")
binary_text = "binary text".encode('utf-8')
file.write(binary_text)
file.close()
print ("Name of the file: ", file.name)
print ("Opening mode : ", file.mode)

#### with open

In [None]:
lines = ["Line 1\n", "Line 2\n", "Line 3\n"]
with open("sm.txt", "w") as f:
    f.writelines(lines)
    
d = open('sm.txt')
d.readlines()
d.close()

#### try except using in files operation

In [None]:
fp = open("data1.txt", "r")
print(fp.read())
fp.close()

In [None]:
# Opening the file with relative path
try:
    fp = open("data1.txt", "r")
    print(fp.read())
    fp.close()
except FileNotFoundError:
    print("Please check the path.")

In [None]:
fp = open("sample3.txt", "w")
print(fp.write("Hello world"))
fp.close()

In [None]:
# Opening the file with relative path
try:
    fp = open("sample3.txt", "w")
    print(fp.write("Hello"))
    fp.close()
except FileNotFoundError:
    print("Please check the path.")

In [None]:
try:
    fp = open('sample3.txt', 'r')
    print(fp.read())
    fp.close()
except Exception as e:
    print("Error: ", e)
finally:
    print("Exit")

In [None]:
try:
    fp = open('Sample3.txt', 'r')
    print(fp.read())
    fp.close()
except Exception as e:
    print("Error: ", e)
finally:
    print("Exit")

In [None]:
try:
    # Creating a new file
    with open("sample3.txt", "x") as fp:
        fp.write("Hello World! I am a new file")

    # reading the contents of the new file
    fp = open("sample3.txt", "r")
    print(fp.read())
except FileExistsError:
    print("The file already exists")

In [None]:
try:
    # Creating a new file
    with open("Sample3.txt", "x") as fp:
        fp.write("Hello World! I am a new file")

    # reading the contents of the new file
    fp = open("Sample3.txt", "r")
    print(fp.read())
except FileExistsError:
    print("The file already exists")

In [None]:
with open("sample3.txt", "r+") as fp:
    # reading the contents before writing
    print(fp.read())

    # Writing new content to this file
    fp.write("\nAdding this new content")
new_f = open("sample3.txt")
new_f.readlines()

In [None]:
with open("sample3.txt", "w+") as fp:
    # reading the contents before writing
    print(fp.read())

    # Writing new content to this file
    fp.write("\nAdding this new content")
new_f = open("sample3.txt")
new_f.readlines()

# Working with pickle and joblib files

**Serialization** (also known as **pickling** in Python) is the process of converting an object into a format that can be easily stored or transmitted, such as a byte stream or a file. This process allows you to save the state of an object so that it can be reconstructed later. The opposite process, where you convert the serialized data back into the original object, is called **deserialization** (or **unpickling** in Python).

### Difference between Pickle and Joblib

Both pickle and joblib are Python libraries used for object serialization, but they have some differences:

1. **Compatibility**:
   - `pickle` is a standard library in Python, while `joblib` is a third-party library that needs to be installed separately (`pip install joblib`).
   - `pickle` is part of the Python Standard Library, so it is available by default in all Python installations. On the other hand, `joblib` needs to be installed separately using pip.

2. **Performance**:
   - `joblib` is optimized for serializing large NumPy arrays, making it faster and more memory-efficient than `pickle` for such data types.
   - For most basic Python objects and small to medium-sized data, the performance difference between `pickle` and `joblib` may not be significant.

3. **Compatibility with External Code**:
   - `pickle` can serialize almost any Python object, including user-defined classes and functions.
   - `joblib` is primarily designed for serializing NumPy arrays and objects from the `scikit-learn` library. While it can handle other types of data, it may not be as flexible as `pickle` for serializing arbitrary Python objects.

4. **File Size**:
   - `joblib` may produce smaller serialized files compared to `pickle`, especially for large NumPy arrays, due to its efficient handling of memory buffers.

5. **Dependencies**:
   - `pickle` has no external dependencies beyond the Python Standard Library, making it lightweight and easy to use.
   - `joblib` relies on NumPy and may require additional dependencies if used in conjunction with `scikit-learn`.

6. **Version Compatibility**:
   - Since `pickle` is part of the Python Standard Library, it maintains backward compatibility across different Python versions.
   - `joblib` is a third-party library, so compatibility may vary between different versions of `joblib` and Python.

In summary, while both `pickle` and `joblib` serve the same purpose of serializing Python objects, `joblib` is optimized for performance and memory efficiency, especially for NumPy arrays and `scikit-learn` objects. However, `pickle` is more versatile and widely compatible with different types of Python objects. Your choice between them depends on your specific use case and requirements.


In [None]:
import pickle

# make an example object to pickle
some_obj = {'x':[4,2,1.5,1], 'y':[32,[101],17], 'foo':True, 'spam':False}

In [None]:
with open('my1.pickle', 'wb') as f:
    pickle.dump(some_obj, f)

In [None]:
with open('my1.pickle', 'rb') as f:
    print(f.readlines())

In [None]:
print(some_obj)

In [None]:
#del some_obj

### Loading

In [4]:
with open('my1.pickle', 'rb') as f:
    loaded_obj = pickle.load(f)

print('loaded_obj is', loaded_obj)

FileNotFoundError: [Errno 2] No such file or directory: 'my1.pickle'

In [5]:
import pandas as pd

df = pd.DataFrame([range(11), range(100,110)], columns=list('abcdefghijk'))

df

Unnamed: 0,a,b,c,d,e,f,g,h,i,j,k
0,0,1,2,3,4,5,6,7,8,9,10.0
1,100,101,102,103,104,105,106,107,108,109,


In [6]:
with open("new.pickle", "wb") as f:
    pickle.dump(df, f)

In [7]:
with open("new.pickle", "rb") as f:    
    a = pickle.load(f)

In [8]:
a

Unnamed: 0,a,b,c,d,e,f,g,h,i,j,k
0,0,1,2,3,4,5,6,7,8,9,10.0
1,100,101,102,103,104,105,106,107,108,109,


In [9]:
df.to_pickle('my_df.pickle')

del df

In [10]:
with open('my_df.pickle', 'rb') as f:
    loaded_obj = pickle.load(f)

print('loaded_obj is', loaded_obj)

loaded_obj is      a    b    c    d    e    f    g    h    i    j     k
0    0    1    2    3    4    5    6    7    8    9  10.0
1  100  101  102  103  104  105  106  107  108  109   NaN


In [11]:
import joblib
import warnings
warnings.filterwarnings('ignore')

In [12]:
# Load the object using joblib
with open('model_log.joblib', 'rb') as f:
    loaded_obj = joblib.load(f)

# Print the loaded object
print('loaded_obj is', loaded_obj)

FileNotFoundError: [Errno 2] No such file or directory: 'model_log.joblib'

In [13]:
# Load the object using joblib
with open('model_svm.joblib', 'rb') as f:
    loaded_obj = joblib.load(f)

# Print the loaded object
print('loaded_obj is', loaded_obj)

FileNotFoundError: [Errno 2] No such file or directory: 'model_svm.joblib'

In [14]:
# Pickle usage

import pickle

data = {'name': 'Alice', 'age': 25}

with open('data.pkl', 'wb') as file:
    pickle.dump(data, file)

In [15]:
with open('data.pkl', 'rb') as file:
    loaded_data = pickle.load(file)

print(loaded_data)  

{'name': 'Alice', 'age': 25}


In [16]:
# Joblib usage

!pip install joblib
from joblib import dump

model = {'weights': [1.2, 3.4], 'bias': 0.9}

dump(model, 'model.joblib')



['model.joblib']

In [17]:
from joblib import load

loaded_model = load('model.joblib')
print(loaded_model)

{'weights': [1.2, 3.4], 'bias': 0.9}


In [18]:
# Writing in Binary mode

with open("data.bin", "wb") as f:
    f.write("Hello".encode("utf-8"))

In [19]:
# Reading in Binary mode
with open("data.bin", "rb") as f:
    content = f.read()
    print(content.decode("utf-8"))  

Hello


In [20]:
with open("example.txt", "r") as f:
    print(f.name)  
    print(f.mode)  

example.txt
r


In [21]:
# Best usage
with open("example.txt", "r") as file:
    content = file.read()
    print(content)

Hello, world!
New line added.


In [22]:
# Try except with file handling 
try:
    with open("example.txt", "r") as file:
        print(file.read())
except FileNotFoundError:
    print("File not found.")

Hello, world!
New line added.
