## 📁 File IO

Dealing with files is very important in Python, yet very simple.

The `open` function is used to deal with files. It takes the *file path* as its first argument and the *mode* as its second. The file path can be relative or absolute. The mode is a string of flag characters that controls how the file will be used, for example, `'w'` for writing, `'r'` for reading, `'a'` for appending, etc., and it defaults to `'r'`.

In [3]:
file = open('./data/myfile.txt', 'w')  # opening a file in the same working directory named myfile.txt

We can write anything in the file

In [4]:
file.write("Hello!\nI am writing this file using Python!")  # use writelines() to write a list of strings
file.close()

We then can read the file contents again using the readLines() function to read the file lines at once. 

In [5]:
f = open('./data/myfile.txt', 'r') # open file in reading mode

## loop over the file lines:

for line in f.readlines():
    print(line)
    
    
## donot forget to close the file handler
f.close()

Hello!

I am writing this file using Python!


**GOOD PRACTICE** to always use `open` in `with` statements to automatically release the file!

In [7]:
# fileReader defines what to do upon exit with def __exit__(self,...):
with open('./data/myfile.txt', 'r') as fileReader:
    for line in fileReader:
        print(line.rstrip())

Hello!
I am writing this file using Python!


### Multi-threading (Bonus)

Not truly possible due to a language design decision: `Global Interpreter Lock`
- Two known conditions under which the lock is released:
    - C-extensions
    - Blocking IO Calls (print, network request, etc.)

In [6]:
import threading

num_list = []

def add_E_fun():
    for _ in range(100000):
        num_list.append("E")

def add_O_fun():
    for _ in range(100000):
        num_list.append("O")

# 1. Create threads for each function
thread1 = threading.Thread(target=add_E_fun)
thread2 = threading.Thread(target=add_O_fun)

# 2. Start both threads
thread1.start()
thread2.start()

# 3. Wait for both threads to finish
thread1.join()
thread2.join()

print(num_list)

print("Both threads have finished.")

['E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E', 'E',

What actually happens here is time-sharing (interleaving) of the functions over the same thread (every execution of 100 bytecodes or so) the threads switch
- Implies zero gain of performance

The following is where `threading` can be useful

In [13]:
import threading
import math
import sys;  sys.set_int_max_str_digits(0)
import random

def bubble_sort(arr):
    print("START BUBBLE SORT!")
    n = len(arr)
    for i in range(n):
        for j in range(0, n-i-1):
            if arr[j] > arr[j+1]:
                arr[j], arr[j+1] = arr[j+1], arr[j]
    print("DONE BUBBLE SORT!")

def print_every_second():
    for i in range(5):  
        with open('./data/myfile.txt', 'r') as fileReader:
            for line in fileReader:
                print(line.rstrip())

# 1. reate threads for each function
thread1 = threading.Thread(target=print_every_second)
def generate_random_list(size): return [random.randint(0, 1000) for _ in range(size)]
thread2 = threading.Thread(target=bubble_sort, args=(generate_random_list(9999),))

# 2. Start both threads
thread1.start()
thread2.start()

# 3. Wait for both threads to finish
thread1.join()
thread2.join()

print("Both threads have finished.")

START BUBBLE SORT!Hello!
I am writing this file using Python! at iteration 0Hello!
I am writing this file using Python! at iteration 1Hello!
I am writing this file using Python! at iteration 2Hello!
I am writing this file using Python! at iteration 3Hello!
I am writing this file using Python! at iteration 4
Hello!
I am writing this file using Python! at iteration 0Hello!
I am writing this file using Python! at iteration 1Hello!
I am writing this file using Python! at iteration 2Hello!
I am writing this file using Python! at iteration 3Hello!
I am writing this file using Python! at iteration 4
Hello!
I am writing this file using Python! at iteration 0Hello!
I am writing this file using Python! at iteration 1Hello!
I am writing this file using Python! at iteration 2Hello!
I am writing this file using Python! at iteration 3Hello!
I am writing this file using Python! at iteration 4
Hello!
I am writing this file using Python! at iteration 0Hello!
I am writing this file using Python! at iter

For true parallelism, `multiprocessing` can be used (meh).

# REAL Python Starts Here
- Until now, all we discussed was the syntax of crude Python operations. 
- Crude Python is slow. Writing a for loop could be too inefficient!
<div align="center"> <img src="https://i.redd.it/y6laupmei9u91.jpg"/> </div>

- It is important to know it and not fall in Syntax errors, but crude Python is slower than most porgramming languages. That is why libraries like NumPy, Pandas, and Matplotlib not only allow you to use ready made functionalities, they are **MUCH** faster than normal Python operations.
    - They use C extensions to enhance the speed.

**For illustration only**:
```C
// In function.c
int myFunction(int num)
{
return (num == 0) ? 0 : ((num & (num - 1)) == 0 ? 1 : 0);
}
```

```python
# in main.py
import ctypes

# 1. load compiled function.c after compiling with cc -fPIC -shared -o libfun.so function.c                        
fun = ctypes.CDLL("libfun.so") # Or full path to file  

# 2. add type information
fun.myFunction.argtypes = [ctypes.c_int]

# 3. Call the function
NUM = 16
returnVale = fun.myFunction(NUM)      
```
**NOTICE** in upcoming labs, you may be evaluated on performance, so make sure you understand these libraries very well.

<div align="center">
<img src="https://www.newus.in/static/media/Core-python-at-newus-Dharmsala.0fc3b7c72cdea81baba4.gif" width=600/>
</div>