# File Handling

File handling is a crucial part of Data Engineering. <br>
It is necessary to know how to read and write to files of different formats<br>
The most common formats are csv, json, xlxs.<br>
Other big data formats are AVRO, Parquet, ORC.<br>

In Python, the `open()` function is used to interact with files<br><br>
Syntax:
```
open(file_name, mode)
```

### File Modes
    - Create   (x) to create a file that doesn't exist in the specified directory
    - Read     (r) reads the content of the file
    - Append   (a) add to the content of an existing file
    - Write    (w) creates a file (if it does not exist), and overwrites the content of that file

### Create a `.txt` file

    always ensure you close a file after it is opened
    to close a file, use the .close() method

### Exercise 1

create a .txt file in your current working directory<br>
==> Do not forget to close your file 

In [6]:
# create a .txt file in your current working directory
# ==> Do not forget to close your file 
school = open('students_lists', 'x')

In [7]:
school.close()

### Write to a `.txt` file
If the file given does not exist, a new file with the specified name will be created<br>
The content of the file will be OVERWRITTEN

    First, open the file in write mode ('w')


In [8]:
file = open('demo.txt', 'w')

    use the .write() method (with a string as an argument) to write to the file

In [9]:
file.write('Hello, Everyone')
file.write("\nIt's a good day!")

17

In [10]:
# always remember to close your file

file.close()

### Exercise 2

Create a new file and write to that file a string of multiple lines<br>
==> Remember to close your file

In [11]:

nate = open('nate', 'w')

In [12]:
nate.write("""
My name is Nate""")

16

In [13]:
nate.close()

### Read a `.txt` file

To read a `.txt` file, the file path specified has to exist<br>
Specify the `r` mode when opening the file

In [14]:
file = open('demo.txt', 'r')

#### read the content of a file

- .read()      - returns the content of this file in a `string`
- .readline()  - returns the content of the file _*one line at a time*_ in a `string` 
- .readlines() - returns the content of a file in a `list` of `strings` 

In [15]:
print(file.read()) # returns all of the content of the file
file.close()

Hello, Everyone
It's a good day!


In [16]:
file = open('demo.txt', 'r')
file.readline()  # returns one line at a time in a string

'Hello, Everyone\n'

In [17]:
print(file.readline())

It's a good day!


In [18]:
file.close()

In [19]:
file = open('demo.txt', 'r')
file_content = file.readlines() # assign the content of the file to a variable
print(file_content)

['Hello, Everyone\n', "It's a good day!"]


#### we can loop through the `file_content` variable

In [20]:
for line in file_content:
    print('This is a new line: ', line)

This is a new line:  Hello, Everyone

This is a new line:  It's a good day!


In [21]:
file.close()

### Recap (Create, Read, Write)

In [22]:
# Create and write to a file
filename = 'new_file.txt'   # store the file name in a variable

new_file = open(filename, 'w')     # we can create a file directly in the write mode
new_file.write('''It's a beautiful day today!
Life is awesome!
Don't be scared to fail.
Just try try try again!
''')
new_file.close()

In [23]:
# read the file

new_file = open(filename, 'r')
for line in new_file.readlines():
    print(line)

It's a beautiful day today!

Life is awesome!

Don't be scared to fail.

Just try try try again!



In [24]:
new_file = open(filename, 'r')
new_file.readline()

"It's a beautiful day today!\n"

### Append a `.txt` file

The append mode allows you to write to a file without overwriting it.<br>
The provided content is placed at the end of the file<br>
Specify the `'a'` mode when opening the file

In [25]:
file_name = 'demo.txt'
reopened_file = open(file_name, 'a')
reopened_file.write('\nI hope you have a good day!') 
reopened_file.close() # don't forget to close your file

### Handling files with the `With` statement

The `With` keyword ensures the file closes after the execution of the file handling process

In [26]:
with open('demo.txt', 'r') as file:
    filecontent = file.readlines()
    for line in filecontent:
        print(line.strip())

Hello, Everyone
It's a good day!
I hope you have a good day!


### Exercise 3
Write a function that takes a `file name` as an argument<br>
The function should:<br>
`Read` the specified file, and <br>
`Print` the content of the file

### Exercise 4

Part 1:<br>
    Define a function that takes another function as an argument<br>
    The function should:<br>
        `execute` the argument function, and<br>
        `return` the name of the argument function, and the result of that function in a `dictionary`<br><br>
Part 2:<br>
    convert the tuple to a string and save it to a .txt file

In [39]:
# define the main function here
### Exercise 4

'''Part 1:<br>
    Define a function that takes another function as an argument<br>
    The function should:<br>
        `execute` the argument function, and<br>
        `return` the name of the argument function, and the result of that function in a `dictionary`<br><br>
Part 2:<br>
    convert the tuple to a string and save it to a .txt file'''
def execute_function(func):
    return {'name': func.__name__,'result' : str(func())}
    print('The results are:')
    for i in range (0,5)
    {print("Function: " + str(i+1))
    }
    # call the function with the lambda expression
    execute = execute_function()
    # write the output into a.txt file
    f=open('output.txt','w')
    f.write(str(execute)+'\n')
    f.close()
    


SyntaxError: expected ':' (2861873273.py, line 14)

In [None]:
# define the argument function here

In [None]:
# write the "save file" statement here

### Other file modes:
'r+' - read and write<br>
'a+' - append and read<br>
'w+' - write and read<br>

`'r+' - read and write`

In [None]:
with open('file_rw.txt', 'r+') as file1:
    for line in file1.readlines():
        print(line)
    file1.seek(0)
    file1.write('\nToday is a good day\nRemain Happy!\nBye!')
    file1.read()

`'a+' - append and read`

In [None]:
with open('file_ar.txt', 'a+') as file1:
    file1.write('\nToday is a good day\nRemain Happy!\nBye!')
    file1.seek(0)
    print(file1.read())

`'w+' - write and read`

In [None]:
with open('file_wr.txt', 'w+') as file1:
    file1.write('\nToday is a good day\nRemain Happy!\nBye!')
    file1.seek(0)
    print(file1.read())

### Read More: 
Python Docs: https://docs.python.org/3/library/functions.html#open<br>
Python Docs: https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files<br>

## CSVs - writer()

    import the csv module

In [40]:
import csv

    create a csv file and add rows to the file

In [41]:
with open('demo.csv', 'w', newline='') as csv_file:
    file = csv.writer(csv_file, delimiter=',')
    file.writerow(['Name', 'Email', 'Phone Number']) # header
    file.writerow(['Funmi', 'funmi@gmail.com', '08077997788']) # row 1
    file.writerow(['Musa', 'musa@example.com', '']) # row 2

In [42]:
with open('demo_II.csv', 'w', newline='') as csv_file:
    file = csv.writer(csv_file, delimiter=',')
    header = ['Name', 'Email', 'Phone Number']
    row1 = ['Funmi', 'funmi@gmail.com', '08077997788']
    row_2 = ['Musa', 'musa@example.com', '']
    file.writerows([header,row1 ,row_2 ])

## CSVs - reader()

In [43]:
file_name = 'demo_II.csv'
with open(file_name, mode='r') as csvfile:
    filereader = csv.reader(csvfile, delimiter=',')
    for row in filereader:
        print(row)

['Name', 'Email', 'Phone Number']
['Funmi', 'funmi@gmail.com', '08077997788']
['Musa', 'musa@example.com', '']


In [44]:
with open('demo_II.csv', 'a', newline='') as csv_file:
    file = csv.writer(csv_file, delimiter=',')
    row3 = ['Alheri', 'alheri@gmail.com', '08077227788']
    row4 = ['Kareem', 'kareem@outlook.com', '+44112233545']
    file.writerows([row3,row4])

## CSVs - DictReader()

In [45]:
with open('demo.csv', mode='r') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(row)

{'Name': 'Funmi', 'Email': 'funmi@gmail.com', 'Phone Number': '08077997788'}
{'Name': 'Musa', 'Email': 'musa@example.com', 'Phone Number': ''}


## CSVs - DictWriter()

In [46]:
with open('demo.csv', 'a', newline='') as csvfile:
    file = csv.DictWriter(csvfile, fieldnames=['Name', 'Email', 'Phone Number'])
    row3 = {'Name': 'Kamal', 'Email': 'kamal@outlook.com', 'Phone Number': ''}
    row4 = {'Name': 'Fatimah', 'Email': 'fatimah@newuser.com', 'Phone Number': '09011223344'}
    row5 = {'Name': 'Rabiu', 'Email': 'rabiu@gmail.com', 'Phone Number': '+442341235544'}
    file.writerows([row3, row4, row5])

### Exercise 5

Read the demo.csv file and print the names of persons whose phone numbers start with +44<br>
You can make use of any of the csv reader methods

In [47]:
# Code your solution here
#Read the demo.csv file and print the names of persons whose phone numbers start with +44<br> ç
# You can make use of any of the csv reader methods




NameError: name 'csvreader' is not defined

: 

### Exercise 6:
Read the demo.csv file and print the names of persons who have no phonenumbers<br>
You can make use of any of the csv reader methods

In [None]:
# Code your solution here

To create a .txt file
