# Week 7: Files, Files, Files!

## Reading and Writing to Files

### Reading a file

To read data from a file in Python, we can use the **open()** function along with the file mode **'r'** (read mode). The **with** statement is used to automatically close a file when the indentation is done. The method **.read()** is used to read the content of a file and store them as a string somewhere.

In [2]:
with open('data.txt', 'r') as file:
    data = file.read()
    print(data)

sorry for forgetting the file


In [7]:
file = open("data.txt", "r")
data = file.read()
print(data)

sorry for forgetting the file
line 2


In [4]:
file.close()

In [9]:
splitted = data.split("\n")
print(splitted)

['sorry for forgetting the file', 'line 2']


### Writing to a File (txt)
To write data to a file in Python, we can use the **open()** function with the file mode **'w'** (write mode). If there is no filename that matches with what you write inside open(), Python will create a new file with that name. 

You use the **.write()** method to write information.

**important** if you want to add information to a file that already contains text, you need to open it with the file mode **"a"** (append mode), otherwise you will **rewrite the content of the entire file**

In [11]:
with open('output.txt', 'w') as file:
    file.write('Hello world 2')

## Reading and Writing CSV files

CSV means comma separated value. A csv file is like a table where the separation between one column and the other is usually given by the "," character.

### Reading csv files

To read data from a CSV file in Python, we can use the **csv** module and its **reader** object.

We use the **.reader(file_variable_name)** from the csv module to read the content of a CSV.

In [17]:
import csv

with open('data.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader: # we can iterate over the rows and print them as lists
        print(row)


['Name', 'Age']
['John', '25']
['Sarah', '30']
['James', '99']


### Writing csv files

To write data to a CSV file in Python, we can use again the **csv** module. And we need to create a writer object, and then use it to write rows using the **.writerow(rows)** or **.writerows(rows)** methods. 

**important** if you want to add information to a file that already contains text, you need to open it with the file mode **"a"** (append mode). If you use the **w** mode you will overwrite the content of the file.

In [19]:
import csv


single_row = ["Mary", 99]

with open('data.csv', 'a', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(single_row)

with open('data.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)


['Name', 'Age']
['John', '25']
['Sarah', '30']
['James', '99']
['Mary', '99']
['Mary', '99']


In [20]:
multiple_rows = [["Roger", 41], ["Superman", 203], ["My imagination has limits", 0]]

with open('data.csv', 'a', newline='') as file:
    writer = csv.writer(file)
    writer.writerows(multiple_rows)

with open('data.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)


['Name', 'Age']
['John', '25']
['Sarah', '30']
['James', '99']
['Mary', '99']
['Mary', '99']
['Roger', '41']
['Superman', '203']
['My imagination has limits', '0']


## What if there is any other type of separator?

You can specify which character is used to separate columns in your file with the "delimiter" argument.

In [22]:
with open('atseparated.csv', 'r') as file:
    reader = csv.reader(file, delimiter="@")
    for row in reader:
        print(row)

['Name', 'Age']
['Zeus', '20000']
['Pikachu', '30']
['Python', '32']


## DictReader to read csv files

The **DictReader** class from the **csv** module provides a convenient way to read CSV files into dictionaries. Each row of the CSV file is represented as a dictionary, where the keys are the column headers and the values are the corresponding values in the row. You use the csv method **DictReader(file_variable_name)** to create a *DictReader* version of your csv

In [27]:
import csv

with open('data.csv', 'r') as file:
    reader = csv.DictReader(file)
    list_of_dictionary = []
    for row in reader:
        print(row['Name'], row['Age'])
        print(row)
        list_of_dictionary.append(dict(row))

print(list_of_dictionary)


John 25
{'Name': 'John', 'Age': '25'}
Sarah 30
{'Name': 'Sarah', 'Age': '30'}
James 99
{'Name': 'James', 'Age': '99'}
Mary 99
{'Name': 'Mary', 'Age': '99'}
Mary 99
{'Name': 'Mary', 'Age': '99'}
Roger 41
{'Name': 'Roger', 'Age': '41'}
Superman 203
{'Name': 'Superman', 'Age': '203'}
My imagination has limits 0
{'Name': 'My imagination has limits', 'Age': '0'}
[{'Name': 'John', 'Age': '25'}, {'Name': 'Sarah', 'Age': '30'}, {'Name': 'James', 'Age': '99'}, {'Name': 'Mary', 'Age': '99'}, {'Name': 'Mary', 'Age': '99'}, {'Name': 'Roger', 'Age': '41'}, {'Name': 'Superman', 'Age': '203'}, {'Name': 'My imagination has limits', 'Age': '0'}]


In the example above, each row is a dictionary, allowing us to access the values by column names (e.g., **row['Name']** **and row[' Age']**).

## Reading and Writing JSON files

JSON (JavaScript Object Notation) is a lightweight data interchange format that is easy for humans to read and write and easy for machines to parse and generate. It is widely used to transmit data between a server and a web application.

JSON represents data as key-value pairs in a hierarchical structure (like a Pyhton Dictionary!). It supports various data types such as strings, numbers, booleans, arrays, and objects. The data is organized into nested structures, making it flexible for representing complex data relationships.



```json
{
  "name": "John Doe",
  "age": 30,
  "city": "New York",
  "skills": ["Python", "JavaScript", "HTML", "CSS"],
  "contact": {
    "email": "john.doe@example.com",
    "phone": "123-456-7890"
  }
}

```

To read and write data from a JSON file in Python, we can use the **json** module.



`json.dumps()`: Converts a Python object to a JSON string.

In [82]:
import json

data = {"name": "John", "age": 30}
json_string = json.dumps(data)
print(json_string)


{"name": "John", "age": 30}


`json.dump()`: Writes a Python object as a JSON string to a file.

In [112]:
import json

data = {"name": "John", "age": 30}
with open("data.json", "w") as file:
    json.dump(data, file)

`json.loads()`: Parses a JSON string and converts it into a Python object. (a dictionary)

In [114]:
import json

json_string = '{"name": "John", "age": 30}'
data = json.loads(json_string)
print(data["name"])  # Output: John


John


`json.load()`: Reads a JSON string from a file and parses it into a Python object.

In [115]:
import json

with open("data.json", "r") as file:
    data = json.load(file)
print(data["name"])

John


`json.dump()` and `json.load()` can be used together to copy JSON data from one file to another.

In [116]:
import json

with open("data.json", "r") as input_file, open("output.json", "w") as output_file:
    data = json.load(input_file)
    json.dump(data, output_file)



`json.dumps()` can accept additional parameters such as `indent` and `sort_keys` for printing things in a nicer way and sorting the keys in the output.

In [118]:
import json

data = {"name": "John", "age": 30}
json_string = json.dumps(data, indent=4, sort_keys=True)
json_string_ugly = json.dumps(data)
print(json_string)
print(json_string_ugly)


{
    "age": 30,
    "name": "John"
}
{"name": "John", "age": 30}


# Exercises:


## Exercise 7.1:

Write a `read_csv(filename)` that takes a filename as input to load the data from the CSV file into a DictReader. The function should return the DictReader.

## Exercise 7.2

Write a function `count_csv_row(csv_file)` that reads a CSV file and returns the total number of rows in the file.

## Exercise 7.3

Write a function `retrieve_info_from_json(json_file_path, specific_key)` that reads a JSON file and returns the value connected to a specific key.

## Exercise 7.4

Write a function `replace_in_txt(txt_file, word_to_be_replaced, replacement)` reads a text file and replaces all occurrences of a word with another word.


## Exercise 7.5:

Write a function `calculate_column_average(csv_file, column_name)` that takes a csv file and a column name as input. The calculates and return the average value for the specified column in the csv. 

## Exercise 7.6

Create a Jupyter notebook where you document and test all the previous functions


## Data for testing

Here is an example CSV file `data.csv` that you can use to test the functions above:

```
name,age,city
John,30,New York
Emily,25,Los Angeles
Michael,35,Chicago
Sophia,28,Houston
Daniel,32,San Francisco
Olivia,27,Miami
James,31,Seattle
Emma,29,Boston
William,26,Atlanta
Ava,33,Dallas
Benjamin,24,Denver
Isabella,30,Austin
Mason,28,Phoenix
Mia,29,Detroit
Elijah,27,Philadelphia
```

Copy this in a notepad and save it as "data_ex.csv".

Here is an example JSON file you can use to set the functions above:

```
{
  "name": "John Doe",
  "age": 30,
  "city": "New York",
  "skills": ["Python", "JavaScript", "HTML", "CSS"],
  "email": "john.doe@example.com"
}
```
Copy this in a notepad and save it as "data_ex.json".
