# Introduction to Computer Programming and Numerical Methods

> **Edited by S. Makiharju** Associate Professor, UC Berkeley <br>
> **Mohamad M. Hallal, PhD** <br> Teaching Professor, UC Berkeley

[![License](https://img.shields.io/badge/license-CC%20BY--NC--ND%204.0-blue)](https://creativecommons.org/licenses/by-nc-nd/4.0/)
***

# File Input and Output Operations

1. [**TXT Files**](#s1)
2. [**CSV Files**](#s2)
3. [**Additional Reading**](#s3)

***

# 0. Motivation

While a program is running, its data is stored in random access memory (RAM). RAM is fast, but it is also volatile, meaning data disappear when the program ends or the computer shuts down. To preserve data for future use, share it with collaborators, or interact with external resources, we have to store it as a **file**. A file is a structured block of data stored within the file system of the computer's operating system. For instance, this lecture notebook itself is a file. In this section, we will learn how to use Python to read and write data in several common file formats.

**Learning objectives:**

* Read and write TXT files using built-in methods and NumPy
* Read and write CSV files using built-in methods and NumPy

# 1. TXT Files <a id="s1"></a>

A **text** file, often with a **.txt** extension, is a file containing plain text without any special formatting like bold text, fonts, or font sizes. Despite their simplicity, text files serve as the foundation for storing and sharing information in a wide range of applications. Even though these files contain plain text, programs usually expect a file to follow a specific structure or format; that is, the data are organized in a specific way. 

In general, working with text files in Python involves three fundamental steps:

1. Opening the file
2. Performing any operations (reading, writing, or modifying)
3. Closing the file

## 1.1. Opening TXT Files

To use a file, we must open it first. When opening the file, we must decide whether we want to read data from the file or write data to it. This can be achieved with the `open()` function, which is most commonly used with two arguments as shown in the following syntax:

```python
open(filename, mode)
```

where:
* `filename`: a string including the file name or the path and name of the file, such as `'example.txt'`
* `mode`: a string defining the way in which the file will be used. See the table below for examples.

<div class="alert alert-block alert-warning"> <b>NOTE!</b> Both <code>filename</code> and <code>mode</code> are strings, so both should be between quotes.</div>

| `mode` | Use and description                                                                                          |
| :----- | :----------------------------------------------------------------------------------------------------------- |
| `'r'`  | **Read** (default mode) – Opens the file for reading. Raises a `FileNotFoundError` exception if file does not exist.  |
| `'w'`  | **Write** – Opens the file for writing. Creates the file if it does not exist. Overwrites (erases) the file if it already exists. |
| `'a'`  | **Append** – Opens the file for appending. Creates the file if it does not exist. Appends to the end of the file if it already exists. |
| `'x'`  | **Create** – Creates the file. Raises an error if it already exists. |

The `mode` argument is optional; the default `r` will be assumed if it's omitted. There are other modes besides `r`, `w`, `a`, and `x` that we will discuss later.

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Try to open a file using <code>open('resources/example.txt')</code>. This will raise an error because 'example.txt' does not exist in your directory.</div>

In [1]:
open('resources/example.txt')

FileNotFoundError: [Errno 2] No such file or directory: 'resources/example.txt'

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Try to write a file using <code>open('resources/example.txt', 'w')</code>. This will not raise an error because this mode creates the file if it does not exist.</div>

In [3]:
open('resources/example.txt', 'w')

<_io.TextIOWrapper name='resources/example.txt' mode='w' encoding='cp1252'>

<div class="alert alert-block alert-warning"> <b>NOTE!</b> If you look at your directory, you will see that a file named 'example.txt' has been created. If you click on it, you will see that it is empty – we just opened it for writing but we haven't written anything to it yet. We will later see how to write to a file. </div>

Opening a file using the `open()` function requires explicitly closing the file using the `close()` method. Closing the file is essential to ensure that all resources are released and any written data is saved.

```python
f = open(filename, mode)
# perform any operations
f.close()
```

This can introduce errors if not handled properly. Alternatively, another common way to work with files in Python is using the `with` statement, which ensures that the file is properly managed (i.e., opened and closed). Here's the syntax:

```python
with open(filename, mode) as f:
    # perform any operations
```
where `f` is a variable name that represents the file object returned by the `open()` function

It's important to note the indentation, as all operations involving the file must be properly nested within the `with` block.

Using the `with` statement to open files is the preferred method in Python as it ensures the file is automatically closed once operations are completed, reducing the risk of resource leaks and potential data loss. For clean, readable, and reliable file handling, always use the `with` statement.

Although there are other way to open files in Python (see [Additional Reading](#s3)), we will use the recommended method based on the `with` statement in conjunction with the `open()` function.


## 1.2. Reading TXT Files

After we open a file in reading mode, we can retrieve its contents in various ways. One approach is to use the `.read()` method after the file object, which returns the **entire** contents of the file as a **single string**:

```python
variable_name = f.read()
```

<div class="alert alert-block alert-warning"> <b>NOTE!</b> If we used <code>with open(filename) as myfile:</code> to open the file, we would use <code>myfile.read()</code> to read it. The <code>open()</code> function returns a file object that we assigned to the variable after <code>as</code>. This object can be used to work with files and directories.</div>

Let's demonstrate this by reading the contents of `zen_of_python.txt` and storing them in a variable. This file is located in a folder called `resources`.

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Open the existing file <code>'resources/zen_of_python.txt'</code> in reading mode and then store all the contents in the file to variable <code>zen</code>. Print <code>zen</code> as well as the last character of it. Note that the file was imported to your directory when you loaded this notebook.</div>

In [1]:
# open the file
# equivalent to open(..., mode = 'r')
with open('resources/zen_of_python.txt') as f:
    
    # read it using .read() method
    zen = f.read()

# print the file data
print(zen)

# print the last character
print(f'Last character: {zen[-1]}')

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
Last character: !


As mentioned earlier, the `.read()` method returns the entire contents of the file as a single string. This may not be ideal for large files. Alternatively, to read the entire contents but save each line as a separate string in a list, we can use the `.readlines()` method.

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Open the existing file <code>'resources/zen_of_python.txt'</code> in reading mode. Save each line in a list named <code>zen</code>. Print <code>zen</code> as well as the last element of it. Check the type of <code>zen</code>.</div>

In [3]:
# open the file
# equivalent to open(..., mode = 'r')
with open('resources/zen_of_python.txt') as f:
    
    # read it using .readlines() method
    zen = f.readlines()

# print the file data
print(zen)

# print the last element
print(f'Last element: {zen[-1]}')

# check its type
print(type(zen))

['The Zen of Python, by Tim Peters\n', '\n', 'Beautiful is better than ugly.\n', 'Explicit is better than implicit.\n', 'Simple is better than complex.\n', 'Complex is better than complicated.\n', 'Flat is better than nested.\n', 'Sparse is better than dense.\n', 'Readability counts.\n', "Special cases aren't special enough to break the rules.\n", 'Although practicality beats purity.\n', 'Errors should never pass silently.\n', 'Unless explicitly silenced.\n', 'In the face of ambiguity, refuse the temptation to guess.\n', 'There should be one-- and preferably only one --obvious way to do it.\n', "Although that way may not be obvious at first unless you're Dutch.\n", 'Now is better than never.\n', 'Although never is often better than *right* now.\n', "If the implementation is hard to explain, it's a bad idea.\n", 'If the implementation is easy to explain, it may be a good idea.\n', "Namespaces are one honking great idea -- let's do more of those!"]
Last element: Namespaces are one honkin

<div class="alert alert-block alert-success"> <b>TIP!</b> The <code>\n</code> character stands for new line. If you see it in a string, that means that the current line ends at that point and a new line starts right after it.</div>

<div class="alert alert-block alert-warning"> <b>NOTE!</b> There is another similar method, <code>.readline()</code> (without s), which returns one line of the file at a time. Each time you call <code>.readline()</code>, it returns the <strong>next line</strong>. So the first time you call it, it returns the first line. The second time, it returns the second line, and so on. Calls made to <code>.readline()</code> after reaching the end of the file will return an empty string (' ').</div>

Another approach to reading a file line by line is using a `for` loop. The object returned by `open()` is iterable, meaning, we can iterate over its contents like this:

```python
with open(filename, mode) as f:

    for line in f:
        print(line)
```

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Open the existing file <code>'resources/zen_of_python.txt'</code> in reading mode and then iterate over its contents and print each line.</div>

In [5]:
# open the file
# equivalent to open(..., mode = 'r')
with open('resources/zen_of_python.txt') as f:

    # iterate over its contents and print each line
    for line in f:
        print(line)

The Zen of Python, by Tim Peters



Beautiful is better than ugly.

Explicit is better than implicit.

Simple is better than complex.

Complex is better than complicated.

Flat is better than nested.

Sparse is better than dense.

Readability counts.

Special cases aren't special enough to break the rules.

Although practicality beats purity.

Errors should never pass silently.

Unless explicitly silenced.

In the face of ambiguity, refuse the temptation to guess.

There should be one-- and preferably only one --obvious way to do it.

Although that way may not be obvious at first unless you're Dutch.

Now is better than never.

Although never is often better than *right* now.

If the implementation is hard to explain, it's a bad idea.

If the implementation is easy to explain, it may be a good idea.

Namespaces are one honking great idea -- let's do more of those!


## 1.3. Writing TXT Files

When writing data to a file, you have to decide whether to start a new version of the file or append data to the end of what's already there. If you choose `mode='w'`, any existing data in the file will be overwritten (erased). If you choose `mode='a'`, data will be appended (added) to the end of any existing data in the file. Let's discuss `mode='w'` first and then `mode='a'`.

When using `mode='w'`, remember:
1. If the file doesn't exist, a new file is created.
2. If the file already exists, its contents are erased, and new content is added.


After we open a file in writing mode, we can write a string to the file using the `.write()` method after the file object:

```python
f.write('some text')
```

<div class="alert alert-block alert-warning"> <b>NOTE!</b> The <code>write()</code> argument must be of type <code>str</code>. So if you have numeric values, use string formatting to store them in a text file.</div>

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Open the file <code>'resources/example.txt'</code> in writing mode and write five lines, where each line should have: 'This is line #', and replace # with the line number, starting from 0.</div>

In [7]:
# open the file
with open('resources/example.txt', 'w') as f:

    # write the data
    for i in range(5):
        f.write(f'This is line {i}')

If you check the file `resources/example.txt`, you will see that the data are all on the same line. Python will **not** automatically write each string to a new line. To write each string to a new line, use `\n` at the end of each string.

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Open the file <code>'resources/example.txt'</code> in writing mode and write five lines, where each line should have: 'This is line #', and replace # with the line number, starting from 0. Each statement should appear on a single line.</div>

In [9]:
# open the file
with open('resources/example.txt', 'w') as f:
    
    # write the data, each on a new line
    for i in range(5):
        f.write(f'This is line {i}\n')

As mentioned earlier, if you open an existing file in writing mode, its contents will be erased, and any new content will replace what was there previously.

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Open the existing file <code>'resources/example.txt'</code> in writing mode and write: 'This is another line'.</div>

In [11]:
# open the file
with open('resources/example.txt', 'w') as f:
    
    # write additional data
    f.write('This is another line')

If you check the file `resources/example.txt`, you will see that all the previous data were erased and were overwritten by the new string. To avoid overwriting data, we can append to the file.

## 1.4. Appending TXT Files

When using `mode = 'a'`, data are appended (added) to the end of any previously existing data in the file. If the file does not already exist, a new file is created, similar to `mode='w'`. The `.write()` method is used in the same way for appending as it is for writing.

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Open the file <code>'resources/example.txt'</code> in writing mode and write five lines, where each line should have: 'This is line #', and replace # with the line number, starting from 0. Each statement should appear on a single line. Close the file and check it. Then, open the file again in appending mode and append: 'This is another line'.</div>

In [13]:
# open the file
with open('resources/example.txt', 'w') as f:

    # write the data, each on a new line
    for i in range(5):
        f.write(f"This is line {i}\n")

In [15]:
# open the file
with open('resources/example.txt', 'a') as f:

    # write additional data
    f.write('This is another line')

If you check the file `resources/example.txt`, you will see that now the previous data were not erased. Instead, the new string was appended to the end of the file.

## 1.5. Other Modes

In addition to the reading, writing, and appending modes we've discussed, there are other modes available for opening files in Python. These modes offer more flexibility and control over how files are accessed and manipulated.

One common scenario is when we need to perform multiple operations on the same file, such as reading from it and then writing to it. However, attempting to perform incompatible operations on a file opened in a specific mode can lead to errors.

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Open the file <code>'resources/example.txt'</code> in appending mode and try to read all its contents as a single string using <code>.read()</code>.</div>

This will result in an error because the file was opened in appending mode, which restricts operations to appending data to the file and does not allow reading its contents.

In [17]:
# open a file in appending mode then try to read it using read() method
with open('resources/example.txt', 'a') as f:
    zen = f.read()

UnsupportedOperation: not readable

To perform multiple operations on a file, there are other modes that could be used, as shown in the table below.

| `mode` | Use and description                                                                                          |
| :----- | :----------------------------------------------------------------------------------------------------------- |
| `'r+'` | Opens the file for reading and writing. Raises an exception if file does not exist. If you read first then write, it appends to the end of the file. Otherwise, it overwrites the file.|
| `'a+'` | Opens the file for reading and appending. Creates the file if it does not exist. Appends to the end of the file when writing. |
| `'w+'` | Opens the file for reading and writing. Creates the file if it does not exist. Overwrites (erases) the file if it already exists. |

See the diagram below for differences between these modes.

<br>

<center><figure>
  <img src="https://d20p74l5mne5au.cloudfront.net/ExWNT-white-bg.png" style="width:75%">
  <figcaption style="text-align:center"><strong> <br> Mode options for opening files: </strong> <a href="https://stackoverflow.com/questions/1466000/">https://stackoverflow.com/</a></figcaption>  
</figure></center>

<br>

In the diagram above, truncate is equivalent to overwrite or erase.

## 1.7. Using NumPy Package

When reading and writing data in the above examples, we had to work with strings. Thus, when working with numerical data, dealing with text files can be cumbersome as it often requires converting data between strings and numeric values. Alternatively, we can leverage the NumPy package, which provides convenient functions for reading and writing numerical data directly from and to arrays.

To save an array to a text file using NumPy, we can use the `numpy.savetxt()` function, which has the following syntax:

```python
numpy.savetxt(filename, array, fmt='%.3f', header='', ...)
```

where:
* `filename`: a string specifying the file name or the path and name of the file enclosed in quotes: `'example.txt'`.
* `array`: an array containing the data we want to write to the file. Most frequently, the input will be an `ndarray`. However, technically, `np.savetxt()` will accept any "array-like" object, such as Python list.
* `fmt`: a string specifying the format for the saved data. This parameter is optional and has default value `'%.3f'`, which indicates that the output numbers should be formatted with three decimal places. Other options include `'%i'` for integer format. Custom format specifications can also be provided. For more information about format specification, read [the official documentation](https://docs.python.org/3/library/string.html#format-specification-mini-language).
* `header`: an optional string that will be written at the beginning of the file, allowing to add labels or comments.
* `...`: additional optional arguments. See [the official documentation](https://numpy.org/doc/stable/reference/generated/numpy.savetxt.html) for more details.

Note that `numpy.savetxt()` creates the file if it does not exist and it overwrites (erases) the file if it already exists. 

Let's use $y = \begin{pmatrix} 1.2 & 2.2 & 3 \\ 4.14 & 5.65 & 7.42 \\ \end{pmatrix}$ as an example.

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Create a 2-D list for <code>y</code> then save it as a file <code>'resources/array_y.txt'</code> with 2 decimal points. Add a header that says 'Col1 Col2 Col3'.</div> 

In [19]:
import numpy as np

# create list of lists y
y = [[1.2, 2.2, 3], [4.14, 5.65, 7.42]]

# save y
np.savetxt('resources/array_y.txt', y, fmt='%.2f', header='Col1 Col2 Col3')

Notice that we were able to save the data without changing it to `str`.

To read numerical data from a text file into an array, we can utilize the `numpy.loadtxt()` function. This function has the following syntax:

```python
numpy.loadtxt(filename, ...)
```

where:
* `filename`: a string specifying the file name or the path and name of the file enclosed in quotes: `'example.txt'`.
* `...`: additional optional arguments. For example, you can skip a number of rows or read specific columns, among others. See [the official documentation](https://numpy.org/doc/stable/reference/generated/numpy.loadtxt.html) for more details.

If the specified file does not exist, `numpy.loadtxt()` will raise an error. Additionally, the data within the text file must be numeric; otherwise, an error will occur during the loading process.

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Read the file <code>'resources/array_y.txt'</code> then assign its contents to variable <code>z</code> and print it.</div> 

In [23]:
z = np.loadtxt('resources/array_y.txt')
print(z)
print(type(z))

[[1.2  2.2  3.  ]
 [4.14 5.65 7.42]]
<class 'numpy.ndarray'>


When using `numpy.loadtxt()`, the contents of the text file are automatically converted to numeric data. It's worth noting that this function also handles headers by default. If the first row of the text file starts with the `#` sign, which is commonly used to denote headers, `numpy.loadtxt()` will skip this row when loading the data. This behavior simplifies the process of reading data files with headers, ensuring that only the numeric data is loaded into the array.

# 2. CSV Files <a id="s2"></a>

One of the most simple and common files to store scientific data are **Comma-Separated Values** (CSV) files, which have a **.csv** file extension. A CSV file is very useful to store large tabular data, which can include both numbers and text. If you open the text file `resources/array_y.txt`, you can see that by default, values on the same row are separated by a white space. In CSV files, the elements are separated by a comma `,`, and hence the name comma-separated values. The character that separates values in a file is known as a **delimiter**.

Let's look at the file `resources/FAOSTAT.csv`. When opened, you'll notice the header row followed by the data rows. Each value is delimited by a comma. You can also open this file (and any CSV file) using Microsoft Excel to visualize the rows and columns.

> The FAOSTAT Temperature change on land domain disseminates statistics of mean surface temperature change by country, with annual updates. This dissemination covers the period 1961–2022. Statistics are available for monthly temperature anomalies (among others), i.e., temperature change with respect to a baseline climatology, corresponding to the period 1951–1980. Data are based on the publicly available Global Surface Temperature Change data distributed by the National Aeronautics and Space Administration Goddard Institute for Space Studies (NASA-GISS). Source: Food and Agriculture Organization of the United Nations [(FAO)](https://www.fao.org/faostat/en/#data/ET).

## 2.1. Using `csv` Module

Python provides a built-in `csv` module that could handle reading and writing CSV files – you can see its details in [this documentation](https://docs.python.org/3/library/csv.html). To utilize this module, start by importing it:

```python
import csv
```
 
To work with a CSV file using the `csv` module, you have to open the file using `open()` and specify `mode` (read, write, append), then perform any operations, and finally close the file, similar to TXT files. 

To read a CSV file, we can use the following syntax:

```python
with open(filename, mode) as f:
    variable_name = csv.reader(f) # if opening in 'r' mode
    for line in f: # iterating over each element in f
        print(line)
```
        
<div class="alert alert-block alert-danger"> <b>TRY IT!</b> Read the file <code>'resources/FAOSTAT.csv'</code> then print its first 10 rows.</div>  

In [25]:
import csv

with open('resources/FAOSTAT.csv') as f:
    data = csv.reader(f) 
    for index, line in zip(range(10), data):
        print(line)

['Area', 'Year', 'Temperature change (celsius)']
['Afghanistan', '1961', '-0.113']
['Afghanistan', '1962', '-0.164']
['Afghanistan', '1963', '0.847']
['Afghanistan', '1964', '-0.764']
['Afghanistan', '1965', '-0.244']
['Afghanistan', '1966', '0.226']
['Afghanistan', '1967', '-0.371']
['Afghanistan', '1968', '-0.423']
['Afghanistan', '1969', '-0.539']


The `csv` module has other functions for writing CSV files, such as `csv.writer()`. 

While the `csv` module provides basic functionality for working with CSV files, it interprets all elements as strings. For more flexibility and convenience, we can leverage other packages such as NumPy and Pandas.

## 2.2. Using NumPy Package

When working with CSV files that only contain numeric data (no text), we can use the same functions that we used for reading and writing TXT files. However, we need to specify `delimiter = ','` to indicate that the file is in CSV format, as shown in the syntax below:

```python
array = np.loadtxt(filename, delimiter=',', ...) # read a csv file and save it as an array
np.savetxt(filename, array, fmt='%.3f', header='', delimiter=',', ...) # save a variable array as a csv file
```

<div class="alert alert-block alert-warning"> <b>NOTE!</b> The extension for <code>filename</code> should be <code>.csv</code>.</div>

If we attempt to use `np.loadtxt('resources/FAOSTAT.csv', delimiter=',')`, it will raise a `ValueError` error because the file includes text (area names), which cannot be converted to float. If the file contains text, we can use additional arguments to skip rows and/or columns that include text and only read the numeric data. Specifically:

```python
array = np.loadtxt(filename, delimiter=',', skiprows=0, usecols=None, ...)
```

where:
* `skiprows`: Skip a certain number of rows. For example, `skiprows = 2` will skip the first two rows. The default, 0, results in no rows being skipped.
* `usecols`: Specify which column(s) to read, with 0 being the first. For example, `usecols = (1,4)` will extract the 2nd and 5th columns. The default, None, results in all columns being read.

The above function takes several optional arguments, which you can read about in the [function's documentation](https://numpy.org/doc/stable/reference/generated/numpy.loadtxt.html).

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Read the last two columns of the file <code>'resources/FAOSTAT.csv'</code> using <code>np.loadtxt()</code>. You will have to skip the first row, which includes the header. Save the data as an array <code>data</code> then print its values.</div>  

In [37]:
import numpy as np

data = np.loadtxt('resources/FAOSTAT.csv', skiprows = 1, usecols=(1, 2), delimiter = ',')
print(data)

[[ 1.961e+03 -1.130e-01]
 [ 1.962e+03 -1.640e-01]
 [ 1.963e+03  8.470e-01]
 ...
 [ 2.020e+03  3.890e-01]
 [ 2.021e+03 -1.250e-01]
 [ 2.022e+03 -4.900e-01]]


<div class="alert alert-block alert-warning"> <b>NOTE!</b> Be careful when using <code>np.loadtxt()</code>. If, for example, an area has the following name: <code>China, Hong Kong SAR</code>, then the <code>,</code> will be interpreted as a column separator and will create problems.</div>

Similarly, we can save a NumPy array as a csv file using `np.savetxt(filename, array, fmt='%.3f', header='', delimiter=',', ...)`. 

Let's use $y = \begin{pmatrix} 1.2 & 2.2 & 3 \\ 4.14 & 5.65 & 7.42 \\ \end{pmatrix}$ as an example.

<div class="alert alert-block alert-info"> <b>TRY IT!</b> Create a new array for <code>y</code> then save it as a file <code>'resources/array_y.csv'</code> with 2 decimal points. Add a header that says 'Col1 Col2 Col3'.</div> 

In [39]:
y = [[1.2, 2.2, 3], [4.14, 5.65, 7.42]]
np.savetxt('resources/array_y.csv', y, delimiter=',', fmt='%.2f', header='Col1,Col2,Col3')

# 3. Additional Reading <a id="s3"></a>

## 3.1. Opening and Closing Files

When working with TXT and CSV files in Python, it's crucial to open and close files properly to ensure data integrity and avoid resource leaks. There are two primary methods to open files: using the `with` statement and using the `open()` function directly.

In all of the previous examples, we opened files using the following syntax:

```python
with open(filename, mode) as f:
    # perform any operations
```

This is the preferred method to open files in Python, as it ensures that the file is automatically closed once all operations within the block are completed, even if an error occurs during file processing.

Alternatively, files can be opened using the `open()` function directly (without using `with`). However, this method requires explicitly closing the file using the `close()` method. Closing the file is essential to ensure that all resources are released and any written data is saved.

```python
f = open(filename, mode)
# perform any operations
f.close()
```

Using the `with` statement to open files is the preferred method in Python as it ensures the file is automatically closed once operations are completed, reducing the risk of resource leaks and potential data loss. The alternative approach of using the `open()` function directly requires explicit file closure, which can introduce bugs if not handled properly. For clean, readable, and reliable file handling, always use the `with` statement.


## 3.2. CSV Files Using Pandas Package

While NumPy is convenient for handling CSV files containing only numeric data, another powerful tool for working with CSV files, especially those with mixed data types, is the Pandas package. Pandas is a widely used data science library in Python for data manipulation and analysis. To use Pandas, we need to import it first, typically using the alias `pd`: `import pandas as pd`.

One advantage of using Pandas is its ability to handle data of any type, including strings, floats, integers, etc. To read a CSV file using Pandas, you can use the `pd.read_csv()` function:

```python
df = pd.read_csv(filename)
```

The `pd.read_csv()` function reads a CSV file named `filename` (which should be a string including the file extension, e.g., `'example.csv'`) and assigns it to a Pandas DataFrame `df`.

<div class="alert alert-block alert-danger"> <b>TRY IT!</b> Read the file <code>'resources/FAOSTAT.csv'</code> using <code>pd.read_csv()</code> and save it as <code>df</code>. Show the first few rows of <code>df</code> using <code>df.head()</code>.</div>  

In [None]:
import pandas as pd

df = pd.read_csv('resources/FAOSTAT.csv')

df.head()

To write a CSV file using Pandas, we can use the `df.to_csv()` function:

```python
df.to_csv(filename)
```

where:
* `df`: variable name containing the Pandas DataFrame
* `filename`: a string specifying the file name or the path and name of the file to be saved enclosed in quotes: `'example.csv'`

The `to_csv()` function also accepts several optional arguments, which you can explore further in the [function's documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html).

<div class="alert alert-block alert-danger"> <b>TRY IT!</b> Create a list that includes in each row the name of a person and their favorite number. Then, convert it to a dataframe using <code>pd.DataFrame()</code>. Finally, save it as a csv file called <code>'fav_num.csv'</code> using <code>.to_csv()</code>.</div> 

In [None]:
# initialize list of lists
data = [['John', 5], ['Doe', 15]]

# creating a data frame
df = pd.DataFrame(data, columns = ['Name', 'Number'])

# writing data frame to a CSV file
df.to_csv('resources/fav_num.csv')

## 3.3. Pickle Files

So far we have discussed how to store data using text and csv files. In some cases, we want to store data while preserving their data type (tuples, lists, dictionaries, etc.) to use them later or send them to colleagues. In thick case, we can use **pickle** files, which serialize objects so that they can be saved into a file and loaded again later. Pickle files usually have **.pickle** or **.pkl** extension, and they are specific to Python.

In order to use pickle, we need to import it first: `import pickle`.

In general, to work with pickle files, you have to open the file using `open()` and specify `mode`, then perform any operations, and finally close the file. Alternatively, you can use a `with` statement, which closes the file automatically.

### 3.3.1. Writing Pickle Files

To write a pickle file, we can use the `pickle.dump()` function, which has the following syntax:

```python
with open(filename, mode='wb') as f:
    pickle.dump(data, f)
```

where:
* `filename`: a string including the file name or the path and name of the file to be saved between quotes: `'example.pickle'`
* `data`: object(s) that you want to save. If you have multiple objects, then `data` should be a list or tuple that includes the objects you want to save.

<div class="alert alert-block alert-danger"> <b>TRY IT!</b> Save the DataFrame <code>df</code> as a pickle file <code>'resources/fav_num.pickle'</code>. </div>  

In [None]:
import pickle

with open('resources/fav_num.pickle', mode='wb') as f:
    pickle.dump(df, f)

### 3.3.2. Reading Pickle Files

To read a pickle file, we can use the `pickle.load()` function, which has the following syntax:

```python
with open(filename, mode='rb') as f:
    data = pickle.load(f)
```

<div class="alert alert-block alert-danger"> <b>TRY IT!</b> Load the pickle file you saved above.</div>  

In [None]:
with open('resources/fav_num.pickle', mode='rb') as f:
    data = pickle.load(f)
    print(data.head())
    print(type(data))

Pickle files are specially designed for Python, and therefore, cannot be easily read by other programming languages. On the contrary, text and csv files could be easily shared and opened in other programming languages (R, MATLAB, Java and so on).

<div class="alert alert-block alert-warning"> <b>NOTE!</b> The <code>pickle</code> module is not secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.</div>

### 3.4. JSON Files

**JSON** is another file format, which stands for **JavaScript Object Notation**. A JSON file usually ends with extension **.json**. Unlike pickle, which is Python dependent, JSON is a language-independent data format, which makes it attractive to use. Besides, it usually takes less space on the disk and the manipulation is faster than pickle. Therefore, it is a good option to store your data using JSON. If you are interested in reading more about JSON file, refer to the [textbook](https://pythonnumericalmethods.berkeley.edu/notebooks/chapter11.04-JSON-Files.html). 