# Introduction

Effectively managing large datasets is a vital skill in scientific computing, as encountering datasets that exceed manageable sizes is a common occurrence. From reading experimental measurements and analyzing extensive datasets to exporting results, Python offers a flexible and powerful set of tools for file handling. This section explores techniques for reading, processing, and writing files in commonly used formats, with Physics-inspired examples to make the concepts practical and accessible.

## Loading and Saving CSV Files

CSV files are a widely used format for storing structured data in a tabular form, where rows represent individual records and columns represent different variables. The values are separated by commas, hence the name: Comma-Separated Values.

In practice, you may encounter CSV files when handling experimental measurements, simulation outputs, or datasets from online sources. Their simplicity and broad compatibility make them a popular choice for data storage.

Python provides two main tools for working with CSV files:

- the built-in csv module and
- the Pandas library.

### Using the csv Module

A module, simply put, is a file containing Python code that can include functions, classes, and variables. A Python file qualifies as a module if it contains code that can be executed. For example, the csv module offers convenient functionality for both reading and writing CSV files through straightforward methods.

### Reading a csv file

In [4]:
import csv # import the csv module

# open and read a CSV file
with open('data.csv', mode = 'r', encoding = "utf-8-sig") as file:
    """
    mode = 'r' is for reading
    encoding = "utf-8-sig" is for encoding the file. This is
    important for files that have special characters and is
    good practice to include it.
    """

    reader = csv.reader(file) # create a reader object
    for row in reader: # iterate over the rows in the file
        print(row)
        # print(row[0], row[1], row[2])

['Time', 'Position']
['0', '0']
['1', '5']
['2', '20']
['3', '45']
['4', '80']
['5', '100']


The `csv.reader()` function reads the contents of a CSV file and returns an iterable object, allowing you to loop through the data row by row. In the example above, it demonstrates how to read a CSV file named data.csv and print its contents. Essentially, the `csv.reader()` function takes a file object as an argument and returns a reader object (similar to a list of lists) that can be iterated over to access the data.

To read a CSV file in Python, follow these steps:
1. Open the file using the `open()` function in the appropriate mode (e.g., 'r' for reading), and use the `with` statement to ensure the file is automatically closed after use.
2. Create a reader object using the `csv.reader()` function to process the file’s contents.
3. Iterate over the reader object to access the data row by row.

The `with` statement is essential when working with files, as it ensures the file is properly closed once the code block is executed. Without it, the file may remain open, potentially causing issues like memory leaks. Using the `with` statement is a best practice for file handling in Python.

This approach allows you to handle simple CSV files without needing external libraries. However, for more advanced operations, the Pandas library provides a more powerful and flexible option.

### Writing to a csv file

You can write data to a CSV file using the `csv.writer()` function, which creates a writer object to handle the output. This writer object provides methods like `writerow()` for writing a single row and `writerows()` for writing multiple rows at once. For example, imagine you have a list of experimental measurements that you want to save to a CSV file. You might start by adding a header row to label the columns, which can be done by writing a list of column names before the data. Then, you can write the measurements to a file, such as results.csv, using the `csv.writer()` function to organize and save the data effectively.

In [7]:
import csv
new_data = [["Time", "Position"], [0, 0], [1, 5], [2, 20]] # data to write to the file

with open("output.csv", mode="w", newline="") as file: # open the file
    csv_writer = csv.writer(file) # create a writer object
    csv_writer.writerows(new_data) # write the data to the file

In the scenario above, the `csv.writer()` function takes a file object and an optional `delimiter` argument (defaulting to a comma) to specify the character used to separate values. The `writerow()` method writes a single row to the CSV file, while the `writerows()` method writes multiple rows at once. By following these steps, you can write data to a CSV file in Python.