# Programming with Python

## Lecture 29: File I/O

### Armen Gabrielyan

#### Yerevan State University
#### Portmind

# `os.path` module

The `os.path` module in Python provides functions for working with file paths and manipulating path-related strings.

- `os.path.join(path, *paths)`: This function joins one or more path components intelligently. It concatenates the specified paths using the appropriate path separator for the underlying operating system.
- `os.path.abspath(path)`: This function returns the absolute version of a path. It resolves any symbolic links and references to parent directories ('..' and '.').

For more information, see https://docs.python.org/3/library/os.path.html

## `os.path.join(path, *paths)`

In [None]:
import os

sample_input_path = os.path.join("resources", "lecture29", "sample_input.txt")
sample_input_path

In [None]:
import os

resources_path = os.path.join("resources")
lecture_29_path = os.path.join(resources_path, "lecture29")

sample_input_path = os.path.join(lecture_29_path, "sample_input.txt")
sample_input_path

## `os.path.abspath(path)`

In [None]:
import os

os.path.abspath("resources")

In [None]:
os.path.abspath(sample_input_path)

# `pathlib` module

The `pathlib` module in Python provides an object-oriented approach to working with file system paths. It was introduced in Python 3.4 and offers a more intuitive and expressive way to manipulate paths compared to the traditional string-based path operations.

The `pathlib` module provides the `Path` class, which represents a file or directory path. Nevertheless, `os.path` module is still widely used.

In [None]:
from pathlib import Path

resources_path = Path("resources")
lecture_29_path = resources_path / "lecture29"

sample_input_path = lecture_29_path / "sample_input.txt"
sample_input_path

# Write to a file

There are several methods you can use to write to a file in Python. Here are some common approaches:

- `.write(string)` method
- `.writelines(seq)` method

# `.write(string)` method

This method writes the provided string to a file.

In [None]:
output_path = lecture_29_path / "output.txt"

In [None]:
with open(output_path, "w") as fp:
    fp.write("Hello world!\n")

In [None]:
with open(output_path, "w") as fp:
    fp.write("We are learning Python!\n")

# `.writelines(seq)` method

This method writes the provided sequence to a file.

In [None]:
lorem_ipsum = [
    "Lorem ipsum dolor sit amet, consectetur adipiscing elit.\n",
    "Mauris orci magna, ullamcorper eget accumsan ut, varius et justo.\n",
    "Duis ultricies eleifend magna sagittis finibus.\n",
]

with open(output_path, "w") as fp:
    fp.writelines(lorem_ipsum)

# Different modes of file operations

- **`r` (read only)**: This option allows you to access text files in a read-only mode. The handle is located at the start of a file. If the file doesn't exist, it will generate an I/O error.
- **`r+` (read and write)**: This approach allows the file to be accessed for both reading and writing purposes. The handle is located at the start of a file. If the file doesn't exist, it will generate an I/O error.
- **`w` (write only)**: This option enables the file to be opened for writing only. It overwrites the data in existing files. The handle is located at the start of a file. If the file does not already exist in the folder, a new file is created.
- **`w+` (write and read)**: This method allows for reading and writing access to the file. It replaces and erases the existing text in the file. The handle is located at the start of a file.
- **`a` (append only)**: In this mode, the file can be opened specifically for writing. If the file does not exist, a new file is created. The handle is located at the end of a file. Any newly written data will be appended after the existing data that was previously written.
- **`a+` (append and read)**: In this mode, it becomes possible to both read from and write to the file. If the file is not present, a new one will be generated. The handle is located at the end of the file. Any newly written text will be appended to the existing data, which was previously written.

In [None]:
# Reading the entire file
with open(sample_input_path, "r") as fp:
    content = fp.read()
    print(content)

In [None]:
non_existent_path = lecture_29_path / "non_existent_file.txt"

# Trying to read from a non-existent file raises an error
with open(non_existent_path, "r") as fp:
    content = fp.read()
    print(content)

In [None]:
# Reading the entire file
with open(sample_input_path, "r+") as fp:
    content = fp.read()
    print(content)
    
    fp.write("Hello world!\n")

In [None]:
# Appending to the end of the file
with open(output_path, "a") as fp:
    fp.write("Hello world!\n")

# Byte mode

To open a file in byte mode, `"b"` character can be added to the mode string.

In [None]:
# Reading the entire file in byte mode
with open(sample_input_path, "rb") as fp:
    content = fp.read()
    print(content)
    print(type(content))

In [None]:
output_bytes_path = lecture_29_path / "output_bytes.txt"

# Write to the file in byte mode
with open(output_bytes_path, "wb") as fp:
    fp.write(b"Hello world!\n")

Byte mode is particularly useful for interacting with binary data, such as images.

In [None]:
cat_image_path = lecture_29_path / "cat.jpeg"

# Reading image file in byte mode
with open(cat_image_path, "rb") as fp:
    content = fp.read(10)
    print(content)
    print(type(content))

# JavaScript Object Notation

**J**ava**S**cript **O**bject **N**otation (**JSON**)  is a lightweight data interchange format that is easy for humans to read and write and also easy for machines to parse and generate. JSON is widely used to transmit data between computer systems.

JSON is based on key-value pairs and supports a variety of data types, including strings, numbers, booleans, arrays, and objects. It has a syntax similar to JavaScript object literals.

JSON is widely supported in various programming languages and is used for data storage, configuration files, and API responses, among other things. It provides a standardized format for data exchange and is often preferred due to its simplicity and readability.

Here's an example of a simple JSON object:

```json
{
  "name": "John Doe",
  "age": 30,
  "isStudent": false,
  "hobbies": ["reading", "traveling", "photography"],
  "address": {
    "street": "123 Main St",
    "city": "Exampleville",
    "country": "USA"
  }
}
```

# `json` module

The built-in `json` module provides functionalities to work with JSON data.

The most commonly used functions are:

- `json.dumps(obj, **)`
- `json.dump(obj, fp, **)`
- `json.loads(s, **)`
- `json.load()`

In [None]:
import json

# JSON serialization

**JSON serialization** refers to the process of converting data from a programming language's native data structures into a JSON format. 

# `json.dumps(obj, **)` function

This function serializes the `obj` into a JSON formatted string.

In [None]:
data = {
  "name": "John Doe",
  "age": 30,
  "isStudent": False,
  "hobbies": ["reading", "traveling", "photography"],
  "address": {
    "street": "123 Main St",
    "city": "Exampleville",
    "country": "USA"
  }
}

json.dumps(data)

In [None]:
json.dumps(data, indent=4)

In [None]:
data_path = lecture_29_path / "data.json"

In [None]:
with open(data_path, "w") as fp:
    json_data = json.dumps(data)
    fp.write(json_data)

In [None]:
with open(data_path, "w") as fp:
    json_data = json.dumps(data, indent=4)
    fp.write(json_data)

# `json.dump(obj, fp, **)` function

This function serializes the `obj` into a JSON formatted string and writes it to the file-like object `fp`

In [None]:
data = {
  "name": "John Doe",
  "age": 30,
  "isStudent": False,
  "hobbies": ["reading", "traveling", "photography"],
  "address": {
    "street": "123 Main St",
    "city": "Exampleville",
    "country": "USA"
  }
}

In [None]:
with open(data_path, "w") as fp:
    json.dump(data, fp)

In [None]:
with open(data_path, "w") as fp:
    json.dump(data, fp, indent=4)

# JSON deserialization

**JSON deserialization** refers to the process of converting a JSON string into its corresponding data structures in a programming language.

# `json.loads(s, **)`

This function deserializes a JSON string `s` into a Python object.

In [None]:
json_data = '{"name": "Jane Smith", "age": 25, "city": "San Francisco"}'

obj = json.loads(json_data)
print(obj)
print(type(obj))

In [None]:
with open(data_path, "r") as fp:
    json_data = fp.read()
    obj = json.loads(json_data)
    print(obj)
    print(type(obj))

# `json.load(fp, **)`

This function deserializes a JSON string read from the file-like object `fp` into a Python object.

In [None]:
with open(data_path, "r") as fp:
    obj = json.load(fp)
    print(obj)
    print(type(obj))

# Comma Separated Values (CSV)

**CSV (Comma Separated Values)** files are a common file format used for storing tabular data. They provide a way to represent structured data in a plain text format, where each line of the file represents a row of data and the values within each row are separated by commas (or other specified delimiters).

In a CSV file, the first line often contains the headers or field names, which describe the data in each column. The subsequent lines contain the actual data values, with each value corresponding to a specific column.

For example:

```csv
Name,Age,City
John,25,London
Alice,30,New York
Bob,35,Paris
```

CSV files are widely used for data interchange between different applications and systems. They are relatively simple and can be easily read and written by various programming languages and spreadsheet software. CSV files are often used in data analysis, data import/export, and database operations.

It's important to note that while the name suggests "comma-separated values," other delimiters like tabs, semicolons, or pipe symbols can also be used, depending on the specific requirements or conventions of the data.

# `csv` module

The `csv` module in Python provides functionality for reading and writing CSV (Comma Separated Values) files. It simplifies the process of working with tabular data stored in CSV format.

In [None]:
import csv

# Reading a CSV file

`csv.reader` function can be used to read data from a CSV file.

In [None]:
data_csv_path = lecture_29_path / "data.csv"

In [None]:
with open(data_csv_path, "r") as fp:
    reader = csv.reader(fp)
    for row in reader:
        print(row)

In [None]:
with open(data_csv_path, "r") as fp:
    reader = csv.reader(fp)
    for index, row in enumerate(reader):
        if index == 0:
            print(f"Headers: {row}")
        else:
            print(f"Row #{index}: {row}")

# Reading a CSV file into a dictionary

`csv.DictReader` class can be used to read data from a CSV file into a dictionary.

In [None]:
with open(data_csv_path, "r") as fp:
    reader = csv.DictReader(fp)
    for row in reader:
        print(row["Name"], row["Age"], row["City"])

# Writing to a CSV file

`csv.writer` function can be used to write data to a CSV file.

In [None]:
output_csv_path = lecture_29_path / "output.csv"

In [None]:
data = [
    ['Name', 'Age', 'City'],
    ['John', '25', 'London'],
    ['Alice', '30', 'New York'],
    ['Bob', '35', 'Paris']
]

with open(output_csv_path, "w") as fp:
    writer = csv.writer(fp)
    writer.writerows(data)

# Writing to a CSV file from a dictionary

`csv.DictWriter` class can be used to write data to a CSV file from a dictionary.

In [None]:
data = [
    {"Name": "John", "Age": "25", "City": "London"},
    {"Name": "Alice", "Age": "30", "City": "New York"},
    {"Name": "Bob", "Age": "35", "City": "Paris"}
]

fieldnames = ["Name", "Age", "City"]

with open(output_csv_path, "w", newline="") as file:
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(data)

# Further learning

- Further Python
    - Package management: https://packaging.python.org/en/latest/tutorials/installing-packages/
    - Deep dive into OOP
    - Type hints, types, type checking
    - Asynchronous programming
    - etc.
    - Book: Luciano Ramalho, Fluent Python, 2nd edition, O'Reilly Media, 2022
- Scientific computing and data analysis libraries
    - NumPy: https://numpy.org/
    - SciPy: https://scipy.org/
    - Pandas: https://pandas.pydata.org/
    - Polars (alternative dataframe library): https://www.pola.rs/
    - Matplotlib: https://matplotlib.org/
- Web
    - WSGI: https://peps.python.org/pep-3333/
    - Gunicorn: https://gunicorn.org/
    - Flask: https://flask.palletsprojects.com/en/2.2.x/
    - ASGI: https://asgi.readthedocs.io/en/latest/index.html
    - Uvicorn: http://www.uvicorn.org/
    - FastAPI: https://fastapi.tiangolo.com/

# Virtual environment

1. Create a virtual environment

```bash
python -m venv .venv
```

2. Activate the environment

```bash
source .venv/bin/activate
```

# NumPy basic examples

Install NumPy

```bash
pip install numpy
```

In [None]:
!pip install numpy

In [None]:
import numpy as np

In [None]:
# One-dimensional array

a = np.array([1, 2, 3, 4, 5])

print(a)
print(type(a))

In [None]:
# Multi-dimensional array

b = np.array([[1, 2, 3], [4, 5, 6]])

print(b)

In [None]:
# Accessing array elements

print(a[0])
print(b[1, 2])

In [None]:
# Array shape and size

print(a.shape)
print(b.shape)
print(a.size)

In [None]:
# Array arithmetic

x = np.array([1, 2, 3])
y = np.array([4, 5, 6])

In [None]:
# Element-wise addition

result = x + y

print(result)

In [None]:
# Element-wise multiplication

result = x * y

print(result)

In [None]:
# Dot product

dot_product = np.dot(x, y)

print(dot_product)

# SciPy basic examples

Install SciPy

```bash
pip install scipy
```

In [None]:
!pip install scipy

In [None]:
# Numerical Integration

from scipy import integrate

# Define the function to integrate
def f(x):
    return x**2

# Integrate the function from 0 to 1
result, error = integrate.quad(f, 0, 1)
print("Result:", result)
print("Error:", error)

In [None]:
# Linear Algebra

import numpy as np
from scipy import linalg

# Create a 2x2 matrix
A = np.array([[1, 2], [3, 4]])

# Compute the inverse of the matrix
inv_A = linalg.inv(A)
print("Inverse of A:")
print(inv_A)

# Compute the determinant of the matrix
det_A = linalg.det(A)
print("Determinant of A:", det_A)

In [None]:
# Optimization

from scipy import optimize

# Define the objective function to minimize
def f(x):
    return (x - 1) ** 2

# Minimize the function
result = optimize.minimize(f, 0)
print("Minimum value:", result.fun)
print("Minimum point:", result.x)

# Pandas basic examples

Install Pandas

```bash
pip install pandas
```

In [None]:
!pip install pandas

In [None]:
import pandas as pd

In [None]:
# Creating a DataFrame

data = {'Name': ['John', 'Emma', 'Mike', 'Sophia'],
        'Age': [25, 28, 22, 31],
        'City': ['New York', 'San Francisco', 'London', 'Paris']}

df = pd.DataFrame(data)
df

In [None]:
# Selecting a column

df['Name']

In [None]:
# Selecting multiple columns

df[['Name', 'City']]

In [None]:
# Checking conditions

df['Age'] > 25

In [None]:
# Filtering rows based on conditions

df[df['Age'] > 25]

In [None]:
# Performing aggregations

df['Age'].mean(), df['Age'].median()

In [None]:
df_path = lecture_29_path / "df.csv"

In [None]:
# Save a dataframe to a file

df.to_csv(df_path)

In [None]:
# Read a dataframe from a file

df = pd.read_csv(df_path)
df

# Matplotlib basic examples

Install Matplotlib

```bash
pip install matplotlib
```

In [None]:
!pip install matplotlib

In [None]:
# Line Plot

import matplotlib.pyplot as plt

# Sample data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Create a figure and axis
fig, ax = plt.subplots()

# Plot the data
ax.plot(x, y)

# Set labels and title
ax.set_xlabel('X-axis')
ax.set_ylabel('Y-axis')
ax.set_title('Line Plot')

# Display the plot
plt.show()

In [None]:
# Scatter Plot

import matplotlib.pyplot as plt

# Sample data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Create a figure and axis
fig, ax = plt.subplots()

# Plot the data
ax.scatter(x, y)

# Set labels and title
ax.set_xlabel('X-axis')
ax.set_ylabel('Y-axis')
ax.set_title('Scatter Plot')

# Display the plot
plt.show()

In [None]:
# Bar Chart

import matplotlib.pyplot as plt

# Sample data
categories = ['A', 'B', 'C', 'D']
values = [10, 15, 7, 12]

# Create a figure and axis
fig, ax = plt.subplots()

# Plot the data
ax.bar(categories, values)

# Set labels and title
ax.set_xlabel('Categories')
ax.set_ylabel('Values')
ax.set_title('Bar Chart')

# Display the plot
plt.show()

In [None]:
# Histogram

import matplotlib.pyplot as plt
import numpy as np

# Generate random data
data = np.random.randn(1000)

# Create a figure and axis
fig, ax = plt.subplots()

# Plot the histogram
ax.hist(data, bins=30)

# Set labels and title
ax.set_xlabel('Value')
ax.set_ylabel('Frequency')
ax.set_title('Histogram')

# Display the plot
plt.show()