## 1. Working with the file system (`os`, `os.path`)

### `os`

The `os` module contains functions to get information on local directories, files, processes, and environment variables.

`os.getcwd()` - returns the current working directory

In [40]:
import os
current_path = os.getcwd()
print(current_path)

/Users/iulia/PycharmProjects/python-training-mar2025/docs


`os.listdir(path)` - returns a list of all the entries in the directory given by `path`

In [41]:
os.listdir(current_path)

['my_module.py',
 '09. More data structures.ipynb',
 'images',
 '01. Introduction.ipynb',
 'output.json',
 '13. Working with databases.ipynb',
 'users.json',
 '__pycache__',
 '12. Working with different data formats.ipynb',
 'output.csv',
 'data.csv',
 '14. Decorators.ipynb',
 'data.json',
 '07. Modules.ipynb',
 'books.csv',
 'clients.json',
 '06. Methods on known data types.ipynb',
 '15. Object-Oriented Programming.ipynb',
 '02. Python basics.ipynb',
 'file_example_out.txt',
 '.ipynb_checkpoints',
 '08. Introduction to File Handling.ipynb',
 '04. Lists and Tuples.ipynb',
 'transactions.csv',
 '11. Introduction to Object-Oriented Programming.ipynb',
 'ceva',
 '05. Functions.ipynb',
 '10. Exception Handling.ipynb',
 '99. Project.ipynb',
 'file_example_in.txt',
 '03. Control Flow.ipynb']

`os.mkdir(path)` - creates a directory

`os.makedirs(path)` - creates directory recursively, by adding eventual missing directories 

In [42]:
os.mkdir('testdir')
assert 'testdir' in os.listdir(current_path)

`os.chdir()` - changes the current working directory

In [43]:
os.chdir('testdir')
print('Items in testdir:', os.listdir())
os.chdir(current_path)

Items in testdir: []


`os.rename(source, dest)` - renames the file or directory 

In [44]:
os.rename('testdir', 'new_testdir')
assert 'testdir' not in os.listdir(current_path)
assert 'new_testdir' in os.listdir(current_path)

`os.remove(path)` - removes a file

`os.rmdir(path)` - removes the directory path

`os.removedirs(path)` - Removes directories recursively

In [45]:
os.rmdir('new_testdir')
assert 'new_testdir' not in os.listdir(current_path)

`os.walk(path)` - Directory tree generator. For each directory in the directory tree rooted at top, yields a 3-tuple `dirpath, dirnames, filenames`:
    
* `dirpath` is a string, the path to the directory.
* `dirnames` is a list of the names of the subdirectories in `dirpath` (excluding '.' and '..').
* `filenames` is a list of the names of the non-directory files in `dirpath`.

In [46]:
for dirpath, dirnames, filenames in os.walk('.'):
    print(dirpath, dirnames, filenames)

. ['images', '__pycache__', '.ipynb_checkpoints', 'ceva'] ['my_module.py', '09. More data structures.ipynb', '01. Introduction.ipynb', 'output.json', '13. Working with databases.ipynb', 'users.json', '12. Working with different data formats.ipynb', 'output.csv', 'data.csv', '14. Decorators.ipynb', 'data.json', '07. Modules.ipynb', 'books.csv', 'clients.json', '06. Methods on known data types.ipynb', '15. Object-Oriented Programming.ipynb', '02. Python basics.ipynb', 'file_example_out.txt', '08. Introduction to File Handling.ipynb', '04. Lists and Tuples.ipynb', 'transactions.csv', '11. Introduction to Object-Oriented Programming.ipynb', '05. Functions.ipynb', '10. Exception Handling.ipynb', '99. Project.ipynb', 'file_example_in.txt', '03. Control Flow.ipynb']
./images ['.ipynb_checkpoints'] ['M-X.png', 'python_installation_windows.png', 'pycharm_installation_mac.png', 'python_installation.png', 'A-L.png', 'pycharm_installation_windows.png', 'Y-Z.png']
./images/.ipynb_checkpoints [] ['A

### `os.path`

`os.path` contains functions for manipulating filenames and directory names.

`os.path.exists(path)` - test whether a path exists

In [47]:
os.path.exists(current_path)

True

`os.path.isfile(path)` - test whether a path is a regular file

In [48]:
os.path.isfile(current_path)

False

`os.path.isdir(path)` - return true if the pathname refers to an existing directory

In [49]:
os.path.isdir(current_path)

True

`os.path.split(path)` - split a pathname;  returns tuple `(head, tail)` where `tail` is everything after the final slash

In [50]:
os.path.split(current_path)

('/Users/iulia/PycharmProjects/python-training-mar2025', 'docs')

`os.path.join(path, "new_var")` - join two or more pathname components, inserting `os.sep` as needed.

In [51]:
os.path.join(current_path, 'testdir', 'innerdir')

'/Users/iulia/PycharmProjects/python-training-mar2025/docs/testdir/innerdir'

## Exercises 1

1. Write a Python program that creates a directory `outdir` at the current location and a directory `innerdir` inside `outdir`. Create an empty file inside `innerdir`. Use `os.walk()` to print the directory tree for `outdir`. Remove the directories and the file.
   
## 2. Working with paths in the file system (`pathlib`)

The `pathlib` module provides an object-oriented approach to handling file system paths. It allows you to work with file and directory paths in a more intuitive and Pythonic way than traditional string manipulation.

The main class in this module is `Path`, which represents a file or directory path:

In [52]:
from pathlib import Path

Using this class, we can instantiate it to create paths:

In [53]:
current_dir = Path()
root_path = Path("/")
relative_path = Path("images/pycharm_installation_mac.png")

`Path` objects attributes:

In [54]:
relative_path.parent  # parent directory

PosixPath('images')

In [55]:
relative_path.stem  # final path component, without its suffix

'pycharm_installation_mac'

In [56]:
relative_path.suffix  # file extension

'.png'

`Path` objects can be used to build new paths:

In [57]:
python_path = root_path / "usr" / "bin" / "python3"
python_path

PosixPath('/usr/bin/python3')

Methods of `Path` objects to check various properties:

In [58]:
python_path.exists()

True

In [59]:
python_path.is_file()

True

In [60]:
python_path.is_dir()

False

The `Path` class also implements functions in `os` module used for directory manipulation, as methods:

In [61]:
new_dir = current_dir / "new_dir"
new_dir.mkdir(exist_ok=True)

for directory in current_dir.iterdir():
    if directory.is_dir():
        print(directory)

images
__pycache__
new_dir
.ipynb_checkpoints
ceva


In [62]:
for txt_file in current_dir.glob("**/*.txt"):
    print(txt_file)

file_example_out.txt
file_example_in.txt
.ipynb_checkpoints/file_example_in-checkpoint.txt
.ipynb_checkpoints/file_example_out-checkpoint.txt


In [63]:
new_dir.rmdir()

In [64]:
for dirpath, dirnames, filenames in current_dir.walk():
    print(dirpath, dirnames, len(filenames))

. ['images', '__pycache__', '.ipynb_checkpoints', 'ceva'] 27
images ['.ipynb_checkpoints'] 7
images/.ipynb_checkpoints [] 2
__pycache__ [] 2
.ipynb_checkpoints [] 18
ceva [] 0


## Exercises 2

1. Solve the same exercises above (Exercises 1), but using `pathlib` module.

## 3. Working with JSON format

### What is JSON?

JSON (JavaScript Object Notation) is a lightweight data format used for exchanging data between systems. It is easy for humans to read and write and easy for machines to parse and generate. JSON is based on key-value pairs and supports basic data types such as strings, numbers, arrays, and objects.

Example:  
```json
{
  "name": "Alice",
  "age": 30,
  "isEmployed": true,
  "skills": ["Python", "SQL", "JavaScript"],
  "address": {
    "city": "New York",
    "zipcode": "10001"
  }
}
```

### `json` module

The `json` module in Python provides methods to **encode** (convert Python objects to JSON) and **decode** (convert JSON to Python objects).


- `json.loads()` - parses a JSON string and converts it into a Python object.

In [65]:
import json

json_string = '{"name": "Alice", "age": 30}'
data = json.loads(json_string)

In [66]:
data

{'name': 'Alice', 'age': 30}

In [67]:
type(data)

dict

- `json.dumps()` - converts a Python object into a JSON string.

In [68]:
json_string = json.dumps(data)

In [69]:
json_string

'{"name": "Alice", "age": 30}'

- `json.load()` - reads JSON data from a file and converts it into a Python object.

In [70]:
with open("data.json", "r") as f:
    data = json.load(f)

In [71]:
data

{'name': 'Mia', 'hobbies': ['painting', 'jogging']}

In [72]:
data["age"] = 20
data["hobbies"].append("cooking")
data

{'name': 'Mia', 'hobbies': ['painting', 'jogging', 'cooking'], 'age': 20}

- `json.dump()` - writes a Python object as JSON data to a file.

In [73]:
with open("output.json", "w") as f:
    json.dump(data, f)

- Formatting with `indent` and `sort_keys`: both `dump` and `dumps` methods also receive `indent` and `sort_keys` optional parameters to add indentation or sort keys alphabetically.

In [74]:
formatted_json = json.dumps(data, indent=4, sort_keys=True)
print(formatted_json)

{
    "age": 20,
    "hobbies": [
        "painting",
        "jogging",
        "cooking"
    ],
    "name": "Mia"
}


## Exercises 3
1. Using [`users.json`](users.json) file:
   - open it and decode the Python object inside it
   - filter users with `"email"` key and encode the resulting object in a JSON string; print the string to the console
   - filter users with ages between 20 and 40 and encode the resulting object in a JSON file, using `indent` and `sort_keys` parameters.

## 4. Working with CSV format

### What is CSV?
 
CSV (Comma-Separated Values) is a simple file format used to store tabular data, such as a spreadsheet or database. Each line in a CSV file represents a data record, and each record consists of fields separated by a delimiter (commonly a comma).

Example:
```csv
Name,Age,City
Alice,30,New York
Bob,25,Los Angeles
Charlie,35,Chicago
```

### `csv` module

The `csv` module in Python provides functionality to read, write, and manipulate CSV files easily. It supports different delimiters, quoting styles, and file encoding.

* `csv.reader()` - reads a CSV file and returns an iterable object for processing each row as a list.

In [75]:
import csv

with open("data.csv", "r") as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

['Name', 'Age', 'City']
['Alice', '30', 'New York']
['Bob', '25', 'Los Angeles']
['Charlie', '35', 'Chicago']


- `csv.writer()` - writes data to a CSV file row by row.

In [76]:
data = [
    ["Name", "Age", "City"],
    ["Alice", 30, "New York"],
    ["Bob", 25, "Los Angeles"],
    ["Charlie", 35, "Chicago"]
]

with open("output.csv", "w", newline="") as file:
    writer = csv.writer(file)
    # writer.writerows(data)
    for line in data:
        writer.writerow(line)
        print(f"{line} written to CSV file.")

['Name', 'Age', 'City'] written to CSV file.
['Alice', 30, 'New York'] written to CSV file.
['Bob', 25, 'Los Angeles'] written to CSV file.
['Charlie', 35, 'Chicago'] written to CSV file.


- `csv.DictReader()` - reads a CSV file and converts each row into a dictionary, with column headers as keys.

In [77]:
with open("data.csv", "r") as file:
    reader = csv.DictReader(file)
    for row in reader:
        print(row)

{'Name': 'Alice', 'Age': '30', 'City': 'New York'}
{'Name': 'Bob', 'Age': '25', 'City': 'Los Angeles'}
{'Name': 'Charlie', 'Age': '35', 'City': 'Chicago'}


- `csv.DictWriter()` - writes dictionaries to a CSV file, using specified `fieldnames` as headers.

In [78]:
data = [
    {"Name": "Alice", "Age": 30, "City": "New York"},
    {"Name": "Bob", "Age": 25, "City": "Los Angeles"},
    {"Name": "Charlie", "Age": 35, "City": "Chicago"}
]

with open("output.csv", "w", newline="") as file:
    fieldnames = ["Name", "Age", "City"]
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(data)

## Exercises 4
1. Using [`books.csv`](books.csv), do the following:
   - read the CSV file
   - create two other CSV files: `mathematics_books.csv` and `computer_science_books.csv`, containing only books in each genre (_Genre_ column should be equal to _mathematics_ or _computer_science_, respectively), with all columns in _books.csv_ except _Genre_