# Session 10: OS methods, Context managers 

## OS methods

When interacting with a computer, we are used to a graphical user interface (GUI) that allows us to point and click on files and folders. However, when we are writing code, we need to be able to interact with the computer using its own language.

The operating system (OS) is the software that manages the computer's resources and allows us to interact with it. The OS is responsible for managing the computer's memory, processes, and all of the software and hardware on the computer.

We can do all this with Python using the `os` module. The `os` module is part of the Python standard library, so we don't need to install anything to use it.

In [1]:
import os

What can we do with the `os` module? We can do things like:

- Navigate the file system
- Get file information
- Create and remove directories
- Create and remove files
- Move and rename files

In this course, we'll focus on navigating the file system and reading and writing files.

### Navigating the file system

The file system is the way that files and folders are organized on a computer. The file system has a root directory, and every file and folder in the file system is located somewhere inside that root directory. 

When working with our computers, the applications point at the files and folders we want to work with. When we are working with Python, we need to tell Python where the files and folders are located. We do this by giving Python the path to the file or folder we want to work with.

There are two types of paths we can use to tell Python where to find a file or folder:
* Absolute paths: An absolute path is the full path to a file or folder. It contains all the information needed to locate the file or folder, starting from the root directory. For example, `/Users/username/Documents/` is an absolute path.
* Relative paths: A relative path is a path relative to the current working directory. For example, if the current working directory is `/Users/username/Documents/`, then the relative path `my_file.txt` refers to the file `/Users/username/Documents/my_file.txt`.


In [2]:
os.getcwd()

'/Users/jorge/Library/Mobile Documents/com~apple~CloudDocs/IE/FIRST TERM/PYTHON FOR DATA ANALYSIS I/S10'

In [11]:
os.chdir('/Users/dgarhdez/pda1_apr23/SESSION_10')

The previous example shows how to use the `os` module to get the current working directory. The current working directory is the directory that Python is operating in. When we use relative paths, Python will look for the file or folder in the current working directory.

### Exercise 1

Change the current working directory to the `SESSION_09` folder.

In [14]:
# os.chdir('/Users/dgarhdez/pda1_apr23/SESSION_10')

In [15]:
# where python is looking at now
print(os.getcwd())

/Users/dgarhdez/pda1_apr23/SESSION_10


In [16]:
# change directory to the immediate parent directory
os.chdir("..")
print(os.getcwd())

/Users/dgarhdez/pda1_apr23


In [17]:
# print all the directories and files in the current directory
print(os.listdir())

['SESSION_15', 'SESSION_12', '.DS_Store', 'SESSION_13', 'SESSION_14', 'Untitled.ipynb', 'SESSION_09', 'SESSION_07', 'SESSION_01', 'SESSION_06', 'SESSION_08', 'SESSION_11', 'SESSION_16', 'SESSION_20', 'SESSION_18', 'SESSION_19', 'SESSION_17', 'SESSION_10', '.ipynb_checkpoints', 'SESSION_03', 'SESSION_04', 'SESSION_05', 'SESSION_02']


In [18]:
# change directory to 'SESSION_09'
os.chdir("SESSION_09")
print(os.getcwd())

/Users/dgarhdez/pda1_apr23/SESSION_09


We have learned ho to move through the filesystem using the `os` module. Now, we will learn how to read and write files.

## I/O and Context Manager

Just like when double clicking on a file and opening, we can open a file and read its content into our Python session.

### Handling files with `open()`

We can read a file using the `open()` function. The `open()` function takes two arguments: the path to the file (either absolute or relative) and the mode. The mode tells Python whether we want to read from the file, write to the file, or append to the file. The mode argument is optional, and if we don't provide it, Python will assume we want to read from the file.

The following modes are available:

* `'r'`: Open text file for reading. The stream is positioned at the beginning of the file.
* `'r+'`: Open for reading and writing. The stream is positioned at the beginning of the file.
* `'w'`: Truncate file to zero length or create text file for writing. The stream is positioned at the beginning of the file.

* `'w+'`: Open for reading and writing. The file is created if it does not exist, otherwise it is truncated. The stream is positioned at the beginning of the file.

* `'a'`: Open for writing.  The file is created if it does not exist. The stream is positioned at the end of the file. Subsequent writes to the file will always end up at the then current end of file, irrespective of any intervening fseek(3) or similar.

* `'a+'`: Open for reading and writing. The file is created if it does not exist. The stream is positioned at the end of the file. Subsequent writes to the file will always end up at the then current end of file, irrespective of any intervening fseek(3) or similar.

The syntax for using `open()` is the following:

'''python
f = open(path, mode)
do something with f
f.close()
'''

In [20]:
# change directory to 'SESSION_10'
os.chdir("../SESSION_10")
print(os.getcwd())

/Users/dgarhdez/pda1_apr23/SESSION_10


In [29]:
# read text_file.txt using open()
f = open('text_file_1.txt', 'r')

# read the content of the file and store it in a variable
content = f.read()

# close the file
f.close()

# print the content of the file
print(content)

Hi there!

This is a dummy text file.


This way of opening files is not recommended, as we have to remember to close the file after we are done with it. If we forget to close the file, it will remain open and we won't be able to open it again until we restart Python.

And even if we remember to close the file, what happens if there is an error while we are working with the file? We might not get to the `f.close()` line, and the file will remain open.

To avoid these problems, we can use a context manager.

A context manager is a Python object that is able to control the context of the code it is running. The `with` statement is used to create a context manager. When we use the `with` statement, Python will automatically close the file for us when we are done with it, even if there is an error while we are working with the file.

In [3]:
# only use this from now on
with open('text_file_1.txt', 'r') as f:
    content = f.read()
    print(content)

Hi there!

This is a dummy text file.


### Exercise 2

Let's write some text into 'text_file_1.txt'. Use the `with` statement to open the file and write the text.

In [4]:
text_to_add = "This is a new line of text"

# open the file in append mode
with open('text_file_1.txt', 'a') as f:

    # write the text to the file
    f.write(text_to_add)

# open the file in read mode
with open('text_file_1.txt', 'r') as f:
    
        # read the content of the file and store it in a variable
        content = f.read()
    
        # print the content of the file
        print(content)

Hi there!

This is a dummy text file.This is a new line of text


### Exercise 3

Now lets create a new file called 'text_file_2.txt' and write some text into it. Use the `with` statement to open the file and write the text.

In [5]:
# create a list of strings
new_content = 'This is a new line of text\nThis is another line of text'

# open the file in write mode
with open('text_file_2.txt', 'w') as f:
    
        # write the list of strings to the file
        f.write(new_content)

### Exercise 4

Open the `ecomm.csv` file and read its content into a list. Each row should be an element in the list.

In [2]:
with open('ecomm.csv', 'r') as f:
    content = f.read()

content[:200]

'product_id,reporting_date,country_code,units,price,sales,cogs,discounts,refunds,advertising_fees,returns,sde,fees\n17,2022-01-01,us,48,31.429791666999996,1508.63,-254.48,-121.2,-24.36,-305.42,3.0,385.5'

In [3]:
header = content.split('\n')[0].split(',')
header

['product_id',
 'reporting_date',
 'country_code',
 'units',
 'price',
 'sales',
 'cogs',
 'discounts',
 'refunds',
 'advertising_fees',
 'returns',
 'sde',
 'fees']

In [8]:
rows = [content.split('\n')[i].split(',') for i, x in enumerate(content[:200])]
rows

[['product_id',
  'reporting_date',
  'country_code',
  'units',
  'price',
  'sales',
  'cogs',
  'discounts',
  'refunds',
  'advertising_fees',
  'returns',
  'sde',
  'fees'],
 ['17',
  '2022-01-01',
  'us',
  '48',
  '31.429791666999996',
  '1508.63',
  '-254.48',
  '-121.2',
  '-24.36',
  '-305.42',
  '3.0',
  '385.53',
  '-428.24'],
 ['3',
  '2022-01-01',
  'de',
  '1',
  '46.21',
  '46.21',
  '-16.12',
  '0.0',
  '',
  '-2.05',
  '',
  '15.79',
  '-12.25'],
 ['32',
  '2022-01-01',
  'de',
  '39',
  '21.330512821',
  '831.89',
  '-178.62',
  '-3.58',
  '',
  '-79.76',
  '',
  '269.24',
  '-300.69'],
 ['9',
  '2022-01-01',
  'de',
  '6',
  '8.048333332999999',
  '48.29',
  '-8.82',
  '-2.51',
  '',
  '-12.27',
  '',
  '4.89',
  '-19.799999999999997'],
 ['5',
  '2022-01-01',
  'nl',
  '1',
  '17.15',
  '17.15',
  '-5.47',
  '-0.52',
  '',
  '-0.44',
  '',
  '4.91',
  '-5.8100000000000005'],
 ['30',
  '2022-01-01',
  'us',
  '1',
  '21.09',
  '21.09',
  '-7.14',
  '0.0',
  '',
  '-


### Exercise 5

Take the list of rows and convert them into a list of dictionaries with the following structure:

```
{"product_id": 17, "reporting_date": "2022-01-01", etc...}
```

Where the keys are each one of the elements in the header row.

In [10]:
values = content.split('\n')[1::]
lst = []
for row in rows:
    dicc = {}
    for i, col in enumerate(header):
       print(i, col)
    lst = {}


0 product_id
1 reporting_date
2 country_code
3 units
4 price
5 sales
6 cogs
7 discounts
8 refunds
9 advertising_fees
10 returns
11 sde
12 fees
0 product_id
1 reporting_date
2 country_code
3 units
4 price
5 sales
6 cogs
7 discounts
8 refunds
9 advertising_fees
10 returns
11 sde
12 fees
0 product_id
1 reporting_date
2 country_code
3 units
4 price
5 sales
6 cogs
7 discounts
8 refunds
9 advertising_fees
10 returns
11 sde
12 fees
0 product_id
1 reporting_date
2 country_code
3 units
4 price
5 sales
6 cogs
7 discounts
8 refunds
9 advertising_fees
10 returns
11 sde
12 fees
0 product_id
1 reporting_date
2 country_code
3 units
4 price
5 sales
6 cogs
7 discounts
8 refunds
9 advertising_fees
10 returns
11 sde
12 fees
0 product_id
1 reporting_date
2 country_code
3 units
4 price
5 sales
6 cogs
7 discounts
8 refunds
9 advertising_fees
10 returns
11 sde
12 fees
0 product_id
1 reporting_date
2 country_code
3 units
4 price
5 sales
6 cogs
7 discounts
8 refunds
9 advertising_fees
10 returns
11 sde
12 fees

## Homework

### Exercise 1
Using the `ecomm` dictionary, find the date with the highest sales

### Exercise 2
Find the `product_id` that gets the most returns

### Exercise 3
Find the `product_id` that gets the most sales

### Exercise 4
Find the country with the highest sde