[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/scott2b/PythonReview/blob/main/notebooks/Python.04.FileIO.ipynb)

# Basic file i/o in Python

## Resources and context blocks

When we talk about resources in programming, we are talking about external things that we "connect" with. Database connections, web connections and other networked resources, and os and filesystem resources like sockets ... and files.


---
### ⚠️ **Pro tip!** close your resources!

Always be sure to close any resources you open. The best way to do this is to open resources in a `with` block.

---

For the most part, file i/o is simple:

**open a file**:

```
f = open('/path/to/my/file')
```

**close the file**:

```
f.close()
```

But even better is to do your file activity within a managed context block. In Python, we do this using the `with` statement:

```
with open(my_filepath) as f:
    pass # do something with f here
# <<-- Python will close the file for you here
```


In the Colab runtime environment, we do not have direct access to a filesystem. Instead, you will need to mount your personal Google Drive and access files there.

To mount your Google Drive in Colab, select the Folder icon in the left sidebar then click the Mount Drive icon. This will insert code like the following into your notebook:

In [23]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


You can list the contents of your drive as follows:

```
import os
os.listdir('drive')
```

## Working with pathlib

pathlib has some nice features for working with file paths, including `glob` for listing files that match a pattern, and the slash syntax for constructing a path.

In [32]:
from pathlib import Path
my_drive = Path("drive/MyDrive")
project_dir = my_drive / "MyProject"
csv_files = list(project_dir.glob('*.csv'))
csv_files

[PosixPath('drive/MyDrive/MyProject/verizon.adspend.csv'),
 PosixPath('drive/MyDrive/MyProject/macys.adspend.csv'),
 PosixPath('drive/MyDrive/MyProject/netflix.adspend.csv'),
 PosixPath('drive/MyDrive/MyProject/nike.adspend.csv'),
 PosixPath('drive/MyDrive/MyProject/hulu.adspend.csv')]

In [37]:
first_file = csv_files[0]
with first_file.open() as adspend:
    for _ in range(3):
        print(adspend.readline().strip())

TIME PERIOD,PRODUCT,TOTAL DOLS (000),NETWORK TV DOLS (000),CABLE TV DOLS (000),SYNDICATION DOLS (000),SPOT TV DOLS (000),MAGAZINES DOLS (000),SUNDAY MAGS DOLS (000),NATL NEWSP DOLS (000),NEWSPAPER DOLS (000),NETWORK RADIO DOLS (000),NAT SPOT RADIO DOLS (000),OUTDOOR DOLS (000)
"WEEK OF OCT 07, 2013 (B)",Verizon : Business,72.8,0,0,0,0,0,0,0,0,0,72.8,0
"WEEK OF OCT 07, 2013 (B)",Verizon : Consumer Wireless Service,11768.8,8350.9,2214.8,346,611.7,0,0,0,8.6,0,236.7,0


## Processing a csv file

Python's csv DictReader makes it easy to work with csv data rows. Each row is converted to a dictionary with the column names as the keys.

In [39]:
from csv import DictReader

with first_file.open() as adspend:
    reader = DictReader(adspend)
    for i, row in enumerate(reader):
        if i % 1000 == 0:
            print(row['PRODUCT'], row['TOTAL DOLS (000)'])

Verizon : Business 72.8
Verizon : Business 175
Verizon FIOS : Cable Service 10.3
Verizon FIOS : Business 58.6
Verizon FIOS : Cable Service 8.5
