# Computer Programming and Algorithms

## Week 6.1: Reading files

<img src="img/full-colour-logo-UoB.png" alt="Bristol" style="width: 300px;"/>

# Aims

In this video we will learn:

* What is a file?
* What is a file path?
* How to open a file using a computer program

# What is a file? 

A file is a set of bytes (a unit of data that is eight binary digits long) used to store data. 

What the data represents depend on the file type which is represented by the file extension. 

Examples of file types and file extensions (there are hundereds more):
- unformatted text (.txt, .dat)
- formatted text (.docx)
- spreadsheet/tabulated data (.xlsx, .csv)
- image (.png, .img)

 __TODO: table showing this type of data__


# What is a File Path?

The file path is a string object that represents the location of a file on an operating system. 

This may include it's location within a series of nested folders. 

We will refer to folders as __directories__.

# What is a File Path?

A file path has three parts:
1. <span style="color:blue">__Directory Path__</span>: the location of the directory on the file system that contains the file. Nested directories are separated by a:
    - forward slash `/` (Unix)
    - backslash `\` (Windows)
3. <span style="color:red">__File Name__</span>: the name of the file
4. <span style="color:green">__File Extension__</span>: the end of the file path pre-pended with a period (.) used to indicate the file type

***

Examples: <br>File Path to myfile.txt from the root (top-most) directory of the file system. 

'<span style="color:blue">C:\Users\YourUsername\Documents\ <span style="color:red">myfile</span><span style="color:green">.txt</span>'
<br><span style="color:black">(Windows)

'<span style="color:blue">/home/YourUsername/Documents/ <span style="color:red">myfile</span><span style="color:green">.txt</span>'
<br><span style="color:black">(Unix)

The __root directory__ is the (top-most) directory of the file system. 

__Windows__: <br>Each drive has its own root directory:
- `C:\` is the root directory of the C: drive.
- `D:\` is the root directory of the D: drive.

__Linux/Unix/Mac__: <br>There is a single root directory for the entire file system, denoted by a forward slash `/`.

The path can be either:
- __Global (Absolute):__ The path to a file from the root directory of the file system. 
- __Local (Relative):__ The path to a file relative to the current *working directory* (the directory where the program is being run) 

Consider the file system below. <br>Assume the directory `YourUsername` is on the root directory:

```python
YourUsername/
|
|--- Documents/
        |
        |--- main.py
        |--- README.txt 
```

We are going to run the programme `main.py` which will use the file path to `README.txt`

```python
YourUsername/
|
|--- Documents/
        |
        |--- main.py
        |--- README.txt  
```
***
__Global path__ (root to temperature.csv) 
```
'C:\Users\YourUsername\Documents\README.txt'

'/home/YourUsername/Documents/README.txt'
```
***

__Local path__ (main.py to README.txt)
```
'README.txt'
```

Consider the file system below. <br> Assume the directory `YourUsername` is on the root directory:

```python
YourUsername/
|
|--- Documents/
        |
        |--- main.py
        |--- README.txt 
        |--- data/ 
               |
               |--- temperature.csv
```

We are going to run the programme `main.py` which will use the file path to `temperature.csv`, which is in subdirectory `data`

```python
YourUsername/
|
|--- Documents/
        |
        |--- main.py
        |--- README.txt 
        |--- data/ 
               |
               |--- temperature.csv
```
***
__Global path__ (root to temperature.csv) 
```
'C:\Users\YourUsername\Documents\data\temperature.csv'

'/home/YourUsername/Documents/data/temperature.csv'
```
***
__Local path__ (main.py to temperature.csv)
```
'data/temperature.csv'
```

In the following examples we will us the local file path:
- shorter
- unchanged by location of files on file system, providing their location relative to each other doesn't change


We will use the notation for Unix systems (forward slash `/`) so remember to chage this to backslash `\` if you are using windows

# How to open a file using a computer program

Consider the file system below

```python
Week_6/
|
|--- Example_1/
        |
        |--- program_1.py
        |--- README.txt 
```


We can open the file `README.txt` in `program_1.py` using:
```python
file = open('README.txt')
```

When you are finished reading in the contents of a file, the file needs to be closed:
```python
file.close()
```

Consider the file system below

```python
Week_6/
|
|--- Example_1/
        |
        |--- program_1.py
        |--- README.txt 
        |--- data/ 
               |
               |--- temperature.csv
```

We can open the file `temperature.csv` in `program_1.py` using:
```python
file = open('data/temperature.csv')
file.close()
```
The file is then closed

# Downstream and Upstream files 

__Downstream files__: Files that exist in the same directory as the current working directory, or any of its subdirectories

__Upstream files__: Files that exist in a higher level folder than your current working directory

__Import path:__
The directories from which a file can be read by a program. 
- Includes downstream directories
- Excludes upstream directories by default 

We can add an upstream directory using:
```python
import sys
sys.path.append('../')
```
Where one directory upstream is denoted by 

Consider the file system below

```python
Week_6/
|
|--- Example_2/
        |
        |--- rainfall.csv
        |--- my_program/ 
               |
               |--- program_2.py
```
The file `rainfall.csv` is *upstream* of the file (`program_2.py`)

We can open the file `rainfall.csv` in `programme_2.py` using:
```python
import sys
sys.path.append('../')
file = open('rainfall.csv')
file.close()
```

The same process can be used to access directories that are downstream of an upstream directory but not downstream of the current working directory

```python
Week_6/
|
|--- Example_2/
        |
        |--- rainfall.csv
        |--- my_program/ 
               |
               |--- program_2.py
        |--- my_data/ 
               |
               |--- wind_speed.csv
```

We can open the file `rainfall.csv` in `programme_2.py` using:
```python
import sys
sys.path.append('../my_data')
file = open('wind_speed.csv')
file.close()
```


### Need to see some more examples? 
https://w3schoolsua.github.io/python/python_file_handling_en.html#gsc.tab=0
<br>https://www.geeksforgeeks.org/file-handling-python/
<br>https://realpython.com/read-write-files-python/#file-paths

### Want to take a quiz?
https://realpython.com/quizzes/read-write-files-python/
<br>https://pynative.com/python-file-handling-quiz/

### Want some more advanced information?
https://pynative.com/python/file-handling/#:~:text=To%20read%20or%20write%20a,It%20returns%20the%20file%20object.