# SAO/LIP Python Primer Course Lecture 7

In this notebook, you will learn about:
- File paths and the `os` library
- I/O in base Python
- The `pandas` library
- Reading and viewing files in `pandas`
- Manipulating datasets

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/acorreia61201/SAOPythonPrimer/blob/main/lectures/Lecture7.ipynb)

At the end of our discussion on `numpy`, we started covering the concept of *I/O*, or *input/output*. This is a useful feature that you can use to share or load large datasets. In this lecture, we'll cover two more methods to do this: using base Python, and using an external library `pandas`.

## File Paths

Before we get into that, it's important that we understand how files are stored on your computer. Understanding this is vital to opening files that may not be inside your current directory. We can use a library `os` to visualize how this works.

In [1]:
import os

All files on an operating system are located in a *directory*, more commonly known as a folder. Directories can contain files or subdirectories, which themselves may have their own files and subdirectories. Each file has a sequence of directories describing where it is on your system known as a *path*.

This notebook, for example, has a path. If you're currently viewing this notebook, its path will be the *current working directory*. In Python, we can view the current working directory with `os.getcwd()`:

In [2]:
lecs = os.getcwd()
lecs

'/home/acorreia7/SAOPythonPrimer/lectures'

This current working directory is populated with all of the lectures for this week. To view them all, we use the *list* command which we call in Python with `os.listdir()`.

In [3]:
os.listdir()

['Lecture3.ipynb',
 '.ipynb_checkpoints',
 'Lecture1.ipynb',
 'Lecture5.ipynb',
 'plt_sine_eg.png',
 'Lecture7.ipynb',
 'example_dir',
 'Lecture4.ipynb',
 'Lecture2.ipynb',
 'Lecture6.ipynb']

Each file within the same directory has a unique name with two parts. Each file has a *file name*, which itself contains the *file extension*. File extensions can tell your operating system how to interpret and open files. For example, extensions like `png` and `jpg` are interpreted as images, while extensions like `txt` are interpreted as plain-text files. Each lecture above has the extension `ipynb`, the standard for a Jupyter notebook. 

We can create a new directory in the current working directory using `os.mkdir()`:

In [4]:
os.mkdir('example_dir')

FileExistsError: [Errno 17] File exists: 'example_dir'

We can see the new directory using `os.listdir()`:

In [5]:
os.listdir()

['Lecture3.ipynb',
 '.ipynb_checkpoints',
 'Lecture1.ipynb',
 'Lecture5.ipynb',
 'plt_sine_eg.png',
 'Lecture7.ipynb',
 'example_dir',
 'Lecture4.ipynb',
 'Lecture2.ipynb',
 'Lecture6.ipynb']

Notice that if we try to add a new directory with the same name we get an error:

In [6]:
os.mkdir('example_dir')

FileExistsError: [Errno 17] File exists: 'example_dir'

We can move our current working directory to this new directory using `os.chdir()`:

In [7]:
os.chdir('example_dir')
os.getcwd()

'/home/acorreia7/SAOPythonPrimer/lectures/example_dir'

Let's see what's in this new directory:

In [8]:
os.listdir()

[]

It returns an empty list. This makes sense; we haven't added anything to it yet. We can, however, view what's in the previous directory (or any directory on the system, for that matter) using an *absolute path*:

In [9]:
os.listdir(lecs)

['Lecture3.ipynb',
 '.ipynb_checkpoints',
 'Lecture1.ipynb',
 'Lecture5.ipynb',
 'plt_sine_eg.png',
 'Lecture7.ipynb',
 'example_dir',
 'Lecture4.ipynb',
 'Lecture2.ipynb',
 'Lecture6.ipynb']

There are two special strings that represent `relative paths` to the current directory. The string `.` refers to the current working directory:

In [10]:
os.listdir('.')

[]

The string `..` refers to the directory above the current working directory:

In [11]:
os.listdir('..')

['Lecture3.ipynb',
 '.ipynb_checkpoints',
 'Lecture1.ipynb',
 'Lecture5.ipynb',
 'plt_sine_eg.png',
 'Lecture7.ipynb',
 'example_dir',
 'Lecture4.ipynb',
 'Lecture2.ipynb',
 'Lecture6.ipynb']

We can use this to easily move up one directory in the path:

In [13]:
os.chdir('..')
os.getcwd()

'/home/acorreia7/SAOPythonPrimer/lectures'