# Filesystem

Reading and writing files requires some understanding of the computer's filesystem.

Example directory structure (Linux, Unix, MacOS)
```
/                    # Root
  home/              # Directory or Folder
    alice/           # Sub-directory or Sub-folder
      fileA.csv
      project1/
        fileB.csv
        fileC.csv
      project2/
        fileD.csv
    bob/
      file1.txt
      projectA/
        file2.txt
```
Windows uses `\` instead of `/` and root is `C:\`

### Absolute and relative paths

***Absolute paths*** start with the root directory (`/` or `C:\`)
* The *absolute path* to fileA.csv is `/home/alice/fileA.csv` (*nix, MacOS) or `C:\home\alice\fileA.csv` (Windows)

***Relative paths*** start from the current directory, not root.

If the current directory is `alice/`...
* the *relative path* to fileA.csv is `fileA.csv` 
* the *relative path* to fileB.csv is `project1/fileB.csv`  (Windows: `project1\fileB.csv`)

Shortcuts in relative paths:
* `.` means current directory
* `..` means parent of current directory

If the current directory is `project1/`...
* the *relative path* to fileA.csv is `../fileA.csv`  

#### Exercise (basic)
* What is the absolute path to file2.txt?
* What is the relative path from `bob/` to file2.txt?
* What is the relative path from `projectA/` to file1.txt?

#### Exercise (advanced)
* What is the relative path from `project2/` to fileA.csv?
* What is the relative path from `project2/` to file1.txt?

## Navigating the filesystem in Python

The `os` module provides functions for navigating the filesystem.

In [None]:
import os

# Current working directory
os.getcwd()

In [None]:
# Change working directory to 'datasets' subdirectory
os.chdir('datasets')
# os.chdir() also works with absolute paths

# Show the new location
os.getcwd()

In [None]:
# Change back to parent directory
os.chdir('..')

In [None]:
# List contents (files and subdirectories) of current directory
os.listdir()

In [None]:
# List contents of a particular directory
os.listdir(path='datasets')
# absolute paths work too

The `os.path` module contains functions to work with file paths.

In [None]:
# Path to a file that may or may not exist
path = 'datasets/titanic.csv'

# Check if the file or directory exists
if os.path.exists(path):
    print('The file or directory exists!')

In [None]:
# Split a path into its directory and file names
print(f'Filename:  {os.path.basename(path)}')
print(f'Directory: {os.path.dirname(path)}')

### Find files matching a pattern with `glob`

Sometimes we want to find just files that match a pattern, for example all of the files ending in '.csv'. The [`glob`](https://docs.python.org/3/library/glob.html) module does this.

Wildcard pattern matching rules
* `?` matches any *single* character
* `*` matches one or more characters
* `[1-4]` matches numbers 1, 2, 3, or 4
* `[c-e]` matches letters c, d, or e
* `[C-E]` matches letters C, D, or E

Examples
* `*.csv` is all files ending with '.csv'
* `file[2-4].csv` will find file2.csv, file3.csv, and file4.csv, but not file1.csv.
* `glob.glob('*')` finds all files, same as `os.listdir()` 

`glob` returns an *unsorted* list

In [None]:
import glob

# Find all files with extension .ipynb
ipynb_files = glob.glob('*.ipynb')

# Sort the filenames
ipynb_files.sort()

print(ipynb_files)

#### Exercise

Write a `glob` command to find files that match these patterns
* all files containing the letter 'm' in the current directory.
* all files containing 'm'
* all files containing 'm' and ending with '.ipynb'
* all files in the 'datasets' subdirectory

In [None]:
# Test your code here

# Magic commands

Jupyter ["magic commands"](https://ipython.readthedocs.io/en/stable/interactive/magics.html) enable limited interaction with the operating system outside of Python. They work within Jupyter notebooks and IPython, but not in pure Python programs. Magic commands begin with `%` or `%%`. 

| Command | Purpose, Example |
|---------|---------|
| `%pwd`   | print current working directory |
| `%ls`    | list contents of the current directory |
| `%cd`    | change current working directory |
|          | `%cd datasets` |
|          | `%cd -` (last visited directory)|
| `%conda` | run the conda package manager |
|          | `%conda install xarray` |
| `%timeit` | time execution of single command |
|           | `%timeit import numpy as np` |
| `%%timeit` | time execution of a cell |
| `%who` | list all variables |
| `%whos`| list all variables, with more info|

See [documentation](https://ipython.readthedocs.io/en/stable/interactive/magics.html) for other magic commands

In [None]:
# Current working directory
%pwd

In [None]:
# Contents of current directory
%ls

In [None]:
# Change directory
%cd datasets

In [None]:
# Content of new directory
%ls

In [None]:
# Change to previous directory
%cd -

In [None]:
%%timeit
l = []
for i in range(1000):
    l.append(i**2)