## Paths

Separators
- Windows uses `\` 
- Unix uses `/`

If you have a space in a file name you need to escape like

`/ my/ file.txt`

Solution = don't put spaces in file names in the first place 
- use `-`

`my-file.txt`

In [1]:
import os

os.getcwd()

'/home/stas/dsr/dsr-classes/python/basics'

## Importing packages

https://realpython.com/absolute-vs-relative-python-imports/

Module = any file with a `.py`

Package = folder with modules in it

We can import packages in various ways:

In [2]:
import numpy

from numpy import array

import numpy as np

How does Python know where to look for these packages?  The answer is the `PYTHONPATH`:

In [3]:
import sys

sys.path

['/home/stas/anaconda3/envs/dsr/lib/python36.zip',
 '/home/stas/anaconda3/envs/dsr/lib/python3.6',
 '/home/stas/anaconda3/envs/dsr/lib/python3.6/lib-dynload',
 '',
 '/home/stas/anaconda3/envs/dsr/lib/python3.6/site-packages',
 '/home/stas/anaconda3/envs/dsr/lib/python3.6/site-packages/IPython/extensions',
 '/home/stas/.ipython']

If we want to import packages from Python scripts not on this path, we can add them:

In [4]:
#  this won't work
from example import hello_world

ModuleNotFoundError: No module named 'example'

If we add `dsr-classes/python/import-example` onto the PYTHONPATH, we can load modules from it:

In [5]:
sys.path.append('../import-example')

#  works now
from example import hello_world

hello_world()

Hello world!


We can see why it works by inspecting the PYTHONPATH again:

In [6]:
sys.path

['/home/stas/anaconda3/envs/dsr/lib/python36.zip',
 '/home/stas/anaconda3/envs/dsr/lib/python3.6',
 '/home/stas/anaconda3/envs/dsr/lib/python3.6/lib-dynload',
 '',
 '/home/stas/anaconda3/envs/dsr/lib/python3.6/site-packages',
 '/home/stas/anaconda3/envs/dsr/lib/python3.6/site-packages/IPython/extensions',
 '/home/stas/.ipython',
 '../import-example']

## The `$HOME` environment variable

This is a Unix environment variable - we can view using the bash command `echo`:

In [7]:
#  ! = run bash command in Jupyter
!echo $HOME

/home/stas


We can also access it in Python:

In [8]:
home = os.environ['HOME']

home

'/home/stas'

Using this directory to store data is useful - it makes your notebooks & packages transportable to any Unix machine, including cloud instances.

Let's make a directory.  We can do this using:

In [9]:
os.makedirs?

[0;31mSignature:[0m [0mos[0m[0;34m.[0m[0mmakedirs[0m[0;34m([0m[0mname[0m[0;34m,[0m [0mmode[0m[0;34m=[0m[0;36m511[0m[0;34m,[0m [0mexist_ok[0m[0;34m=[0m[0;32mFalse[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
makedirs(name [, mode=0o777][, exist_ok=False])

Super-mkdir; create a leaf directory and all intermediate ones.  Works like
mkdir, except that any intermediate path segment (not just the rightmost)
will be created if it does not exist. If the target directory already
exists, raise an OSError if exist_ok is False. Otherwise no exception is
raised.  This is recursive.
[0;31mFile:[0m      ~/anaconda3/envs/dsr/lib/python3.6/os.py
[0;31mType:[0m      function


We need to pass a path into `makedirs`.

An incorrect way (that would work) to do this would be to add the strings together 
- one problem with this is using the correct separator

In [10]:
path = home + '/learning-python'
path

'/home/stas/learning-python'

## `os.path`

The classic & common way to deal with paths in Python

In [13]:
#dir(os.path)

We can use `os.path.join` to form our path:

In [14]:
path = os.path.join(home, 'learning-python')

os.path.exists(path)

False

Let's make our directory:

In [15]:
os.makedirs(path, exist_ok=True)

os.path.exists(path)

True

Another common usecase is iterating over all files in a directory - we can get a list using `os.listdir`:

In [18]:
os.listdir(home)

['.gconf',
 '.thunderbird',
 '.PyCharmCE2019.3',
 '.jupyter',
 '.ssh',
 '.java',
 '.profile',
 '.pam_environment',
 'Music',
 '.gnome',
 '.local',
 '.vim',
 '.gnupg',
 '.bash_history',
 '.bash_logout',
 'PycharmProjects',
 '.viminfo',
 'Desktop',
 'anaconda3',
 'Videos',
 '.wget-hsts',
 'snap',
 '.cache',
 '.ipython',
 '.keras',
 '.gitconfig',
 'learning-python',
 '.boto',
 '.config',
 'sumo-1.3.1',
 '.ICEauthority',
 '.bashrc',
 '.pki',
 'Templates',
 '.git-credentials',
 '.gsutil',
 'Documents',
 'Pictures',
 '.sudo_as_admin_successful',
 'Downloads',
 '.mozilla',
 'Jupyter',
 'dsr',
 'Public',
 '.conda']

The problem with `os.listdir` is that it only lists one directory - if we want recursive we can use `os.walk`:

In [24]:
for root, dirs, files in os.walk('../'):
    print(root, dirs, files)

../ ['top-down', 'basics', '.ipynb_checkpoints', 'assets', 'import-example'] ['readme.md']
../top-down ['.ipynb_checkpoints'] ['using-an-api.ipynb', 'linear-programming.ipynb', 'readme.md', 'web-scraping.ipynb']
../top-down/.ipynb_checkpoints [] ['web-scraping-checkpoint.ipynb', 'linear-programming-checkpoint.ipynb']
../basics ['__pycache__', '.ipynb_checkpoints'] ['2.pep8.ipynb', '4.paths-and-importing.ipynb', '6.dicts-and-sets.ipynb', '7.functions.ipynb', '5.iterables-and-files.ipynb', '8.classes.ipynb', 'readme.md', '3.strings.ipynb', 'answers.py', '1.intro.ipynb']
../basics/__pycache__ [] ['answers.cpython-36.pyc']
../basics/.ipynb_checkpoints [] ['7.functions-checkpoint.ipynb', '4.paths-and-importing-checkpoint.ipynb', 'answers-checkpoint.py', '3.strings-checkpoint.ipynb', '1.intro-checkpoint.ipynb', '8.classes-checkpoint.ipynb', '2.pep8-checkpoint.ipynb', '6.dicts-and-sets-checkpoint.ipynb', '5.iterables-and-files-checkpoint.ipynb']
../.ipynb_checkpoints [] ['readme-checkpoint.md

## `pathlib`

[Python 3's pathlib Module: Taming the File System - Real Python](https://realpython.com/python-pathlib/)

In Python 3.4 `pathlib` was introduced.  `pathlib` is an object oriented approach - centered around a `Path` object:

In [25]:
from pathlib import Path

#  In Unix, `.` refers to the current working directory
p = Path('./intro.ipynb')

p

PosixPath('intro.ipynb')

We can get the filetype:

In [26]:
p.suffix

'.ipynb'

The filename:

In [27]:
p.stem

'intro'

The user's $HOME:

In [28]:
p.home()

PosixPath('/home/stas')

We can look at all the methods & attributes that don't have an `_` on the `Path` object:

In [29]:
[f for f in dir(p) if '_' not in f]

['absolute',
 'anchor',
 'chmod',
 'cwd',
 'drive',
 'exists',
 'expanduser',
 'glob',
 'group',
 'home',
 'iterdir',
 'joinpath',
 'lchmod',
 'lstat',
 'match',
 'mkdir',
 'name',
 'open',
 'owner',
 'parent',
 'parents',
 'parts',
 'rename',
 'replace',
 'resolve',
 'rglob',
 'rmdir',
 'root',
 'samefile',
 'stat',
 'stem',
 'suffix',
 'suffixes',
 'touch',
 'unlink']

We can create files using `touch`:

In [30]:
p = Path('./test.temp')
p.touch()
!ls

1.intro.ipynb	 4.paths-and-importing.ipynb  7.functions.ipynb  __pycache__
2.pep8.ipynb	 5.iterables-and-files.ipynb  8.classes.ipynb	 readme.md
3.strings.ipynb  6.dicts-and-sets.ipynb       answers.py	 test.temp


And delete files using `unlink`:

In [31]:
p.unlink()
!ls

1.intro.ipynb	 4.paths-and-importing.ipynb  7.functions.ipynb  __pycache__
2.pep8.ipynb	 5.iterables-and-files.ipynb  8.classes.ipynb	 readme.md
3.strings.ipynb  6.dicts-and-sets.ipynb       answers.py


We can read files without using context management:

In [32]:
Path('./readme.md').read_text()

'A series of notebooks designed to teach Python from the bottom up.  The notes are designed for students with no Python experience.\n\n## Further reading\n\n[My personal collection of Python resources](https://github.com/ADGEfficiency/programming-resources/tree/master/python)\n\n[The Python Tutorial](https://docs.python.org/3/tutorial/)\n\n[An Effective Python Environment: Making Yourself at Home - Real Python](https://realpython.com/effective-python-environment/)\n'

Joining paths can be done using the Python division syntax:

In [33]:
Path.home() / 'test_dir'

PosixPath('/home/stas/test_dir')

## Exercise

In your `$HOME` directory:

A loop that:
- create a folder `practice`
- create 10 folders inside this directory (`practice/0`, `practice/1` ...)
- create a `.py` file inside each that is double the folder number (`practice/0/0.py`, `practice/1/2.py`, `practice/2/4.py` ...)

A second loop that:
- gets the names of all files you created
- copies the files into `practice` if the file name is evenly divisible by 4 `4.py`, `8.py` etc

Then remove all the number folders (`practice/0`, `practice/1` ...)