# Introduction to python for hydrologists &mdash; sys, os, and shutil
These four packages are part of the standard python library and provide very useful functionality for working with your operating system and files.  This notebook will provide explore these packages and demonstrate some of their functionality.  Online documentation is at [sys](https://docs.python.org/2/library/sys.html "sys doc"), [os](https://docs.python.org/2/library/os.html "os doc"), [shutil](https://docs.python.org/2/library/shutil.html "shutil doc"), and [subprocess](https://docs.python.org/2/library/subprocess.html "subprocess doc").

Import things to cover:
* sys: platform
* os: path, chdir, getcwd, listdir
* shutil: copy, copytree, rmtree
  
This notebook was modified from a USGS Intro to python for hydrologists course: https://github.com/mnfienen/python-usgs-training

In [1]:
import sys
import os
import shutil

## Sys Module

System-specific parameters and functions.

The following cells simply print some of the sys methods and attributes that you might find useful.

When working on collaborative projects, you often will run into issues with different operating systems. For example, Windows uses \ in paths whereas Mac and Linux use /. ```sys.platform``` detects the platform, which is handy for OS-aware programming.

In [10]:
sys.platform

'darwin'

## sys.path

If you haven't seen `sys.path` already mentioned in a python script, you will soon.  `sys.path` is a list of directories.  This path list is used by python to search for python modules and packages.  If for some reason, you want to use a python package that is not installed in the main python folder, you can add directory containing your module to sys.path.

In [11]:
# Or more elegantly
for pth in sys.path:
    print(pth)

['/Users/aleaf/Documents/GitHub/python-usgs-training/notebooks/part1_python_intro', '/Users/aleaf/anaconda3/envs/pyclass/lib/python37.zip', '/Users/aleaf/anaconda3/envs/pyclass/lib/python3.7', '/Users/aleaf/anaconda3/envs/pyclass/lib/python3.7/lib-dynload', '', '/Users/aleaf/anaconda3/envs/pyclass/lib/python3.7/site-packages', '/Users/aleaf/Documents/GitHub/flopy', '/Users/aleaf/Documents/GitHub/modflow-export', '/Users/aleaf/Documents/GitHub/gisutils', '/Users/aleaf/anaconda3/envs/pyclass/lib/python3.7/site-packages/IPython/extensions', '/Users/aleaf/.ipython']
/Users/aleaf/Documents/GitHub/python-usgs-training/notebooks/part1_python_intro
/Users/aleaf/anaconda3/envs/pyclass/lib/python37.zip
/Users/aleaf/anaconda3/envs/pyclass/lib/python3.7
/Users/aleaf/anaconda3/envs/pyclass/lib/python3.7/lib-dynload

/Users/aleaf/anaconda3/envs/pyclass/lib/python3.7/site-packages
/Users/aleaf/Documents/GitHub/flopy
/Users/aleaf/Documents/GitHub/modflow-export
/Users/aleaf/Documents/GitHub/gisutils
/

A common way that we add a folder to sys.path is as follows:

    pathtomymodule = os.path.join('..')
    if pathtomymodule not in sys.path:
        sys.path.append(pathtomymodule)

This will allow us to import any modules or packages that are up one directory from the current working directory.  Keep this in mind as we use this throughout the class exercises.

## os Module
Module for providing portable operating system functionality.

In [12]:
print('os.name: ', os.name)

os.name:  posix


In [14]:
cwd = os.getcwd()
print(cwd)

/Users/aleaf/Documents/GitHub/python-usgs-training/notebooks/part1_python_intro


In [15]:
#list all the entries in the specified directory. 
mylistofitems = os.listdir(os.getcwd())
for thingy in mylistofitems:
    if os.path.isdir(thingy):
        print('directory: ', thingy)
    else:
        print('file: ', thingy)

directory:  extracted_data
file:  09_sys-os.ipynb
file:  02_functions.ipynb
file:  TheisExercise.pdf
file:  06_numpy.ipynb
file:  .DS_Store
file:  08_namespace.ipynb
file:  Pandas_weather_timeseries_Wunderground.ipynb
directory:  images
file:  Untitled.ipynb
file:  Pandas_NWIS.ipynb
file:  04_objects.ipynb
file:  Pandas_ColoradoRiver-FFT.ipynb
file:  gis_vector_msn_crime.ipynb
file:  05_files.ipynb
file:  TheisExercise.tex
file:  03_scripts.ipynb
file:  mtsthelens.pdf
directory:  .ipynb_checkpoints
file:  Matplotlib_StHelens.ipynb
file:  gis_raster_mt_rainier_glaciers.ipynb
file:  xarray_mt_rainier_precip.ipynb
file:  Pandas_ColoradoRiver.ipynb
directory:  data
file:  tmp
file:  01_basics.ipynb
file:  junk.zip
file:  LeesFerryOnePlot.pdf


In [16]:
# Example of changing the working directory
old_wd = os.getcwd()

# Go up one directory
os.chdir('..')
cwd = os.getcwd()
print ('Now in: ', cwd)

# Change back to original
os.chdir(old_wd)
cwd = os.getcwd()
print('Switched back to: ', cwd)

Now in:  /Users/aleaf/Documents/GitHub/python-usgs-training/notebooks
Switched back to:  /Users/aleaf/Documents/GitHub/python-usgs-training/notebooks/part1_python_intro


## Glob
The glob library provides handy shorthand for listing files using patterns and wildcard (*) characters

https://en.wikipedia.org/wiki/Glob_(programming)

**Note!** Sorting of the files returned by `Glob` is platform-dependent. In general, if your code depends on a specific ordering of a list, it is best to explicitly sort it yourself using `sorted()` or `.sort()`, instead of depending on the behavior of an imported module.  
https://arstechnica.com/information-technology/2019/10/chemists-discover-cross-platform-python-scripts-not-so-cross-platform/

In [17]:
import glob

In [18]:
# list all of the Jupyter notebooks in the current working directory
glob.glob('*.ipynb')

['09_sys-os.ipynb',
 '02_functions.ipynb',
 '06_numpy.ipynb',
 '08_namespace.ipynb',
 'Pandas_weather_timeseries_Wunderground.ipynb',
 'Untitled.ipynb',
 'Pandas_NWIS.ipynb',
 '04_objects.ipynb',
 'Pandas_ColoradoRiver-FFT.ipynb',
 'gis_vector_msn_crime.ipynb',
 '05_files.ipynb',
 '03_scripts.ipynb',
 'Matplotlib_StHelens.ipynb',
 'gis_raster_mt_rainier_glaciers.ipynb',
 'xarray_mt_rainier_precip.ipynb',
 'Pandas_ColoradoRiver.ipynb',
 '01_basics.ipynb']

In [19]:
sorted(glob.glob('*.ipynb'))

['01_basics.ipynb',
 '02_functions.ipynb',
 '03_scripts.ipynb',
 '04_objects.ipynb',
 '05_files.ipynb',
 '06_numpy.ipynb',
 '08_namespace.ipynb',
 '09_sys-os.ipynb',
 'Matplotlib_StHelens.ipynb',
 'Pandas_ColoradoRiver-FFT.ipynb',
 'Pandas_ColoradoRiver.ipynb',
 'Pandas_NWIS.ipynb',
 'Pandas_weather_timeseries_Wunderground.ipynb',
 'Untitled.ipynb',
 'gis_raster_mt_rainier_glaciers.ipynb',
 'gis_vector_msn_crime.ipynb',
 'xarray_mt_rainier_precip.ipynb']

## os.path

os.path is a very widely used submodule of os.  In fact we use it in almost all of the class notebooks and scripts to deal with file system paths.  Some common os.path functions are:

    os.path.join() #build a path from its parts (can be absolute or relative, relative is much easier to type...)
    os.path.exists()#check if path exists
    os.path.isdir()#check if path is a directory 

os has other handy tricks, such as to create an empty directory (```os.mkdir()```), remove a file (```os.remove()```), and remove a directory (```os.rmdir()```)

let's put it all together...

In [6]:
#we want to create a directory

#first let's list what's in the current directory again
print(os.listdir(os.getcwd()))

#let's give it a name
new_dir = os.path.join('new_directory')

#check if it exists, and if so, let's remove it
if os.path.exists(new_dir):
    os.rmdir(new_dir)
    print('dir existed, so we removed it')
os.mkdir(new_dir)

#now let's list the contents of the directory again
print(os.listdir(os.getcwd()))

['.DS_Store', 'Pandas_weather_timeseries_Wunderground.ipynb', 'new_directory', '03_scripts.ipynb', '03_functions_scripts.ipynb', '.ipynb_checkpoints', 'Matplotlib_StHelens.ipynb', '02_sys_os_shutil.ipynb', 'Pandas_ColoradoRiver.ipynb', '04_numpy.ipynb', '01_basics.ipynb', '05_pandas.ipynb']
dir existed, so we removed it
['.DS_Store', 'Pandas_weather_timeseries_Wunderground.ipynb', 'new_directory', '03_scripts.ipynb', '03_functions_scripts.ipynb', '.ipynb_checkpoints', 'Matplotlib_StHelens.ipynb', '02_sys_os_shutil.ipynb', 'Pandas_ColoradoRiver.ipynb', '04_numpy.ipynb', '01_basics.ipynb', '05_pandas.ipynb']


## shutil Module
shutil is a high level file managment module for copying, moving, and deleting files and directories.

The functions from shutil that you may find useful are:

    shutil.copy2(from_path, to_path) #copy a single file
    shutil.copytree(from_path,to_path) #recursively copy the contents of a directory to another
    shutil.move() #move a file or directory
    shutil.rmtree()  #remove a directory (better choice than os.rmdir). obviously, you need to be careful with this one!
    
Give these guys a shot and see what they do.  Remember, you can always get help by typing:

    help(shutil.copy)


In [8]:
#let's give it a try. let's copy the environment.yml file into the new directory we just made
print(os.getcwd())
shutil.copy2(os.path.join('..','..','environment.yml'),os.path.join(new_dir,'environment.yml'))
os.listdir(new_dir)


/Users/kmarkovich/Desktop/PhysHydro_Fall25/notebooks/intro_to_python


['environment.yml']

In [None]:
#now try copying the contents of a directory to new_dir
