# Interaction with the file system

## classic: the `os` module

Python comes with a lot of pre-installed modules (standard Python library) which greatly extend the language. This is also known as «batteries included».

The `os` module is probably the most often used module, it is part of the **standard Python library** and offers a lot of system-independent file operations. To use this module, you have to `import` it. Importing a module means, you import the _symbol_ `os` into your script. Typically, you will be using quite a few Python modules, as they are the strength of the language. Import statements should be placed at the top of your script.

**importing the `os` module**

In [1]:
import os

print("now 'os' is a module:", os)
os = "something else"
print("now 'os' is:", os)

import os
print("now 'os' is a module again:", os)

now 'os' is a module: <module 'os' from '/Users/vermeul/.pyenv/versions/3.6.9/lib/python3.6/os.py'>
now 'os' is: something else
now 'os' is a module again: <module 'os' from '/Users/vermeul/.pyenv/versions/3.6.9/lib/python3.6/os.py'>


### Sidenote 1: symbols

Python offers **no protection to imported symbols**, even the built-in functions – such as `print` – can be overwritten:

In [13]:
if type(print) == 'builtin_function_or_method':
    print_original = print
def new_print(*items_to_print):
    print_original(">>>>>>", *items_to_print, "<<<<<<")
print = new_print

now the original print method is gone:

In [14]:
print("hello", "world")

>>>>>> hello world <<<<<<


better back to the original...

In [15]:
print = print_original
print("hello, here we are again")

hello, here we are again


### Sidenote 2: get information about a module and its methods in Jupyter

**put a question mark ? directly after any method or module name** and execute the cell to receive the docstring.

In [21]:
import os
os?

[0;31mType:[0m        module
[0;31mString form:[0m <module 'os' from '/Users/vermeul/.pyenv/versions/3.6.9/lib/python3.6/os.py'>
[0;31mFile:[0m        ~/.pyenv/versions/3.6.9/lib/python3.6/os.py
[0;31mDocstring:[0m  
OS routines for NT or Posix depending on what system we're on.

This exports:
  - all functions from posix or nt, e.g. unlink, stat, etc.
  - os.path is either posixpath or ntpath
  - os.name is either 'posix' or 'nt'
  - os.curdir is a string representing the current directory (always '.')
  - os.pardir is a string representing the parent directory (always '..')
  - os.sep is the (or a most common) pathname separator ('/' or '\\')
  - os.extsep is the extension separator (always '.')
  - os.altsep is the alternate pathname separator (None or '/')
  - os.pathsep is the component separator used in $PATH etc
  - os.linesep is the line separator in text files ('\r' or '\n' or '\r\n')
  - os.defpath is the default search path for executables
  - os.devnull is the file 

In [23]:
os.path.exists?

[0;31mSignature:[0m [0mos[0m[0;34m.[0m[0mpath[0m[0;34m.[0m[0mexists[0m[0;34m([0m[0mpath[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m Test whether a path exists.  Returns False for broken symbolic links
[0;31mFile:[0m      ~/.pyenv/versions/3.6.9/lib/python3.6/genericpath.py
[0;31mType:[0m      function


In [24]:
print?

[0;31mDocstring:[0m
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file:  a file-like object (stream); defaults to the current sys.stdout.
sep:   string inserted between values, default a space.
end:   string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.
[0;31mType:[0m      builtin_function_or_method



**use Jupyter’s TAB completion to list all methods**

enter the following cell, then hit the tabulator key after the dot: a list of possible methods will appear as a vertical list.

In [None]:
os.

### Sidenote 3: install new modules from pypi (= the Python Package Index)

simply use the `pip` command line tool, which is being shipped with Python

In [25]:
!pip install pandas



the same with the `-m` parameter to use a specific module

In [27]:
!python3 -m pip install pandas



**list installed modules**

In [28]:
!pip freeze

alabaster==0.7.12
alembic==1.4.2
appnope==0.1.0
argon2-cffi==20.1.0
astroid==2.4.2
async-generator==1.10
attrs==19.3.0
Babel==2.8.0
backcall==0.2.0
beautifulsoup4==4.9.1
bleach==3.1.5
blis==0.4.1
bs4==0.0.1
cachetools==4.1.1
catalogue==1.0.0
certifi==2020.6.20
certipy==0.1.3
cffi==1.14.1
chardet==3.0.4
chartpress==0.6.0
click==7.1.2
codecov==2.1.9
colorama==0.4.3
CommonMark==0.5.4
contextvars==2.4
coverage==5.3
cryptography==3.0
cycler==0.10.0
cymem==2.0.3
DateTime==4.3
de-core-news-sm @ https://github.com/explosion/spacy-models/releases/download/de_core_news_sm-2.3.0/de_core_news_sm-2.3.0.tar.gz
decorator==4.4.2
defusedxml==0.6.0
docker==4.3.0
docutils==0.16
english==2020.7.0
entrypoints==0.3
escapism==1.0.1
ethz-iam-webservice==0.3.1
-e git+git@gitlab.ethz.ch:jnowotny/edition-louis-ginzberg.git@1549d7495c3dbbcaef717900f27be93b1caa6560#egg=ginzberg2tei&subdirectory=Ginzberg2Tei
gitdb==4.0.5
GitPython==3.1.8
google-auth==1.21.1
h11==0.11.0
html5lib==1.1
httpcore==0.12.1
httpx==0.16.1
i

**The not so easy way to list all installed packages from within Python**

In [31]:
import pkg_resources
print("\n".join([
    "{}=={}".format(i.key, i.version) for i in pkg_resources.working_set
    ])
)

zope.interface==5.1.0
zipp==3.1.0
yapf==0.30.0
wrapt==1.12.1
wos==0.2.5
widgetsnbextension==3.5.1
wheel==0.35.1
websocket-client==0.57.0
webencodings==0.5.1
wcwidth==0.2.5
wasabi==0.8.0
urllib3==1.25.10
typed-ast==1.4.1
txt2tags==3.7
twine==3.2.0
traitlets==4.3.3
tqdm==4.48.2
tornado==6.0.4
toml==0.10.1
thinc==7.4.1
texttable==1.6.2
testpath==0.4.4
terminado==0.8.3
tabulate==0.8.7
suds-py3==1.4.1.0
sudospawner==0.5.2
srsly==1.0.2
sqlalchemy==1.3.18
sphinxcontrib-serializinghtml==1.1.4
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-htmlhelp==1.0.3
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-applehelp==1.0.2
sphinx==3.2.1
sphinx-copybutton==0.3.0
spacy==2.3.2
soupsieve==2.0.1
snowballstemmer==2.0.0
sniffio==1.2.0
smmap==3.0.4
six==1.15.0
simplejson==3.17.2
shortuuid==1.0.1
setuptools==40.6.2
send2trash==1.5.0
semver==2.10.2
ruamel.yaml==0.16.10
ruamel.yaml.clib==0.2.0
rsa==4.6
rfc3986==1.4.0
requests==2.24.0
requests-toolbelt==0.9.1
requests-oauthlib==1.3.0
recommon

### back to our `os` module...

**current working directory**

In [143]:
os.getcwd()

'/Users/vermeul/Python-for-SysAdmins/ws2'

**all files in a directory**

In [26]:
os.listdir('.')

['01_interaction_with_the_file_system.ipynb',
 '02_open_and_close_files.ipynb',
 'README.md',
 'Python for Sysadmins2.md',
 'Python for Sysadmins3.md',
 '.ipynb_checkpoints']

**create, rename and delete a file**

In [39]:
!touch _testfile

In [40]:
os.path.exists('_testfile')

True

In [41]:
os.rename('_testfile', 'testfile')

In [42]:
os.path.exists('_testfile')

False

In [137]:
os.remove('_testfile')

**file permissions**

In [121]:
!touch _test_file_permissions

In [122]:
os.stat('_test_file_permissions')

os.stat_result(st_mode=33188, st_ino=21840500, st_dev=16777221, st_nlink=1, st_uid=502, st_gid=20, st_size=0, st_atime=1605827131, st_mtime=1605827131, st_ctime=1605827131)

In [123]:
os.stat('_test_file_permissions').st_mode

33188

get the octal representation of the file permission

In [124]:
oct(os.stat('_test_file_permissions').st_mode)

'0o100644'

shorten the octal representation

In [125]:
oct(os.stat('_test_file_permissions').st_mode & 0o777)

'0o644'

change file permissions

In [126]:
os.chmod('_test_file_permissions', 0o666)
oct(os.stat('_test_file_permissions').st_mode & 0o777)

'0o666'

In [127]:
os.remove('_test_file_permissions')

**change file ownership**

In [130]:
!touch _test_file_ownership

In [131]:
os.stat('_test_file_ownership').st_uid

502

In [132]:
os.stat('_test_file_ownership').st_gid

20

In [133]:
os.getgroups()

[20, 502, 12, 61, 79, 80, 81, 98, 33, 100, 204, 250, 395, 398, 103, 400, 701]

In [134]:
os.chown('_test_file_ownership', os.getuid(), 400)

In [135]:
os.stat('_test_file_ownership').st_gid

400

In [136]:
os.remove('_test_file_ownership')

**directories**

In [69]:
os.mkdir('tmp')

In [68]:
os.makedirs('tmp2/some/more/dirs')

In [70]:
os.rmdir('tmp')

In [71]:
os.removedirs('tmp2/some/more/dirs')

## interesting alternatives: the `pathlib` and `shutil` modules

**import pathlib and shutil**

In [34]:
import pathlib
import shutil

In [158]:
info = pathlib.Path('/home/ihritik/Desktop/file.txt') 

In [159]:
info.exists()

False

In [160]:
info = pathlib.Path('.')

In [161]:
info.exists()

True

**create a file, test if it exists, delete the file**

In [53]:
file = pathlib.Path('_pathlib_testfile')
file.touch()

In [54]:
file.exists()

True

In [64]:
file.unlink()

**open file, write content, close file**

In [70]:
file = pathlib.Path('_pathlib_file_with_content')

In [71]:
file.write_text('here comes some content')

23

In [72]:
file.read_text()

'here comes some content'

overwrite content

In [74]:
file.write_text('some more content')

17

In [75]:
file.read_text()

'some more content'

In [76]:
file.unlink()

In [77]:
file.exists()

False

**change ownership of a file**

In [178]:
info = pathlib.Path('_pathlib_ownership_testfile')
info.touch()

In [179]:
print("owner:", info.owner())
print("group:", info.group())

owner: vermeul
group: staff


In [208]:
shutil.chown(path='_pathlib_ownership_testfile', group='everyone')

In [209]:
info.group()

'everyone'

In [211]:
info.unlink()

## match file patterns (1): `glob`

## match file patterns (2): `fmatch`