# Files and paths

At the start of the course, we learned how to manipulate strings, and how to read/write files. In this lecture, we go over a few useful features of Python that make it easier to deal with lists of files, as well as formatting data into strings (which can be useful for e.g. constructing filenames or writing data)

## The ``glob`` module

In the Linux command-line, it is possible to list multiple files matching a pattern with e.g.:

    $ ls *.py

This means list all files ending in ``.py``.

The built-in [glob](http://docs.python.org/3/library/glob.html) module allows you to do something similar from Python. The only important function here in the ``glob`` module is also called ``glob``.

This function can be given a pattern (such as ``*.py``) and will return a list of filenames that match:

In [1]:
import glob
glob.glob('*.ipynb')

['00. About the course.ipynb',
 '01. What is Python.ipynb',
 '02. How to run Python code.ipynb',
 '03. Using the IPython notebook.ipynb',
 '04. Numbers, String, and Lists.ipynb',
 '05. Booleans, Tuples, and, Dictionaries.ipynb',
 '06. Control Flow.ipynb',
 '06.B Comments.ipynb',
 '07. Functions.ipynb',
 '08. Reading and writing files.ipynb',
 '09. Modules and Variable Scope.ipynb',
 '10. Introduction to Numpy.ipynb',
 '11. Introduction to Matplotlib.ipynb',
 '12. Files and paths.ipynb',
 '13. String Formatting.ipynb',
 '14. Python variables - behind the scenes.ipynb',
 '15. Fitting models to data.ipynb',
 '16. Interpolation and Integration.ipynb',
 '17. Understanding Python errors.ipynb',
 '18. Accessing remote resources.ipynb',
 '19. Object-oriented programming.ipynb',
 'Heat Conduction.ipynb',
 'Interactive Differential Equations.ipynb',
 'Practice Problem - Cryptography.ipynb',
 'Practice Problem - Monte-Carlo Error Propagation.ipynb',
 'Practice Problem - Radioactive Decay.ipynb',


## The ``os`` module

The [os](http://docs.python.org/3/library/os.html) module allows you to interact with the system, and also contains utilities to construct or analyse file paths. The ``os.path`` sub-module is particularly useful for accessing files - for example,

In [2]:
import os
os.path.exists('test.py')

True

can be used to find out if a file exists.

When constructing the path to a file, for example ``data/file.txt``, one normally has to worry about whether this file is a Linux/Mac or a Windows file path (since Linux/Mac use ``/`` and Windows uses ``\``). However, the ``os`` module allows you to construct file paths without worrying about this:

In [3]:
os.path.join('data', 'file.txt')

'data/file.txt'

This can be combined with glob, for example:

    glob.glob(os.path.join('data', '*.txt'))

The ``os`` module also has other useful functions which you can find about from the [documentation](http://docs.python.org/3/library/os.html).

## Exercise

The ``os.path.getsize`` function can be used to find the size of a file in bytes. Do a loop over all the files in the current directory using ``glob`` and for each one, print out the filename and the size in kilobytes (1024 bytes):

In [4]:

# your solution here
for f in glob.glob('*'):
    print(f,str(os.path.getsize(f)/1024)+' kB')

00. About the course.ipynb 8.6689453125 kB
01. What is Python.ipynb 5.298828125 kB
02. How to run Python code.ipynb 5.1640625 kB
03. Using the IPython notebook.ipynb 24.1962890625 kB
04. Numbers, String, and Lists.ipynb 36.2587890625 kB
05. Booleans, Tuples, and, Dictionaries.ipynb 14.2822265625 kB
06. Control Flow.ipynb 22.1845703125 kB
06.B Comments.ipynb 4.3525390625 kB
07. Functions.ipynb 12.6884765625 kB
08. Reading and writing files.ipynb 21.2978515625 kB
09. Modules and Variable Scope.ipynb 20.181640625 kB
10. Introduction to Numpy.ipynb 51.3017578125 kB
11. Introduction to Matplotlib.ipynb 635.876953125 kB
12. Files and paths.ipynb 10.1767578125 kB
13. String Formatting.ipynb 10.7998046875 kB
14. Python variables - behind the scenes.ipynb 22.828125 kB
15. Fitting models to data.ipynb 162.712890625 kB
16. Interpolation and Integration.ipynb 81.3857421875 kB
17. Understanding Python errors.ipynb 25.904296875 kB
18. Accessing remote resources.ipynb 650.2490234375 kB
19. Object-ori