# ListFiles

Version 0.1 Dec 2019

This notebook demonstrates the use of the  `os` and `glob` modules to find and list files located in different locations.

The `os` module covers a wide range of low-level operating system functions, such as finding files and directories, creating and deleting files and directories, working with environment variables.  Some of these should be used with care !   Here we will be concerned only with the functions relating to finding and listing files.

The `glob` module is far more focused in scope, concentrating on finding and listing files.  It does this in an operating sytem independent way.

Both of these modules are part of the Python standard library, so can be used without any additional installation or setup. They just need to be imported in the usual way.

For further information, refer to the Python documentation:

https://docs.python.org/3/library/os.html

https://docs.python.org/3/library/glob.html


### 1. Listing files with `os`

To simply list the contents of a folder or directory, `os.listdir()` is the quickest and easiest way.

This will list all of the files in the folder specified by the pathname. In Windows, pathnames start with a drive letter. Macintosh pathnames work in a similar way, but without the drive letter.

One quirk of Python is that the backslash character ' \\ ' , which is used in Windows as part of a pathname, has a special meaning (it is used as an escape character), and so can't be included in a pathname.  To specify a Windows pathname in Python, use the forward slash character ' / ' instead. 

For example, suppose that you have a folder `C:\OU\SXPS288` set up to hold all of your work on SXPS288, and within that you have a folder structure for the different investigations, including a folder `C:\OU\SXPS288\DataFiles\ReferenceSpectra` containing some spectra files that you have downloaded.

(On a Macintosh, the corresponding pathname might be something like: `Users\yourname\OU\SXPS288\DataFiles\ReferenceSpectra`)

In [None]:
import os

# This will list all of the files in a specified folder.
# Modify the pathname in this example to match the location of the files on your computer

lstFiles = os.listdir("C:/OU/SXPS288/DataFiles/ReferenceSpectra")

for filename in lstFiles:
    print(filename)

### 2. Listing files with `glob`

While `os.listdir()` is quick and easy, it is also indiscriminate.  As you may have found, it simply lists all of the files in the location specified by the pathname.  If you have a mixture of different files or a large number of files, you may wish to search for a specific subset or group of files (for instance, to list only the .csv files).  

You can do this using `glob()`, which allows you to specify a wildcard search string in addition to the pathname. 

In [None]:
import glob

# Glob allows you to add a wildcard to the pathname.  Only files matching the pattern will be listed.
# Modify the pathname in this example to match the location of the files on your computer

strPath = "C:/OU/SXPS288/DataFiles/ReferenceSpectra/"
nPathLen = len(strPath)

#Search for .csv files.  Change the filter (e.g. to "*.jdx") to search for files of a different type.
dirlist = glob.glob(strPath + "*.csv")

# Unlike os.listdir(), glob() returns the entire pathname
# To get just the file names, this loop removes the first part of the pathname before printing
# Try printing pathname instead to see the actual strings returned by glob()
for pathname in dirlist:
    filename = pathname[nPathLen:]
    print(filename)

### 3. Exercise: filters and wildcards

Experiment with the example in the previous cell.  Try changing the pathname to list files in different folders, and use the wildcard to select files of different types, or files with certain characters in the filename. 

You may also want to try building a list or a dict of the filenames instead of printing them out.

In [None]:
### Write your solution here