# Programming with Python
## Analyzing Data from Multiple Files
Questions
* How can I do the same operations on many different files?

Objectives
* Use a library function to get a list of filenames that match a simple wildcard pattern.
* Use a `for` loop to process multiple files.

## Using `glob`

In [None]:
# Import the glob module
import glob

In [None]:
# Get and print the list of all CSV files
print(glob.glob('../data/inflammation*.csv'))

## Loading Files in Sorted Order

In [None]:
import numpy
import matplotlib.pyplot as plt

In [None]:
# Get the first three file names from a sorted list
filenames = sorted(glob.glob('../data/inflammation*.csv'))
filenames = filenames[:3]

In [None]:
# For each file name
for filename in filenames:
    # Print the name
    print(filename)

    # Load the data with the current file name
    data = numpy.loadtxt(fname=filename, delimiter=',')

    # Create and show the figure with three sub-figures
    fig = plt.figure(figsize=(10.0, 3.0))

    axes1 = fig.add_subplot(1, 3, 1)
    axes2 = fig.add_subplot(1, 3, 2)
    axes3 = fig.add_subplot(1, 3, 3)

    axes1.set_ylabel('Average')
    axes1.plot(numpy.mean(data, axis=0))

    axes2.set_ylabel('Max')
    axes2.plot(numpy.max(data, axis=0))

    axes3.set_ylabel('Min')
    axes3.plot(numpy.min(data, axis=0))

    fig.tight_layout()
    plt.show()

### Exercise - Plotting All Mean Values
By using the sorted list of `inflammation*.csv` files, fill a
NumPy 2D array with average values for each day and for each file.
Display the final result as an image.

In [None]:
filenames = sorted(glob.glob('../data/inflammation*.csv'))
all_mean_values = numpy.zeros( (len(filenames), 40) )

# For each file
for i in range(len(filenames)):
    # Read the data from file
    data = numpy.loadtxt(fname=filenames[i], delimiter=',')
    # Compute daily mean values and assign to a whole row
    all_mean_values[i, :] = numpy.mean(data, axis=0)

# Create the image
image = plt.imshow(all_mean_values)

plt.xlabel('Day #')
plt.ylabel('Group #')
plt.title('Daily average of each group of patients')

plt.show()

## Key points
* `import glob`
* `glob.glob(pattern)` to get a list of files
  * Pattern example: `'file_*.txt'`
* `sorted(list)` to get a sorted list
* Use a `for`-loop to process one file at a time