# Practical Python for Scientists and Engineers

Welcome!  The goal of these tutorials is to help you get familiar with basic aspects of Python that will allow you to be more productive in everyday work.  We will work on skills that will let you graph, manipulate, and manage data.  Our goal will be to take things one step at a time, learning only what is needed to accomplish a specific task.  The philosophy behind these tutorials is learning by doing, rather than learning to let you do something later.  Hopefully you will start learning tools right from day one that will be useful in other settings.  By the end of these tutorials, you will be able to make complicated applications that load and save data to and from files, manipulate data, run numerical simulations, make complex visualizations and more! 

## Tutorial 5: For Loops
In this tutorial, we will introduce the concept of loops as a means for repeating an action.  Loops are very useful in many situations, in particular they can be helpful for reading and writing data files or repeating a complex set of procedures multiple times.  

By the end of this tutorial you will be able to automate the process of adding multiple datasets to a graph. 

<U>In this tutorial we will cover:</u>
- What are loops?
- Using a counter in a loop.
- Using loops with Numpy arrays.
- Plotting multiple data sets with a for loop.

### Step 1: What are loops?
Loops are a basic part of every programming language.  They let you repeat an action that would otherwise be repetitive.  For example, imagine that you were performing an analysis of a remote sensing image that required you to count how many pixels in the image were urbanized versus agricultural or forested areas.  You could use a loop to allow you to: (i) visit a pixel in the image, (ii) make a determination of the landuse at that pixel, (iii) record the result, and (iv) move to the next pixel in the image.  By repeating these four commands, it would be possible to visit and classify every pixel in the image.    

There are two different ways that this loop could be done.  

You could say:  

><b>while</b> <i>you have not yet visited every pixel in the image</i>
1. choose a pixel that has not yet been classified  
2. classify the land use for that pixel  
3. record the result in an array  
4. repeat  

Or you could alternatively say:

><b>for</b> <i>pixel 1 to pixel N (where N is the last pixel in the image)</i>
1. go to the next pixel in a list of pixels
2. classify the land use for that pixel
3. record the result in an array
4. stop after you classify pixel N

While these seem pretty similar, in the first case you don't need to know how many pixels are in the image.  Instead, the desired actions continue being repeated *while* there are still pixels that have yet to be classified.  In contrast, you need to specify all the pixels in the image from the beginning in the second case.  Thus these two cases represent two fundamentally different ways of constructing loops: <b>while loops</b> and <b>for loops</b>.  In this tutorial, we will focus on *for loops*.

### for loop  
*For loops are used to repeat an action <b>for</b> a fixed number of values.  In Python, for loops progress through a sequence of values provided in a list until the loop has been repeated for each value in the list.  This is a little different from other languages where the loop is controlled by a counter that increments during each iteration.*

<u>Structure of a For Loop</u>  
To create a for loop you use the keyword <b>`for`</b> followed by: a variable that will be used as a *`counter`*, the keyword <b>`in`</b>, the *`list`* used for iterating, and a colon (<b>:</b>).  The comands that are to be repeated follow the first line as indented code.  It is important to indent the lines below the first one so that Python knows what actions are to be repeated.  

><b>for</b> <i>counter</i> <b>in</b> <i>list</i>:  
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*commands to be repeated*  (one command per line, must be indented)

As the loop progresses, the counter variable will take on a new value from the list after each iteration.

Let's examine a simply example:    

In [None]:
for i in [1,2,3,4]:
    print('hello world!')
    
    
print('this line is not repeated because it is not indented and therefore out of the for loop')

Since there were four numbers in the list, this for loop just repeated the command `print('hello world!')` four times.  

We can see what value the counter variable *i* is taking on by adding an additional print statement:  

In [None]:
for i in [1,2,3,4]:
    print('hello world!')
    print(i)
print('still not indented and still not repeated!')

The result shows that at each iteration, the counter variable *i* takes on a different value from the list.

#### Challenge Problem 5.1
Create a loop that will print out a data table showing the values of $y=x^2$ between 0 and 10.
The table should look something like this (but you don't need to print the title):  

$x$   |   $x^2$ 
----|--------
 0 | 0
 1 | 1
 2 | 4
 3 | 9
4 | 16
5 | 25
6 | 36
7 | 49
8 | 64



In [None]:
#add your code here:



It is notable that the values in the list don't need to be numbers.

In [None]:
for i in ['apple','banana', 'orange','raspberry']:
    print('hello world!')

In this case we get exactly the same behavior as before even though the values in the list are different.  If we add the second print statement, we can see that the counter takes on each value from the list as we iterate through the loop:  

In [None]:
for i in ['apple','banana', 'orange','raspberry']:
    print('hello world!')
    print(i)

One of the things that makes loops powerful is that you can use the changing value of the counter within the loop itself.    
For example, try to imagine what this loop would do before you run it:

In [None]:
for i in [1,2,3,4]:
    print('hello world! '*i)

This is a good example illustrating how the behavior of the loop can be altered by the values given in the list.  Here is another example:

In [None]:
for i in ['apples','bananas', 'oranges','raspberries']:
    print('I like to eat ' + i)

This example illustrates how meaningful changes of output can be achieved with a loop, but it isn't that practical to have the data you want to report (like what fruit you like to eat) coded in the loop itself.  

We can improve on this by replacing the list in the loop with a variable that represents the list.  We can make our code more general by representing the list with a variable containing the list.  It will then be possible to change the data that is provided to the loop without changing any code in the loop itself.   

For example, let's modify what we had in the code cell from above by replacing the list with a variable called *`fruit_list`* containing the actual list with the data:

In [None]:
fruit_list = ['apples','bananas','oranges','raspberries']

for i in fruit_list:
    print('I like to eat ' + i)

As a next step, let's see how we can now change the data assigned to the variable to change the output of the code without actually modify the code used for the loop:

In [None]:
#let's change the data in the list:
fruit_list = ['pears', 'pineapples', 'mangos','peaches','apricots','blueberries']

#now repeat the two lines for the loop from above without making ANY changes (i.e., you can just copy/paste to loop)



Notice that with exactly the same code you can get a very different output (even a different number of lines printed!) that might reflect the results of two different users.  

#### Challenge Problem 5.2:
In the example above we ended up having to copy and past the code for the loop.  This is undesirable as it means that if you made an error or want to change something later you would have to find two places in your code that need to be updated.  We have leanred in previous tutorials that you can use functions as a tool for generalizing you code that you might want to use over and over again.  

In this challenge problem, you wll modify the code cell below by adding in the appropriate information in a function definition that will allow you to print out the fruit data for two different users without needing to repeat the commands for the loop twice.   

When you correctly edit and run the code cell below, your output should look like this:

I like to eat apples   
I like to eat bananas   
I like to eat oranges  
I like to eat raspberries  
I like to eat pears  
I like to eat pineapples  
I like to eat mangos  
I like to eat peaches  
I like to eat apricots  
I like to eat blueberries  

In [None]:
#specify each user's data (do not modify these):
user1 = ['apples','bananas','oranges','raspberries']
user2 = ['pears', 'pineapples', 'mangos','peaches','apricots','blueberries']


#define the function:
def     (    ):                      #fill in the missing info in this line
                                     #add code to this line
                                     #add code to this line
        
#use the function to print out the user data:
print_fruit(    )                    #complete the missing info in this line
print_fruit(    )                    #complete the missing info in this line

### Step 2: Using a counter in a loop.
Sometimes you will want to keep track of where you are in the loop by creating a variable that increments during each cycle:

In [None]:
fruit_list = ['apples','bananas','oranges','raspberries']

my_index = 0        #this line initializes the index counter by setting it to zero

for i in fruit_list:
    print('I like to eat ' + i)
    print('my_index = ' + str(my_index)) #this line prints out the current value of the counter
    
    my_index = my_index + 1   #this line increments the value stored in my_index by 1
    
    print('item #'+str(my_index)+' is '+i)
    print('')  #this line was added to create a space between the print outs for different interations of the loop

While the example above illustrates the use of a counter in a loop, it isn't very likely that you will want to simply print out a series of values like this (though it is one application of loops!).  A more practical problem is when you wish to extract a value from an array (or list).

Consider our fruit loop above.  Now imagine that we want to print which day of the week we like each fruit.  We could then o something like this:

In [None]:
days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']
fruit_list = ['apples','bananas','oranges','raspberries']

my_index = 0

for i in fruit_list:
    print('I like to eat '+i+' on '+days[my_index])
    my_index = my_index + 1

In this example, the type of fruit is assigned to the counter *`i`* automatically through each iteration of the loop.  We have no easy way, however, to also cycle through the days of the week, so the counter was needed for this. There are two other ways we could have handled this problem in a different way.

First, you could use what you have already learned in this tutorial to restructure the loop so that you proceed through a list of index values in the loop, rather than the actual fruit list as below:

In [None]:
days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']
fruit_list = ['apples','bananas','oranges','raspberries']

for i in [0,1,2,3]:
    print('I like to eat '+fruit_list[i]+' on '+days[my_index])

In this case, the value of *`i`* cycles through 0, 1, 2, and 3, so we can use it to represent the index value we want to access in the data without needing to use a seperate counter at all.  

An alternative approach is the use the `enumerate()` function. 

## for *count*, *values* in enumerate(*list*):
*The `enumerate()` function is only used for loops in Python.  This function will automatically create a loop counter for you to keep track of how many iterations of the loop you have created.  There are therefore two variables availabe in the loop, the first variable (i.e., "count" above) stores the value of the counter just like the my_index variable we created earlier.  The second variable (i.e., values) stores the actual data from the list for this iteration.*

We can use this approach to replace our counter as shown below.  Note that in this case, we use the counter (here named my_index) just like we did before, we just don't need to initialize and update the value stored in the variable any more since Python is now doing that for us.

In [None]:
days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']
fruit_list = ['apples','bananas','oranges','raspberries']

for my_index,i in enumerate(fruit_list):
    print('I like to eat '+i+' on '+days[my_index])

There is one very important advantage of this approach.  Notice that nowhere do we specify the number of items in the list!  This makes our loop much more general since we can run our code with any number of items representing different data sets and it will still work great.  Let's see this in action by putting the loop into a function and then playing around with it a bit:

In [None]:
#define the function:
def eatprint(input1, input2):
    for my_index,i in enumerate(input1):            
        print('I like to eat '+i+' on '+input2[my_index])
        
        #Note: The lines in the loop are identical to those above other than 
        #replacing "fruit_list" with "input1" and "days" with "input2".  We actually did not
        #need to do this as the variables defined within a function are distinct from those 
        #outside of it, but I did so here to emphasize the fact that variables inside and
        #outside of a function are distinct from each other (though it is still possible to 
        #reference a variable outside of the function from within a function, so be careful)

In [None]:
days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']
fruit_list = ['apples','bananas','oranges','raspberries']

eatprint(fruit_list,days)

The function works just like the loop above, which is great!  We can now use it for different kinds of data:

In [None]:
eatprint(['hamburgers', 'french fries'], ['an airplane','France'])

In [None]:
eatprint(['with friends'],['a regular basis!'])

Lots of ways that we can use the function!  Let's try something silly and interchange days with fruit_list.  Before running the code cell below, predict what the output will look like.

In [None]:
eatprint(days,fruit_list)

The output was printed as you probably expected it to (with the days and fruit in reversed position in the sentence).  But why did we now get an error when the function was working great before and we got our expected output?

The *'list index out of range'* comment gives us a clue.  Let's take a careful look at how our function runs by breaking it down line by line.

`def eatprint(input1, input2):`  This line defines the function and input arguments.  
> `    for my_index,i in enumerate(input1):`   
Here we start the loop using the variable `input1` - this is important!  
> >`        print('I like to eat '+i+' on '+input2[my_index])`  
Finally, we ask for the value of `input2` at the current value of my_index.


Since we called the function with `days`, which includes 5 items, as `input1` and `fruit_list`, which only includes 4 items, as `input2`, we can now begin to understand the problem. 

When we call the function as: `eatprint(fruit_list,days)` we have a situation where `input1` only has 4 values, so the loop can only run 4 times.  If we examine the output from where we first made this function call above, we can see that the final item in the `days` list (i.e., Friday) was never used.

In contrast, when we call the function as: `eatprint(days, fruit_list)` we have a situation where the loop is iterating over the elements of `days` (since it is input1).  Given that there are 5 items in this list, the loop will iterate over values for `my_index` of 0, 1, 2, 3, and 4.  For the first four times through the loop, i.e., my_index = 0, 1, 2, 3, we have no problems at all because `fruit_list` has four items.  For the final time through the loop, however, my_index = 4 yet there is no value of `fruit_list` at that index position since the list only has 4 items, thus the *list index out of range* statement given by Python.



There are a couple of reasons that we took a detailed look at this example:
1. It is important to realize that the commands within a loop are completed independently for each iteration of the loop.  There is no advance check to make sure that the loop makes sense for all of values of the list you used to create it.  For example, in our problem above you saw that the output was printed for the first four iterations of the loop and an error didn't occur until the fifth time through the loop.  This can sometimes make it hard to find problems in the code if you are just using a small subset of data to test it.
2. Always remember to be thinking about what the value of the counter is when you are referencing index positions in an array or list.  This become increasingly complicated (and confusing) when you go to higher array dimensions (e.g., a data table or 3D volume of data).    
3. It is a good idea to always use arrays (or lists) that have the same size of data if your intent is to loop through all values.  There are cases, however, where this might now apply (e.g., if you are referencing values from a large data set).

### Step 3: Using loops with Numpy arrays.
Let's switch to an example where we are working with numerical data in arrays.  Notice that everything that we have done so far with loops and lists is all part of standard Python, so we will still have to import Numpy to use arrays.

We will start with a simple 1-dimensional arrays:

In [None]:
a = np.array([2, 3, 13, 1, 93, 2, 1, 10])
print('a=')
print(a)

b = np.r_[0:8:1]
print('b=')
print(b)

Consider the simple case where we wanted to find the product of the values in the two arrays:

In [None]:
c = np.zeros(a.shape)           #define an array to store the calculated values
for ind,value in enumerate(a):  #set up the loop to go through each element of a
    c[ind] = value*b[ind]       #for each iteration, get the value of b at ind and assign the product to c at position ind
    
print('Here is the result:')
print(c)


You can see that the final result of c is the indeed the product of the elements of a and b.
Remember, that the beauty of using Numpy arrays is that we don't actually have to do that kind of looping because Numpy does it for us automatically.  In other words, for arrays we can just type:

In [None]:
c2 = a*b

print('Here is the result from numpy:')
print(c2)

Exactly the same, but a lot easier to type and understand using arrays!  Avoiding loops for basic operations is exactly the point of using Numpy arrays to make things easier for us.  

In general, you should try to find ways to use Numpy array functions rather than loops whenever you can for doing mathematical operations (such as that here).

Now let's try working with some 2-dimensional arrays:

In [None]:
import numpy as np
A = np.r_[1:17:1]
A = A.reshape([4,4])
print('A= ')
print(A)

We could now use a loop to change all the values in a row or column of the array:

In [None]:
for i in [0,1,2,3]:
    A[i,0] = 100
    
print(A)

But it is still easier to just use Numpy array indexing to do the same thing:

In [None]:
A[:,0] = 200
print(A)

We can use a double loop to go through the rows and columns of a two-dimensional array just like we did for the one-dimensional array:

In [None]:
Nrows = A.shape[0]  #number of rows in A
Ncols = A.shape[1]  #number of columns in A

C = np.zeros([Nrows,Ncols])

for i in np.r_[0:Nrows:1]:
    for j in np.r_[0:Ncols:1]:
        C[i,j] = 1/A[i,j]
        
print(C)

But again this would have just been a lot easier by taking advantage of the fact that you can do math with Numpy arrays:

In [None]:
B = np.ones([Nrows,Ncols])  #create an array of ones that is the same shape as A
C2 = B/A  #c

print(C2)

The bottom line is that whenever you can you should take advantage of Numpy's arrays and their ability to manipulate them directly to do math.  In general, your code will be more compact, easier to read, and will run faster when you use Numpy arrays and their functions.  You will come across cases when it makes more sense to use loops and you certainly can do so when you need to.

#### Flight of the Bumblebee!
Here is an example of a problem where you can't use arrays: the path that a bee takes during a flight!  Well, I don't know if this is actually how bees fly, but we will approximate their motion using Brownian motion, where the direction they move in has a random component to it.  We can do a simply array calculation because where they are going next depends on where they have been and since their path is random, there is no way to predict it.

In this example, let's try to estimate where the bee is located at 100 different points in time.  Here is a simple model for the bee's flight path:  
> position now = position at previous time + average velocity time lenght of time bewteen updates + random perturbation in flight path
 
 We could write this a little more mathematically as:
 > $x(t_{i}) = x(t_{i-1}) + v_x*{\Delta}t + \delta_{x}$
 
 where:  
 $x(t_i)$ = the bee's position at the curren observation time $t_i$  
 $x(t_{i-1})$ = the bee's position at the previous observation point $t_{i-1}$  
 $v_x$ = the average velocity of the bee in the x direction  
 ${\Delta}t = t_i - t_{i-1}$ = the length of time between the observation periods
 $\delta_x$ = a perturbation in position along the average flight path due to random effects (e.g., wind, pretty flowers, etc)
 
One last thing that we will need is to decide how big to make the deviations from the bee's flight path.  Because these are random deviations, we will use a number random generator, i.e., `np.random.normal()`.  Since this isn't a statistics or biology tutorial, we'll just use a Guassian random number generator for this example, because it can create both positive and negative numbers that follow a bell curve (normal distribution) with a mean of zero and standard deviation of 1.  We will use a parameter (`drift`) to specify how large the deviations of the bee are compared to the average distance the bee travels during a time period between observations.  Thus we will have:

> $\delta_x = drift*rnd$   (where *rnd* is a random number drawn using the np.random.normal() function)


So our final formula for the flight path of the bee in the x-direction is:  
 > $x(t_{i}) = x(t_{i-1}) + v_x*{\Delta}t + drift*rnd$

To make things extra interesting, let's use the same type of formula for the bee's position in the y direction.  In this case, everything would be the same, but replace the x's with y's in the formula.
 > $y(t_{i}) = y(t_{i-1}) + v_y*{\Delta}t + drift*rnd$


 These two equations can be used to update the position of the bee over time - a loop is perfect for this!  But we need to know where the bee is at time $t_{i-1}$ to predict where it will go at time $t_i$.  So we will have to pick a starting position for the bee before we can start calculating step by step where the bee is going using a loop.  In addition, we need to know the average velocity of the bee in the x and y directions.  
 
One last issue is that if we want to use a loop to update the position of the bee over time, we need to initialize the array that will hold these positions first just like we did for the loop counter we made earlier, i.e., we can't update something that doesn't yet exist.
 
 So before we start coding, let's map out in words what we need to do to solve this problem:
 1. assign the parameters needed for the bee flight model ($v_x, v_y, {\Delta}t$, and *drift*) 
 2. assign the initial position of the bee ($x_o$ and $y_o$)
 3. define the number of steps through time for which we will observe the bee
 4. initialize an array to store the position results
 5. for each time step, calculate the new position of the bee from the old position
 6. repeat until the bee's position has been calculated for all time steps
 7. plot the path of the bee!
 
 OK!  Let's try coding this up!

In [None]:
#1. assign the parameters needed for the bee flight model:
vx = 1     #let's assume that the bee is flying at an average x-velocity of 3m/s
vy = 1     #                               and an average y-velocity of 5m/s (feel free to change these!)
Dt = 0.5   #Let's use a time step of 0.5 seconds between observations
drift = 0.1 #parameter controlling how much the bee drifts from its course (as a percentage of mean discplacement)

#2. assign the initial position of the bee (let's assume that this is zero)
xo = 0
yo = 0

#3. define the number of steps through time for which we will observe the bee (let's observe the bee for 60 minutes)
obs_times = np.r_[0:60:Dt] 

#4. initialize an array to store the position results
positions = np.zeros([obs_times.shape[0],2])  #we will use two columns - one for x and one for y positions

#set the initial bee position in the array (note that we do this to be general in case we change xo and yo later)
positions[0,0] = xo 
positions[0,1] = yo
 
#5. for each time step....     (here comes the loop!)
for i in np.r_[1:positions.shape[0]:1]:   #we start the counter at 1 here instead of 0 because the first row is xo, yo
    #...calculate the new position of the bee from the old position
    positions[i,0] = positions[i-1,0] + vx*Dt + drift*np.random.normal()  #calculate the new x position
    positions[i,1] = positions[i-1,1] + vy*Dt + drift*np.random.normal()  #calculate the new y position
#6. repeat (the loop continues until all time steps have been met)
    
#7. plot the path of the bee!
import matplotlib.pyplot as plt
plt.plot(positions[:,0],positions[:,1],'.-')
plt.xlabel('x position')
plt.ylabel('y position')
plt.title('Flight of the Bumblebee! (drift='+str(drift)+')' )
plt.axis([-100, 100, -100, 100])

With the *`drift`* parameter set to 0.1 we can see that our bee is pretty down to business!  Try changing the drift parameter to 1, 2, and 5 and rerun the code cell multiple times for each case to see what the effect of the parameter is on the flight of the bumblebee!  Clearly the higher the drift parameter, the more distracted the bee!

### Challenge Problem 5.3
The bee problem is a perfect example of where a function would be useful.  Copy the code from above to the cell below and modify it so that you can call a function as `beeflight(vx,vy,Dt,drift,xo,yo)` that will produce the same output that you were observing above.

In [None]:
#enter your code for the beeflight function here:




















#calling the function as follows should produce the same type of output as above:
beeflight(1,1,0.5,2,0,0)  #you can change the inputs of the function to change the bee flight

### Step 4: Plotting multiple data sets with a for loop.
One case where loops can work well is when you need to plot multiple data sets.  We will build on our graphing experience from the last four tutorials here to plot a graph with multiple data values on it.

Let's start by recalling the final challenge problem from Tutorial 4.  Remember that the goal was to plot user-specified portions of the data showing river levels for different stations, where the data were stored in different data files (make sure that you still have those data files from the challenge problem loaded in your Jupyter server - if you are using the same computer, they should still be there unless you deleted them).

Though your final solution may have been slightly different, my function for that problem looked like this:

In [None]:
def usgsplot(data,tstart,tend,river,sname):
    """
    usgsplot(data,tstart,tend,river,sname)
    This function plots a subset of data from a USGS stream gauge specified by the user.
    
       Input arguments:
       data = the full data record for the monitoring station
       tstart = index value for the first observation time to plot
       tend = index value for the final observation time to plot
       river = text string containing the name of the river
       sname = text string containing the name of the observation station
       
       Output arguments:
       None  (but a graph is created)
       
       usgsplot v.1.0 (created by S.Moysey)
    """
       
    pdata = data[tstart:tend,:]   #select the data to plot based on the specified start and end periods
    plt.plot(pdata[:,0],pdata[:,1])  #plot the selected data

    #add axis labels:
    plt.xlabel('Days since Jan.1, 2021') 
    plt.ylabel('River Stage [m]')
    
    #add a custom title
    plt.title(river+' River at '+sname+' - Observation Period: Day '+str(pdata[0,0])+' to Day '+str(pdata[pdata.shape[0]-1,0]))


You may recall that we loaded the data to plot from data files, loading them like this:

In [None]:
#load the data for the Fayetteville and Greenville stations
data = np.genfromtxt('Tutorial4_Fayetteville2021_stagedata.csv', delimiter=',', skip_header=1)

#we could then use our function to plot the data:
usgsplot(data,100,6000,'Cape Fear','Fayetteville')

In the callenge problem we had used three different data files from three different stations and manually loaded and plotted the data for each one.  Let's use loops to automate that process. 

In [None]:
#let's use a list to store the file name for each of the data files:
data_files = ['Tutorial4_Fayetteville2021_stagedata.csv','Tutorial4_Lillington2021_stagedata.csv', 'Tutorial4_Greenville2021_stagedata.csv' ]
#let's also define a lists for the river and station name for each data set (the order should match the list of files above):
station_name = ['Fayetteville', 'Lillington', 'Greenville']
river_name = ['Cape Fear', 'Cape Fear', 'Tar'] #note that we need to repeat Cape Fear since both the Fayettevile
                                               #and Lillington stations are on that river

#let's also specify the start and stop position in the data that we are going to plot:
start = 100
stop = 6000

#now let's create the for loop to plot the data:
for ind,fname in enumerate(data_files):  #notice that we are giving the list of filenames to iterate on
    data = np.genfromtxt(fname, delimiter=',', skip_header=1) #fname contains the filename from the list for the current iteration
    usgsplot(data,start,stop,river_name[ind],station_name[ind])

plt.legend(station_name)

To make this really general, we could now also put our new loop in a function:

In [None]:
def plotusgsfiles(data_files, station_name, river_name, start, stop):
    """
    plotusgsfiles(data_files, station_name, river_name, start, stop)
    This function plots the data record from multiple USGS stream gauge files on a single plot.
    
    Input Arguments:
    data_files - a list containing the filename of each data file
    station_name - a list containing the name of each station associated with the data files (must have an entry for each data file)
    river_name - a list containing the name of each river associated with the data files (must have an entry for each data file)
    
    Output Arguments:
    None (makes a plot)
    
    plotusgsfiles v.1.0 (created by S. Moysey)
    """
    import numpy as np
    import matplotlib.pyplot as plt
    
    for ind,fname in enumerate(data_files):  #notice that we are giving the list of filenames to iterate on
        data = np.genfromtxt(fname, delimiter=',', skip_header=1) #fname contains the filename from the list for the current iteration
        usgsplot(data,start,stop,river_name[ind],station_name[ind])
        plt.legend(station_name)

In [None]:
plotusgsfiles(data_files,station_name,river_name,100,6000)  #figure made with just one command!

So imagine that plotting this data sets was a common task that you needed to perform (for example, perhaps as part of your job you need to make this identical plot every month with the most recent data available).  If you saved these two functions (i.e., `usgsplot()` and `plotusgsfiles()`) in a module, you could very easily create these professional plots in a consistent way with little chance of data erros in just a matter of seconds every month!  Talk about impressing your boss!

### Challenge Problem 5.4
For your final challenge problem you will create a function to load and read a data file where observations are listed in rows with each column representing a different variable (the first column will always be the values for the x-axis).  Your function must be able to plot the data for all of the columns on a new plot (<b>tip:</b> to create a new figure window, use the command `figure()` from the matplotlib.pyplot module).  

Your function should work for both the [Tutorial5_example_data.csv](https://www.dropbox.com/s/cjy5muk0229fpk0/Tutorial5_example_data.csv?dl=0) and [Tutorial5_example_data2.csv](https://www.dropbox.com/s/grl3jmxnfh9yarv/Tutorial5_example_data2.csv?dl=0) files, which each have a different number of columns.  (Download the files from these links and then upload them to your Jupyter server to access them as you have done in other tutorials.) 

Here are some output examples of what your output graphs should look like:  
[Output Example for Column 3 of Tutorial_example_data1.csv](https://www.dropbox.com/s/p1g8s215o19xesr/Tutorial5_dataset1_output.png?dl=0)  
[Output Example for Column 2 of Tutorial_example_data2.csv](https://www.dropbox.com/s/diowc21pnusaga3/Tutorial5_dataset2_output.png?dl=0)

(Note: I call the first column that contains the x-values "column 1")

In [None]:
#add your code for the function here:



















In [None]:
#your function should work to create 19 plots for the first data set and 3 plots for the second data set)
plotdata('Tutorial5_example_data.csv')
plotdata('Tutorial5_example_data2.csv')