## Sara the Turtle: `Pandas` Redux
Recalling the script we wrote earlier to process the ARGOS tracking data for Sara the Turtle. Here, we'll demonstrate how `pandas`, can greatly simplify that task. More specifically, we look at the ability of Pandas to extract and summarize data stored in a DataFrame.

In [None]:
#Set the user date
userDate = '7/3/2003'

In [None]:
#Set the location classes
locClasses = '1','2','3'

In [None]:
#import the modules
import pandas as pd

In [None]:
#set a variable to the path where the tracking data lives
dataFilename = '..\Data\Sara.txt'

#### Importing the csv file into a dataFrame
Here we pull the data into the data frame. We need to set a few extra parameters in the `read_csv` statement because of the format of our "csv" file. 
* First, we need to skip the comment lines, done with the `comment=` argument. 
* Second we need to specify the delimiter since it isn't the default of a comma. 
* Third, these data have a 'uid' column with values that uniquely identify each column; we'll use this as the index of our dataFrame rather than the default of sequentially increasing integers. This is done with the `index_col` argument. 
* And lastly, we have a few numeric columns that should be imported as strings since they represent nominal values. We do this by setting up a data type dictionary which specifies columns who's data types we want to override from default types when importing. 

In [None]:
#Open the data as a pandas dataframe. 
df = pd.read_csv(dataFilename,
                 comment='#',   #Skip lines that start with '#'
                 delimiter='\t', #Set the delimiter as <tab>
                 index_col='uid',
                 dtype={'uid':'str','tag_id':'str'}
                )

Now, let's check the data types in what we just imported...

In [None]:
df.dtypes

In [None]:
df.head()

#### Extracting a subset of the data
Subsetting data in Pandas is a two step process. The first step is to create a **row mask**. This is a single column of boolean values with rows corresponding to whether a criteria we specify is True or False. Below, we'll create a row mask of values that match the user provided date. There are more sophisitcated ways of dealing with dates, but for our purposes selecting rows that start with the date string the user provides works. 

In [None]:
#Create a mask of user dates
dateMask = df['utc'].str.startswith(userDate)

Here we'll create another row mask, this time selecting rows with `lc` values within our list. 

In [None]:
#Create a mask of location classe
lcMask = df['lc'].isin(locClasses)

And now we *apply* these masks, using the bit-wise `&` to select and returnrows where *both* masks are true. 

In [None]:
#Filter the records that match the above masks
dfOutput = df[dateMask & lcMask]

In [None]:
#How many rows meet this criteria
print len(dfOutput)

In [None]:
print "On {}, Sara the turtle was found at:".format(userDate)
for i,row in dfOutput.iterrows():
    print('  {0} deg Lat; {1} deg lon'.format(row['lat1'],row['lon1']))