**Welcome to the second session of NYC Opportunity Data School!**

Today we're going to learn two things:

* Data Structure
* Pandas library
    * Viewing/Inspecting
    * Selection/Indexing
    * Grouping data
    * Working with Date format data



# What is a data structure? 

* Way to store data and have some methods to retrieve and manipulate it 

* Examples in Python:

    * List, Dictionary, tuple, set, string
    * Array
    * **Series, DataFrame** 

In [1]:
import pandas as pd

# pandas Data Structure 

Pandas in Python deals with two data structures: Series, DataFrame 

* Series in pandas: A one-dimensional labeled array with homogeneous data. All the elements of series should be of same data type 
* DataFrame in pandas: A two-dimensional labeled data structure, usually represented in the tabular format. Each column represents an attribute and each row represents a person/object. 

In [2]:
## importing CSV file into pandas dataframe

url = 'https://data.cityofnewyork.us/resource/5hyw-n69x.csv'
dog_data = pd.read_csv(url)

## pd.read_excel("filename.xlsx")

To take a peek at a dataframe, we can use the functions .head() or .tail() to look at the first or final 5 rows of the table. 

In [3]:
dog_data.head()

Unnamed: 0,animalbirth,animalgender,animalname,borough,breedname,censustract2010,citycouncildistrict,communitydistrict,congressionaldistrict,licenseexpireddate,licenseissueddate,nta,rownumber,statesenatorialdistrict,zipcode
0,2000-01-01T00:00:00.000,M,SHADOW,Brooklyn,Beagle,1014.0,46.0,318.0,8.0,2016-01-30T00:00:00.000,2014-12-29T00:00:00.000,BK50,1753,19.0,11236
1,2011-10-01T00:00:00.000,M,ROCCO,Brooklyn,Boxer,756.0,45.0,314.0,9.0,2016-01-30T00:00:00.000,2015-01-07T00:00:00.000,BK43,2415,17.0,11210
2,2005-09-01T00:00:00.000,M,LUIGI,Bronx,Maltese,516.0,13.0,210.0,14.0,2016-02-02T00:00:00.000,2015-01-17T00:00:00.000,BX10,3328,34.0,10464
3,2013-08-01T00:00:00.000,F,PETUNIA,Brooklyn,Pug,419.0,34.0,304.0,7.0,2016-03-28T00:00:00.000,2015-03-01T00:00:00.000,BK78,7537,18.0,11221
4,2008-10-01T00:00:00.000,M,ROMEO,Bronx,Maltese,65.0,17.0,201.0,15.0,2016-03-09T00:00:00.000,2015-03-09T00:00:00.000,BX34,8487,32.0,10451


In [4]:
# Create a Series 
s = pd.Series([3, -5, 7, 4]) 
s

0    3
1   -5
2    7
3    4
dtype: int64

In [5]:
# Create a Series with index

s_index = pd.Series([3, -5, 7, 4], index = ['a', 'b', 'c', 'd'])
s_index

a    3
b   -5
c    7
d    4
dtype: int64

## Viewing/Inspecting data 
 
 * info()
 * head()/tale()
 * describe()
 * value_counts()
 * pd.crosstab()

In [6]:
## Index, Data structure, Datatype for each column, number of missing values 
dog_data.info() 

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 15 columns):
animalbirth                1000 non-null object
animalgender               1000 non-null object
animalname                 1000 non-null object
borough                    1000 non-null object
breedname                  1000 non-null object
censustract2010            974 non-null float64
citycouncildistrict        974 non-null float64
communitydistrict          974 non-null float64
congressionaldistrict      974 non-null float64
licenseexpireddate         1000 non-null object
licenseissueddate          1000 non-null object
nta                        974 non-null object
rownumber                  1000 non-null int64
statesenatorialdistrict    974 non-null float64
zipcode                    1000 non-null int64
dtypes: float64(5), int64(2), object(8)
memory usage: 117.3+ KB


In [7]:
## First 5 rows of the Dataframe 
dog_data.head() 

## dog_data.head(10) ## First 10 rows of the Dataframe

Unnamed: 0,animalbirth,animalgender,animalname,borough,breedname,censustract2010,citycouncildistrict,communitydistrict,congressionaldistrict,licenseexpireddate,licenseissueddate,nta,rownumber,statesenatorialdistrict,zipcode
0,2000-01-01T00:00:00.000,M,SHADOW,Brooklyn,Beagle,1014.0,46.0,318.0,8.0,2016-01-30T00:00:00.000,2014-12-29T00:00:00.000,BK50,1753,19.0,11236
1,2011-10-01T00:00:00.000,M,ROCCO,Brooklyn,Boxer,756.0,45.0,314.0,9.0,2016-01-30T00:00:00.000,2015-01-07T00:00:00.000,BK43,2415,17.0,11210
2,2005-09-01T00:00:00.000,M,LUIGI,Bronx,Maltese,516.0,13.0,210.0,14.0,2016-02-02T00:00:00.000,2015-01-17T00:00:00.000,BX10,3328,34.0,10464
3,2013-08-01T00:00:00.000,F,PETUNIA,Brooklyn,Pug,419.0,34.0,304.0,7.0,2016-03-28T00:00:00.000,2015-03-01T00:00:00.000,BK78,7537,18.0,11221
4,2008-10-01T00:00:00.000,M,ROMEO,Bronx,Maltese,65.0,17.0,201.0,15.0,2016-03-09T00:00:00.000,2015-03-09T00:00:00.000,BX34,8487,32.0,10451


In [8]:
## Summary statistics for numerical columns

dog_data.describe() 

Unnamed: 0,censustract2010,citycouncildistrict,communitydistrict,congressionaldistrict,rownumber,statesenatorialdistrict,zipcode
count,974.0,974.0,974.0,974.0,1000.0,974.0,1000.0
mean,5360.5154,20.867556,245.823409,10.422998,8485.01,24.120123,10626.766
std,17259.628015,16.25345,132.361031,2.74278,23867.322548,6.415948,589.369198
min,1.0,1.0,101.0,3.0,1.0,10.0,7030.0
25%,93.0,5.0,107.0,8.0,250.75,20.0,10025.0
50%,189.0,18.0,211.0,11.0,500.5,26.0,10462.5
75%,539.0,35.0,318.0,12.0,750.25,28.0,11222.0
max,157101.0,51.0,595.0,16.0,121295.0,36.0,11697.0


In [9]:
## Return a series with unique values and counts
dog_data.borough.value_counts()

Manhattan        403
Brooklyn         255
Queens           179
Bronx            104
Staten Island     57
BROOKLYN           2
Name: borough, dtype: int64

In [10]:
## Return a series with unique values and counts
dog_data.communitydistrict.value_counts() ## Excludes NA values by default

107.0    74
108.0    59
106.0    52
104.0    51
306.0    38
102.0    34
401.0    33
301.0    32
302.0    32
208.0    26
103.0    25
501.0    23
310.0    23
101.0    22
405.0    21
111.0    20
406.0    20
503.0    19
112.0    19
402.0    18
210.0    17
304.0    15
407.0    15
109.0    14
318.0    14
502.0    13
303.0    13
313.0    13
311.0    13
411.0    13
307.0    12
209.0    12
315.0    11
211.0    11
105.0    11
308.0    10
404.0    10
201.0    10
414.0     9
413.0     8
314.0     8
110.0     8
409.0     8
206.0     7
410.0     7
309.0     5
207.0     5
312.0     5
403.0     5
412.0     4
408.0     4
202.0     4
316.0     4
204.0     4
305.0     3
212.0     3
203.0     2
595.0     2
205.0     1
Name: communitydistrict, dtype: int64

In [11]:
## View unique values and counts including missing value
dog_data.communitydistrict.value_counts(dropna = False)

 107.0    74
 108.0    59
 106.0    52
 104.0    51
 306.0    38
 102.0    34
 401.0    33
 301.0    32
 302.0    32
 208.0    26
NaN       26
 103.0    25
 310.0    23
 501.0    23
 101.0    22
 405.0    21
 111.0    20
 406.0    20
 503.0    19
 112.0    19
 402.0    18
 210.0    17
 304.0    15
 407.0    15
 109.0    14
 318.0    14
 502.0    13
 303.0    13
 313.0    13
 311.0    13
 411.0    13
 307.0    12
 209.0    12
 315.0    11
 211.0    11
 105.0    11
 201.0    10
 308.0    10
 404.0    10
 414.0     9
 314.0     8
 413.0     8
 409.0     8
 110.0     8
 410.0     7
 206.0     7
 309.0     5
 207.0     5
 312.0     5
 403.0     5
 202.0     4
 204.0     4
 316.0     4
 412.0     4
 408.0     4
 305.0     3
 212.0     3
 595.0     2
 203.0     2
 205.0     1
Name: communitydistrict, dtype: int64

In [12]:
## Create a crosstab table by two categorical variables

pd.crosstab(dog_data.animalgender, dog_data.borough)

borough,BROOKLYN,Bronx,Brooklyn,Manhattan,Queens,Staten Island
animalgender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
F,0,44,117,186,76,26
M,2,60,138,217,103,31


In [13]:
pd.crosstab(dog_data.animalgender, dog_data.borough, margins = True) ## Including tatal values 

borough,BROOKLYN,Bronx,Brooklyn,Manhattan,Queens,Staten Island,All
animalgender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
F,0,44,117,186,76,26,449
M,2,60,138,217,103,31,551
All,2,104,255,403,179,57,1000


In [14]:
## Number of unique dog name by borough and animalgender 

pd.crosstab(dog_data.animalgender, dog_data.borough, dog_data.animalname, aggfunc = pd.Series.nunique)

## official document: https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.crosstab.html

borough,BROOKLYN,Bronx,Brooklyn,Manhattan,Queens,Staten Island
animalgender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
F,,43.0,104.0,162.0,69.0,26.0
M,2.0,56.0,126.0,200.0,95.0,30.0


## Selection/ Indexing 

* Retrieve a part of dataframe based on condition, column names, etc.. 
* Selection is possible by position, column name(label), and some conditions

In [15]:
## Getting 

# df['colname'] or df.colname : select a single column as a Series

dog_data['animalgender']

0      M
1      M
2      M
3      F
4      M
5      M
6      M
7      F
8      F
9      M
10     F
11     F
12     F
13     M
14     M
15     F
16     F
17     M
18     M
19     M
20     F
21     M
22     F
23     M
24     F
25     M
26     M
27     M
28     M
29     F
      ..
970    F
971    F
972    M
973    M
974    F
975    M
976    F
977    F
978    F
979    M
980    F
981    F
982    M
983    M
984    F
985    M
986    M
987    M
988    F
989    F
990    M
991    F
992    M
993    M
994    M
995    M
996    M
997    F
998    F
999    F
Name: animalgender, Length: 1000, dtype: object

In [16]:
# Select a subset of columns with all rows 

sm_dog_data = dog_data[['animalgender', 'animalname', 'zipcode']]

sm_dog_data.head()

Unnamed: 0,animalgender,animalname,zipcode
0,M,SHADOW,11236
1,M,ROCCO,11210
2,M,LUIGI,10464
3,F,PETUNIA,11221
4,M,ROMEO,10451


In [17]:
## Return number of rows and columns
sm_dog_data.shape

(1000, 3)

In [18]:
## Select subset of rows with all columns

sm_dog_data_2 = dog_data[0:100]
sm_dog_data_2.shape

(100, 15)

In [19]:
## TOGETHER: Select subset of a Dataframe with conditions for both row and columns

sm_dog_data_3 = dog_data.loc[0:100, ['animalgender', 'animalname', 'zipcode']]
sm_dog_data_3.shape

(101, 3)

In [20]:
## I want to have only Manhattan dogs.

mn_dog = dog_data.loc[dog_data['borough'] == 'Manhattan']
mn_dog.shape

(403, 15)

**Exercise: I want to have Manhattan or Queens dogs** 




**Exercise: I want to have only Manhattan girl dogs** 

**Exercise: I want to have dogs in my zipcode** 

**Hint** 

**1. cheet sheet**

* dog_data.loc[(condition1) & (condition2)] 

* df[df.colname.isin(['val1', 'val2'])]

**2. Python Operator**

* &: and 
* | : or 
* ~ : not
* == : equal 
* != : not equal 

In [21]:
## Filter rows based on logic 

df[df.colname > 1000] : rows where the colname column is greater than 1000

SyntaxError: invalid syntax (<ipython-input-21-807a72368ba2>, line 3)

In [22]:
dog_data[dog_data.rownumber > 1000].head() 

Unnamed: 0,animalbirth,animalgender,animalname,borough,breedname,censustract2010,citycouncildistrict,communitydistrict,congressionaldistrict,licenseexpireddate,licenseissueddate,nta,rownumber,statesenatorialdistrict,zipcode
0,2000-01-01T00:00:00.000,M,SHADOW,Brooklyn,Beagle,1014.0,46.0,318.0,8.0,2016-01-30T00:00:00.000,2014-12-29T00:00:00.000,BK50,1753,19.0,11236
1,2011-10-01T00:00:00.000,M,ROCCO,Brooklyn,Boxer,756.0,45.0,314.0,9.0,2016-01-30T00:00:00.000,2015-01-07T00:00:00.000,BK43,2415,17.0,11210
2,2005-09-01T00:00:00.000,M,LUIGI,Bronx,Maltese,516.0,13.0,210.0,14.0,2016-02-02T00:00:00.000,2015-01-17T00:00:00.000,BX10,3328,34.0,10464
3,2013-08-01T00:00:00.000,F,PETUNIA,Brooklyn,Pug,419.0,34.0,304.0,7.0,2016-03-28T00:00:00.000,2015-03-01T00:00:00.000,BK78,7537,18.0,11221
4,2008-10-01T00:00:00.000,M,ROMEO,Bronx,Maltese,65.0,17.0,201.0,15.0,2016-03-09T00:00:00.000,2015-03-09T00:00:00.000,BX34,8487,32.0,10451


## Write to CSV/Excel file 



In [23]:
mn_dog.to_csv('manhattan_dog.csv')

In [24]:
mn_dog.to_excel('manhattan_dog.xlsx')

In [25]:
## To get the current working directory

import os
os.getcwd()

'C:\\Users\\sujlee\\GitHub\\data_school'

## Grouping Data 

With 'group by' method, data process involving one or more of the following steps:

 * **Splitting** the data into groups based on some criteria 
 * **Applying** a function to each group independently (count, sum, mean, etc)
 * **Combining** the results into a data structure 

In [None]:
dog_data.groupby('colname')
dog_data.groupby('colname')['col2'].mean() # returns the mean of the values in col2 grouped by colname 

## MultiIndex 
dog_data.groupby(['colname1', 'colname2']) 



In [26]:
## read a new data on Titanic passengers 

titanic = pd.read_csv('titanic.csv', sep='|')
titanic.head()

Unnamed: 0,pclass,survived,name,sex,age,sibsp,parch,ticket,fare,cabin,embarked,boat,body,dest
0,1,1,"Allen, Miss. Elisabeth Walton",female,29.0,0,0,24160,211.3375,B5,S,2.0,,"St Louis, MO"
1,1,1,"Allison, Master. Hudson Trevor",male,0.9167,1,2,113781,151.55,C22 C26,S,11.0,,"Montreal, PQ / Chesterville, ON"
2,1,0,"Allison, Miss. Helen Loraine",female,2.0,1,2,113781,151.55,C22 C26,S,,,"Montreal, PQ / Chesterville, ON"
3,1,0,"Allison, Mr. Hudson Joshua Creighton",male,30.0,1,2,113781,151.55,C22 C26,S,,135.0,"Montreal, PQ / Chesterville, ON"
4,1,0,"Allison, Mrs. Hudson J C (Bessie Waldo Daniels)",female,25.0,1,2,113781,151.55,C22 C26,S,,,"Montreal, PQ / Chesterville, ON"


In [27]:
## What is the average fare by class?  

titanic.groupby('pclass')['fare'].mean()

pclass
1    87.508992
2    21.179196
3    13.302889
Name: fare, dtype: float64

In [28]:
## How many people survived by class?

titanic.groupby('pclass')['survived'].sum()

pclass
1    200
2    119
3    181
Name: survived, dtype: int64

In [29]:
## How many people survived by class and gender?

titanic.groupby(['pclass', 'sex'])['survived'].sum()

pclass  sex   
1       female    139
        male       61
2       female     94
        male       25
3       female    106
        male       75
Name: survived, dtype: int64

In [30]:
## What if we want to combine this result back with the original dataframe?

titanic.groupby('pclass')['fare'].transform('mean')

0       87.508992
1       87.508992
2       87.508992
3       87.508992
4       87.508992
5       87.508992
6       87.508992
7       87.508992
8       87.508992
9       87.508992
10      87.508992
11      87.508992
12      87.508992
13      87.508992
14      87.508992
15      87.508992
16      87.508992
17      87.508992
18      87.508992
19      87.508992
20      87.508992
21      87.508992
22      87.508992
23      87.508992
24      87.508992
25      87.508992
26      87.508992
27      87.508992
28      87.508992
29      87.508992
          ...    
1279    13.302889
1280    13.302889
1281    13.302889
1282    13.302889
1283    13.302889
1284    13.302889
1285    13.302889
1286    13.302889
1287    13.302889
1288    13.302889
1289    13.302889
1290    13.302889
1291    13.302889
1292    13.302889
1293    13.302889
1294    13.302889
1295    13.302889
1296    13.302889
1297    13.302889
1298    13.302889
1299    13.302889
1300    13.302889
1301    13.302889
1302    13.302889
1303    13

In [31]:
## Add a series above into a new column in the original titanic dataframe 

titanic['avg_fare_class'] = titanic.groupby('pclass')['fare'].transform('mean')

In [32]:
## check the result! 

titanic.head()

Unnamed: 0,pclass,survived,name,sex,age,sibsp,parch,ticket,fare,cabin,embarked,boat,body,dest,avg_fare_class
0,1,1,"Allen, Miss. Elisabeth Walton",female,29.0,0,0,24160,211.3375,B5,S,2.0,,"St Louis, MO",87.508992
1,1,1,"Allison, Master. Hudson Trevor",male,0.9167,1,2,113781,151.55,C22 C26,S,11.0,,"Montreal, PQ / Chesterville, ON",87.508992
2,1,0,"Allison, Miss. Helen Loraine",female,2.0,1,2,113781,151.55,C22 C26,S,,,"Montreal, PQ / Chesterville, ON",87.508992
3,1,0,"Allison, Mr. Hudson Joshua Creighton",male,30.0,1,2,113781,151.55,C22 C26,S,,135.0,"Montreal, PQ / Chesterville, ON",87.508992
4,1,0,"Allison, Mrs. Hudson J C (Bessie Waldo Daniels)",female,25.0,1,2,113781,151.55,C22 C26,S,,,"Montreal, PQ / Chesterville, ON",87.508992


# Working with Date format data

In order to use pandas' wonderful date-related tools, we need to have a right format of date which pandas can handle. 

In [33]:
dog_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 15 columns):
animalbirth                1000 non-null object
animalgender               1000 non-null object
animalname                 1000 non-null object
borough                    1000 non-null object
breedname                  1000 non-null object
censustract2010            974 non-null float64
citycouncildistrict        974 non-null float64
communitydistrict          974 non-null float64
congressionaldistrict      974 non-null float64
licenseexpireddate         1000 non-null object
licenseissueddate          1000 non-null object
nta                        974 non-null object
rownumber                  1000 non-null int64
statesenatorialdistrict    974 non-null float64
zipcode                    1000 non-null int64
dtypes: float64(5), int64(2), object(8)
memory usage: 117.3+ KB


In [34]:
## Convert string to timestamp format 

dog_data['animalbirth'] = pd.to_datetime(dog_data['animalbirth'])

dog_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 15 columns):
animalbirth                1000 non-null datetime64[ns]
animalgender               1000 non-null object
animalname                 1000 non-null object
borough                    1000 non-null object
breedname                  1000 non-null object
censustract2010            974 non-null float64
citycouncildistrict        974 non-null float64
communitydistrict          974 non-null float64
congressionaldistrict      974 non-null float64
licenseexpireddate         1000 non-null object
licenseissueddate          1000 non-null object
nta                        974 non-null object
rownumber                  1000 non-null int64
statesenatorialdistrict    974 non-null float64
zipcode                    1000 non-null int64
dtypes: datetime64[ns](1), float64(5), int64(2), object(7)
memory usage: 117.3+ KB


In [35]:
## Extract Year, Month, Date from timestamp column

dog_data['birth_yr'] = dog_data['animalbirth'].dt.year

dog_data.head()

Unnamed: 0,animalbirth,animalgender,animalname,borough,breedname,censustract2010,citycouncildistrict,communitydistrict,congressionaldistrict,licenseexpireddate,licenseissueddate,nta,rownumber,statesenatorialdistrict,zipcode,birth_yr
0,2000-01-01,M,SHADOW,Brooklyn,Beagle,1014.0,46.0,318.0,8.0,2016-01-30T00:00:00.000,2014-12-29T00:00:00.000,BK50,1753,19.0,11236,2000
1,2011-10-01,M,ROCCO,Brooklyn,Boxer,756.0,45.0,314.0,9.0,2016-01-30T00:00:00.000,2015-01-07T00:00:00.000,BK43,2415,17.0,11210,2011
2,2005-09-01,M,LUIGI,Bronx,Maltese,516.0,13.0,210.0,14.0,2016-02-02T00:00:00.000,2015-01-17T00:00:00.000,BX10,3328,34.0,10464,2005
3,2013-08-01,F,PETUNIA,Brooklyn,Pug,419.0,34.0,304.0,7.0,2016-03-28T00:00:00.000,2015-03-01T00:00:00.000,BK78,7537,18.0,11221,2013
4,2008-10-01,M,ROMEO,Bronx,Maltese,65.0,17.0,201.0,15.0,2016-03-09T00:00:00.000,2015-03-09T00:00:00.000,BX34,8487,32.0,10451,2008


In [36]:
## Re-formatting timestamp in anyway you want

dog_data['animalbirth1'] = dog_data['animalbirth'].dt.strftime('%m/%d/%Y') 
dog_data.head(2)

Unnamed: 0,animalbirth,animalgender,animalname,borough,breedname,censustract2010,citycouncildistrict,communitydistrict,congressionaldistrict,licenseexpireddate,licenseissueddate,nta,rownumber,statesenatorialdistrict,zipcode,birth_yr,animalbirth1
0,2000-01-01,M,SHADOW,Brooklyn,Beagle,1014.0,46.0,318.0,8.0,2016-01-30T00:00:00.000,2014-12-29T00:00:00.000,BK50,1753,19.0,11236,2000,01/01/2000
1,2011-10-01,M,ROCCO,Brooklyn,Boxer,756.0,45.0,314.0,9.0,2016-01-30T00:00:00.000,2015-01-07T00:00:00.000,BK43,2415,17.0,11210,2011,10/01/2011


In [37]:
## To get weekday from a timestamp date

dog_data['birth_day'] = dog_data['animalbirth'].dt.strftime('%a')
dog_data.head()

Unnamed: 0,animalbirth,animalgender,animalname,borough,breedname,censustract2010,citycouncildistrict,communitydistrict,congressionaldistrict,licenseexpireddate,licenseissueddate,nta,rownumber,statesenatorialdistrict,zipcode,birth_yr,animalbirth1,birth_day
0,2000-01-01,M,SHADOW,Brooklyn,Beagle,1014.0,46.0,318.0,8.0,2016-01-30T00:00:00.000,2014-12-29T00:00:00.000,BK50,1753,19.0,11236,2000,01/01/2000,Sat
1,2011-10-01,M,ROCCO,Brooklyn,Boxer,756.0,45.0,314.0,9.0,2016-01-30T00:00:00.000,2015-01-07T00:00:00.000,BK43,2415,17.0,11210,2011,10/01/2011,Sat
2,2005-09-01,M,LUIGI,Bronx,Maltese,516.0,13.0,210.0,14.0,2016-02-02T00:00:00.000,2015-01-17T00:00:00.000,BX10,3328,34.0,10464,2005,09/01/2005,Thu
3,2013-08-01,F,PETUNIA,Brooklyn,Pug,419.0,34.0,304.0,7.0,2016-03-28T00:00:00.000,2015-03-01T00:00:00.000,BK78,7537,18.0,11221,2013,08/01/2013,Thu
4,2008-10-01,M,ROMEO,Bronx,Maltese,65.0,17.0,201.0,15.0,2016-03-09T00:00:00.000,2015-03-09T00:00:00.000,BX34,8487,32.0,10451,2008,10/01/2008,Wed


**Documentation on strftime**

http://strftime.org/