## AcademyXi Data Analysis - Data Manipulation
### Workshop B - Data manipulation in practice
In this workshop module, we will go through a number of ways in which you can use Python for data manipulation. 

Think of this as a beginning of a rich, rewarding journey. We appreciate that some of the below may seem difficult (or easy) depending on your experience level with Python and Pandas. 

If you're not sure about how some part of the code below is working, try reviewing the documentation for the method or function (e.g. [here](https://pandas.pydata.org/docs/index.html)). Alternatively, think about what the code has done to the underlying data, and go back to the code to see if you can understand the steps it's taken to do so.

Good luck!

### Preparation

This will prepare our notebook including installing required packages and loading the data.

In [None]:
# Install additional libraries required (fsspec and s3fs) to load files through AWS S3
%%capture tmp
!pip install fsspec s3fs

# Import libraries to be used
import plotly.express as px
import pandas as pd
import numpy as np

In [None]:
# Load data from S3
df = pd.read_csv("s3://databyjp/academyxi/Datafiniti_Womens_Shoes_sm.csv")

In [None]:
# Check that the file has been properly loaded
df.head()

Unnamed: 0,id,dateAdded,dateUpdated,asins,brand,categories,primaryCategories,colors,dimension,ean,imageURLs,keys,manufacturer,manufacturerNumber,name,prices.amountMax,prices.amountMin,prices.availability,prices.color,prices.condition,prices.currency,prices.dateAdded,prices.dateSeen,prices.isSale,prices.merchant,prices.offer,prices.returnPolicy,prices.shipping,prices.size,prices.sourceURLs,sizes,sourceURLs,upc,weight
0,AVpfEf_hLJeJML431ueH,2015-05-04T12:13:08Z,2018-01-29T04:38:43Z,,Naturalizer,"Clothing,Shoes,Women's Shoes,All Women's Shoes...",Shoes,"Silver,Cream Watercolor Floral",,,https://i5.walmartimages.com/asr/861ca6cf-fa55...,"naturalizer/47147sc022,017136472311,womensnatu...",,47147SC022,Naturalizer Danya Women N/S Open Toe Synthetic...,55.99,55.99,,UWomens M Regular,,USD,2017-03-28T11:40:25Z,"2017-03-25T09:19:24.819Z,2017-03-25T09:19:19.600Z",False,Overstock.com,,,,S,https://www.overstock.com/Clothing-Shoes/Women...,"6W,9W,7.5W,12W,8.5M,9N,9M,9.5M,10.5M,10W,8.5W,...",https://www.walmart.com/ip/Naturalizer-Danya-W...,17136472311,
1,AVpi74XfLJeJML43qZAc,2017-01-27T01:23:39Z,2018-01-03T05:21:54Z,,MUK LUKS,"Clothing,Shoes,Women's Shoes,Women's Casual Sh...",Shoes,Grey,,33977050000.0,https://i5.walmartimages.com/asr/421de5d5-3a74...,"mukluks/00173650206,033977045743,muklukswomens...",Muk Luks,0017365020-6,MUK LUKS Womens Jane Suede Moccasin,47.0,35.25,In Stock,Grey,New,USD,2018-01-03T05:21:54Z,"2017-12-08T14:24:00.000Z,2017-11-01T02:52:00.000Z",True,Walmart.com,,,Standard,6,https://www.walmart.com/ip/MUK-LUKS-Womens-Jan...,107698,https://www.walmart.com/ip/MUK-LUKS-Womens-Jan...,33977045743,
2,AVpi74XfLJeJML43qZAc,2017-01-27T01:23:39Z,2018-01-03T05:21:54Z,,MUK LUKS,"Clothing,Shoes,Women's Shoes,Women's Casual Sh...",Shoes,Grey,,33977050000.0,https://i5.walmartimages.com/asr/421de5d5-3a74...,"mukluks/00173650206,033977045743,muklukswomens...",Muk Luks,0017365020-6,MUK LUKS Womens Jane Suede Moccasin,35.25,35.25,In Stock,Grey,New,USD,2017-12-06T05:02:42Z,"2017-11-10T15:11:00.000Z,2017-11-18T08:00:00.000Z",False,Slippers Dot Com,,,Value,6,https://www.walmart.com/ip/MUK-LUKS-Womens-Jan...,107698,https://www.walmart.com/ip/MUK-LUKS-Womens-Jan...,33977045743,
3,AVpjXyCc1cnluZ0-V-Gj,2017-01-27T01:25:56Z,2018-01-04T11:52:35Z,,MUK LUKS,"Clothing,Shoes,Women's Shoes,All Women's Shoes...","Shoes,Shoes",Black,6.0 in x 6.0 in x 1.0 in,33977050000.0,https://i5.walmartimages.com/asr/950d38a5-0113...,"033977045903,muklukswomensdawnsuedescuffslippe...",Muk Luks,0017366001-6,MUK LUKS Womens Dawn Suede Scuff Slipper,24.75,24.75,In Stock,Black,New,USD,2018-01-04T11:52:35Z,2017-12-07T16:37:00.000Z,False,Slippers Dot Com,,,Value,6,https://www.walmart.com/ip/MUK-LUKS-Womens-Daw...,107698,https://www.walmart.com/ip/MUK-LUKS-Womens-Daw...,33977045903,
4,AVphGKLPilAPnD_x1Nrm,2017-01-27T01:25:56Z,2018-01-18T03:55:18Z,,MUK LUKS,"Clothing,Shoes,Women's Shoes,All Women's Shoes...",Shoes,Grey,6.0 in x 6.0 in x 1.0 in,33977050000.0,https://i5.walmartimages.com/asr/5e137bc3-c900...,"mukluks/00173660206,033977045958,0033977045958...",,0017366020-6,MUK LUKS Womens Dawn Suede Scuff Slipper,33.0,30.39,In Stock,Grey,New,USD,2017-12-04T21:35:47Z,2017-11-17T21:15:00.000Z,True,Walmart.com,,,Expedited,6,https://www.walmart.com/ip/MUK-LUKS-Womens-Daw...,107698,https://www.walmart.com/ip/MUK-LUKS-Womens-Daw...,33977045958,


In [None]:
# Show summary information about the DataFrame, as well as individual columns
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 34 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   id                   1000 non-null   object 
 1   dateAdded            1000 non-null   object 
 2   dateUpdated          1000 non-null   object 
 3   asins                3 non-null      object 
 4   brand                1000 non-null   object 
 5   categories           1000 non-null   object 
 6   primaryCategories    1000 non-null   object 
 7   colors               339 non-null    object 
 8   dimension            35 non-null     object 
 9   ean                  112 non-null    float64
 10  imageURLs            1000 non-null   object 
 11  keys                 1000 non-null   object 
 12  manufacturer         67 non-null     object 
 13  manufacturerNumber   254 non-null    object 
 14  name                 1000 non-null   object 
 15  prices.amountMax     1000 non-null   fl

## Sort / filter data

Sorting and filtering data in a Pandas DataFrame is easy and powerful. Take a look at some common ways to do it below.

### Sort data with Python and Pandas

In [None]:
# We will be using just a few columns, so let's make a copy of the DataFrame with only those
sdf = df[["id", "prices.merchant", "prices.amountMax"]]

In [None]:
# .sort_values method is one you will be using the most often. It can take one argument like so:
sdf.sort_values("prices.amountMax")

Unnamed: 0,id,prices.merchant,prices.amountMax
132,AVpfWLtyilAPnD_xaGle,Walmart.com,5.87
136,AVpgOcmuilAPnD_xpVft,Walmart.com,5.87
162,AVphqVhoLJeJML43dZ2a,Walmart.com,8.97
133,AVphQm0-ilAPnD_x3RCO,Walmart.com,9.62
134,AVpe_2h81cnluZ0-boAU,Walmart.com,12.44
...,...,...,...
511,AV-ncuMfHh53nbDR_Vej,,120.00
144,AV_HAsJHHh53nbDR_7AF,,130.00
145,AV_HAsJHHh53nbDR_7AF,,130.00
166,AVpim4qG1cnluZ0-O5LN,Overstock.com,145.95


In [None]:
# Or provide multiple arguments as a list, which will then sort the data in the order of columns specified
sdf.sort_values(["prices.merchant", "prices.amountMax"])

Unnamed: 0,id,prices.merchant,prices.amountMax
165,AVpfKjuI1cnluZ0-fS-R,AmazingBasics,37.41
164,AVpibldAilAPnD_xEGpy,Big Deal Hunter,39.99
142,AVph34uzilAPnD_x-STz,DAILYWEAR SPORTSWEAR CORP.,15.88
0,AVpfEf_hLJeJML431ueH,Overstock.com,55.99
166,AVpim4qG1cnluZ0-O5LN,Overstock.com,145.95
...,...,...,...
521,AV-ncuMfHh53nbDR_Vej,,120.00
522,AV-ncuMfHh53nbDR_Vej,,120.00
523,AV-ncuMfHh53nbDR_Vej,,120.00
144,AV_HAsJHHh53nbDR_7AF,,130.00


In [None]:
# By default, .sort_values method sorts the data in ascending order. 
# To sort in descending order, add the ascending=False argument.
sdf.sort_values(["prices.merchant", "prices.amountMax"], ascending=False)

Unnamed: 0,id,prices.merchant,prices.amountMax
14,AVpgNGD6ilAPnD_xpAke,Walmart.com,69.0
12,AVph9abqilAPnD_x_Nrp,Walmart.com,68.0
13,AVph9abqilAPnD_x_Nrp,Walmart.com,68.0
8,AVpfLXyhilAPnD_xWmNc,Walmart.com,59.0
9,AVpfeWdJ1cnluZ0-lXYU,Walmart.com,59.0
...,...,...,...
540,AV_EeA7_KZqtpbFMTSsJ,,24.0
541,AV_EeA7_KZqtpbFMTSsJ,,24.0
542,AV_EeA7_KZqtpbFMTSsJ,,24.0
543,AV_EeA7_KZqtpbFMTSsJ,,24.0


In [None]:
# To sort one column in ascending order and another in descending order, chain the methods
# Note that when chaining the methods, the order of columns should be reversed
sdf.sort_values("prices.amountMax", ascending=False).sort_values("prices.merchant")

Unnamed: 0,id,prices.merchant,prices.amountMax
165,AVpfKjuI1cnluZ0-fS-R,AmazingBasics,37.41
164,AVpibldAilAPnD_xEGpy,Big Deal Hunter,39.99
142,AVph34uzilAPnD_x-STz,DAILYWEAR SPORTSWEAR CORP.,15.88
0,AVpfEf_hLJeJML431ueH,Overstock.com,55.99
166,AVpim4qG1cnluZ0-O5LN,Overstock.com,145.95
...,...,...,...
540,AV_EeA7_KZqtpbFMTSsJ,,24.00
539,AV_EeA7_KZqtpbFMTSsJ,,24.00
538,AV_EeA7_KZqtpbFMTSsJ,,24.00
544,AV_EeA7_KZqtpbFMTSsJ,,24.00


### Filter data with Python and Pandas
Three useful ways to access particular rows of Pandas DataFrames are by the:
- row number; 
- `index` value; or
- column values.
Let's take a look at each below.

#### Filter by rows
The .iloc method can be used to slide the data in the way in which it is currently arranged.

In [None]:
# Get the first 10 rows
sdf.iloc[:10]

Unnamed: 0,id,prices.merchant,prices.amountMax
0,AVpfEf_hLJeJML431ueH,Overstock.com,55.99
1,AVpi74XfLJeJML43qZAc,Walmart.com,47.0
2,AVpi74XfLJeJML43qZAc,Slippers Dot Com,35.25
3,AVpjXyCc1cnluZ0-V-Gj,Slippers Dot Com,24.75
4,AVphGKLPilAPnD_x1Nrm,Walmart.com,33.0
5,AVpg91ziilAPnD_xziOo,Walmart.com,14.0
6,AVpjGKXyLJeJML43r8BH,,24.0
7,AVpjGKXyLJeJML43r8BH,,24.0
8,AVpfLXyhilAPnD_xWmNc,Walmart.com,59.0
9,AVpfeWdJ1cnluZ0-lXYU,Walmart.com,59.0


In [None]:
# Get the 15th row
sdf.iloc[15]  # Note that the row object is returned rather than a DataFrame

id                  AVpgBzl41cnluZ0-vSQf
prices.merchant              Walmart.com
prices.amountMax                      50
Name: 15, dtype: object

#### Filter by index
Each row of a DataFrame includes an `index` value, which acts as a name for each row. 

This might simply be a meaningless number, but it can be more - it might for example be a date, userID, whatever, allowing for convenient selection of subsets. 

In [None]:
# Get rows where the index is 10 or smaller
sdf.loc[:10,:]

Unnamed: 0,id,prices.merchant,prices.amountMax
0,AVpfEf_hLJeJML431ueH,Overstock.com,55.99
1,AVpi74XfLJeJML43qZAc,Walmart.com,47.0
2,AVpi74XfLJeJML43qZAc,Slippers Dot Com,35.25
3,AVpjXyCc1cnluZ0-V-Gj,Slippers Dot Com,24.75
4,AVphGKLPilAPnD_x1Nrm,Walmart.com,33.0
5,AVpg91ziilAPnD_xziOo,Walmart.com,14.0
6,AVpjGKXyLJeJML43r8BH,,24.0
7,AVpjGKXyLJeJML43r8BH,,24.0
8,AVpfLXyhilAPnD_xWmNc,Walmart.com,59.0
9,AVpfeWdJ1cnluZ0-lXYU,Walmart.com,59.0


So while that above example might look the same as before, we can do things like:

In [None]:
# Select rows where the index is smaller than 10, and the vendor is Walmart
sdf[sdf["prices.merchant"]=="Walmart.com"].loc[:10]

Unnamed: 0,id,prices.merchant,prices.amountMax
1,AVpi74XfLJeJML43qZAc,Walmart.com,47.0
4,AVphGKLPilAPnD_x1Nrm,Walmart.com,33.0
5,AVpg91ziilAPnD_xziOo,Walmart.com,14.0
8,AVpfLXyhilAPnD_xWmNc,Walmart.com,59.0
9,AVpfeWdJ1cnluZ0-lXYU,Walmart.com,59.0
10,AVpfeWdJ1cnluZ0-lXYU,Walmart.com,59.0


#### Filter by column data

You may have noticed the code `df[df["prices.merchant"]=="Walmart.com"]` above when showing how to filter data by the index. 

This code uses an Boolean array produced by `df["prices.merchant"]=="Walmart.com"`, in which each row is marked as TRUE or FALSE, based on whether the row's "prices.merchant" column value is "Walmart.com".

This is an extremely powerful method of data filtering, as any number of logical (and/or) operations can be combined using these Boolean arrays as you will see below.

Pay attention to how the query is constructed using brackets, combining logical operations. If you are not sure, I find it helpful to articulate what each clause within one set of brackets is doing, and to consider each conditional (& = AND, | = OR) clause.

In [None]:
# Get the portion of the dataframe where "prices.merchant" has "Walmart.com" values
sdf[sdf["prices.merchant"]=="Walmart.com"]

Unnamed: 0,id,prices.merchant,prices.amountMax
1,AVpi74XfLJeJML43qZAc,Walmart.com,47.00
4,AVphGKLPilAPnD_x1Nrm,Walmart.com,33.00
5,AVpg91ziilAPnD_xziOo,Walmart.com,14.00
8,AVpfLXyhilAPnD_xWmNc,Walmart.com,59.00
9,AVpfeWdJ1cnluZ0-lXYU,Walmart.com,59.00
...,...,...,...
201,AVpe43H3LJeJML43xkBD,Walmart.com,25.76
202,AVpfk9jZ1cnluZ0-nUd_,Walmart.com,23.99
203,AVph_sRxilAPnD_x_kaA,Walmart.com,23.99
206,AVphbNXp1cnluZ0-CU4j,Walmart.com,19.74


In [None]:
# Get the portion of the dataframe where "prices.merchant" has "Walmart.com" values, 
# and the prices.amountMax is above 50
sdf[(sdf["prices.merchant"]=="Walmart.com") & (sdf["prices.amountMax"] > 50)]

Unnamed: 0,id,prices.merchant,prices.amountMax
8,AVpfLXyhilAPnD_xWmNc,Walmart.com,59.0
9,AVpfeWdJ1cnluZ0-lXYU,Walmart.com,59.0
10,AVpfeWdJ1cnluZ0-lXYU,Walmart.com,59.0
12,AVph9abqilAPnD_x_Nrp,Walmart.com,68.0
13,AVph9abqilAPnD_x_Nrp,Walmart.com,68.0
14,AVpgNGD6ilAPnD_xpAke,Walmart.com,69.0
19,AVpivUm2ilAPnD_xHGfN,Walmart.com,59.0


In [None]:
# Get the portion of the dataframe where "prices.merchant" has "Walmart.com" values
# and the prices.amountMax is above 60 or less than 10
sdf[(sdf["prices.merchant"]=="Walmart.com") & ((sdf["prices.amountMax"] > 60) | (sdf["prices.amountMax"] < 10))]

Unnamed: 0,id,prices.merchant,prices.amountMax
12,AVph9abqilAPnD_x_Nrp,Walmart.com,68.0
13,AVph9abqilAPnD_x_Nrp,Walmart.com,68.0
14,AVpgNGD6ilAPnD_xpAke,Walmart.com,69.0
132,AVpfWLtyilAPnD_xaGle,Walmart.com,5.87
133,AVphQm0-ilAPnD_x3RCO,Walmart.com,9.62
136,AVpgOcmuilAPnD_xpVft,Walmart.com,5.87
162,AVphqVhoLJeJML43dZ2a,Walmart.com,8.97


In [None]:
# Get the portion of the dataframe where "prices.merchant" is missing values
# and by value of prices.amountMax
sdf[(sdf["prices.merchant"].isna()) & (sdf["prices.amountMax"] > 120)]

Unnamed: 0,id,prices.merchant,prices.amountMax
144,AV_HAsJHHh53nbDR_7AF,,130.0
145,AV_HAsJHHh53nbDR_7AF,,130.0


In [None]:
# Get the portion of the dataframe where "prices.merchant" contains string "com"
# and the prices.amountMax is above 60
sdf[(sdf["prices.merchant"].str.contains("com")) & (sdf["prices.amountMax"] > 60)]

Unnamed: 0,id,prices.merchant,prices.amountMax
11,AVpf38PKLJeJML43FRwO,Shoebuy.com,95.0
12,AVph9abqilAPnD_x_Nrp,Walmart.com,68.0
13,AVph9abqilAPnD_x_Nrp,Walmart.com,68.0
14,AVpgNGD6ilAPnD_xpAke,Walmart.com,69.0
166,AVpim4qG1cnluZ0-O5LN,Overstock.com,145.95


As you can see, Pandas provides flexible and powerful data filtering tools. This just scratches the surface of the large array of ways in which you can filter data in Pandas. 

To learn more, check out [this tutorial](https://pandas.pydata.org/pandas-docs/dev/getting_started/intro_tutorials/03_subset_data.html) from Pandas, and other methods such as `.query` ([reference](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.query.html)), `.filter` ([reference](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.filter.html)) and how to test for patterns in strings ([reference](https://pandas.pydata.org/pandas-docs/stable/user_guide/text.html#testing-for-strings-that-match-or-contain-a-pattern)).

# Data type conversions

Now, let's take a look at how to convert data types within Pandas.

### Data types - Simple conversion 

To convert one data type to another in a DataFrame, the `.astype` method can be used ([read more](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.astype.html)). Take a look below:

In [None]:
# Floating point to String
df["prices.amountMax"].astype(str)

0      55.99
1       47.0
2      35.25
3      24.75
4       33.0
       ...  
995    64.99
996    64.99
997    64.99
998    64.99
999    64.99
Name: prices.amountMax, Length: 1000, dtype: object

In [None]:
# Floating point to Integer (rounds down)
df["prices.amountMax"].astype(int)

0      55
1      47
2      35
3      24
4      33
       ..
995    64
996    64
997    64
998    64
999    64
Name: prices.amountMax, Length: 1000, dtype: int64

In [None]:
# Boolean to Binary / Integer
df["prices.isSale"].astype(int)

0      0
1      1
2      0
3      0
4      1
      ..
995    0
996    0
997    0
998    0
999    0
Name: prices.isSale, Length: 1000, dtype: int64

### Data types - converting dates

Without manual intervention, dates and/or times are usually loaded as strings. However, they are best handled in a native datetime format as it allows date/time specific operations.

Take a look at a few examples below:

What happens when we manipulate the data as-is, without converting it to a datetime object?

In [None]:
# Grab the year data - first four characters of the "dateAdded" column
df["dateAddedYr"] = df["dateAdded"].str[:4]
print(df["dateAddedYr"])

0      2015
1      2017
2      2017
3      2017
4      2017
       ... 
995    2017
996    2017
997    2017
998    2017
999    2017
Name: dateAddedYr, Length: 1000, dtype: object


In [None]:
# What happens if we operate on the year column?
df["dateAddedYr"] * 2

0      20152015
1      20172017
2      20172017
3      20172017
4      20172017
         ...   
995    20172017
996    20172017
997    20172017
998    20172017
999    20172017
Name: dateAddedYr, Length: 1000, dtype: object

In [None]:
# So let's convert the data to integers
df["dateAddedYr"] = df["dateAddedYr"].astype(int)

But, many of our operations are easier if the column is converted to datetime.

In [None]:
# Actually, "2015-05-04T12:13:08Z" is a standard datetime format. This can be simply converted to datetime objects.
df["dateAdded"] = pd.to_datetime(df["dateAdded"])
df["dateAdded"]

0     2015-05-04 12:13:08+00:00
1     2017-01-27 01:23:39+00:00
2     2017-01-27 01:23:39+00:00
3     2017-01-27 01:25:56+00:00
4     2017-01-27 01:25:56+00:00
                 ...           
995   2017-08-04 22:20:57+00:00
996   2017-08-04 22:20:57+00:00
997   2017-08-04 22:20:57+00:00
998   2017-08-04 22:20:57+00:00
999   2017-08-04 22:20:57+00:00
Name: dateAdded, Length: 1000, dtype: datetime64[ns, UTC]

In [None]:
# Once the column has been converted to a datetime objects, their properties can be accessed with various methods under the `.dt` set
print(df["dateAdded"].dt.year)  # Year
print(df["dateAdded"].dt.timetz)  # Timezone
print(df["dateAdded"].dt.dayofweek)  # Day (monday=0, sunday=6)

0      2015
1      2017
2      2017
3      2017
4      2017
       ... 
995    2017
996    2017
997    2017
998    2017
999    2017
Name: dateAdded, Length: 1000, dtype: int64
0      12:13:08+00:00
1      01:23:39+00:00
2      01:23:39+00:00
3      01:25:56+00:00
4      01:25:56+00:00
            ...      
995    22:20:57+00:00
996    22:20:57+00:00
997    22:20:57+00:00
998    22:20:57+00:00
999    22:20:57+00:00
Name: dateAdded, Length: 1000, dtype: object
0      0
1      4
2      4
3      4
4      4
      ..
995    4
996    4
997    4
998    4
999    4
Name: dateAdded, Length: 1000, dtype: int64


In [None]:
# This can be now used to easily filter our data
# let's say we want to see all data added in 2017 or later, and on Saturday/Sunday.
df[(df["dateAdded"].dt.year >= 2017) & (df["dateAdded"].dt.dayofweek >= 6)]

Unnamed: 0,id,dateAdded,dateUpdated,asins,brand,categories,primaryCategories,colors,dimension,ean,imageURLs,keys,manufacturer,manufacturerNumber,name,prices.amountMax,prices.amountMin,prices.availability,prices.color,prices.condition,prices.currency,prices.dateAdded,prices.dateSeen,prices.isSale,prices.merchant,prices.offer,prices.returnPolicy,prices.shipping,prices.size,prices.sourceURLs,sizes,sourceURLs,upc,weight,dateAddedYr
30,AV-KM264uC1rwyj_gSfJ,2017-11-05 03:20:19+00:00,2018-04-21T12:44:41Z,,SKECHERS,"Womens,Womens Shoes,Clothing,Women's Shoes,All...",Shoes,"Taupe,White",,,https://media.kohlsimg.com/is/image/kohls/2938...,"190211832214,skechersrumblershotshotwomenswedg...",,38562blk,Skechers 38562BLK Women's RUMBLERS - HOTSHOT S...,49.99,44.99,,Taupe,,USD,,"2017-12-19T10:00:00Z,2017-12-18T12:00:00Z,2017...",False,,,,,8,https://www.kohls.com/product/prd-2938910/skec...,567891011,https://www.walmart.com/ip/Skechers-Cali-Women...,190212000000,,2017
31,AV-KM264uC1rwyj_gSfJ,2017-11-05 03:20:19+00:00,2018-04-21T12:44:41Z,,SKECHERS,"Womens,Womens Shoes,Clothing,Women's Shoes,All...",Shoes,"Taupe,White",,,https://media.kohlsimg.com/is/image/kohls/2938...,"190211832214,skechersrumblershotshotwomenswedg...",,38562blk,Skechers 38562BLK Women's RUMBLERS - HOTSHOT S...,49.99,49.99,,White,,USD,,"2018-01-16T12:00:00Z,2018-01-15T08:00:00Z,2018...",False,,,,,9,https://www.kohls.com/product/prd-2938910/skec...,567891011,https://www.walmart.com/ip/Skechers-Cali-Women...,190212000000,,2017
32,AV-KM264uC1rwyj_gSfJ,2017-11-05 03:20:19+00:00,2018-04-21T12:44:41Z,,SKECHERS,"Womens,Womens Shoes,Clothing,Women's Shoes,All...",Shoes,"Taupe,White",,,https://media.kohlsimg.com/is/image/kohls/2938...,"190211832214,skechersrumblershotshotwomenswedg...",,38562blk,Skechers 38562BLK Women's RUMBLERS - HOTSHOT S...,49.99,49.99,,White,,USD,,"2018-01-16T12:00:00Z,2018-01-15T08:00:00Z,2018...",False,,,,,7,https://www.kohls.com/product/prd-2938910/skec...,567891011,https://www.walmart.com/ip/Skechers-Cali-Women...,190212000000,,2017
177,AVpe8ltm1cnluZ0-adgq,2017-01-22 06:22:02+00:00,2018-01-04T11:53:00Z,,Brinley Co.,"Clothing,Shoes,Women's Shoes,All Women's Shoes",Shoes,"Black,Brown,Chestnut",,8.701910e+11,https://i5.walmartimages.com/asr/bb4bbc35-f809...,"870191304146,0870191304146,brinley/bethblk070,...",Brinley Co,BETH-BLK-070,Brinley Co. Womens Round Toe Buckle Detail Boots,48.99,32.88,In Stock,Black,New,USD,2018-01-04T11:53:00Z,2017-12-07T16:41:00.000Z,True,Walmart.com,,,Expedited,7,https://www.walmart.com/ip/Brinley-Co-Womens-R...,"8.5,10,7,6,6.5,7.5,9,8",https://www.walmart.com/ip/Brinley-Co-Womens-R...,870191000000,,2017
192,AVpfstKl1cnluZ0-poJ4,2017-01-22 06:22:04+00:00,2018-01-03T05:22:07Z,,Brinley Co.,"Clothing,Shoes,Women's Shoes,All Women's Shoes","Shoes,Shoes","Stone,Black,Brown,Grey",,8.701920e+11,https://i5.walmartimages.com/asr/b7b69cdd-5898...,brinleywomensfauxleatherstackedheelfringeankle...,Brinley Co.,AKARA-BLK-060,Brinley Co. Womens Faux Leather Stacked Heel F...,44.99,23.88,In Stock,Black,New,USD,2018-01-03T05:22:07Z,2017-12-08T14:43:00.000Z,True,Walmart.com,,,Expedited,6,https://www.walmart.com/ip/Brinley-Co-Womens-F...,"8.5,10,7,6,6.5,7.5,9,8,11",https://www.walmart.com/ip/Brinley-Co-Womens-F...,870192000000,,2017
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
987,AV-KM9V2uC1rwyj_gSl1,2017-11-05 03:20:16+00:00,2018-02-13T19:03:15Z,,SKECHERS,"Clothing,Shoes,Women's Shoes,All Women's Shoes...",Shoes,,,,https://i5.walmartimages.com/asr/753af2dc-04a3...,"190872215555,skechersflexappeal20boldmovewomen...",,,Women's Flex Appeal 2.0-Bold Move Black/Charco...,64.99,64.99,,Charcoal,,USD,2017-11-27T10:48:18Z,"2017-11-02T11:24:00.000Z,2017-10-31T23:52:00.0...",False,,,,,5.5,https://www.kohls.com/product/prd-2978127/skec...,"5,5.5,6,6.5,7,7.5,8,8.5,9,9.5,10,11",http://www.walmart.com/ip/Women-s-Flex-Appeal-...,"190872215555,190872215746,190872215630,1908722...",,2017
988,AV-KM9V2uC1rwyj_gSl1,2017-11-05 03:20:16+00:00,2018-02-13T19:03:15Z,,SKECHERS,"Clothing,Shoes,Women's Shoes,All Women's Shoes...",Shoes,,,,https://i5.walmartimages.com/asr/753af2dc-04a3...,"190872215555,skechersflexappeal20boldmovewomen...",,,Women's Flex Appeal 2.0-Bold Move Black/Charco...,64.99,64.99,,Black Charcoal,,USD,2017-11-27T10:48:18Z,"2017-11-02T11:24:00.000Z,2017-10-31T23:52:00.0...",False,,,,,9,https://www.kohls.com/product/prd-2978127/skec...,"5,5.5,6,6.5,7,7.5,8,8.5,9,9.5,10,11",http://www.walmart.com/ip/Women-s-Flex-Appeal-...,"190872215555,190872215746,190872215630,1908722...",,2017
989,AV-KM9V2uC1rwyj_gSl1,2017-11-05 03:20:16+00:00,2018-02-13T19:03:15Z,,SKECHERS,"Clothing,Shoes,Women's Shoes,All Women's Shoes...",Shoes,,,,https://i5.walmartimages.com/asr/753af2dc-04a3...,"190872215555,skechersflexappeal20boldmovewomen...",,,Women's Flex Appeal 2.0-Bold Move Black/Charco...,64.99,64.99,,Charcoal,,USD,2017-11-27T10:48:18Z,"2017-11-02T11:24:00.000Z,2017-10-31T23:52:00.0...",False,,,,,8,https://www.kohls.com/product/prd-2978127/skec...,"5,5.5,6,6.5,7,7.5,8,8.5,9,9.5,10,11",http://www.walmart.com/ip/Women-s-Flex-Appeal-...,"190872215555,190872215746,190872215630,1908722...",,2017
990,AV-KM9V2uC1rwyj_gSl1,2017-11-05 03:20:16+00:00,2018-02-13T19:03:15Z,,SKECHERS,"Clothing,Shoes,Women's Shoes,All Women's Shoes...",Shoes,,,,https://i5.walmartimages.com/asr/753af2dc-04a3...,"190872215555,skechersflexappeal20boldmovewomen...",,,Women's Flex Appeal 2.0-Bold Move Black/Charco...,64.99,64.99,,Black Charcoal,,USD,2017-11-27T10:48:18Z,"2017-11-02T11:24:00.000Z,2017-10-31T23:52:00.0...",False,,,,,8.5,https://www.kohls.com/product/prd-2978127/skec...,"5,5.5,6,6.5,7,7.5,8,8.5,9,9.5,10,11",http://www.walmart.com/ip/Women-s-Flex-Appeal-...,"190872215555,190872215746,190872215630,1908722...",,2017


If you will be working with date/time data, we recommend reading [this tutorial](https://pandas.pydata.org/docs/getting_started/intro_tutorials/09_timeseries.html), and [this reference guide](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html).