# SI370 Day 3: Loading and manipulating data in pandas
## Please download Day3.zip from Canvas -> Files and start working on the first Exercise


# Reminders


## Learning Objectives
* load CSV files
* load JSON files
* use pd.read_html to extract tables from web pages
* load data from simple APIs 
* load data from a SQL database
* handle missing data (dropna and fillna)
* use vectorized string functions
* Pandas refresher (or introduction)
* explain how pandas operations differ from "traditional" python
* be able to load a CSV file into a Pandas DataFrame
* explain how to extract columns from a DataFrame
* sort a DataFrame
* assign a column as the index of a DataFrame
* filter a DataFrame according to some criteria
* explain how boolean masks work in filtering DataFrames

### IMPORTANT: Replace ```?``` in the following code with your uniqname.

In [1]:
MY_UNIQNAME = '?'

## <font color="magenta">Exercise 1 (2 points):</font>
### a. Sign up for a Kaggle account (https://www.kaggle.com/).  Record your Kaggle username in the following markdown cell


Replace this with your Kaggle username

### b. Browse the Kaggle datasets (https://www.kaggle.com/datasets) and list two or three that you find interesting.  Explain why you find them interesting.

Insert your answer here.

# Today's focus: Loading (and manipulating) data using pandas

In [1]:
import pandas as pd

Recall the ```pd.read_csv``` function that we used to load data sets in previous classes:

In [2]:
menu = pd.read_csv('data/menu.csv') 

That works great for well-formatted CSV files, but what happens when you get something that looks like the ```data/avocado_eu.csv``` file.
Go ahead and browse that in JupyterLab's CSV browser.

You'll notice a new drop-down menu labelled "Delimiter".  Go ahead and change that to ```;```.

Referring back to your readings and the [read_csv documentation online](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html), complete the following exercise


Read the data/avocado_eu.csv file into a pandas DataFrame and show the first 5 rows.


In [3]:
avocado = pd.read_csv('data/avocado_eu.csv', delimiter=';')
avocado.head(5)

Unnamed: 0.1,Unnamed: 0,Date,AveragePrice,Total Volume,4046,4225,4770,Total Bags,Small Bags,Large Bags,XLarge Bags,type,year,region
0,0,2015-12-27,133,6423662,103674,5445485,4816,869687,860362,9325,0,conventional,2015,Albany
1,1,2015-12-20,135,5487698,67428,4463881,5833,950556,940807,9749,0,conventional,2015,Albany
2,2,2015-12-13,93,11822022,7947,10914967,1305,814535,804221,10314,0,conventional,2015,Albany
3,3,2015-12-06,108,7899215,11320,7197641,7258,581116,56774,13376,0,conventional,2015,Albany
4,4,2015-11-29,128,510396,94148,4383839,7578,618395,598626,19769,0,conventional,2015,Albany


You'll notice that, unless you did something special in the previous read_csv invocation, the decimal points don't look quite right.  Go ahead and find the right option to convert commas to periods when loading a CSV file.

## Exercise 2 (1 point):
Read the data/avocado_eu.csv file using the correct delimiter and decimal character into a dataframe and show the first 5 rows:

In [5]:
avocado = pd.read_csv('data/avocado_eu.csv', delimiter=';', decimal=',')
avocado.head(5)

Unnamed: 0.1,Unnamed: 0,Date,AveragePrice,Total Volume,4046,4225,4770,Total Bags,Small Bags,Large Bags,XLarge Bags,type,year,region
0,0,2015-12-27,1.33,64236.62,1036.74,54454.85,48.16,8696.87,8603.62,93.25,0.0,conventional,2015,Albany
1,1,2015-12-20,1.35,54876.98,674.28,44638.81,58.33,9505.56,9408.07,97.49,0.0,conventional,2015,Albany
2,2,2015-12-13,0.93,118220.22,794.7,109149.67,130.5,8145.35,8042.21,103.14,0.0,conventional,2015,Albany
3,3,2015-12-06,1.08,78992.15,1132.0,71976.41,72.58,5811.16,5677.4,133.76,0.0,conventional,2015,Albany
4,4,2015-11-29,1.28,51039.6,941.48,43838.39,75.78,6183.95,5986.26,197.69,0.0,conventional,2015,Albany


# Counting the number of values

Sometimes, you'll want to count the number of times values occur.  For example, we might want to know the number of times each 'type'
is reported in our avocado data.  Use the ```value_counts()``` function on a Series to do so:

In [6]:
avocado['type'].value_counts()

conventional    9126
organic         9123
Name: type, dtype: int64

# Loading JSON data

In addition to CSV files, JSON (JavaScript Object Notation) files or data is commonly used.  

In [6]:
nfl_football_players = pd.read_json('data/nfl_football_profiles.json')

In [7]:
nfl_football_players.head()

Unnamed: 0,birth_date,birth_place,college,current_salary,current_team,death_date,draft_position,draft_round,draft_team,draft_year,height,high_school,hof_induction_year,name,player_id,position,weight
0,1967-05-12,"Bay City, TX",Baylor,,,,34.0,2.0,Seattle Seahawks,1990.0,6-0,"Van Vleck, TX",,Robert Blackmon,1809,DB,208.0
1,1970-07-20,"Louisville, KY",Kentucky,,,,85.0,4.0,Seattle Seahawks,1993.0,6-3,"Holy Cross, KY",,Dean Wells,23586,LB,248.0
2,1990-08-14,"Newton, MA",Oregon,1075000.0,Miami Dolphins,,46.0,2.0,Buffalo Bills,2013.0,6-3,"Los Gatos, CA",,Kiko Alonso,355,ILB,238.0
3,1948-04-22,"Dallas, TX",North Texas,,,1999-10-15,126.0,5.0,New Orleans Saints,1970.0,6-2,"W.W. Samuell, TX",,Steve Ramsey,18182,QB,210.0
4,1988-02-27,"Neptune, NJ",Miami (FL),,,,,,,,6-0,"Neptune, NJ",,Cory Nelms,16250,CB,195.0


And, just for fun, show the player with the highest Current Salary from that dataset:

In [8]:
nfl_football_players.sort_values('current_salary', ascending=False).head(1)

Unnamed: 0,birth_date,birth_place,college,current_salary,current_team,death_date,draft_position,draft_round,draft_team,draft_year,height,high_school,hof_induction_year,name,player_id,position,weight
6454,1993-01-17,"Ibadan, Nigeria",Georgia Tech,993150,Los Angeles Chargers,,50.0,2.0,San Diego Chargers,2014.0,6-3,"Archbishop Carroll, DE",,Jeremiah Attaochu,721,OLB,252.0


# Fixing up the data
Assuming you did something like sort_values on one of the original columns, you probably got the wrong result.

Looking a bit more closely at the results, you'll notice that the current_salary column.  Remembering that we have made the shift from pythonic to pandorable, we can leverage the impressive-sounding "vectorized string functions" mentioned in Section XXX of the McKinney book.  Specifically, we can use the str.replace(...) method.  Note that had we use read_csv to load the file we could have used the ```thousands=``` option and avoided all this, but sometimes data doesn't come in a convenient format.

One way to apply functions is to operate on a column and then assign the results to another column.  For example, if we wanted to eliminate commas, we could replace them with null strings


In [9]:
nfl_football_players['current_salary'].str.replace(',', '')

0           None
1           None
2        1075000
3           None
4           None
5           None
6        1762000
7           None
8           None
9           None
10          None
11          None
12          None
13        774294
14          None
15          None
16          None
17          None
18          None
19          None
20          None
21          None
22          None
23          None
24          None
25          None
26          None
27          None
28          None
29          None
          ...   
25013       None
25014       None
25015       None
25016       None
25017       None
25018       None
25019       None
25020       None
25021       None
25022       None
25023       None
25024       None
25025       None
25026       None
25027       None
25028       None
25029       None
25030       None
25031       None
25032       None
25033       None
25034       None
25035       None
25036       None
25037       None
25038       None
25039       None
25040       No

And assign the results to a column in the original dataframe (in this case I'm calling the column current_salary_nocommas)

In [10]:
nfl_football_players['current_salary_nocommas'] = nfl_football_players['current_salary'].str.replace(',', '')

But you'll notice that the type of the column is string, and we want to convert it to a float so we can sort it numerically.  So we can use the astype() function to convert it:

In [11]:
nfl_football_players['current_salary_cleaned'] = nfl_football_players['current_salary_nocommas'].astype(float)

In [13]:
nfl_football_players.head(2)

Unnamed: 0,birth_date,birth_place,college,current_salary,current_team,death_date,draft_position,draft_round,draft_team,draft_year,height,high_school,hof_induction_year,name,player_id,position,weight,current_salary_nocommas,current_salary_cleaned
0,1967-05-12,"Bay City, TX",Baylor,,,,34.0,2.0,Seattle Seahawks,1990.0,6-0,"Van Vleck, TX",,Robert Blackmon,1809,DB,208.0,,
1,1970-07-20,"Louisville, KY",Kentucky,,,,85.0,4.0,Seattle Seahawks,1993.0,6-3,"Holy Cross, KY",,Dean Wells,23586,LB,248.0,,


And now we can re-run our command to sort by salary and get the correct result:

In [14]:
nfl_football_players.sort_values('current_salary_cleaned', ascending=False).head(1)

Unnamed: 0,birth_date,birth_place,college,current_salary,current_team,death_date,draft_position,draft_round,draft_team,draft_year,height,high_school,hof_induction_year,name,player_id,position,weight,current_salary_nocommas,current_salary_cleaned
17756,1988-08-19,"Holland, MI",Michigan St.,23943600,Washington Redskins,,102.0,4.0,Washington Redskins,2012.0,6-3,"Holland Christian, MI",,Kirk Cousins,4644,QB,214.0,23943600,23943600.0


# Dropping missing values

In addition to the "all" or "any" functionality described in McKinney section 7.1, it's sometimes useful to drop a row only if a certain column or columns have missing data.  To do this, use the subset= option with dropna().  So, for example, to drop all players for whom we do not have salary information, we could use the following code:

In [15]:
nfl_football_players_salaries = nfl_football_players.dropna(subset=['current_salary_cleaned'])

In [16]:
nfl_football_players_salaries.head()

Unnamed: 0,birth_date,birth_place,college,current_salary,current_team,death_date,draft_position,draft_round,draft_team,draft_year,height,high_school,hof_induction_year,name,player_id,position,weight,current_salary_nocommas,current_salary_cleaned
2,1990-08-14,"Newton, MA",Oregon,1075000,Miami Dolphins,,46.0,2.0,Buffalo Bills,2013.0,6-3,"Los Gatos, CA",,Kiko Alonso,355,ILB,238.0,1075000,1075000.0
6,1992-10-27,"Cincinnati, OH",Louisville,1762000,Buffalo Bills,,73.0,3.0,Buffalo Bills,2014.0,6-1,"Northwest, OH",,Preston Brown,2701,ILB,251.0,1762000,1762000.0
13,1993-06-14,"Cleveland, OH",Michigan,774294,Seattle Seahawks,,63.0,2.0,Seattle Seahawks,2015.0,6-2,"Glenville, OH",,Frank Clark,3966,DE,270.0,774294,774294.0
37,1987-03-16,"Bellville, TX",SMU,6750000,Denver Broncos,,82.0,3.0,Pittsburgh Steelers,2010.0,5-11,"Bellville, TX",,Emmanuel Sanders,19449,WR,186.0,6750000,6750000.0
53,1988-10-27,"Lakeland, FL",Louisville,3750000,New York Jets,,126.0,4.0,New York Jets,2011.0,5-11,"Lake Gibson, FL",,Bilal Powell,17858,RB,204.0,3750000,3750000.0


# Creating dummy variables

We might, on occasion, want to "bin" or "discretize" a variable.  For example, we might want to take the previous dataframe and add dummy variables that map onto whether the salaries are "small" (< \$1M) , "medium" (\$1M - \$10M), or "large" (> \$10M).  We could do something like the following:

In [17]:
bins = [0,1000000,10000000,1000000000]

In [18]:
dummies = pd.get_dummies(pd.cut(nfl_football_players_salaries['current_salary_cleaned'],bins,labels=['small','medium','large']))

In [19]:
dummies.head()

Unnamed: 0,small,medium,large
2,0,1,0
6,0,1,0
13,1,0,0
37,0,1,0
53,0,1,0


In [21]:
nfl_cats = pd.concat([nfl_football_players_salaries,dummies],axis=1)

In [22]:
nfl_cats.tail()

Unnamed: 0,birth_date,birth_place,college,current_salary,current_team,death_date,draft_position,draft_round,draft_team,draft_year,...,hof_induction_year,name,player_id,position,weight,current_salary_nocommas,current_salary_cleaned,small,medium,large
24885,1994-02-12,"San Antonio, TX",Memphis,880741,Denver Broncos,,26.0,1.0,Denver Broncos,2016.0,...,,Paxton Lynch,13753,QB,244.0,880741,880741.0,1,0,0
24917,1991-05-13,"Fairfield, CA",TCU,860000,Los Angeles Chargers,,25.0,1.0,San Diego Chargers,2014.0,...,,Jason Verrett,22916,CB,189.0,860000,860000.0,1,0,0
24923,1993-01-21,"Sacramento, CA",Stanford,772413,New England Patriots,,64.0,2.0,New England Patriots,2015.0,...,,Jordan Richards,18586,SS,211.0,772413,772413.0,1,0,0
24967,1989-11-27,"St. Paul, MN",Notre Dame,887058,Minnesota Vikings,,13.0,1.0,Arizona Cardinals,2012.0,...,,Michael Floyd,7063,WR,220.0,887058,887058.0,1,0,0
24984,1988-03-22,"Longview, TX",Washington St.,4500000,Jacksonville Jaguars,,,,,,...,,Chris Ivory,10701,RB,222.0,4500000,4500000.0,0,1,0


# Scraping Tables from HTML

The ```pd.read_html``` function returns a list of DataFrames read from an HTML source.  The following line will return a list of DataFrames from https://en.wikipedia.org/wiki/List_of_largest_sports_contracts

In [23]:
contracts_scraped = pd.read_html('https://en.wikipedia.org/wiki/List_of_largest_sports_contracts',header=0)

In [24]:
len(contracts_scraped)

1

To get the first table, you'll need to pull off the 0th element:

In [25]:
contracts = contracts_scraped[0]
contracts.head()

Unnamed: 0,Rank,Player,Team,Sport,Length of contract,Contract value (USD),Average per year (USD),Average per game4 (USD),Ref[1]
0,01,Giancarlo Stanton,Miami Marlins*,Baseball,13 years (2014–2027),"$325,000,000","$25,000,000","$154,320.99",[1]
1,02,Alex Rodriguez1R,New York Yankees*,Baseball,10 years (2008–2017),"$275,000,000","$27,500,000","$169,753.09",[2]
2,03,Alex Rodriguez2R,Texas Rangers*,Baseball,10 years (2001–2010),"$252,000,000","$25,200,000","$155,555.56",[3]
3,04,Miguel Cabrera,Detroit Tigers,Baseball,8 years (2016–2023),"$247,000,000","$31,000,000","$191,358.02",[4]
4,05 (tie),Robinson Cano,Seattle Mariners,Baseball,10 years (2014–2023),"$240,000,000","$24,000,000","$148,148.15",[5]


## Exercise 3 (2 points): 

Count the number of players from each sport in the List of Largest Sports Contracts 

Hint:  see value_counts() description above

In [31]:
sports = contracts['Sport']
sports.value_counts()

Baseball                55
Basketball              26
American football       15
Auto racing              2
Association football     1
Hockey                   1
Name: Sport, dtype: int64

In [32]:
contracts['Sport'].value_counts()

Baseball                55
Basketball              26
American football       15
Auto racing              2
Association football     1
Hockey                   1
Name: Sport, dtype: int64

For the final exercise, we're going to return to the nfl_football_players dataframe we created earlier.  The following question looks deceptively easy.
We'd like you to tackle this on in your groups.  We'd like you to first use the whiteboard to come up with a plan to tackle this exercise. 
You should first list the steps you'll follow and the highlight the steps that you think you'll need to look up extra documentation for.  Members of the first group to complete this exercise **and help another group complete it by explaining how to do it** will receive **two bonus marks**.

## Exercise 4 (5 points): 
Create a new dataframe that contains all the columns in the nfl_football_players dataframe as well as an additional column that contains each player's height in centimeters. Show the first 5 rows of your result.

hint: 1 inch = 2.54 cm

hint: you can use the vectorized string function str.split() to separate feet and inches from the original dataframe column

hint: remember to cast strings to numeric types if you're going to perform math on them

hint: you might want to create an intermediate (temporary) DataFrame to help you keep things clear instead of attempting to do 
this in one line 

In [33]:
nfl_football_players.columns

Index(['birth_date', 'birth_place', 'college', 'current_salary',
       'current_team', 'death_date', 'draft_position', 'draft_round',
       'draft_team', 'draft_year', 'height', 'high_school',
       'hof_induction_year', 'name', 'player_id', 'position', 'weight',
       'current_salary_nocommas', 'current_salary_cleaned'],
      dtype='object')

In [35]:
nfl_football_players.height.head(5)

0    6-0
1    6-3
2    6-3
3    6-2
4    6-0
Name: height, dtype: object

In [40]:
heights = nfl_football_players.height.str.split('-')

In [41]:
heights.head()

0    [6, 0]
1    [6, 3]
2    [6, 3]
3    [6, 2]
4    [6, 0]
Name: height, dtype: object

In [56]:
heights = nfl_football_players.height.str.split('-', expand=True)

In [57]:
heights.head()

Unnamed: 0,0,1
0,6,0
1,6,3
2,6,3
3,6,2
4,6,0


In [54]:
heights = nfl_football_players.height.str.split('-', expand = True)
nfl_football_players['cm'] = (heights[0].astype(float)*12+heights[1].astype(float))*2.54

In [58]:
nfl_football_players.head(1)

Unnamed: 0,birth_date,birth_place,college,current_salary,current_team,death_date,draft_position,draft_round,draft_team,draft_year,height,high_school,hof_induction_year,name,player_id,position,weight,current_salary_nocommas,current_salary_cleaned,cm
0,1967-05-12,"Bay City, TX",Baylor,,,,34.0,2.0,Seattle Seahawks,1990.0,6-0,"Van Vleck, TX",,Robert Blackmon,1809,DB,208.0,,,182.88


# APIs and requests (FYI only)
You've covered the ```requests``` package in previous courses.  This example shows what you can do with an API that returns JSON:

In [22]:
import requests

In [23]:
url = 'https://api.github.com/repos/pandas-dev/pandas/issues'

In [24]:
resp = requests.get(url)
resp

<Response [200]>

In [25]:
data = resp.json()

In [26]:
data[0]['title']

'BUG: SparseDataFrame coerces input to dense matrix if string-type index is given'

In [27]:
issues = pd.DataFrame(data)
issues.head()

Unnamed: 0,assignee,assignees,author_association,body,closed_at,comments,comments_url,created_at,events_url,html_url,...,milestone,node_id,number,pull_request,repository_url,state,title,updated_at,url,user
0,,[],NONE,"#### Code Sample, a copy-pastable example if p...",,0,https://api.github.com/repos/pandas-dev/pandas...,2018-09-07T19:15:41Z,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/issues/22630,...,,MDU6SXNzdWUzNTgxODEwODU=,22630,,https://api.github.com/repos/pandas-dev/pandas,open,BUG: SparseDataFrame coerces input to dense ma...,2018-09-07T19:16:54Z,https://api.github.com/repos/pandas-dev/pandas...,"{'login': 'scottgigante', 'id': 8499679, 'node..."
1,,[],NONE,#### Code Sample\r\n\r\n```python\r\nimport pa...,,1,https://api.github.com/repos/pandas-dev/pandas...,2018-09-07T17:34:04Z,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/issues/22629,...,,MDU6SXNzdWUzNTgxNTE3Mjc=,22629,,https://api.github.com/repos/pandas-dev/pandas,open,read_excel ignores `sheet_name` parameter in P...,2018-09-07T19:31:00Z,https://api.github.com/repos/pandas-dev/pandas...,"{'login': 'alexHolistX', 'id': 43070785, 'node..."
2,,[],NONE,- [X] tests added / passed\r\n- [X] passes `gi...,,5,https://api.github.com/repos/pandas-dev/pandas...,2018-09-07T15:07:08Z,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/pull/22628,...,,MDExOlB1bGxSZXF1ZXN0MjEzOTU3NDg3,22628,{'url': 'https://api.github.com/repos/pandas-d...,https://api.github.com/repos/pandas-dev/pandas,open,BUG: Some sas7bdat files with many columns are...,2018-09-07T19:51:57Z,https://api.github.com/repos/pandas-dev/pandas...,"{'login': 'troels', 'id': 3203, 'node_id': 'MD..."
3,,[],NONE,"#### Code Sample, a copy-pastable example if p...",,1,https://api.github.com/repos/pandas-dev/pandas...,2018-09-07T13:10:56Z,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/issues/22627,...,,MDU6SXNzdWUzNTgwNjEyMzI=,22627,,https://api.github.com/repos/pandas-dev/pandas,open,Series.reorder_levels docstring includes extra...,2018-09-07T15:55:01Z,https://api.github.com/repos/pandas-dev/pandas...,"{'login': 'tschm', 'id': 2046079, 'node_id': '..."
4,,[],CONTRIBUTOR,Off the back of discussion in this [PR](https:...,,0,https://api.github.com/repos/pandas-dev/pandas...,2018-09-06T23:27:53Z,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/issues/22624,...,,MDU6SXNzdWUzNTc4NjUyODc=,22624,,https://api.github.com/repos/pandas-dev/pandas,open,Refactor test_sql.py,2018-09-06T23:38:20Z,https://api.github.com/repos/pandas-dev/pandas...,"{'login': 'alimcmaster1', 'id': 16733618, 'nod..."


In [28]:
issues.columns

Index(['assignee', 'assignees', 'author_association', 'body', 'closed_at',
       'comments', 'comments_url', 'created_at', 'events_url', 'html_url',
       'id', 'labels', 'labels_url', 'locked', 'milestone', 'node_id',
       'number', 'pull_request', 'repository_url', 'state', 'title',
       'updated_at', 'url', 'user'],
      dtype='object')

In [29]:
issues = pd.DataFrame(data, columns=['number', 'title','labels', 'state'])
issues.head()

Unnamed: 0,number,title,labels,state
0,22630,BUG: SparseDataFrame coerces input to dense ma...,[],open
1,22629,read_excel ignores `sheet_name` parameter in P...,[],open
2,22628,BUG: Some sas7bdat files with many columns are...,"[{'id': 76811, 'node_id': 'MDU6TGFiZWw3NjgxMQ=...",open
3,22627,Series.reorder_levels docstring includes extra...,"[{'id': 134699, 'node_id': 'MDU6TGFiZWwxMzQ2OT...",open
4,22624,Refactor test_sql.py,"[{'id': 211029535, 'node_id': 'MDU6TGFiZWwyMTE...",open


# Accessing databases (FYI only)
This section is preparation for later in the course; we won't be covering this in today's lab.
You'll need to download the database on your own (see link in the following paragraph).

Note that some datasets are available only in SQLite formats.  It's useful to know how to load those databases into pandas.
The following dataset is a collection of about 500,000 fine food reviews from Amazon. (https://www.kaggle.com/snap/amazon-fine-food-reviews/home).  In the past, you may have used the SQLite3 library. That's fine, but it's often easier to use SQLAlchemy, which plays nicely with pandas.  SQLAlchemy also makes it easy to switch the database "backend" so you can also access postgres, MySQL or other databases.

In [34]:
import sqlalchemy as sqla

In [35]:
db = sqla.create_engine('sqlite:///data/fine_food_reviews.sqlite')

In [36]:
df = pd.read_sql('select score, count(*) from Reviews group by score', db)

In [37]:
df.head()

Unnamed: 0,Score,count(*)
0,1,52268
1,2,29769
2,3,42640
3,4,80655
4,5,363122


In [38]:
ff_df = pd.read_sql('select * from Reviews',db)

In [39]:
ff_df.head(5)

Unnamed: 0,Id,ProductId,UserId,ProfileName,HelpfulnessNumerator,HelpfulnessDenominator,Score,Time,Summary,Text
0,1,B001E4KFG0,A3SGXH7AUHU8GW,delmartian,1,1,5,1303862400,Good Quality Dog Food,I have bought several of the Vitality canned d...
1,2,B00813GRG4,A1D87F6ZCVE5NK,dll pa,0,0,1,1346976000,Not as Advertised,Product arrived labeled as Jumbo Salted Peanut...
2,3,B000LQOCH0,ABXLMWJIXXAIN,"Natalia Corres ""Natalia Corres""",1,1,4,1219017600,"""Delight"" says it all",This is a confection that has been around a fe...
3,4,B000UA0QIQ,A395BORC6FGVXV,Karl,3,3,2,1307923200,Cough Medicine,If you are looking for the secret ingredient i...
4,5,B006K2ZZ7K,A1UQRSCLF8GW1T,"Michael D. Bigham ""M. Wassir""",0,0,5,1350777600,Great taffy,Great taffy at a great price. There was a wid...
