# Practice your skills loading data from a CSV file
In this exercise, you will load a CSV file and you'll perform some operations on it to extract data. If you know how to do this with the Pandas library, you can use that. If not, you can use the csv library with the ready-to-use example this notebook provides.

In [21]:
from csv import DictReader
import pandas as pd
# Open the CSV file and read it into a list of dictionaries and ignore unicode errors

with open('sample_data/wine-ratings-small.csv', encoding='utf-8', errors='ignore') as f:
    reader = DictReader(f)
    wines = list(reader)

# The wines variable is now a list of dictionaries, one for each row in the CSV file. This is the sample output of a single entry:
# {'': '1',
#  'name': 'Laurenz V Charming Gruner Veltliner 2014',
#  'grape': '',
#  'region': 'Kamptal, Austria',
#  'variety': 'White Wine',
#  'rating': '90.0',
#  'notes': ''}



Looping over the list of dictionaries can be tricky with plain Python. Specialized libraries like Pandas make this much easier but the downside is that you need to learn a new library. The following code is a bit more verbose but it's a good exercise to learn how to work with dictionaries in Python.


In [22]:
# This example creates a new list that only has wines from Napa Valley. The new list is called napa_wines:
napa_wines = []
for wine in wines:
    if 'Napa' in wine['region']:
        napa_wines.append(wine)

napa_wines[:5]

[{'': '24',
  'name': 'Lava Vine Winery Napa Valley Cabernet Sauvignon 2014',
  'grape': '',
  'region': 'Napa Valley, California',
  'variety': 'Red Wine',
  'rating': '91.0',
  'notes': 'A wonderful representation of how amazing the 2014 vintage could be and how to balance Napa Valley’s Intensity. A ripe cherry and cassis entry with dusted cocoa and a touch of graham cracker entice. The silky rich entry is so balanced there seems to be no separation from mid-palate through the lengthy finish with multitudes of fruit and spice to accompany. This Cabernet Sauvignon is fully integrated, super complex and will age beautifully.'},
 {'': '25',
  'name': 'Lava Vine Winery Napa Valley Reserve Cabernet Sauvignon 2012',
  'grape': '',
  'region': 'Napa Valley, California',
  'variety': 'Red Wine',
  'rating': '92.0',
  'notes': 'Black berries and hints of strawberry invite with muddled cherry cola spices. Deep earthy tones on a silk entry proceed to a rich, full mid-palate. Expressive tannins 

**NOTE**: If you are trying to use ratings, remember that you will need to convert the ratings to integers for numerical comparisons.

## Using Pandas
Alternatively, you can use the Pandas library to load the CSV file and then extract the data. You'll need to install the Pandas library first. You can do this with the following command:

```bash
pip install pandas
```

Then, you can use the following code to load the CSV file and extract the data:

```python
import pandas as pd

df = pd.read_csv('sample_data/wine-ratings-small.csv')
df.head()
```

In [68]:
import pandas as pd
df = pd.read_csv("sample_data/wine-ratings-small.csv", index_col=0) # read the csv file and set the index column to 0
df.head() # show the first 5 rows of the dataframe

Unnamed: 0,name,grape,region,variety,rating,notes
0,Laurenz V Charming Gruner Veltliner 2013,,"Kamptal, Austria",White Wine,90.0,Aromas of ripe apples and a typical Veltliner ...
1,Laurenz V Charming Gruner Veltliner 2014,,"Kamptal, Austria",White Wine,90.0,Aromas of ripe apples and a typical Veltliner ...
2,Laurenz V Singing Gruner Veltliner 2007,,Austria,White Wine,90.0,"A very attractive fruit bouquet yields apple, ..."
3,Laurenz V Singing Gruner Veltliner 2010,,Austria,White Wine,88.0,"A very attractive fruit bouquet yields apple, ..."
4,Laurenz V Singing Gruner Veltliner 2011,,Austria,White Wine,88.0,"A very attractive fruit bouquet yields apple, ..."


## Manipulate data with Pandas or as a dictionary
At this point, you can use Pandas if you know how to use it. Otherwise, you can use the data as a dictionary. You can use the following code to extract the data:

```python
data = df.to_dict()
```

In [69]:
dict_data = df.to_dict('records')
# You'll get several keys, one for each column in the dataframe. You can access the values of a column by using the column name as a key. You'll also
# get the index of each row as a key. You can access the values of a row by using the index as a key.

dict_data[0] # get the values of the 'name' column
# sample output:
# {'name': 'Laurenz V Charming Gruner Veltliner 2013',
#  'grape': nan,
#  'region': 'Kamptal, Austria',
#  'variety': 'White Wine',
#  'rating': 90.0,
#  'notes': 'Aromas of ripe apples and a typical Veltliner spiciness marry to create a fascinating fruit bouquet. On the palate, the wine is soft and juicy, supported by a fine fruit acidity. Very harmonious, allowing for perfectly smooth drinking. Simply charming!'}

{'name': 'Laurenz V Charming Gruner Veltliner 2013',
 'grape': nan,
 'region': 'Kamptal, Austria',
 'variety': 'White Wine',
 'rating': 90.0,
 'notes': 'Aromas of ripe apples and a typical Veltliner spiciness marry to create a fascinating fruit bouquet. On the palate, the wine is soft and juicy, supported by a fine fruit acidity. Very harmonious, allowing for perfectly smooth drinking. Simply charming!'}

### <b>Excercise</b>: retrieve data by variety of wines

In [65]:
red_wines = []

for item in dict_data:
    if 'Red' in str(item['variety']):
        red_wines.append(item)

red_wines[:5]

[{'name': 'Lava Cap American River Red',
  'grape': nan,
  'region': 'El Dorado, Sierra Foothills, California',
  'variety': 'Red Wine',
  'rating': 90.0,
  'notes': 'This wine was created as a table wine. We wanted the wine to be enjoyable and distinguishable as an El Dorado AVA wine, but varietal character takes a back seat to the the smooth and flavorful structure. During the blending process, we selected wines with supple tannins, smooth spice, and a decadent body that work together while never dominating with one varietal character. Great fruit, great body, a little toasty oak, and the smooth, rich Lava Cap finish! '},
 {'name': 'Lava Cap Barbera 2010',
  'grape': nan,
  'region': 'Sierra Foothills, California',
  'variety': 'Red Wine',
  'rating': 90.0,
  'notes': 'The plump, rich cherry, raspberry and plum fruit is immediately appealing when first poured, feeling soft and rich. The robust fruit coupled with acidity and soft tannin makes this one of the best, most versatile food 

#### Serialize data to JSON file

In [67]:
import json

with open('sample_data/red_wines.json', 'w') as f:
    json.dump(red_wines, f)

#### Deserialize data from JSON file to dictionaries

In [75]:
records = pd.read_json('sample_data/red_wines.json')
dict_records = records.to_dict('records')
dict_records[:5]

[{'name': 'Lava Cap American River Red',
  'grape': nan,
  'region': 'El Dorado, Sierra Foothills, California',
  'variety': 'Red Wine',
  'rating': 90,
  'notes': 'This wine was created as a table wine. We wanted the wine to be enjoyable and distinguishable as an El Dorado AVA wine, but varietal character takes a back seat to the the smooth and flavorful structure. During the blending process, we selected wines with supple tannins, smooth spice, and a decadent body that work together while never dominating with one varietal character. Great fruit, great body, a little toasty oak, and the smooth, rich Lava Cap finish! '},
 {'name': 'Lava Cap Barbera 2010',
  'grape': nan,
  'region': 'Sierra Foothills, California',
  'variety': 'Red Wine',
  'rating': 90,
  'notes': 'The plump, rich cherry, raspberry and plum fruit is immediately appealing when first poured, feeling soft and rich. The robust fruit coupled with acidity and soft tannin makes this one of the best, most versatile food wine