**This notebook is an exercise in the [Pandas](https://www.kaggle.com/learn/pandas) course.  You can reference the tutorial at [this link](https://www.kaggle.com/residentmario/creating-reading-and-writing).**

---


# Introduction

The first step in most data analytics projects is reading the data file. In this exercise, you'll create Series and DataFrame objects, both by hand and by reading data files.

Run the code cell below to load libraries you will need (including code to check your answers).

In [39]:
import pandas as pd
pd.set_option('display.max_rows', 5)
from learntools.core import binder; binder.bind(globals())
from learntools.pandas.creating_reading_and_writing import *
print("Setup complete.")

Setup complete.


# Exercises

## 1.

In the cell below, create a DataFrame `fruits` that looks like this:

![](https://storage.googleapis.com/kaggle-media/learn/images/Ax3pp2A.png)

In [40]:
import pandas as pd
fruits = pd.DataFrame([[30, 21]], columns=['Apples', 'Bananas'])
q1.check()

<IPython.core.display.Javascript object>

<span style="color:#33cc33">Correct</span>

In [41]:
q1.hint()
q1.solution()

<IPython.core.display.Javascript object>

<span style="color:#3366cc">Hint:</span> Use the `pd.DataFrame` constructor to create the DataFrame.

<IPython.core.display.Javascript object>

<span style="color:#33cc99">Solution:</span> 
```python
fruits = pd.DataFrame([[30, 21]], columns=['Apples', 'Bananas'])
```

## 2.

Create a dataframe `fruit_sales` that matches the diagram below:

![](https://storage.googleapis.com/kaggle-media/learn/images/CHPn7ZF.png)

In [42]:
fruit_sales = pd.DataFrame([[35, 21], [41, 34]], columns=['Apples', 'Bananas'],
                index=['2017 Sales', '2018 Sales'])
q2.check()

<IPython.core.display.Javascript object>

<span style="color:#33cc33">Correct</span>

In [43]:
q2.hint()
q2.solution()

<IPython.core.display.Javascript object>

<span style="color:#3366cc">Hint:</span> Set the row labels in the DataFrame by using the `index` parameter in `pd.DataFrame`.

<IPython.core.display.Javascript object>

<span style="color:#33cc99">Solution:</span> 
```python
fruit_sales = pd.DataFrame([[35, 21], [41, 34]], columns=['Apples', 'Bananas'],
                index=['2017 Sales', '2018 Sales'])
```

## 3.

Create a variable `ingredients` with a Series that looks like:

```
Flour     4 cups
Milk       1 cup
Eggs     2 large
Spam       1 can
Name: Dinner, dtype: object
```

In [44]:
quantities = ['4 cups', '1 cup', '2 large', '1 can']
items = ['Flour', 'Milk', 'Eggs', 'Spam']
recipe = pd.Series(quantities, index=items, name='Dinner')
ingredients = pd.Series(quantities, index=items, name='Dinner')
q3.check()

<IPython.core.display.Javascript object>

<span style="color:#33cc33">Correct</span>

In [45]:
q3.hint()
q3.solution()

<IPython.core.display.Javascript object>

<span style="color:#3366cc">Hint:</span> Note that the Series must be named `"Dinner"`. Use the `name` keyword-arg when creating your series.

<IPython.core.display.Javascript object>

<span style="color:#33cc99">Solution:</span> 
```python
quantities = ['4 cups', '1 cup', '2 large', '1 can']
items = ['Flour', 'Milk', 'Eggs', 'Spam']
recipe = pd.Series(quantities, index=items, name='Dinner')
```

## 4.

Read the following csv dataset of wine reviews into a DataFrame called `reviews`:

![](https://storage.googleapis.com/kaggle-media/learn/images/74RCZtU.png)

The filepath to the csv file is `../input/wine-reviews/winemag-data_first150k.csv`. The first few lines look like:

```
,country,description,designation,points,price,province,region_1,region_2,variety,winery
0,US,"This tremendous 100% varietal wine[...]",Martha's Vineyard,96,235.0,California,Napa Valley,Napa,Cabernet Sauvignon,Heitz
1,Spain,"Ripe aromas of fig, blackberry and[...]",Carodorum Selección Especial Reserva,96,110.0,Northern Spain,Toro,,Tinta de Toro,Bodega Carmen Rodríguez
```

In [46]:
file_path = '../input/wine-reviews/winemag-data_first150k.csv'
reviews = pd.read_csv(file_path, index_col='Unnamed: 0')  # Especificar que "Unnamed: 0" es el índice
q4.check()


<IPython.core.display.Javascript object>

<span style="color:#33cc33">Correct</span>

In [47]:
q4.hint()
q4.solution()

<IPython.core.display.Javascript object>

<span style="color:#3366cc">Hint:</span> Note that the csv file begins with an unnamed column of increasing integers. We want this to be used as the index. Check out the description of the `index_col` keyword argument in [the docs for `read_csv`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html).

<IPython.core.display.Javascript object>

<span style="color:#33cc99">Solution:</span> 
```python
reviews = pd.read_csv('../input/wine-reviews/winemag-data_first150k.csv', index_col=0)
```

## 5.

Run the cell below to create and display a DataFrame called `animals`:

In [48]:
animals.to_csv('cows_and_goats.csv')
q5.check()

<IPython.core.display.Javascript object>

<span style="color:#33cc33">Correct</span>

In the cell below, write code to save this DataFrame to disk as a csv file with the name `cows_and_goats.csv`.

In [49]:
animals.to_csv("cows_and_goats.csv")
q5.check()

<IPython.core.display.Javascript object>

<span style="color:#33cc33">Correct</span>

In [50]:
q5.hint()
q5.solution()

<IPython.core.display.Javascript object>

<span style="color:#3366cc">Hint:</span> Use [`to_csv`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html) to save a DataFrame to a CSV file.

<IPython.core.display.Javascript object>

<span style="color:#33cc99">Solution:</span> 
```python
animals.to_csv("cows_and_goats.csv")
```

# Keep going

Move on to learn about **[indexing, selecting and assigning](https://www.kaggle.com/residentmario/indexing-selecting-assigning)**.