# Creating, Reading, and Writing

## Data Frames (DataFrame)

Of note, this JSON-like object format creates a dataframe where each key is a column name and each value is a column entry. This is a very common structure within Python and is also very similar to how data is stored in SQL databases, or how data is structured in key-value stores such as Redis.

In [3]:
import pandas as pd

pd.DataFrame({'Yes': [50, 21], 'No': [131, 2]})

Unnamed: 0,Yes,No
0,50,131
1,21,2


In [5]:
pd.DataFrame({'Bob': ['I liked it', 'It was awful'], 'Sue': ['Pretty good', 'Bland']})

Unnamed: 0,Bob,Sue
0,I liked it,Pretty good
1,It was awful,Bland


## Series

A Series, by contrast, is a sequence of data values. If a DataFrame is a table, a Series is a list. It's effectively like a single column from a DataFrame. A DataFrame can contain multiple Series, and a Series can contain multiple values.

In [6]:
pd.Series([1, 2, 3, 4, 5])

0    1
1    2
2    3
3    4
4    5
dtype: int64

Series can have names, which is the name of the column, and they can have an index, which is the list of row labels. If you don't specify an index when creating a Series, a default one of integers is used.

In [11]:
pd.Series([30, 35, 40], index=['2015 Sales','2016 Sales', '2017 Sales'], name='Product A')

2015 Sales    30
2016 Sales    35
2017 Sales    40
Name: Product A, dtype: int64

## Reading Files

Most of the time we will _not_ be creating our own data by hand, but instead reading it in from a file. We can use the `read_csv` function to read in data from a CSV file. You can get the _shape_ of a DataFrame by using the `shape` property. You can get the first few rows of data by using the `head` function.

In [18]:
wines = pd.read_csv('./wines.csv')
wines.shape

(4, 3)

In [16]:
wines.head()

Unnamed: 0.1,Unnamed: 0,country,price
0,0,USA,55.0
1,1,Canada,
2,2,Brazil,33.0
3,3,,73.0
