# Tutorial 1

## Getting started

To use pandas, you'll typically start with the following line of code.

In [1]:
import pandas as pd

## Creating data
There are two core objects in pandas: the DataFrame and the Series.

### DataFrame
A DataFrame is a table. It contains an array of individual entries, each of which has a certain value. Each entry corresponds to a row (or record) and a column.

For example, consider the following simple DataFrame:

In [4]:
fruits = pd.DataFrame({'Apples':[50,21],'Bananas':[131,2]})

fruits

Unnamed: 0,Apples,Bananas
0,50,131
1,21,2


In this example, the "0, Apples" entry has the value of 131. The "0, Bananas" entry has a value of 50, and so on.

DataFrame entries are not limited to integers. For instance, here's a DataFrame whose values are strings:

In [6]:
bank_branches = pd.DataFrame({'Country':['US','UK','France','Italy','Monaco','Switzerland'],'Branches':[31,20,2,4,45,24]})
bank_branches

Unnamed: 0,Country,Branches
0,US,31
1,UK,20
2,France,2
3,Italy,4
4,Monaco,45
5,Switzerland,24


The dictionary-list constructor assigns values to the column labels, but just uses an ascending count from 0 (0, 1, 2, 3, ...) for the row labels. Sometimes this is OK, but oftentimes we will want to assign these labels ourselves.

The list of row labels used in a DataFrame is known as an Index. We can assign values to it by using an index parameter in our constructor:

In [7]:
pd.DataFrame({'Bob': ['I liked it.', 'It was awful.'], 
              'Sue': ['Pretty good.', 'Bland.']},
             index=['Product A', 'Product B'])

Unnamed: 0,Bob,Sue
Product A,I liked it.,Pretty good.
Product B,It was awful.,Bland.


### Series
A Series, by contrast, is a sequence of data values. If a DataFrame is a table, a Series is a list. And in fact you can create one with nothing more than a list:

In [8]:
pd.Series([0,1,32,323])

0      0
1      1
2     32
3    323
dtype: int64

A Series is, in essence, a single column of a DataFrame. So you can assign row labels to the Series the same way as before, using an index parameter. However, a Series does not have a column name, it only has one overall name:

In [9]:
pd.Series([30, 35, 40], index=['2015 Sales', '2016 Sales', '2017 Sales'], name='Product A')

2015 Sales    30
2016 Sales    35
2017 Sales    40
Name: Product A, dtype: int64

The Series and the DataFrame are intimately related. It's helpful to think of a DataFrame as actually being just a bunch of Series "glued together". We'll see more of this in the next section of this tutorial.