<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Series-and-Dataframes" data-toc-modified-id="Series-and-Dataframes-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Series and Dataframes</a></span><ul class="toc-item"><li><span><a href="#Series" data-toc-modified-id="Series-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Series</a></span><ul class="toc-item"><li><span><a href="#Create-a-Series" data-toc-modified-id="Create-a-Series-1.1.1"><span class="toc-item-num">1.1.1&nbsp;&nbsp;</span>Create a Series</a></span></li><li><span><a href="#Using-the-Series-index" data-toc-modified-id="Using-the-Series-index-1.1.2"><span class="toc-item-num">1.1.2&nbsp;&nbsp;</span>Using the Series index</a></span></li></ul></li><li><span><a href="#Dataframes" data-toc-modified-id="Dataframes-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Dataframes</a></span><ul class="toc-item"><li><span><a href="#Create-a-DataFrame" data-toc-modified-id="Create-a-DataFrame-1.2.1"><span class="toc-item-num">1.2.1&nbsp;&nbsp;</span>Create a DataFrame</a></span></li><li><span><a href="#Using-the-DataFrame-index" data-toc-modified-id="Using-the-DataFrame-index-1.2.2"><span class="toc-item-num">1.2.2&nbsp;&nbsp;</span>Using the DataFrame index</a></span></li></ul></li></ul></li></ul></div>

From the documentation.  https://pandas.pydata.org/docs/getting_started/overview.html

The two primary data structures of pandas, Series (1-dimensional) and DataFrame (2-dimensional), 
handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. For R users, DataFrame provides everything that R’s data.frame provides and much more. pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other 3rd party libraries.<p>
    Here are just a few of the things that pandas does well:

- Easy handling of missing data (represented as NaN) in floating point as well as non-floating point data

- Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects

- Powerful, flexible group by functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data

- Intuitive merging and joining data sets

- Flexible reshaping and pivoting of data sets

- Robust IO tools for loading data from flat files (CSV and delimited), Excel files, databases, and saving / loading data from the ultrafast HDF5 format

In [None]:
import numpy as np
import pandas as pd
from numpy.random import randn


# Series and Dataframes

## Series

**Series** is a one-dimensional labeled array capable of holding any data type. The **axis labels** are collectively referred to as the index. The basic method to create a Series is to call:

s = pd.Series(data, index=index)

### Create a Series

In [None]:
# Use the Series method: s = pd.Series(data, index=index)
# Shift + Tab t osee other parameters

s = pd.Series(np.random.randn(5), index=["a", "b", "c", "d", "e"])
s

In [None]:
# Index is optional

s = pd.Series(randn(5))  # Don't need np.random because randn was imported.
s

In [None]:
# A list, array or dictionary can be used to create a series.

my_list = [5,3,0]
my_arr = np.array([5,3,0])
my_dictionary = {'a':5,'b':3,'c':0}

In [None]:
# Use a list w/o an index

pd.Series(my_list)

In [None]:
# Use a list w/ an index

pd.Series(my_list, index=['a','b','c'])

In [None]:
# Use a list w/ a list for the index
i_names = [['a','b','c']]

pd.Series(my_list, i_names)

In [None]:
# Use an array
my_arr = np.array([5,3,0])
pd.Series(my_arr, index=['a','b','c'])

In [None]:
# Use a dictionary
my_dictionary = {'a':5,'b':3,'c':0}
pd.Series(my_dictionary, index=['a','b','c'])

# What happens if the index list is changed to hold x,y and z?

In [None]:
# Using strings
my_cities = ['Chicago','Atlanta','Boston']
pd.Series(my_cities, i_names)

In [None]:
# Use the cities as the labels
my_cities = ['Chicago','Atlanta','Boston']
state = ['IL','GA','MA']
cities = pd.Series(state, my_cities)
cities

### Using the Series index


In [None]:
cities['Chicago']

## Dataframes

**DataFrame** is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects. It is generally the most commonly used pandas object. 

### Create a DataFrame

In [None]:
np.random.seed(1234)  
df = pd.DataFrame(randn(4,5),index=['IL','GA','MA','VT'],columns=['Sent','Used','Expired','Lost','Destroyed'])
df

In [None]:
# A little shortcut
np.random.seed(1234)  
df = pd.DataFrame(randn(4,5),index='IL GA MA VT'.split(),columns='S U E L D'.split())
df

In [None]:
# Create a DataFrame

data = {
    'apples': [3, 2, 0, 1], 
    'oranges': [0, 3, 7, 2]
}
sales = pd.DataFrame(data)
sales

In [None]:
sales.iloc[1:2,2:3]

### Using the DataFrame index

**.loc[] and .iloc[]**

In [None]:
df

In [None]:
# Select a column
df['S']

In [None]:
# Select multiple columns
df[['S','E']]            # Outer brackets: [ expecting an arguement] inner brackets: passing in a list ['a','b']

In [None]:
# Getting a row
df.loc['IL']

In [None]:
df.iloc[0]

In [None]:
df.iloc[1:3]