# Learning Pandas
<a href="https://pandas.pydata.org/">Pandas</a> a port monteau of panel + pata, is a Python data analysis library



### Creating a Series
There are a couple of ways to create a <code>Series</code> from scratch.

In [3]:
import numpy as np
import pandas as pd 

#### Creating a dictionary 

Sample data from CashBox
Track balances of users with <code>test_balance_data</code>, a standard Python library. The dictionary key is username and the vlaue is that user's current account balance.

In [4]:
test_balance_data = {
    'pasan': 20.00, 
    'treasure': 20.18,
    'ashley': 1.05, 
    'craig': 42.42
    
}

#### The <code>Series</code> accepts any dict-like argument
Read <a href="https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.Series.html">pandas.Series</a> documentation for more information.

Notice that labels have been set from the <code>test_balance_data.keys()</code> and the values are set from <code>test_balance_data.values()</code>

In [5]:
balances = pd.Series(test_balance_data)

In [6]:
print(balances)

Series([], dtype: float64)


In [7]:
print(pd.Series([1,2,3]))

0    1
1    2
2    3
dtype: int64


### Accessing a Series 
There are multiple ways to get to the data stored in your <code>Series</code>. Let's explore the <code>balances Series</code> 

Remember, the <code>Series</code> is indexed by username. The label is the username, the value is that user's cash balance. 

In [10]:
# Setup
import pandas as pd
from utils import render

ModuleNotFoundError: No module named 'utils'

In [12]:
sample = {
    'neptune': 2.793, 
    'earth': 92.96,
    'uranus': 1.784, 
    'jupiter': 483.8
}

distances = pd.Series(sample)

# iloc is similar to a stander list slicing function in that it is inclusive:exclusive
distances.iloc[0:2]

neptune     2.793
earth      92.960
dtype: float64

In [14]:
# properties are exposed on the underlying Series (if they pass naming rules)
distances.earth

92.96

In [16]:
# Check to see if 'pluto' is in the index of the Series
'pluto' in distances

False

In [17]:
# Indexing works just like a list -- using iloc is more specific
distances[-1]

483.8

In [19]:
# the loc indexes is inclusive
distances.loc['earth':'jupiter']

earth       92.960
uranus       1.784
jupiter    483.800
dtype: float64

## Vectorization and Broadcasting 
NumPy is very fast because it relies heavily on <a href="http://enhancedatascience.com/2018/05/07/machine-learning-explained-vectorization-matrix-operations/">vectorization.</a> Vectorization allows us to avoid looping by working on an entire set of values at once. 

Optimizations are handled at a very low level.
