___

<a href='https://www.prosperousheart.com/'> <img src='files/learn to code online.png' /></a>
___

# Series

This is the first data type when using **pandas** - and you will build from that towards DataFrames.

They are very similar to NumPy array (in fact it is built on top of the object) - the difference is that you can add axis labels to the series. This also means it can be **indexed** by labels!

They can hold just about any object type in python:  string, numbers, functions, etc

In [1]:
# import NumPy & pandas
import numpy as np
import pandas as pd

In [2]:
# create data objects
labels = ['a', 'b', 'c']    # list of labels
temp_data = [10, 20, 30]    # list of data
arr = np.array(temp_data)    # NumPy array
d = {'a': 10, 'b': 20, 'c':30}    # dictionary

In [3]:
arr

array([10, 20, 30])

In [4]:
# CREATING A PANDAS SERIES (no labels)
# -------------------------
# there are several options when creating your Series
# be sure to use SHIFT + TAB to see more
pd.Series(data = temp_data)

0    10
1    20
2    30
dtype: int64

In [5]:
# CREATING A PANDAS SERIES (with labels)
# this allows you to be able to call data using that index
# -------------------------
pd.Series(data = temp_data, index=labels)  # pd.Series(temp_data, labels) would also work

a    10
b    20
c    30
dtype: int64

In [6]:
# CREATING A PANDAS SERIES (using NumPy array - no labels)
# --------------------------------------------
pd.Series(arr) # pd.Series(np.array([10, 20, 30])) == pd.Series([10,20,30])

0    10
1    20
2    30
dtype: int32

In [7]:
# CREATING A PANDAS SERIES (using NumPy array - no labels)
# --------------------------------------------
pd.Series(arr, labels)

a    10
b    20
c    30
dtype: int32

In [8]:
# CREATING A PANDAS SERIES (using a dict)
# --------------------------------------------
print(d)
pd.Series(d)

{'a': 10, 'b': 20, 'c': 30}


a    10
b    20
c    30
dtype: int64

In [9]:
# CREATING A PANDAS SERIES (using functions)
# holds references to the functions
# cannot do in a pandas Array
# --------------------------------------------
pd.Series(data=[sum, print, len])

0      <built-in function sum>
1    <built-in function print>
2      <built-in function len>
dtype: object

# Index

The key to understanding a series is using it's index. These names (or numbers) provide very fast lookup information as it works just like a hash table or dictionary.

`pd.Series(data, index)`

Each element in the Series input must be put in a list. Otherwise you will get a **TypeError** like the following:

In [17]:
pd.Series(1, "label1")

TypeError: Index(...) must be called with a collection of some kind, 'label1' was passed

<hr>
The following code is proper notation for a Series. Be sure that you have equal number of labels as you do data points, otherwise you will receive another error.

In [10]:
ser1 = pd.Series([0, 1, 2, 3], ['USA', 'Italy', 'Japan', 'Scotland'])
ser1

USA         0
Italy       1
Japan       2
Scotland    3
dtype: int64

In [11]:
ser2 = pd.Series([0, 1, 4, 3], ['USA', 'Italy', 'Germany', 'Scotland'])
ser2

USA         0
Italy       1
Germany     4
Scotland    3
dtype: int64

To grab information out of a series, it will work much like bracket information from a python dictionary. This input will be your index, so if it is a number it must match what's in your series.

In [12]:
ser1['USA'] # series[list_of_columns_2_show]

0

In [13]:
ser3 = pd.Series(labels)
print(ser3)
ser3[1]

0    a
1    b
2    c
dtype: object


'b'

To do basic operations, they are usually done based off of the index.

For example, if you were to add series, if it cannot find a match? it will put a **null** or **NaN** object in it's place.

In [14]:
print(ser1)
print()
print(ser2)
ser1 + ser2

USA         0
Italy       1
Japan       2
Scotland    3
dtype: int64

USA         0
Italy       1
Germany     4
Scotland    3
dtype: int64


Germany     NaN
Italy       2.0
Japan       NaN
Scotland    6.0
USA         0.0
dtype: float64

**NOTE:** Performing operations with pandas series (really any NumPy or pandas base object) then the integers will be converted into floats. This will ensure you do not lose information based off of some weird division.

## More Indexing Help

If you would like to learn more, please <a href='http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html'>read the documentation</a>.

Indexing a 2d matrix can be a bit confusing at first, especially when you start to add in step size.

Try google image searching **_NumPy indexing_** to find useful images, like this one:

<a href="https://www.oreilly.com/library/view/python-for-data/9781449323592/ch04.html"><img src='IMGs/NumPy_indexing_0.png' width=500></a>

... or even this!

<a href="https://www.oreilly.com/library/view/python-for-data/9781449323592/ch04.html"><img src='IMGs/NumPy_indexing_1.png' width=500></a>