___

<a href='http://www.pieriandata.com'><img src='../Pierian_Data_Logo.png'/></a>
___
<center><em>Copyright by Pierian Data Inc.</em></center>
<center><em>For more information, visit us at <a href='http://www.pieriandata.com'>www.pieriandata.com</a></em></center>

# Series

The first main data type we will learn about for pandas is the Series data type. Let's import Pandas and explore the Series object.

A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that <font color="Yellow">a Series can have axis labels, meaning it can be indexed by a label, instead of just a number location</font>. It also doesn't need to hold numeric data, it can hold any arbitrary Python Object.

Let's explore this concept through some examples:

## Imports

In [128]:
import numpy as np
import pandas as pd

## Creating a Series from Python Objects

### Index and Data Lists

We can create a Series from Python lists (also from NumPy arrays)

In [129]:
myindex = ['USA','Canada','Mexico']
mydata = [1776,1867,1821]

<font color="yellow">A Serie can be accessed by index or by label </font>

In [130]:
myser = pd.Series(data=mydata)
myser

0    1776
1    1867
2    1821
dtype: int64

In [131]:
myser = pd.Series(data=mydata,index=myindex)
myser

USA       1776
Canada    1867
Mexico    1821
dtype: int64

In [132]:
# Accessing an element by index
myser[0]

1776

In [133]:
# Accessing an element by label
myser['USA']

1776

In [134]:
ran_data = np.random.randint(0,100,4)
ran_data

array([40, 55, 52, 58])

In [135]:
names = ['Andrew','Bobo','Claire','David']

In [136]:
ages = pd.Series(ran_data,names)
ages

Andrew    40
Bobo      55
Claire    52
David     58
dtype: int64

### From a  Dictionary

In [137]:
# Label : data
ages = {'Sammy':5,'Frank':10,'Spike':7}
ages

{'Sammy': 5, 'Frank': 10, 'Spike': 7}

In [138]:
pd.Series(ages)

Sammy     5
Frank    10
Spike     7
dtype: int64

# Key Ideas of a Series

## Named Index

In [139]:
# Imaginary Sales Data for 1st and 2nd Quarters for Global Company
q1 = {'Japan': 80, 'China': 450, 'India': 200, 'USA': 250}
q2 = {'Brazil': 100,'China': 500, 'India': 210,'USA': 260}

In [140]:
# Convert into Pandas Series
sales_Q1 = pd.Series(q1)
sales_Q2 = pd.Series(q2)

In [141]:
sales_Q1

Japan     80
China    450
India    200
USA      250
dtype: int64

In [142]:
# Call values based on Named Index
sales_Q1['Japan']

80

In [143]:
# Integer Based Location information also retained!
sales_Q1[0]

80

**Be careful with potential errors!**

In [144]:
# Wrong Name
# sales_Q1['France']

In [145]:
# Accidental Extra Space
# sales_Q1['USA ']

In [146]:
# Capitalization Mistake
# sales_Q1['usa']

## Operations

In [147]:
# Grab just the index keys
sales_Q1.keys()

Index(['Japan', 'China', 'India', 'USA'], dtype='object')

In [148]:
# Can Perform Operations Broadcasted across entire Series
sales_Q1 * 2

Japan    160
China    900
India    400
USA      500
dtype: int64

In [149]:
sales_Q2 / 100

Brazil    1.0
China     5.0
India     2.1
USA       2.6
dtype: float64

## Between Series

In [150]:
# Notice how Pandas informs you of mismatch with NaN
sales_Q1 + sales_Q2

Brazil      NaN
China     950.0
India     410.0
Japan       NaN
USA       510.0
dtype: float64

In [151]:
# You can fill these with any value you want
sales_Q1.add(sales_Q2,fill_value=0)

Brazil    100.0
China     950.0
India     410.0
Japan      80.0
USA       510.0
dtype: float64

That is all we need to know about Series, up next, DataFrames!