___

<a href='http://www.pieriandata.com'> <img src='../Pierian_Data_Logo.png' /></a>
___
# Series

The first main data type we will learn about for pandas is the Series data type. Let's import Pandas and explore the Series object.

A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can have axis labels, meaning it can be indexed by a label, instead of just a number location. It also doesn't need to hold numeric data, it can hold any arbitrary Python Object.

We can call the data points using labels.

Let's explore this concept through some examples:

In [1]:
import numpy as np
import pandas as pd

### Creating a Series

You can convert a list,numpy array, or dictionary to a Series:

In [16]:
labels = ['a','b','c'] # this is a list

my_list = [10,20,30] # list of numerical values

arr = np.array([10,20,30]) # this is a NumPy array

d = {'a':10,'b':20,'c':30} # this is a dictionary

In [17]:
arr # NumPy array

array([10, 20, 30])

In [18]:
d # this is a Dictionary

{'a': 10, 'b': 20, 'c': 30}

** Using Lists**

* Creating Pandas Series using Python Lists
* pd.Series -> typical usage, it takes data and index as arguments

In [19]:
pd.Series(data=my_list) # we are just specifying the data for the Pandas series. It uses default index numbers

0    10
1    20
2    30
dtype: int64

In [21]:
# You can actually specify what you want the index to be.
# the count of data values and index labels, don't match => this will give an Error
# Here, we are specifying the labels, instead of default index numbers
# Unlike NumPy array, we can access the data values using the labels.

pd.Series(data=my_list,index=labels) 

a    10
b    20
c    30
dtype: int64

In [22]:
# You don't need to explicitly specify, data=<> or index<>.
pd.Series(my_list,labels)

a    10
b    20
c    30
dtype: int64

** NumPy Arrays **

* You can create a Pandas Series using NumPy array. this is exactly the same way as Python lists.

In [23]:
pd.Series(arr)

0    10
1    20
2    30
dtype: int32

In [24]:
pd.Series(arr,labels)

a    10
b    20
c    30
dtype: int32

** Dictionary**

* When creating a Pandas Series using Dictionary, Pandas automatically uses the keys of the dictionary as index labels.

In [25]:
d

{'a': 10, 'b': 20, 'c': 30}

In [9]:
pd.Series(d)

a    10
b    20
c    30
dtype: int64

### Data in a Series

A pandas Series can hold a variety of object types:

In [26]:
# Pandas Series can hold any type of data objects as its data points.
pd.Series(data=labels)

0    a
1    b
2    c
dtype: object

In [28]:
# Even functions (although unlikely that you will use this) can also be used as data points for a Pandas Series
pd.Series([sum,print,len])

0      <built-in function sum>
1    <built-in function print>
2      <built-in function len>
dtype: object

## Using an Index

The key to using a Series is understanding its index. Pandas makes use of these index names or numbers by allowing for fast look ups of information (works like a hash table or dictionary).

Let's see some examples of how to grab information from a Series. Let us create two sereis, ser1 and ser2:

In [42]:
myseries = pd.Series(my_list, labels)

In [43]:
myseries

a    10
b    20
c    30
dtype: int64

In [44]:
myseries2 = pd.Series(arr)

In [45]:
myseries2

0    10
1    20
2    30
dtype: int32

In [49]:
# this series doesn't have any labels associated with, so we can just access using index numbers
myseries2[1]

20

In [34]:
# index labels inside Pandas Series work just like a hashtable or Dictionary keys
myseries['a']

10

In [35]:
ser1 = pd.Series(data=[1,2,3,4],index = ['USA', 'Germany','USSR', 'Japan'])                                   

In [36]:
ser1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

In [37]:
ser2 = pd.Series(data=[1,2,5,4],index = ['USA', 'Germany','Italy', 'Japan'])                                   

In [38]:
ser2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

In [40]:
ser1['USA']

1

In [41]:
ser2['Italy']

5

Operations are then also done based off of index:

In [50]:
# When doing operations, it tries to match up the data points based on index labels/index numbers.
ser1 + ser2

Germany    4.0
Italy      NaN
Japan      8.0
USA        2.0
USSR       NaN
dtype: float64

Let's stop here for now and move on to DataFrames, which will expand on the concept of Series!
# Great Job!