## Introduction to Pandas Series

_Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language._

_The Data structures are_ __DataFrame__ and __Series__

In this notebook, we'll be looking into __Pandas Series__

Note: It's highly recommended to go through my tutorial on  __Pandas DataFrame__ before continuing with this one. It'll help you to have a better understanding of __Pandas Series__

### What is Pandas Series?

Technically, Pandas Series is a one-dimensional labeled array capable of holding any data type. 

In layman terms, Pandas Series is nothing but a column in an excel sheet.  As in the picture below, columns with __Name, Age and Designation__ representing a Series

_So, in terms of Pandas DataStructure, A Series represents a single column in memory, which is either independent or belongs to a Pandas DataFrame_. 

![alt text](excel_sheet.png "Title")

### Pandas Series are Independent of DataFrame

A Series can have its own independent existence without DataFrame. We can very well create a series (i.e a 1-D column array) and use it for data analysis

### How to use Pandas Series?

In [None]:
# The logistics
import pandas as pd
import numpy as np

A Pandas Series can be created out of a Python list or numpy array. It has to be remembered that unlike Python lists, a Series will always contain data of the same type. That's the reason why a numpy array is a better candidate for creating a pandas series

In [None]:
series_list = pd.Series([1,2,3,4,5,6])
series_np = pd.Series(np.array([10,20,30,40,50,60]))

In [None]:
series_list

In [None]:
series_np

As with Pandas DataFrame, the Series also generates a default row index number. We can replace the by default row index number with the number sequence of our choice.

In [None]:
series_index = pd.Series(np.array([10,20,30,40,50,60]), index=np.arange(0,12,2))

In [None]:
series_index

In [None]:
series_index = pd.Series(np.array([10,20,30,40,50,60]), index=['a', 'b', 'c', 'd', 'e', 'f' ] )

In [None]:
series_index

We can print the index separately as

In [None]:
series_index.index

### Pandas Series from python Dictionary

As we've seen during creation of Pandas DataFrame, it was extremely easy to create a DataFrame out of python dictionaries as __keys__ maps to Column header while __values__ corresponds to column values.

__So how does it map while creating the Pandas Series?__

If we create a Series from a python dictionary, the __key__ becomes the row index while the __value__ becomes the value at that row index.


In [None]:
t_dict = {'a' : 1, 'b': 2, 'c':3}

series_dict = pd.Series(t_dict)

In [None]:
series_dict

So far so good, but what if the key contains a list of values? Well, in that case, the list will be represented as a single value against the row index

In [None]:
t_dict = {'a' : [1,2,3], 'b': [4,5], 'c':6, 'd': "Hello World"}
series_dict = pd.Series(t_dict)

In [None]:
series_dict

### Extracting Series from a DataFrame

Though Pandas Series is extremely useful in itself, most of the time, we'll will end up using DataFrame and Series together. 

As Series represents a 1D column array, we can use it to extract individual columns of a Pandas DataFrame. Here is how the code will look like

Let's create a Pandas DataFrame first


In [None]:
my_dict = { 'name' : ["a", "b", "c", "d", "e"],
                   'age' : [10,20, 30, 40, 50],
                   'designation': ["CEO", "VP", "SVP", "AM", "DEV"]}
df = pd.DataFrame( my_dict , index = ["First -> ","Second -> ", "Third -> ", "Fourth -> ", "Fifth -> "])


In [None]:
df

In [None]:
df.name

In [None]:
df.age

In [None]:
df.designation

In [None]:
df['name']

__Each column is nothing but a Series__

In [None]:
df['name']

In [None]:
series_name = df.name
series_age = df.age
series_designation = df.designation

In [None]:
series_name

In [None]:
series_age

In [None]:
series_designation

### Getting Series by Iterating through a DataFrame

It's possible to iterate through the DataFrame and get every column as a Series

In [None]:
series_col = []
for col_name in df.columns:
    series_col.append(df[col_name])    
    
print(series_col)

### Creating DataFrame using Series (Standalone or Combination)

A Pandas DataFrame is nothing but a collection of one of more Series (1+).  We can generate the DataFrame by using a Single Series or by combining multiple Series

For example, let's generate a DataFrame from combining `series_name` and `series_age`

In [None]:
series_name

In [None]:
series_age

In [None]:
df_from_series = pd.DataFrame([series_name, series_age])

In [None]:
df_from_series

_The difference here is that the row index values are converted into columns and columns shall be considered as row.
This is similar to transpose of a matrix_

_NOTE: This is NOT applicable if we create a DataFrame from a single Series_

In [None]:
df_from_series_single = pd.DataFrame(series_name)
df_from_series_single

_However, things changes if we pass an Array in DataFrame_

In [None]:
df_from_series_single = pd.DataFrame([series_name])
df_from_series_single

_Same behaviour will be observed when we pass python dictionaries as arrays to create a DataFrame_

In [None]:
t_dict

In [None]:
ds = pd.DataFrame([t_dict])
ds

In [None]:
ds = pd.DataFrame([t_dict, t_dict ], index=[1,2])

In [None]:
ds

### Series from a Scalar

A Series can also be created from a scalar value like

In [None]:
series_scalar = pd.Series(10)
series_scalar

### Series Helper Functions

Just like pandas DataFrame, Series also have multiple set of helper functions. Please note that all Column helper functions of Pandas DataFrame will work with Pandas Series

In [None]:
#Getting the mean of a Series
series_age.mean()

In [None]:
# Getting the size of the Series
series_age.size

In [None]:
# Getting all unique items in a series
series_designation.unique()

In [None]:
# Getting a python list out of a Series
series_name.tolist()

### Iterating over series items

Just like any other Data Structure in python, we can iterate over individual series items as

In [None]:
for value in series_name:
    print(value)

We can also iterate through row index numbers as 

In [None]:
ppc

This is all about basic usage of Pandas Series