# Series

The first main data type we will learn about for pandas is the Series data type. Let's import Pandas and explore the Series object.

A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can have axis labels, meaning it can be indexed by a label, instead of just a number location. It also doesn't need to hold numeric data, it can hold any arbitrary Python Object.

Let's explore this concept through some examples:

In [1]:
import numpy as np

In [3]:
import pandas as pd

### Creating a Series

You can convert a list,numpy array, or dictionary to a Series:

In [4]:
labels = ['a', 'b', 'c']
my_data = [10,20,30]
arr = np.array(my_data)
d = {'a':10, 'b':20, 'c':30}

In [5]:
labels #List of labels

['a', 'b', 'c']

In [6]:
my_data  #A simple list

[10, 20, 30]

In [7]:
arr   #Array

array([10, 20, 30])

In [8]:
d  #Dictionary

{'a': 10, 'b': 20, 'c': 30}

** Using Lists**

In [9]:
pd.Series(data = my_data)

0    10
1    20
2    30
dtype: int64

In [10]:
#We can change the indexes (ie: 0 1 2) to labels. Thus we have a label indexed series
pd.Series(data=my_data,index=labels) 

a    10
b    20
c    30
dtype: int64

In [11]:
#Evabeo likha jay. Don't need to constantly specify data=my_data, index=labels....
pd.Series(my_data,labels)  

a    10
b    20
c    30
dtype: int64

** Using Lists**

In [12]:
pd.Series(arr) #Passing a numpy array would convert it to a pandas series

0    10
1    20
2    30
dtype: int32

In [13]:
pd.Series(arr,labels) #Passing a numpy array and also label as indexes

a    10
b    20
c    30
dtype: int32

** Using Dictionary **

In [14]:
#Passing a dictionary so that the keys are now indexes along with corresponding values.
pd.Series(d) 

a    10
b    20
c    30
dtype: int64

### Data in a Series

A pandas Series can hold a variety of object types:

In [15]:
labels

['a', 'b', 'c']

In [16]:
pd.Series(data=labels)

0    a
1    b
2    c
dtype: object

##### We can even store functions as datapoints in pandas series. Though we would probably never use this but let's see how we can do that

In [21]:
pd.Series(data=[sum,print,len]) 

0      <built-in function sum>
1    <built-in function print>
2      <built-in function len>
dtype: object

## Using an Index

#### How to grab information using indexes of Pandas Series

In [22]:
ser1 = pd.Series([1,2,3,4],['USA', 'Germany', 'USSR', 'Japan'])

In [23]:
ser1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

In [25]:
ser2 = pd.Series([1,2,5,4],['USA', 'Germany', 'Italy', 'Japan'])

In [26]:
ser2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

In [28]:
ser1['USA']

1

In [29]:
labels

['a', 'b', 'c']

In [30]:
ser3 = pd.Series(data=labels)

In [31]:
ser3

0    a
1    b
2    c
dtype: object

In [32]:
ser3[0]

'a'

In [33]:
ser1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

In [34]:
ser2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

#### Operations are then also done based off of index: 
If we add two pandas Series; For each index which is common in both series, for them values will be added. But indexes which are unique for each series, for them addition result values will be NaN

In [36]:
ser1+ser2

Germany    4.0
Italy      NaN
Japan      8.0
USA        2.0
USSR       NaN
dtype: float64

#### Note: Whenever we do any arithmetic operation in NumPy or Pandas the integers are automatically converted to float so that we dont lose any information.

#### In this course, we will be not working much with Series. Our main focus will be on DataFrames. But to work with DataFrames, some basics of Series is required.