# Series

**The first main data type we will learn about for pandas is the Series data type. Let's import Pandas and explore the Series object.**

A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can have axis labels, meaning it can be indexed by a label, instead of just a number location. It also doesn't need to hold numeric data, it can hold any arbitrary Python Object.

Let's explore this concept through some examples:

In [22]:
#  Series are the first main datatype we'll be working with in Pandas and from here we'll build towards working
# with dataframes in the next notebook.
import numpy as np
import pandas as pd

In [23]:
labels = ['a','b','c']
my_data = [10,20,30]
arr = np.array(my_data)
d = {'a':10,'b':20,'c':30}

In [24]:
print(labels)
print(my_data)
print(arr)
print(d)

['a', 'b', 'c']
[10, 20, 30]
[10 20 30]
{'a': 10, 'b': 20, 'c': 30}


In [25]:
pd.Series(data = my_data) # Series takes a wide variety of parameters, what matters the most is data and index.

0    10
1    20
2    30
dtype: int64

In [26]:
pd.Series(data = my_data,index=labels) # Labelled index series.

a    10
b    20
c    30
dtype: int64

In [27]:
pd.Series(my_data,labels)

a    10
b    20
c    30
dtype: int64

In [28]:
pd.Series(arr,labels) # Inside the series() you can pass in a numpy array as well as python list.

a    10
b    20
c    30
dtype: int32

In [29]:
pd.Series(d) # Passing a dictionary to the Pandas Series, takes automatically the keys of the dict and makes them labels.

a    10
b    20
c    30
dtype: int64

In [30]:
# Pandas series can hold a variety of data types when compared to the numpy array.
pd.Series(data=labels)

0    a
1    b
2    c
dtype: object

# Index

In [31]:
# The key to using a series is understanding its index. Pandas makes use of these index names or numbers by
# allowing up very fast lookup of the information. Works like a hash table or dictionary. 
ww1 = pd.Series([1,2,3,4],['USA','Germany','USSR','Japan'])

In [32]:
ww1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

In [35]:
ww2 = pd.Series([1,2,5,4],['USA','Germany','Italy','Japan'])

In [37]:
ww2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

* To grab information out of series it is similar to getting info from a py3 dictionary.
* We type 'USA' as string as we know that index is a string.If you are working with a series which has  indices as 0,1,2 i.e. integers then we pass integer. It depends on data type of index.
* Usually index is either integer or string.


In [38]:
ww1['USA']

1

In [40]:
ww3 = pd.Series(data=labels)#Passing data as labels

In [41]:
ww3

0    a
1    b
2    c
dtype: object

In [42]:
# Grabbing info out of the series with index of integers is similar to numpy array.
ww3[1]

'b'

# Basic Operation on Series

In [44]:
ww1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

In [45]:
ww2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

**Addition of ww1 and ww2**

Command above tries to match up the operation based on the index i.e. USA 1 of ww1 and USA 1 of ww2 will be shown as USA 2, where the match is not found such as USSR and Italy which aren't common in both the series it will put a NaN meaning Not a Number.


In [50]:
ww1+ww2

Germany    4.0
Italy      NaN
Japan      8.0
USA        2.0
USSR       NaN
dtype: float64

# NOTE : 

**>> When you are performing operations on any pandas or numpy based object Integers will get converted to floats, so that you do not lose information.**


**>> Series have label index and a data point.**