## Series

A series is a type that is used to store one column only. You can think of a series as one column of a DataFrame extracted.

Series is very similar to NumPy Array, with a main difference that it has an index label for each observation.

In [2]:
import numpy as np
import pandas as pd
import random

Relationship between a series and a dataframe

If you extract any given column froma  DataFrame, the resulting object is a series

In [5]:
df = pd.DataFrame(np.random.randint(1, 100, (5,4)), columns = list('abcd'))
df

Unnamed: 0,a,b,c,d
0,91,99,70,47
1,68,40,69,93
2,25,91,95,38
3,40,30,93,25
4,7,64,46,81


In [6]:
df['a']

0    91
1    68
2    25
3    40
4     7
Name: a, dtype: int32

In [7]:
type(df['a'])

pandas.core.series.Series

For this you can use indexing to get specific elements

In [8]:
df['a'][0:3]

0    91
1    68
2    25
Name: a, dtype: int32

To get numpy array, use .values

In [9]:
df['a'][0:3].values

array([91, 68, 25])

You can further convert it to a list

In [10]:
df['a'][0:3].values.tolist()

[91, 68, 25]

Creating a standalone series object

In [11]:
data = np.arange(10)
index = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

ser = pd.Series(data = data, name = 'numbers') #name is optional
ser

#if you dont provide index, pandas will make by deafult starting from 0

0    0
1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9
Name: numbers, dtype: int32

In [12]:
#providing index

ser = pd.Series(data = data, index = index, name = 'numbers') #name is optional
ser

a    0
b    1
c    2
d    3
e    4
f    5
g    6
h    7
i    8
j    9
Name: numbers, dtype: int32

In [13]:
type(ser)

pandas.core.series.Series

In [14]:
ser*2 #you can also do vectorized operations
#if you do that on a list, that list will be repeated 2 times

a     0
b     2
c     4
d     6
e     8
f    10
g    12
h    14
i    16
j    18
Name: numbers, dtype: int32

How to Extract and item from Series

In [15]:
ser['b'] #if you want a particular element, you can refer to that index and put in a square bracket and you get element

1

To extract more than one item, put all items in a list and pass that list as an argument

In [None]:
#this wont work because series is a one dimensional object and therefore will accept only one argument

#ser['a', 'b']

So pass all arguments in a in a square bracket

In [16]:
ser[['a', 'b']]

a    0
b    1
Name: numbers, dtype: int32

You can extract index as well

In [17]:
#method 1
ser.index

Index(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'], dtype='object')

In [18]:
#method 2
ser.keys()

Index(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'], dtype='object')

Also if you simply extract one column from a DataFrame, it becomes a series. So you can think of a DataFrame as a column-wise arrangement of series

you can create a series from a dict as well

In [26]:
d1 = {'a' : 0, 'b' : 1, 'c' : 3}
d2 = {'b' : 0, 'c' : 1, 'd' : 3}

In [27]:
ser1 = pd.Series(d1)
ser2 = pd.Series(d2)
ser1

a    0
b    1
c    3
dtype: int64

In [28]:
ser2

b    0
c    1
d    3
dtype: int64

Addition

In [29]:
ser1 + ser2
#it will align the indexes together

a    NaN
b    1.0
c    4.0
d    NaN
dtype: float64

In place of missing value, we use zero for computation
i.e it will place 0 in the missing values of indexes a and d

In [30]:
ser1.add(ser2, fill_value = 0)

a    0.0
b    1.0
c    4.0
d    3.0
dtype: float64