Let's import the NumPy and Pandas libraries

In [1]:
import numpy as np  # we're using this to use np.nan; missing values in Pandas objects are represented by np.nan
import pandas as pd

Imagine you have gone abroad and are spending money in dollars, which you have tracked on a monthly basis. You want to convert the amount to rupees and sort your expenditure from most to least expensive.

In [2]:
monthly_expenses_in_dollars = pd.Series(data = [700, 550, 480, 625, 670, 515], index = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'])
monthly_expenses_in_dollars

Jan    700
Feb    550
Mar    480
Apr    625
May    670
Jun    515
dtype: int64

In [3]:
monthly_expenses_in_rupees = monthly_expenses_in_dollars.mul(84)  # assuming a conversion rate of Rs. 84 to the dollar
monthly_expenses_in_rupees

Jan    58800
Feb    46200
Mar    40320
Apr    52500
May    56280
Jun    43260
dtype: int64

In [4]:
monthly_expenses_in_rupees_descending = monthly_expenses_in_rupees.sort_values(ascending = False)
monthly_expenses_in_rupees_descending

Jan    58800
May    56280
Apr    52500
Feb    46200
Jun    43260
Mar    40320
dtype: int64

Can we achieve this in a single line of code? Yes, methods on Series can be chained.

In [5]:
answer = monthly_expenses_in_dollars.mul(84).sort_values(ascending = False)
answer

Jan    58800
May    56280
Apr    52500
Feb    46200
Jun    43260
Mar    40320
dtype: int64

You can chain as many methods as you'd like, as long as each method returns a value (therefore, do not use `inplace = True` while chaining methods, unless it's the final method)

In [6]:
my_series = pd.Series([10, 20, np.nan, 40, 50])
my_series

0    10.0
1    20.0
2     NaN
3    40.0
4    50.0
dtype: float64

In [7]:
s1 = my_series.fillna(0)
s1

0    10.0
1    20.0
2     0.0
3    40.0
4    50.0
dtype: float64

In [8]:
s2 = s1.loc[s1 > 10]
s2

1    20.0
3    40.0
4    50.0
dtype: float64

In [10]:
s3 = s2.mul(3)
s3

1     60.0
3    120.0
4    150.0
dtype: float64

In [11]:
s4 = s3.sort_values(ascending = False)
s4

4    150.0
3    120.0
1     60.0
dtype: float64

In [12]:
s5 = s4.reset_index(drop = True)
s5

0    150.0
1    120.0
2     60.0
dtype: float64

In [13]:
# All these steps can be achieved in a single line of code
my_series.fillna(0).loc[my_series > 10].mul(3).sort_values(ascending = False).reset_index(drop = True)

0    150.0
1    120.0
2     60.0
dtype: float64

Let's use some Series methods whose purpose is to describe the Series

In [14]:
# Let's create a Series
data = [125, 113, 118, 124, 125, 120, 120, 118, 117, 118]
stocks = pd.Series(data = data, index = ['Day' + str(i) for i in range(len(data))], name = 'Closing stock price')
stocks

Day0    125
Day1    113
Day2    118
Day3    124
Day4    125
Day5    120
Day6    120
Day7    118
Day8    117
Day9    118
Name: Closing stock price, dtype: int64

In [15]:
stocks.head()

Day0    125
Day1    113
Day2    118
Day3    124
Day4    125
Name: Closing stock price, dtype: int64

In [16]:
stocks.head(n = 3)

Day0    125
Day1    113
Day2    118
Name: Closing stock price, dtype: int64

In [17]:
stocks.tail()

Day5    120
Day6    120
Day7    118
Day8    117
Day9    118
Name: Closing stock price, dtype: int64

In [18]:
stocks.tail(3)

Day7    118
Day8    117
Day9    118
Name: Closing stock price, dtype: int64

In [19]:
stocks.describe()

count     10.00000
mean     119.80000
std        3.88158
min      113.00000
25%      118.00000
50%      119.00000
75%      123.00000
max      125.00000
Name: Closing stock price, dtype: float64

Let's drop duplicate entries from a Series

In [20]:
stocks

Day0    125
Day1    113
Day2    118
Day3    124
Day4    125
Day5    120
Day6    120
Day7    118
Day8    117
Day9    118
Name: Closing stock price, dtype: int64

In [21]:
stocks.duplicated()  # only flags the duplicates as True, not the first appearance of the value

Day0    False
Day1    False
Day2    False
Day3    False
Day4     True
Day5    False
Day6     True
Day7     True
Day8    False
Day9     True
Name: Closing stock price, dtype: bool

In [22]:
stocks.drop_duplicates()  # keeps only the first appearance of a duplicate value

Day0    125
Day1    113
Day2    118
Day3    124
Day5    120
Day8    117
Name: Closing stock price, dtype: int64

In [23]:
stocks

Day0    125
Day1    113
Day2    118
Day3    124
Day4    125
Day5    120
Day6    120
Day7    118
Day8    117
Day9    118
Name: Closing stock price, dtype: int64

In [24]:
stocks.drop_duplicates(inplace = True)

In [25]:
stocks

Day0    125
Day1    113
Day2    118
Day3    124
Day5    120
Day8    117
Name: Closing stock price, dtype: int64

You can apply functions on every element of a Series using `.apply()` or `.map()`

In [31]:
series = pd.Series([10, 20, 30, 40, 50])
series
series.reset_index(drop = True).idxmin()  # this works fine

0

In [27]:
series.apply(lambda x: x ** 2)

0     100
1     400
2     900
3    1600
4    2500
dtype: int64

In [28]:
series.map(lambda x: x ** 2)

0     100
1     400
2     900
3    1600
4    2500
dtype: int64

In [29]:
my_dict = {10: 'Hello', 30: 'World'}
series.map(my_dict)  # .map() can also be used to perform value replacements using a dictionary

0    Hello
1      NaN
2    World
3      NaN
4      NaN
dtype: object