# Combining multiple operations

We’ve covered different ways to manipulate strings. There will be cases where we need to apply a series of operations to complete the desired task. We can complete each task in a separate line or step. However, the Pandas library provides us with a more practical and efficient way of doing such tasks.

![image.png](attachment:d2f4345d-c700-4abc-91b8-e536b4f127aa.png)

We can combine multiple string manipulation operations into a single chained operation. For instance, we can extract the state part from the city column and convert it to lowercase letters in a single line of code.

In [1]:
import pandas as pd

staff = pd.read_csv("staff.csv")

print(staff["city"].str.split(",", expand=True)[1].str.lower())

0     tx
1     ca
2     tx
3     fl
4     ca
5     ga
Name: 1, dtype: object


In [2]:
staff["city"]

0        Houston, TX
1       San Jose, CA
2         Dallas, TX
3          Miami, FL
4    Santa Clara, CA
5        Atlanta, GA
Name: city, dtype: object

Consider a case where we need to change the name of the <u>“field quality”</u> department to <u>“quality.”</u> In the department column of the staff, there are both lower and upper case letters. We first need to convert them to either lower or upper case and then do the replacement.

Let’s perform this task with a chained operation. Line 5 in the following code snippet performs this operation.

In [3]:
staff

Unnamed: 0,name,city,date_of_birth,start_date,salary,department
0,John Doe,"Houston, TX",1998-11-04,2018-08-11,"$65,000",Accounting
1,Jane Doe,"San Jose, CA",1995-08-05,2017-08-24,"$70,000",Field Quality
2,Matt smith,"Dallas, TX",1996-11-25,2020-04-16,"$58,500",human resources
3,Ashley Harris,"Miami, FL",1995-01-08,2021-02-11,"$49,500",accounting
4,Jonathan targett,"Santa Clara, CA",1998-08-14,2020-09-01,"$62,000",field quality
5,Hale Cole,"Atlanta, GA",2000-10-24,2021-10-20,"$54,500",engineering


In [16]:
import pandas as pd

staff = pd.read_csv("staff.csv")

staff1 = staff["department"].str.lower().replace("field quality","quality")

In [17]:
staff

Unnamed: 0,name,city,date_of_birth,start_date,salary,department
0,John Doe,"Houston, TX",1998-11-04,2018-08-11,"$65,000",Accounting
1,Jane Doe,"San Jose, CA",1995-08-05,2017-08-24,"$70,000",Field Quality
2,Matt smith,"Dallas, TX",1996-11-25,2020-04-16,"$58,500",human resources
3,Ashley Harris,"Miami, FL",1995-01-08,2021-02-11,"$49,500",accounting
4,Jonathan targett,"Santa Clara, CA",1998-08-14,2020-09-01,"$62,000",field quality
5,Hale Cole,"Atlanta, GA",2000-10-24,2021-10-20,"$54,500",engineering


In [18]:
staff1

0         accounting
1            quality
2    human resources
3         accounting
4            quality
5        engineering
Name: department, dtype: object

Chained operations aren’t limited to string manipulation methods. We can combine operations of different types as well.

For instance, **line 5** in the following code snippet performs a filtering operation with the query function, extracts the year from the start_date column, and changes its data type to integer.

In [20]:
import pandas as pd

staff = pd.read_csv("staff.csv")

print(staff.query("name > 'John Doe'").start_date.str[:4].astype("int"))

2    2020
4    2020
Name: start_date, dtype: int64


In [24]:
a = (staff.query("name > 'John Doe'"))
a["name"]

2          Matt smith
4    Jonathan targett
Name: name, dtype: object

The code a = (staff.query("name > 'John Doe'")) will filter the DataFrame staff to include only rows where the "name" column values are alphabetically greater than 'John Doe'. In other words, it will select all rows where the names come after 'John Doe' when sorted alphabetically.

As we’ve seen in the above examples, it’s possible to combine as many operations as needed. Chained operations come in handy in many cases and help us write more efficient scripts.