# Upper and lower case

Python is **case-sensitive**. This means that we need to have the strings in standard format in order to perform analysis on them. Thankfully, it’s quite simple to change a string’s case in Pandas.

The names in our staff have both upper and lower case letters. Let’s convert all of them to lower case.

In [1]:
import pandas as pd

staff = pd.read_csv("staff.csv")

staff["name_lower"] = staff["name"].str.lower()

print(staff[["name","name_lower"]])

               name        name_lower
0          John Doe          john doe
1          Jane Doe          jane doe
2        Matt smith        matt smith
3     Ashley Harris     ashley harris
4  Jonathan targett  jonathan targett
5         Hale Cole         hale cole


The <font color='red'>lower</font> function under the <font color='red'>str</font> accessor converts all characters to lowercase. The <font color='red'>upper</font> function does the opposite.

Another function that we can use is the <font color='red'>capitalize</font> function. It only converts the first letter to upper case. Let’s use it to capitalize the values in the department column.

In [3]:
import pandas as pd

staff = pd.read_csv("staff.csv")

print(staff["department"].str.capitalize())
print(staff["name"].str.capitalize())

0         Accounting
1      Field quality
2    Human resources
3         Accounting
4      Field quality
5        Engineering
Name: department, dtype: object
0            John doe
1            Jane doe
2          Matt smith
3       Ashley harris
4    Jonathan targett
5           Hale cole
Name: name, dtype: object


In addition to converting the first letter to upper case, the <font color='red'>capitalize</font> function ensures all other letters are lowercase. Thus, if there’s an uppercase letter other than the first one, it’s converted to lowercase.

Python also has built-in <font color='red'>upper</font> and <font color='red'>lower</font> functions. They work on a single string, though. We can’t use them to perform operations on an entire column of a <font color='red'>DataFrame</font>.

For instance, we can apply the built-in <font color='red'>upper</font> function on the value in the first row of the department column.

In [5]:
import pandas as pd

sales = pd.read_csv("staff.csv")

print(sales["department"][0].upper())

ACCOUNTING


When we work on tabular data (data in tables), it’s much more efficient to use the string methods under the str accessor. They allow us to perform operations on the entire column. Make sure to write <font color='red'>str`</font> before the name of the method.