- There are cases in which the code column of the original data needs to be cleaned.
- I am trying to merge with other data by using this code column as a merge condition,  
  but there are often cases where the number of digits in the code is not uniform.

- In this case, use the zfill() or rjust() function  
  to change the number of digits or shape of the code column.

In [1]:
import pandas as pd
import numpy as np

In [2]:
df = pd.read_excel("data/sample_data_code_2.xlsx")
print(df.shape)
df

(6, 1)


Unnamed: 0,code
0,1000
1,1001
2,1002
3,2000
4,2001
5,2002


- zfill() function fills "0" in front of a character to make it the desired number of digits.
- Since the "code" column of the sample data is in numeric format,  
  the function was applied after first converting it to a string type.

In [3]:
df['code_zfill_5'] = df['code'].apply(lambda x : str(x).zfill(5))
df['code_zfill_7'] = df['code'].apply(lambda x : str(x).zfill(7))

df

Unnamed: 0,code,code_zfill_5,code_zfill_7
0,1000,1000,1000
1,1001,1001,1001
2,1002,1002,1002
3,2000,2000,2000
4,2001,2001,2001
5,2002,2002,2002


- rjust() function provides more flexibility.
- You can make the desired number of digits  
  by filling the front of the string with a character other than "0".

In [4]:
df['code_rjust_5'] = df['code'].apply(lambda x : str(x).rjust(5, "0"))
df['code_rjust_A'] = df['code'].apply(lambda x : str(x).rjust(5, "A"))

df

Unnamed: 0,code,code_zfill_5,code_zfill_7,code_rjust_5,code_rjust_A
0,1000,1000,1000,1000,A1000
1,1001,1001,1001,1001,A1001
2,1002,1002,1002,1002,A1002
3,2000,2000,2000,2000,A2000
4,2001,2001,2001,2001,A2001
5,2002,2002,2002,2002,A2002
