## Rename Columns in Dataframe

Let us understand how to rename the columns using column mapping between source field names and target field names.

* We can use rename to rename the columns. It has keyword argument called as `columns`. You can pass a dict or function to it. We will pass a dict.
  * The dict should contain key and value pairs of source and target column names.
  * We need to get the relevant column names from **column_mapping** in desired format.

In [None]:
import pandas as pd
customers = pd.read_csv('/data/ecomm/customers/part-00000')

In [None]:
column_mapping_str = '''{
    "customer_first_name": {"target_field_name": "FirstName", "is_required": true},
    "customer_last_name": {"target_field_name": "LastName", "is_required": true},
    "customer_email": {"target_field_name": "Email", "is_required": true},
    "product_name": {"is_required": false},
    "product_subscription": {"is_required": false}
}'''

In [None]:
import json
column_mapping = json.loads(column_mapping_str)

In [None]:
column_mapping

In [None]:
list(filter(lambda col: col[1]['is_required'], column_mapping.items()))

In [None]:
# We got the required columns list here
required_columns_list = list(filter(lambda col: col[1]['is_required'], column_mapping.items()))

In [None]:
required_columns_list

In [None]:
col = required_columns_list[0]

In [None]:
col

In [None]:
type(col)

In [None]:
col[1]

In [None]:
col[1]['target_field_name']

In [None]:
(col[0], col[1]['target_field_name'])

In [None]:
dict(map(lambda col: (col[0], col[1]['target_field_name']), required_columns_list))

In [None]:
# Get the dict with source and target field names
required_columns_mapping = dict(map(lambda col: (col[0], col[1]['target_field_name']), required_columns_list))

In [None]:
customers.rename?

In [None]:
# It will rename the columns that are present in the dict passed.
# Other columns will have the names from the original dataframe
# This will return all the columns with new names for the 3 columns that are part of he dict.
customers.rename(columns=required_columns_mapping)

In [None]:
customers.rename(columns=required_columns_mapping).columns

In [2]:
import pandas as pd
customers = pd.read_csv('/data/ecomm/customers/part-00000')

column_mapping_str = '''{
    "customer_first_name": {"target_field_name": "FirstName", "is_required": true},
    "customer_last_name": {"target_field_name": "LastName", "is_required": true},
    "customer_email": {"target_field_name": "Email", "is_required": true},
    "product_name": {"is_required": false},
    "product_subscription": {"is_required": false}
}'''

import json
column_mapping = json.loads(column_mapping_str)

# Assigning the list of not required fields to a variable
columns_to_be_dropped = dict(list(filter(lambda col: not col[1]['is_required'], column_mapping.items()))).keys()
required_columns_list = list(filter(lambda col: col[1]['is_required'], column_mapping.items()))
required_columns_mapping = dict(map(lambda col: (col[0], col[1]['target_field_name']), required_columns_list))

# This will take care of dropping the not required fields and rename others as per mapping
customers_target = customers.drop(columns=columns_to_be_dropped).rename(columns=required_columns_mapping)

In [3]:
customers_target

Unnamed: 0,FirstName,LastName,Email
0,Cassaundra,Collinson,ccollinson0@alibaba.com
1,Rozamond,Oene,roene1@technorati.com
2,Gus,Hawick,ghawick2@dagondesign.com
3,Delano,Ashbey,dashbey3@purevolume.com
4,Fara,Simondson,fsimondson4@umn.edu
5,Myrilla,Gates,mgates5@sina.com.cn
6,Arabela,Tweedlie,atweedlie6@comcast.net
7,Loise,Schindler,lschindler7@discovery.com
8,Storm,McBrearty,smcbrearty8@ovh.net
9,Westley,Matityahu,wmatityahu9@altervista.org
