# How to...clean a company's name

This notebook shows how to use the entitymatching api to clean attributes that contain a company's name. The cleaning process relies in two internal and customized dictionaries to apply the desirable cleaning rules: 
- Dictionary of Legal Terms: defines the replacement rules to normalize the legal forms for business. For instance, a company's name that has LT as its legal form, will be normalized as LIMITED. 
- Dictionary of cleaning rules: defines the cleaning rules to be applied, for example: remove numbers, punctuations, etc.

In [3]:
# Sets up the location of the api relative to this notebook 
import sys
sys.path.append('../../')

In [4]:
# Import the module for cleaning company's name
from financialcleaner import company

In [5]:
# Example of some company name with unicode, punctuation, URL and email
company_name = ' [89]	GRAND BUDAPEST HOTEL adm@budapest.com %& 7((888)) www.gbhotel.com lt'
print('COMPANY NAME TO CLEAN: {}'.format(company_name))

COMPANY NAME TO CLEAN:  [89]	GRAND BUDAPEST HOTEL adm@budapest.com %& 7((888)) www.gbhotel.com lt


## 1. Using the default cleaning

Default parameters: performs the normalization of legal terms and returns the clean name in lower case.

In [6]:
# Call the cleaning function
clean_name = company.get_clean_name(company_name)
print(clean_name)

grand budapest hotel limited


## 2. Not applying the normalization of legal terms

In [7]:
# Call the cleaning function
clean_name = company.get_clean_name(company_name, normalize_legal_terms=False)
print(clean_name)

grand budapest hotel lt


## 3. Returning the clean name in upper case

In [8]:
# Call the cleaning function
clean_name = company.get_clean_name(company_name, output_lettercase='upper')
print(clean_name)

GRAND BUDAPEST HOTEL LIMITED


## 4. Returning the clean name with the first letter capitalized

In [9]:
# Call the cleaning function
clean_name = company.get_clean_name(company_name, output_lettercase='title')
print(clean_name)

Grand Budapest Hotel Limited
