## Problem Statement

You need to build a model that is able to classify customer complaints based on the products/services. By doing so, you can segregate these tickets into their relevant categories and, therefore, help in the quick resolution of the issue.

You will be doing topic modelling on the <b>.json</b> data provided by the company. Since this data is not labelled, you need to apply NMF to analyse patterns and classify tickets into the following five clusters based on their products/services:

* Credit card / Prepaid card

* Bank account services

* Theft/Dispute reporting

* Mortgages/loans

* Others


With the help of topic modelling, you will be able to map each ticket onto its respective department/category. You can then use this data to train any supervised model such as logistic regression, decision tree or random forest. Using this trained model, you can classify any new customer complaint support ticket into its relevant department.

## Pipelines that needs to be performed:

You need to perform the following eight major tasks to complete the assignment:

1.  Data loading

2. Text preprocessing

3. Exploratory data analysis (EDA)

4. Feature extraction

5. Topic modelling

6. Model building using supervised learning

7. Model training and evaluation

8. Model inference

## Importing the necessary libraries

In [45]:
import json
import numpy as np
import pandas as pd
import re, nltk, spacy, string
import en_core_web_sm
from tqdm import tqdm
nlp = en_core_web_sm.load()
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

from plotly.offline import plot
import plotly.graph_objects as go
import plotly.express as px

from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from pprint import pprint

from nltk.stem import WordNetLemmatizer
# hide warnings
import warnings
warnings.filterwarnings('ignore')

In [109]:
from nltk.stem import WordNetLemmatizer
from nltk.tokenize import word_tokenize
from nltk import pos_tag

#nltk.download('punkt')
#nltk.download('averaged_perceptron_tagger')
#nltk.download('wordnet')
#nltk.download('averaged_perceptron_tagger_eng')

In [5]:
# change the display properties of pandas
pd.set_option('display.max_colwidth', 100)
#pd.set_option('display.max_columns', None)
#pd.set_option('display.max_rows', None)

## Loading the data

The data is in JSON format and we need to convert it to a dataframe.

In [7]:
# Opening JSON file
f = open('complaints-2021-05-14_08_16.json')

# returns JSON object as
# a dictionary
data = json.load(f)
df=pd.json_normalize(data)

## Data preparation

In [9]:
# Inspect the dataframe to understand the given data.
df.head(5)


Unnamed: 0,_index,_type,_id,_score,_source.tags,_source.zip_code,_source.complaint_id,_source.issue,_source.date_received,_source.state,...,_source.company_response,_source.company,_source.submitted_via,_source.date_sent_to_company,_source.company_public_response,_source.sub_product,_source.timely,_source.complaint_what_happened,_source.sub_issue,_source.consumer_consent_provided
0,complaint-public-v2,complaint,3211475,0.0,,90301,3211475,Attempts to collect debt not owed,2019-04-13T12:00:00-05:00,CA,...,Closed with explanation,JPMORGAN CHASE & CO.,Web,2019-04-13T12:00:00-05:00,,Credit card debt,Yes,,Debt is not yours,Consent not provided
1,complaint-public-v2,complaint,3229299,0.0,Servicemember,319XX,3229299,Written notification about debt,2019-05-01T12:00:00-05:00,GA,...,Closed with explanation,JPMORGAN CHASE & CO.,Web,2019-05-01T12:00:00-05:00,,Credit card debt,Yes,Good morning my name is XXXX XXXX and I appreciate it if you could help me put a stop to Chase B...,Didn't receive enough information to verify debt,Consent provided
2,complaint-public-v2,complaint,3199379,0.0,,77069,3199379,"Other features, terms, or problems",2019-04-02T12:00:00-05:00,TX,...,Closed with explanation,JPMORGAN CHASE & CO.,Web,2019-04-02T12:00:00-05:00,,General-purpose credit card or charge card,Yes,I upgraded my XXXX XXXX card in XX/XX/2018 and was told by the agent who did the upgrade my anni...,Problem with rewards from credit card,Consent provided
3,complaint-public-v2,complaint,2673060,0.0,,48066,2673060,Trouble during payment process,2017-09-13T12:00:00-05:00,MI,...,Closed with explanation,JPMORGAN CHASE & CO.,Web,2017-09-14T12:00:00-05:00,,Conventional home mortgage,Yes,,,Consent not provided
4,complaint-public-v2,complaint,3203545,0.0,,10473,3203545,Fees or interest,2019-04-05T12:00:00-05:00,NY,...,Closed with explanation,JPMORGAN CHASE & CO.,Referral,2019-04-05T12:00:00-05:00,,General-purpose credit card or charge card,Yes,,Charged too much interest,


In [11]:
df.shape

(78313, 22)

In [13]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 78313 entries, 0 to 78312
Data columns (total 22 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   _index                             78313 non-null  object 
 1   _type                              78313 non-null  object 
 2   _id                                78313 non-null  object 
 3   _score                             78313 non-null  float64
 4   _source.tags                       10900 non-null  object 
 5   _source.zip_code                   71556 non-null  object 
 6   _source.complaint_id               78313 non-null  object 
 7   _source.issue                      78313 non-null  object 
 8   _source.date_received              78313 non-null  object 
 9   _source.state                      76322 non-null  object 
 10  _source.consumer_disputed          78313 non-null  object 
 11  _source.product                    78313 non-null  obj

In [15]:
#print the column names
df.columns

Index(['_index', '_type', '_id', '_score', '_source.tags', '_source.zip_code',
       '_source.complaint_id', '_source.issue', '_source.date_received',
       '_source.state', '_source.consumer_disputed', '_source.product',
       '_source.company_response', '_source.company', '_source.submitted_via',
       '_source.date_sent_to_company', '_source.company_public_response',
       '_source.sub_product', '_source.timely',
       '_source.complaint_what_happened', '_source.sub_issue',
       '_source.consumer_consent_provided'],
      dtype='object')

In [17]:
#Assign new column names
df.rename(columns={'_index':'index',
  '_type':'type',
  '_id':'id',
  '_score':'score',
  '_source.tags':'tags',
  '_source.zip_code':'zip_code',
 '_source.complaint_id':'complaint_id',
 '_source.issue':'issue',
 '_source.date_received':'date_received',
 '_source.state':'state',
 '_source.consumer_disputed':'consumer_disputed',
 '_source.product':'product',
 '_source.company_response':'company_response',
 '_source.company':'company',
 '_source.submitted_via':'submitted_via',
 '_source.date_sent_to_company':'date_sent_to_company',
 '_source.company_public_response':'company_public_response',
 '_source.sub_product':'sub_product',
 '_source.timely':'timely',
 '_source.complaint_what_happened':'complaint_what_happened',
 '_source.sub_issue':'sub_issue',
 '_source.consumer_consent_provided':'consumer_consent_provided'},inplace=True)

In [19]:
#Checking nulls values in each columns
df.isnull().sum()

index                            0
type                             0
id                               0
score                            0
tags                         67413
zip_code                      6757
complaint_id                     0
issue                            0
date_received                    0
state                         1991
consumer_disputed                0
product                          0
company_response                 0
company                          0
submitted_via                    0
date_sent_to_company             0
company_public_response      78309
sub_product                  10571
timely                           0
complaint_what_happened          0
sub_issue                    46297
consumer_consent_provided     1008
dtype: int64

In [25]:
# Checking rows where complaint_what_happened is blank
df[df['complaint_what_happened']==''].shape

(57241, 22)

In [27]:
#Assign nan in place of blanks in the complaints column
df['complaint_what_happened'].replace(r'^\s*$', np.nan, regex=True, inplace=True)

In [29]:
df[df['complaint_what_happened'].isna()]

Unnamed: 0,index,type,id,score,tags,zip_code,complaint_id,issue,date_received,state,...,company_response,company,submitted_via,date_sent_to_company,company_public_response,sub_product,timely,complaint_what_happened,sub_issue,consumer_consent_provided
0,complaint-public-v2,complaint,3211475,0.0,,90301,3211475,Attempts to collect debt not owed,2019-04-13T12:00:00-05:00,CA,...,Closed with explanation,JPMORGAN CHASE & CO.,Web,2019-04-13T12:00:00-05:00,,Credit card debt,Yes,,Debt is not yours,Consent not provided
3,complaint-public-v2,complaint,2673060,0.0,,48066,2673060,Trouble during payment process,2017-09-13T12:00:00-05:00,MI,...,Closed with explanation,JPMORGAN CHASE & CO.,Web,2017-09-14T12:00:00-05:00,,Conventional home mortgage,Yes,,,Consent not provided
4,complaint-public-v2,complaint,3203545,0.0,,10473,3203545,Fees or interest,2019-04-05T12:00:00-05:00,NY,...,Closed with explanation,JPMORGAN CHASE & CO.,Referral,2019-04-05T12:00:00-05:00,,General-purpose credit card or charge card,Yes,,Charged too much interest,
5,complaint-public-v2,complaint,3275312,0.0,Older American,48227,3275312,Managing an account,2019-06-13T12:00:00-05:00,MI,...,Closed with monetary relief,JPMORGAN CHASE & CO.,Referral,2019-06-14T12:00:00-05:00,,Checking account,Yes,,Problem using a debit or ATM card,
6,complaint-public-v2,complaint,3238804,0.0,,76262,3238804,Managing an account,2019-05-10T12:00:00-05:00,TX,...,Closed with monetary relief,JPMORGAN CHASE & CO.,Phone,2019-05-10T12:00:00-05:00,,Checking account,Yes,,Problem using a debit or ATM card,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
78304,complaint-public-v2,complaint,3080086,0.0,,76107,3080086,Applying for a mortgage or refinancing an existing mortgage,2018-11-22T12:00:00-05:00,TX,...,Closed with monetary relief,JPMORGAN CHASE & CO.,Web,2018-11-22T12:00:00-05:00,,Conventional home mortgage,Yes,,,Other
78305,complaint-public-v2,complaint,3103013,0.0,Older American,863XX,3103013,Closing an account,2018-12-17T12:00:00-05:00,AZ,...,Closed with explanation,JPMORGAN CHASE & CO.,Web,2018-12-18T12:00:00-05:00,,Other banking product or service,Yes,,Funds not received from closed account,Consent not provided
78306,complaint-public-v2,complaint,3099437,0.0,,11217,3099437,Managing an account,2018-12-12T12:00:00-05:00,NY,...,Closed with explanation,JPMORGAN CHASE & CO.,Referral,2018-12-18T12:00:00-05:00,,Checking account,Yes,,Deposits and withdrawals,
78307,complaint-public-v2,complaint,3156336,0.0,,074XX,3156336,Applying for a mortgage or refinancing an existing mortgage,2019-02-19T12:00:00-05:00,NJ,...,Closed with explanation,JPMORGAN CHASE & CO.,Web,2019-02-19T12:00:00-05:00,,Conventional home mortgage,Yes,,,Other


In [31]:
df.isnull().sum()

index                            0
type                             0
id                               0
score                            0
tags                         67413
zip_code                      6757
complaint_id                     0
issue                            0
date_received                    0
state                         1991
consumer_disputed                0
product                          0
company_response                 0
company                          0
submitted_via                    0
date_sent_to_company             0
company_public_response      78309
sub_product                  10571
timely                           0
complaint_what_happened      57241
sub_issue                    46297
consumer_consent_provided     1008
dtype: int64

In [33]:
#Remove all rows where complaints column is nan
df.dropna(subset=['complaint_what_happened'],inplace=True)

In [35]:
df.isnull().sum()

index                            0
type                             0
id                               0
score                            0
tags                         17256
zip_code                      4645
complaint_id                     0
issue                            0
date_received                    0
state                          143
consumer_disputed                0
product                          0
company_response                 0
company                          0
submitted_via                    0
date_sent_to_company             0
company_public_response      21070
sub_product                   2109
timely                           0
complaint_what_happened          0
sub_issue                     8176
consumer_consent_provided        0
dtype: int64

## Prepare the text for topic modeling

Once you have removed all the blank complaints, you need to:

* Make the text lowercase
* Remove text in square brackets
* Remove punctuation
* Remove words containing numbers


Once you have done these cleaning operations you need to perform the following:
* Lemmatize the texts
* Extract the POS tags of the lemmatized text and remove all the words which have tags other than NN[tag == "NN"].


In [37]:
# Write your function here to clean the text and remove all the unnecessary elements.
def clean_text(text):
    #Making text Lower case 
    text = text.lower()

    #Removing text in Square brackets []
    text = re.sub(r'\[[^\]]+\]',' ',text)
    #Removing Puntuations
    text = re.sub(r'[^\w\s/]',' ',text)

    # words containg numbers 
    text = re.sub(r'\b[a-zA-Z]*\d+[a-zA-Z]+\b|\b[a-zA-Z]+\d+\w*\b',' ',text)

    return text

In [39]:
df['complaint_what_happened_cleaned'] = df['complaint_what_happened'].apply(lambda x : clean_text(x))

In [47]:
#Write your function to Lemmatize the texts
lemmatizer = WordNetLemmatizer()
def lemmatize(text):
    # Remove words inside square brackets
    #text = re.sub(r'\[\w+\]', '', text)
    words = word_tokenize(text)
    # Lemmatize each word
    #lemmatized_text = ' '.join(lemmatizer.lemmatize(word) for word in text.split())
    lemmatized_text = ' '.join(lemmatizer.lemmatize(word) for word in words)
    return lemmatized_text

In [69]:
#df['complaint_what_happened_lemma'] = df['complaint_what_happened_cleaned'].apply(lemmatize)
df['complaint_what_happened_lemma'] = df['complaint_what_happened_cleaned'].apply(lambda x:lemmatize(x))

In [71]:
#Create a dataframe('df_clean') that will have only the complaints and the lemmatized complaints
df_clean = df[['complaint_what_happened_cleaned','complaint_what_happened_lemma']]

In [73]:
df_clean

Unnamed: 0,complaint_what_happened_cleaned,complaint_what_happened_lemma
1,good morning my name is xxxx xxxx and i appreciate it if you could help me put a stop to chase b...,good morning my name is xxxx xxxx and i appreciate it if you could help me put a stop to chase b...
2,i upgraded my xxxx xxxx card in xx/xx/2018 and was told by the agent who did the upgrade my anni...,i upgraded my xxxx xxxx card in xx/xx/2018 and wa told by the agent who did the upgrade my anniv...
10,chase card was reported on xx/xx/2019 however fraudulent application have been submitted my id...,chase card wa reported on xx/xx/2019 however fraudulent application have been submitted my ident...
11,on xx/xx/2018 while trying to book a xxxx xxxx ticket i came across an offer for 300 00 t...,on xx/xx/2018 while trying to book a xxxx xxxx ticket i came across an offer for 300 00 to be ap...
14,my grand son give me check for 1600 00 i deposit it into my chase account after fund clear my...,my grand son give me check for 1600 00 i deposit it into my chase account after fund clear my ch...
...,...,...
78303,after being a chase card customer for well over a decade was offered multiple solicitations for...,after being a chase card customer for well over a decade wa offered multiple solicitation for ac...
78309,on wednesday xx/xx/xxxx i called chas my xxxx xxxx visa credit card provider and asked how to...,on wednesday xx/xx/xxxx i called chas my xxxx xxxx visa credit card provider and asked how to ma...
78310,i am not familiar with xxxx pay and did not understand the great risk this provides to consumers...,i am not familiar with xxxx pay and did not understand the great risk this provides to consumer ...
78311,i have had flawless credit for 30 yrs i ve had chase credit cards chase freedom specifica...,i have had flawless credit for 30 yr i ve had chase credit card chase freedom specifically since...


In [111]:
#Write your function to extract the POS tags
def extract_nouns_column(text):
    words = word_tokenize(text)  # Tokenize text
    tagged_words = pos_tag(words)  # Get POS tags
    lemmatized_nouns = [lemmatizer.lemmatize(word) for word, tag in tagged_words if tag == 'NN']  # Keep only nouns
    return ' '.join(lemmatized_nouns)  # Return as a single string

In [125]:
'''
df['complaint_what_happened_lemma'][1]
words = word_tokenize(df['complaint_what_happened_lemma'][1])
tagged_words = pos_tag(words)
lemmatized_nouns = [lemmatizer.lemmatize(word) for word, tag in tagged_words if tag == 'NN']
' '.join(lemmatized_nouns) 
'''

'morning name stop bank cardmember service debt verification statement i bank debt mail month debt i right information consumer chase account advance help'

In [127]:
df_clean["complaint_POS_removed"] =  df_clean['complaint_what_happened_lemma'].apply(extract_nouns_column) #this column should contain lemmatized text with all the words removed which have tags other than NN[tag == "NN"].

In [129]:
#The clean dataframe should now contain the raw complaint, lemmatized complaint and the complaint after removing POS tags.
df_clean

Unnamed: 0,complaint_what_happened_cleaned,complaint_what_happened_lemma,complaint_POS_removed
1,good morning my name is xxxx xxxx and i appreciate it if you could help me put a stop to chase b...,good morning my name is xxxx xxxx and i appreciate it if you could help me put a stop to chase b...,morning name stop bank cardmember service debt verification statement i bank debt mail month deb...
2,i upgraded my xxxx xxxx card in xx/xx/2018 and was told by the agent who did the upgrade my anni...,i upgraded my xxxx xxxx card in xx/xx/2018 and wa told by the agent who did the upgrade my anniv...,i card wa agent upgrade date agent wa information order account date xx/xx/xxxx consent xxxx rec...
10,chase card was reported on xx/xx/2019 however fraudulent application have been submitted my id...,chase card wa reported on xx/xx/2019 however fraudulent application have been submitted my ident...,card wa xx/xx/2019 application identity consent service credit identity applicant
11,on xx/xx/2018 while trying to book a xxxx xxxx ticket i came across an offer for 300 00 t...,on xx/xx/2018 while trying to book a xxxx xxxx ticket i came across an offer for 300 00 to be ap...,book xxxx ticket i offer ticket card i information offer minute wa screen decision xxxx wa bank ...
14,my grand son give me check for 1600 00 i deposit it into my chase account after fund clear my...,my grand son give me check for 1600 00 i deposit it into my chase account after fund clear my ch...,son deposit chase account fund chase bank account money son check money wa taking bank refuse mo...
...,...,...,...
78303,after being a chase card customer for well over a decade was offered multiple solicitations for...,after being a chase card customer for well over a decade wa offered multiple solicitation for ac...,chase card customer decade wa solicitation credit card chase airline mile hotel point wa card fe...
78309,on wednesday xx/xx/xxxx i called chas my xxxx xxxx visa credit card provider and asked how to...,on wednesday xx/xx/xxxx i called chas my xxxx xxxx visa credit card provider and asked how to ma...,xx/xx/xxxx i chas visa credit card provider claim purchase protection benefit xx/xx/xxxx i schoo...
78310,i am not familiar with xxxx pay and did not understand the great risk this provides to consumers...,i am not familiar with xxxx pay and did not understand the great risk this provides to consumer ...,i pay risk consumer i bank app chase year mobile banking i merchant merchant ha inquiry communic...
78311,i have had flawless credit for 30 yrs i ve had chase credit cards chase freedom specifica...,i have had flawless credit for 30 yr i ve had chase credit card chase freedom specifically since...,i credit yr i ve credit card chase freedom xxxx problem balance transfer life plenty experience ...


## Exploratory data analysis to get familiar with the data.

Write the code in this task to perform the following:

*   Visualise the data according to the 'Complaint' character length
*   Using a word cloud find the top 40 words by frequency among all the articles after processing the text
*   Find the top unigrams,bigrams and trigrams by frequency among all the complaints after processing the text. ‘




In [None]:
# Write your code here to visualise the data according to the 'Complaint' character length

#### Find the top 40 words by frequency among all the articles after processing the text.

In [None]:
#Using a word cloud find the top 40 words by frequency among all the articles after processing the text


In [None]:
#Removing -PRON- from the text corpus
df_clean['Complaint_clean'] = df_clean['complaint_POS_removed'].str.replace('-PRON-', '')

#### Find the top unigrams,bigrams and trigrams by frequency among all the complaints after processing the text.

In [None]:
#Write your code here to find the top 30 unigram frequency among the complaints in the cleaned datafram(df_clean).


In [None]:
#Print the top 10 words in the unigram frequency


In [None]:
#Write your code here to find the top 30 bigram frequency among the complaints in the cleaned datafram(df_clean).


In [None]:
#Print the top 10 words in the bigram frequency

In [None]:
#Write your code here to find the top 30 trigram frequency among the complaints in the cleaned datafram(df_clean).


In [None]:
#Print the top 10 words in the trigram frequency

## The personal details of customer has been masked in the dataset with xxxx. Let's remove the masked text as this will be of no use for our analysis

In [None]:
df_clean['Complaint_clean'] = df_clean['Complaint_clean'].str.replace('xxxx','')

In [None]:
#All masked texts has been removed
df_clean

## Feature Extraction
Convert the raw texts to a matrix of TF-IDF features

**max_df** is used for removing terms that appear too frequently, also known as "corpus-specific stop words"
max_df = 0.95 means "ignore terms that appear in more than 95% of the complaints"

**min_df** is used for removing terms that appear too infrequently
min_df = 2 means "ignore terms that appear in less than 2 complaints"

In [None]:
#Write your code here to initialise the TfidfVectorizer



#### Create a document term matrix using fit_transform

The contents of a document term matrix are tuples of (complaint_id,token_id) tf-idf score:
The tuples that are not there have a tf-idf score of 0

In [None]:
#Write your code here to create the Document Term Matrix by transforming the complaints column present in df_clean.


## Topic Modelling using NMF

Non-Negative Matrix Factorization (NMF) is an unsupervised technique so there are no labeling of topics that the model will be trained on. The way it works is that, NMF decomposes (or factorizes) high-dimensional vectors into a lower-dimensional representation. These lower-dimensional vectors are non-negative which also means their coefficients are non-negative.

In this task you have to perform the following:

* Find the best number of clusters
* Apply the best number to create word clusters
* Inspect & validate the correction of each cluster wrt the complaints
* Correct the labels if needed
* Map the clusters to topics/cluster names

In [None]:
from sklearn.decomposition import NMF

## Manual Topic Modeling
You need to do take the trial & error approach to find the best num of topics for your NMF model.

The only parameter that is required is the number of components i.e. the number of topics we want. This is the most crucial step in the whole topic modeling process and will greatly affect how good your final topics are.

In [None]:
#Load your nmf_model with the n_components i.e 5
num_topics = #write the value you want to test out

#keep the random_state =40
nmf_model = #write your code here

In [None]:
nmf_model.fit(dtm)
len(tfidf.get_feature_names())

In [None]:
#Print the Top15 words for each of the topics


In [None]:
#Create the best topic for each complaint in terms of integer value 0,1,2,3 & 4



In [None]:
#Assign the best topic to each of the cmplaints in Topic Column

df_clean['Topic'] = #write your code to assign topics to each rows.

In [None]:
df_clean.head()

In [None]:
#Print the first 5 Complaint for each of the Topics
df_clean=df_clean.groupby('Topic').head(5)
df_clean.sort_values('Topic')

#### After evaluating the mapping, if the topics assigned are correct then assign these names to the relevant topic:
* Bank Account services
* Credit card or prepaid card
* Theft/Dispute Reporting
* Mortgage/Loan
* Others

In [None]:
#Create the dictionary of Topic names and Topics

Topic_names = {   }
#Replace Topics with Topic Names
df_clean['Topic'] = df_clean['Topic'].map(Topic_names)

In [None]:
df_clean

## Supervised model to predict any new complaints to the relevant Topics.

You have now build the model to create the topics for each complaints.Now in the below section you will use them to classify any new complaints.

Since you will be using supervised learning technique we have to convert the topic names to numbers(numpy arrays only understand numbers)

In [None]:
#Create the dictionary again of Topic names and Topics

Topic_names = {   }
#Replace Topics with Topic Names
df_clean['Topic'] = df_clean['Topic'].map(Topic_names)

In [None]:
df_clean

In [None]:
#Keep the columns"complaint_what_happened" & "Topic" only in the new dataframe --> training_data
training_data=

In [None]:
training_data

####Apply the supervised models on the training data created. In this process, you have to do the following:
* Create the vector counts using Count Vectoriser
* Transform the word vecotr to tf-idf
* Create the train & test data using the train_test_split on the tf-idf & topics


In [None]:

#Write your code to get the Vector count


#Write your code here to transform the word vector to tf-idf

You have to try atleast 3 models on the train & test data from these options:
* Logistic regression
* Decision Tree
* Random Forest
* Naive Bayes (optional)

**Using the required evaluation metrics judge the tried models and select the ones performing the best**

In [None]:
# Write your code here to build any 3 models and evaluate them using the required metrics



