# Automatic Ticket Classification
Eeshan Gupta  
eeshangpt@gmail.com

## Introduction to Problem Statement

For a financial company, customer complaints carry a lot of importance, as they are often an indicator of the shortcomings in their products and services. If these complaints are resolved efficiently in time, they can bring down customer dissatisfaction to a minimum and retain them with stronger loyalty. This also gives them an idea of how to continuously improve their services to attract more customers.

### Business goal

You need to build a model that is able to classify customer complaints based on the products/services. By doing so, you can segregate these tickets into their relevant categories and, therefore, help in the quick resolution of the issue.

## Table of content

1. [Introduction to problem statemtent](#Introduction-to-Problem-Statement)
2. [Reading in the data](#)
3. [Cleaning the data](#)
4. [Pre-processing the data](#)
5. [Data Visualization](#)
6. [Feature Engineering](#)
7. [Model Building](#)
8. [Inferences from the model](#)

## Reading the data

### Installations and Imports

In [1]:
import json
import os
import pickle

In [2]:
import pandas as pd

In [3]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 10)

In [4]:
PRJ_DIR = os.getcwd()
DATA_DIR = os.path.join(PRJ_DIR, 'data')

In [5]:
file_name = 'complaints-2021-05-14_08_16.json'
pkl_file_name = file_name + ".pkl"

In [6]:
try:
    assert os.path.isfile(os.path.join(DATA_DIR, pkl_file_name))
    print("Pickle found. Now loading...")
    with open(os.path.join(DATA_DIR, pkl_file_name), 'rb') as f:
        data = pickle.load(f)
except AssertionError as e:
    print("Serialized file not found. Now reading the raw file....")
    with open(os.path.join(DATA_DIR, file_name)) as f:
        data = json.load(f)
    print("Raw file is read. Now pickling.....")
    with open(os.path.join(DATA_DIR, pkl_file_name), 'wb') as f:
        pickle.dump(data, f)

Pickle found. Now loading...


In [7]:
df = pd.json_normalize(data)

In [8]:
df.sample(10)

Unnamed: 0,_index,_type,_id,_score,_source.tags,_source.zip_code,_source.complaint_id,_source.issue,_source.date_received,_source.state,_source.consumer_disputed,_source.product,_source.company_response,_source.company,_source.submitted_via,_source.date_sent_to_company,_source.company_public_response,_source.sub_product,_source.timely,_source.complaint_what_happened,_source.sub_issue,_source.consumer_consent_provided
50360,complaint-public-v2,complaint,1251110,0.0,,10475,1251110,False statements or representation,2015-02-22T12:00:00-05:00,NY,No,Debt collection,Closed with explanation,JPMORGAN CHASE & CO.,Web,2015-02-22T12:00:00-05:00,,Credit card,Yes,,Attempted to collect wrong amount,
50336,complaint-public-v2,complaint,1573338,0.0,,10312,1573338,"Account opening, closing, or management",2015-09-21T12:00:00-05:00,NY,No,Bank account or service,Closed with explanation,JPMORGAN CHASE & CO.,Referral,2015-09-24T12:00:00-05:00,,Checking account,Yes,,,
30186,complaint-public-v2,complaint,3320692,0.0,,750XX,3320692,Managing an account,2019-07-27T12:00:00-05:00,TX,,Checking or savings account,Closed with explanation,JPMORGAN CHASE & CO.,Web,2019-07-27T12:00:00-05:00,,Checking account,Yes,I had an account with an attached student acco...,Deposits and withdrawals,Consent provided
41782,complaint-public-v2,complaint,2258503,0.0,,,2258503,Closing/Cancelling account,2016-12-21T12:00:00-05:00,CO,Yes,Credit card,Closed with explanation,JPMORGAN CHASE & CO.,Web,2016-12-21T12:00:00-05:00,,,Yes,I have a Chase Credit Card that is co-branded ...,,Consent provided
39217,complaint-public-v2,complaint,1855144,0.0,Older American,91607,1855144,Using a debit or ATM card,2016-03-29T12:00:00-05:00,CA,Yes,Bank account or service,Closed with explanation,JPMORGAN CHASE & CO.,Phone,2016-03-29T12:00:00-05:00,,Checking account,Yes,,,
18733,complaint-public-v2,complaint,3563777,0.0,Older American,46234,3563777,Managing an account,2020-03-12T12:00:00-05:00,IN,,Checking or savings account,Closed with explanation,JPMORGAN CHASE & CO.,Fax,2020-03-12T12:00:00-05:00,,Checking account,Yes,,Deposits and withdrawals,
71906,complaint-public-v2,complaint,1313198,0.0,Servicemember,,1313198,"Loan modification,collection,foreclosure",2015-04-02T12:00:00-05:00,AL,No,Mortgage,Closed with explanation,JPMORGAN CHASE & CO.,Web,2015-04-02T12:00:00-05:00,,Conventional adjustable mortgage (ARM),Yes,I applied for a modification review with Chase...,,Consent provided
14313,complaint-public-v2,complaint,3254261,0.0,,917XX,3254261,Managing an account,2019-05-27T12:00:00-05:00,CA,,Checking or savings account,Closed with monetary relief,JPMORGAN CHASE & CO.,Web,2019-05-27T12:00:00-05:00,,Checking account,Yes,I have been a loyal Chase bank customer since ...,Banking errors,Consent provided
17139,complaint-public-v2,complaint,2555349,0.0,,981XX,2555349,Unexpected or other fees,2017-06-21T12:00:00-05:00,WA,,"Money transfer, virtual currency, or money ser...",Closed with explanation,JPMORGAN CHASE & CO.,Web,2017-06-21T12:00:00-05:00,,Check cashing service,Yes,I was given a check from a friend for {$85.00}...,,Consent provided
11397,complaint-public-v2,complaint,3788487,0.0,Older American,77380,3788487,Problem with a credit reporting company's inve...,2020-08-10T12:00:00-05:00,TX,,"Credit reporting, credit repair services, or o...",Closed with explanation,JPMORGAN CHASE & CO.,Web,2020-08-10T12:00:00-05:00,,Credit reporting,Yes,,Difficulty submitting a dispute or getting inf...,Consent not provided


In [9]:
df.columns 

Index(['_index', '_type', '_id', '_score', '_source.tags', '_source.zip_code',
       '_source.complaint_id', '_source.issue', '_source.date_received',
       '_source.state', '_source.consumer_disputed', '_source.product',
       '_source.company_response', '_source.company', '_source.submitted_via',
       '_source.date_sent_to_company', '_source.company_public_response',
       '_source.sub_product', '_source.timely',
       '_source.complaint_what_happened', '_source.sub_issue',
       '_source.consumer_consent_provided'],
      dtype='object')

In [11]:
df['_source.complaint_what_happened'].

0