## Problem Description
Imagine being hungry in an unfamiliar part of town and getting restaurant recommendations served up, based on your personal preferences, at just the right moment. The recommendation comes with an attached discount from your credit card provider for a local place around the corner!

Right now, Elo, one of the largest payment brands in Brazil, has built partnerships with merchants in order to offer promotions or discounts to cardholders. But do these promotions work for either the consumer or the merchant? Do customers enjoy their experience? Do merchants see repeat business? Personalization is key.



In [None]:
from IPython.display import Image
from IPython.core.display import HTML 
Image(url= "https://storage.googleapis.com/kaggle-competitions/kaggle/10445/logos/thumb76_76.png?t=2018-10-24-17-14-05")

Elo has built machine learning models to understand the most important aspects and preferences in their customers’ lifecycle, from food to shopping. But so far none of them is specifically tailored for an individual or profile. This is where you come in.

In this competition, Kagglers will develop algorithms to identify and serve the most relevant opportunities to individuals, by uncovering signal in customer loyalty. Your input will improve customers’ lives and help Elo reduce unwanted campaigns, to create the right experience for customers.

### What am I predicting?
You are predicting a loyalty score for each card_id represented in test.csv and sample_submission.csv.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

import os
print(os.listdir("../input"))

# Any results you write to the current directory are saved as output.

In [None]:
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (20,10)
import seaborn as sns
from scipy.stats import norm
import warnings
warnings.filterwarnings('ignore')

### Importing the Dataset

In [None]:
train = pd.read_csv('../input/train.csv')
test = pd.read_csv('../input/test.csv')
merchants = pd.read_csv('../input/merchants.csv')
hist_tran = pd.read_csv('../input/historical_transactions.csv')
new_merc_tran = pd.read_csv('../input/new_merchant_transactions.csv')

data_dict_train=pd.read_excel('../input/Data_Dictionary.xlsx',sheet_name='train')
data_dict_hist_tran=pd.read_excel('../input/Data_Dictionary.xlsx',sheet_name='history')
data_dict_new_merc_tran=pd.read_excel('../input/Data_Dictionary.xlsx',sheet_name='new_merchant_period')
data_dict_merchants=pd.read_excel('../input/Data_Dictionary.xlsx',sheet_name='merchant')

In [None]:
train.head()

### Target Analysis

In [None]:
plt.figure(figsize = (12,7))
sns.distplot(train['target'], fit = norm);
plt.xlabel('Loyalty Score',fontsize = 14);

In [None]:
print('Skewness of Target is :',train.target.skew())
print('Kurtosis of Traget is :',train.target.kurt())

In [None]:
plt.figure(figsize = (10,7))
sns.heatmap(train.corr(),annot = True,linewidths = 0.5,cmap='cubehelix_r');

In [None]:
train.info()

So, in this section I am going to plot some violin graphs to see the behaviour of feature with our **Target**.
### Violin Plot:
A violin plot is a method of plotting numeric data. It is similar to a box plot with a rotated kernel density plot on each side.
A violin plot has four layers. The **outer shape** represents all possible results, with thickness indicating how common. (Thus the thickest section represents the mode average.) The next layer inside represents the values that occur 95% of the time. The next layer (if it exists) inside represents the values that occur 50% of the time. The central dot represents the median average value.

Violin plots are similar to box plots, except that they also show the **probability density of the data at different values** (in the simplest case this could be a histogram). Typically violin plots will include a marker for the median of the data and a box indicating the interquartile range, as in standard box plots. Overlaid on this box plot is a kernel density estimation. Like box plots, violin plots are used to represent comparison of a variable distribution (or sample distribution) across different "categories" (for example, temperature distribution compared between day and night, or distribution of car prices compared across different car makers).

A violin plot is more informative than a plain box plot. In fact **while a box plot only shows summary statistics such as mean/median and interquartile ranges, the violin plot shows the full distribution of the data**. The difference is particularly useful when the data distribution is multimodal (more than one peak). In this case a violin plot clearly shows the presence of different peaks, their position and relative amplitude. This information could not be represented with a simple box plot which only reports summary statistics. The inner part of a violin plot usually shows the mean (or median) and the interquartile range. In other cases, when the number of samples is not too high, the inner part can show all sample points (with a dot or a line for each sample).

In [None]:
data_dict_train

In [None]:
plt.figure(figsize = (20,7));
plt.subplot(121)
sns.violinplot(train.feature_1,train.target);
plt.xlabel('feature_1',fontsize = 14);
plt.ylabel('target',fontsize = 14);
plt.subplot(122)
train['feature_1'].value_counts().plot(kind='bar');
plt.xlabel('feature_1',fontsize = 14);
plt.ylabel('target',fontsize = 14);

In [None]:
plt.figure(figsize = (20,7));
plt.subplot(121)
sns.violinplot(train.feature_2,train.target);
plt.xlabel('feature_2',fontsize = 14);
plt.ylabel('target',fontsize = 14);
plt.subplot(122)
train['feature_2'].value_counts().plot(kind='bar');
plt.xlabel('feature_2',fontsize = 14);
plt.ylabel('target',fontsize = 14);

In [None]:
plt.figure(figsize = (20,7));
plt.subplot(121)
sns.violinplot(train.feature_3,train.target);
plt.xlabel('feature_3',fontsize = 14);
plt.ylabel('target',fontsize = 14);
plt.subplot(122)
train['feature_3'].value_counts().plot(kind='bar');
plt.xlabel('feature_3',fontsize = 14);
plt.ylabel('target',fontsize = 14);

### The graph below shows how the transactions vary on the usage of credit card in the first month.
#### It is visible that most of the first purchases cames in the years 2016 and 2017 and the trend of purchase started increasing slowly from 2011 which was almost constant till 2014. And then from 2015 onwards it's going up which indicates that new customers have shown trust in services which is a point to notice for Elo.

In [None]:
plt.figure(figsize = (20,25))
plt.subplot(211)
train['first_active_month'].value_counts().sort_index().plot(kind = 'bar');
plt.xlabel('First_active_month',fontsize = 14);
plt.ylabel('Count',fontsize = 14);
plt.title('First Active Month in Training Data',fontsize = 18);
plt.subplot(212)
test['first_active_month'].value_counts().sort_index().plot(kind = 'bar');
plt.xlabel('First_active_month',fontsize = 14);
plt.ylabel('Count',fontsize = 14);
plt.title('First Active Month in Test Data',fontsize = 18);

#### There were some unusual values in target function. Outliers exist. So we'll look into the outliers of the target now.

In [None]:
train_lesser_m20 = train[train['target']<-20]
train_lesser_m20['first_active_month'].value_counts().sort_index().plot(kind = 'bar');
plt.xlabel('First active month', fontsize=15);
plt.ylabel('Number of cards', fontsize=15);
plt.title("First active month count in target less than -20",fontsize=18);

#### First Active Month In Target less than -20.

In [None]:
train_lesser_m20

### Checking for Missing Values in Training Data

In [None]:
train.isna().sum()

### Missing Values in Test Data

In [None]:
test.isna().sum()

In [None]:
ax = sns.FacetGrid(train, hue="feature_3", col="feature_2", margin_titles=True,
                  palette={1:"red", 0:"green"} )
ax.map(plt.scatter, "first_active_month", "target",edgecolor="w").add_legend();


#### So it was all about our train  set and features given there. Now, we'll head on to historical transaction data and new merchants data.

In [None]:
hist_tran.head()

In [None]:
new_merc_tran.head()

### Informations about the features in Historical Transactions

In [None]:
data_dict_hist_tran

### Information about the features in New Data

In [None]:
data_dict_new_merc_tran

In [None]:
print('Authorized Flag Y if approved, N if denied \n',hist_tran['authorized_flag'].value_counts(),
     '\n Authorized Flag Y if approved, N if denied \n',new_merc_tran['authorized_flag'].value_counts())
plt.subplot(121)
hist_tran['authorized_flag'].value_counts().plot(kind = 'bar');
plt.xlabel('Authorization',fontsize = 15);
plt.ylabel('Values',fontsize = 15);
plt.title('Authorized Flag Y if approved, N if denied \n Historical Data',fontsize = 20);
plt.subplot(122)
new_merc_tran['authorized_flag'].value_counts().plot(kind = 'bar');
plt.xlabel('Authorization',fontsize = 15);
plt.ylabel('Values',fontsize = 15);
plt.title('Authorized Flag Y if approved, N if denied \n New Data',fontsize = 20);

In [None]:
print('Historical Category 3 \n',hist_tran['category_3'].value_counts(),
     '\n New Category 3 \n',new_merc_tran['category_3'].value_counts())
plt.subplot(121)
hist_tran['category_3'].value_counts().plot(kind = 'bar');
plt.xlabel('Category 3',fontsize = 15);
plt.ylabel('Values',fontsize = 15);
plt.title('Historical Data',fontsize = 20);
plt.subplot(122)
new_merc_tran['category_3'].value_counts().plot(kind = 'bar');
plt.xlabel('Category 3',fontsize = 15);
plt.ylabel('Values',fontsize = 15);
plt.title('New Data',fontsize = 20);

In [None]:
print('Historical Category 2 \n',hist_tran['category_2'].value_counts(),
     '\n New Category 2 \n',new_merc_tran['category_2'].value_counts())
plt.subplot(121)
hist_tran['category_2'].value_counts().plot(kind = 'bar');
plt.xlabel('Category 2',fontsize = 15);
plt.ylabel('Values',fontsize = 15);
plt.title('Historical Data',fontsize = 20);
plt.subplot(122)
new_merc_tran['category_2'].value_counts().plot(kind = 'bar');
plt.xlabel('Category 2',fontsize = 15);
plt.ylabel('Values',fontsize = 15);
plt.title('New Data',fontsize = 20);

In [None]:
print('Historical Category 1 \n',hist_tran['category_1'].value_counts(),
     '\n New Category 1 \n',new_merc_tran['category_1'].value_counts())
plt.subplot(121)
hist_tran['category_1'].value_counts().plot(kind = 'bar');
plt.xlabel('Category 1',fontsize = 15);
plt.ylabel('Values',fontsize = 15);
plt.title('Historical Data',fontsize = 20);
plt.subplot(122)
new_merc_tran['category_1'].value_counts().plot(kind = 'bar');
plt.xlabel('Category 1',fontsize = 15);
plt.ylabel('Values',fontsize = 15);
plt.title('New Data',fontsize = 20);

In [None]:
print('Historical Month Lag \n''\n',hist_tran['month_lag'].value_counts(),
     '\n New Month Lag \n''\n',new_merc_tran['month_lag'].value_counts())
plt.subplot(121)
hist_tran['month_lag'].value_counts().plot(kind = 'bar');
plt.xlabel('Month Lag',fontsize = 15);
plt.ylabel('Values',fontsize = 15);
plt.title('Month lag to reference date \n Historical Data',fontsize = 15);
plt.subplot(122)
new_merc_tran['month_lag'].value_counts().plot(kind = 'bar');
plt.xlabel('Month Lag',fontsize = 15);
plt.ylabel('Values',fontsize = 15);
plt.title('Month lag to reference date \n New Data',fontsize = 15);

In [None]:
print('Historical Installments \n''\n',hist_tran['installments'].value_counts(),
     '\n New Installments \n''\n',new_merc_tran['installments'].value_counts())
plt.subplot(121)
hist_tran['installments'].value_counts().plot(kind = 'bar');
plt.title('Number of Installments of Purchase \n Historical Data',fontsize = 20);
plt.subplot(122)
new_merc_tran['installments'].value_counts().plot(kind = 'bar');
plt.title('Number of Installments of Purchase \n New Data',fontsize = 20);


### Now we'll explore Merchants.csv

In [None]:
data_dict_merchants

In [None]:
sns.heatmap(merchants.corr(),annot = True);

#### From the correlation Heatmap above we can deduce that the Quantities of active months in different lags are correlated with each other and Average Sales are correlated with average Purchases which is quite obvious.

In [None]:
print('Quantity of active months within Last 3 months \n',merchants['active_months_lag3'].value_counts(),
      '\n Quantity of active months within Last 6 months \n',merchants['active_months_lag6'].value_counts(),
      '\n Quantity of active months within Last 12 months \n',merchants['active_months_lag12'].value_counts())
plt.subplot(131)
merchants['active_months_lag3'].value_counts().plot(kind = 'bar');
plt.xlabel('active_months_lag3',fontsize = 14);
plt.title('Quantity of active months within Last 3 months',fontsize = 15);
plt.subplot(132)
merchants['active_months_lag6'].value_counts().plot(kind = 'bar');
plt.xlabel('active_months_lag6',fontsize = 14);
plt.title('Quantity of active months within Last 6 months',fontsize = 15);
plt.subplot(133)
merchants['active_months_lag12'].value_counts().plot(kind = 'bar');
plt.xlabel('active_months_lag12',fontsize = 14);
plt.title('Quantity of active months within Last 12 months',fontsize = 15);

In [None]:
plt.figure(figsize = (20,7))
plt.subplot(131)
plt.scatter(merchants['avg_sales_lag3'],merchants['avg_sales_lag6'],color = 'red');
plt.title('Average Sales in Lag 3 vs Lag 6',fontsize = 15);
plt.subplot(132)
plt.scatter(merchants['avg_sales_lag3'],merchants['avg_sales_lag12'],color = 'green');
plt.title('Average Sales in Lag 3 vs Lag 12',fontsize=15);
plt.subplot(133)
plt.scatter(merchants['avg_sales_lag6'],merchants['avg_sales_lag12'],color = 'green');
plt.title('Average Sales in Lag 6 vs Lag 12',fontsize=15);

In [None]:
plt.figure(figsize = (20,5))
plt.subplot(131)
sns.distplot(merchants['avg_sales_lag3'].value_counts(),fit = norm);
plt.title('avg_sales_lag3',fontsize = 15);
plt.subplot(132)
sns.distplot(merchants['avg_sales_lag6'].value_counts(),fit = norm);
plt.title('avg_sales_lag6',fontsize = 15);
plt.subplot(133)
sns.distplot(merchants['avg_sales_lag12'].value_counts(),fit = norm);
plt.title('avg_sales_lag12',fontsize = 15);

In [None]:
hist = hist_tran.groupby("card_id").size().reset_index().rename({0:'transactions'},axis=1)
new = new_merc_tran.groupby("card_id").size().reset_index().rename({0:'transactions'},axis=1)

#### From  the Historic Transactions and New Merchant Transactions, looking into the transactions detail we deduce that Maximum transaction by a card from Historic Data is 5582 while the same for New Data is 109.
#### Also the mean transactions from Historic data are 89 and from new card its 7.

In [None]:
print('Historical Transactions:  \n',hist.describe()," \n New Transactions  \n",new.describe())

In [None]:
plt.subplot(221)
sns.violinplot(hist['transactions']);
plt.xlabel('Historic Transactions',fontsize = 15);
plt.subplot(222)
sns.violinplot(new['transactions'],color = 'red');
plt.xlabel('New Transactions',fontsize = 15);
plt.subplot(223)
sns.distplot(hist['transactions'],fit = norm);
plt.xlabel('Historic Transactions',fontsize = 15);
plt.subplot(224)
sns.distplot(new['transactions'],fit = norm, color = 'red');
plt.xlabel('New Transactions',fontsize = 15);

#### Merging Historcial and New transactions to make a new DataFrame total_trans which is Total Transactions.

In [None]:
total_trans = hist_tran.append(new_merc_tran)

In [None]:
total_trans.head()

Reference: [SRK's Kernel](http://https://www.kaggle.com/sudalairajkumar/simple-exploration-notebook-elo)

### Working on the Rest.