# Capstone: <font color = red> Think of witty title </font> 
> **Shaun Chua** 
<br>**(DSI-13)**

---

# Table of Contents: <a id="top"></a>
[**1. Problem Statement**](#1)
<br> [**2. Importing Libraries**](#2)
<br> [**3. Importing Datasets**](#3)
<br> [**4. Data Cleaning**](#4)
<br> &emsp; [4.1 Creating a Function to Preview Data](#4.1)
<br> &emsp; [4.2 Reading and Previewing Datasets](#4.2)
<br> &emsp; [4.3 Cleaning the Datasets](#4.3)
<br> &emsp;&emsp;&emsp; [4.3.1 Selecting Columns](#4.3.1)
<br> &emsp;&emsp;&emsp; [4.3.2 Combining Datasets](#4.3.2)
<br> &emsp;&emsp;&emsp; [4.3.3 Dropping Duplicates](#4.3.3)
<br> &emsp;&emsp;&emsp; [4.3.4 Resolving Missing Values for `title`](#4.3.4)
<br> &emsp;&emsp;&emsp; [4.3.5 Resolving Missing Values for `subreddit`](#4.3.5)
<br> &emsp;&emsp;&emsp; [4.3.6 Resolving Missing Values for `selftext`](#4.3.6)
<br> &emsp;&emsp;&emsp; [4.3.7 Mapping `subreddit`](#4.3.7)
<br> &emsp;&emsp;&emsp; [4.3.8 Cleaning with RegEx](#4.3.8)
<br> &emsp;&emsp;&emsp; [4.3.9 Cleaning with Stop Words](#4.3.9)
<br> &emsp;&emsp;&emsp; [4.3.10 Cleaning with Lemmetisation](#4.3.10)
<br> &emsp; [4.4 EDA: Visualisation](#4.4)
<br> &emsp;&emsp;&emsp; [4.4.1 Word Cloud](#4.4.1)
<br> &emsp;&emsp;&emsp; [4.4.2 Barh Plot](#4.4.2)
<br> [**5. Preprocessing and Modelling**](#5)
<br> &emsp; [5.1 Train Test Split](#5.1)
<br> &emsp; [5.2 MultinomialNB](#5.2)
<br> &emsp; [5.3 Logistic Regression](#5.3)
<br> &emsp; [5.4 Model Optimisation](#5.4)
<br> &emsp;&emsp;&emsp; [5.4.1 GridSearchCV](#5.4.1)
<br> &emsp;&emsp;&emsp; [5.4.2 Optimised MultinomialNB](#5.4.2)
<br> &emsp;&emsp;&emsp; [5.4.3 Optimised Logistic Regression](#5.4.3)
<br> &emsp; [5.5 Summary of Classification Metrics](#5.5)
<br> &emsp; [5.6 Fitting the Chosen Model](#5.6)
<br> &emsp; [5.7 Feature Words and Coefficients](#5.7)
<br> &emsp;&emsp;&emsp; [5.7.1 Feature Words](#5.7.1)
<br> &emsp;&emsp;&emsp; [5.7.2 Coefficients](#5.7.2)
<br> &emsp; [5.8 The ROC Curve](#5.8)
<br> [**6. Conclusion and Recommendations**](#6)
<br> [**7. Limitations**](#7)
<br> [**8. Future Directions**](#8)

# 1. Problem Statement <a id="1"></a>



## Formulating your Problem Statement

Your problem statement should the guiding principle for your project.  You can think about this as a "SMART" goal.

## **Context:**  

The inception of the Government Electronic Business Centre (GeBIZ) to standardise government tender and procurement, has significantly reduced <a href="https://opentextbc.ca/principlesofeconomics/chapter/16-1-the-problem-of-imperfect-information-and-asymmetric-information/"> imperfect information and assymetric information</a>.

As a result, education consultancies face the daunting challenge of balancing several tenets of business development, such as: 
* Outreach to educational institutions 
* Programme creation
* Programme pricing

## **Specific:** 
**What precisely do you plan to do?**
<br> Obejctive 1: Identify business units/programmes that help/hurt saleprice most 
<br> Objective 2: Create a model that may help predict saleprice for a particular programme type
<br> Objective 3: Market Basket Analysis

**What type of model will you need to develop?**
<br> 1) Linear Regression with Regularisation (Ridge, Lasso, Elastic Net)
<br> 2) Decision Tree Regressor
<br> 3) Random Forest Regressor
<br> 4) Support Vector Regressor
<br> 5) Decision Tree Regressor with AdaBoost
<br> 6) Random Forest Regressor with AdaBoost
<br> 7) Gradient Boosting Regressor 
<br> 8) Extreme Gradient Boosting (XGBoost)

## **Measurable:** 
**What metrics will you be using to assess performance?** 
* Objective 1: Coefficient of Determination ($R^2$)
* Objective 2: RMSE or MSE
* Objective 3: TBC

## **Achievable:** 
**Is your project appropriately scoped?**
<br> Yes. 

**Is it too aggressive? Too easy?**
<br> Initially, felt that simply running Linear Regression was too "simple", so I decided to add more dimensions to it, and run more models.

## **Relevant:**
**Does anyone care about this?**
<br> Education consultancies may find this insightful.

**Why should people be interested in your results?**
<br> Findings can assist an education consultancy in:
* Allocating resources to business development efforts based on data
* Predicting sale price for a particular programme
* Potential Market Basket Analysis (TBC)

**What value will the completion of your project be adding?**
Education consultancies may make more targetted efforts at business development with the aid of insights from data.

## **Time-bound**
**What is your deadline?**
* Personal Deadline: 13 Apr 2020
* Official Deadline: 23 Apr 2020

# 2. Importing Libraries <a id="2"></a>

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime

import time

# sklearn
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split, GridSearchCV 
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier, RandomForestClassifier
from sklearn.svm import SVC


from sklearn.metrics import confusion_matrix, roc_auc_score

# Print styles
class color:
   PURPLE = '\033[95m'
   CYAN = '\033[96m'
   DARKCYAN = '\033[36m'
   BLUE = '\033[94m'
   GREEN = '\033[92m'
   YELLOW = '\033[93m'
   RED = '\033[91m'
   BOLD = '\033[1m'
   UNDERLINE = '\033[4m'
   END = '\033[0m'
    
%matplotlib inline

# 3. Importing Datasets <a id="3"></a>

In [2]:
sales2017_df = pd.read_csv("./datasets/GA Capstone Dataset 1.csv")

In [3]:
sales2018_df = pd.read_csv("./datasets/GA Capstone Dataset 2.csv")

# 4. Data Cleaning and EDA <a id="4"></a>

##### Defining a function to preview dataframes

In [14]:
def preview(dataframe):
    dataframe_name = [x for x in globals() if globals()[x] is dataframe][0]
    print(f"{dataframe_name} has shape: {dataframe.shape}.")
    
    print("")
    print(f"{dataframe_name} has the following columns: {dataframe.columns}")
    
    print("\n")
    print(f"These are the top 5 rows of {dataframe_name}:")
    display(dataframe.head())

    print("\n")
    print(f"These are the bottom 5 rows of {dataframe_name}:")
    display(dataframe.tail())
    
    print("\n")
    nulls = dataframe.isnull().sum()
    total_nulls = dataframe.isnull().sum().sum()
    if total_nulls > 0:
        print(f"{dataframe_name} has a total {total_nulls} of nulls.")
        print("\n")
        print(f"The columns in {dataframe_name} with nulls are: {list(nulls[nulls>0].index)}") 
      
        print("\n")
        print(f"The variables with nulls in {dataframe_name} are:")
        display(nulls)

        print("\n")
        print(f"The top 5 variables in {dataframe_name} with the highest percentage of missing values are:")
        display(dataframe.isnull().mean().sort_values(ascending=False)[:5])

    else:
        print(f"{dataframe_name} does not contain nulls.")

In [15]:
# Previewing sales2017_df

preview(sales2017_df)

sales2017_df has shape: (3120, 36).

sales2017_df has the following columns: Index(['Project Code', 'Invoice Number', 'Invoice Date', 'QTY', 'UNIT PRICE',
       'TOTAL', 'Invoiced?\n[Y / N]', 'Invoice \nRemarks',
       'Reason for \nnot Invoicing', 'Entity', 'Actual Entity',
       'Consultant Name', 'SCHOOL', 'Zone', 'Programme \nName', 'UOM',
       'Projected \nAmount', 'January', 'February', 'March', 'March Hols',
       'April', 'May', 'June Hols', 'July', 'August', 'September', 'Sep Hols',
       'October', 'November', 'December', 'Payment \nReference',
       'Payment Date', 'Paid Amount', 'Outstanding\nAmount',
       'Service \nConsultant'],
      dtype='object')


These are the top 5 rows of sales2017_df:


Unnamed: 0,Project Code,Invoice Number,Invoice Date,QTY,UNIT PRICE,TOTAL,Invoiced?\n[Y / N],Invoice \nRemarks,Reason for \nnot Invoicing,Entity,...,September,Sep Hols,October,November,December,Payment \nReference,Payment Date,Paid Amount,Outstanding\nAmount,Service \nConsultant
0,VARI17001,INVVARI1702001,16-Feb-17,4.5,$150.00,675,,4.5X$150=$675,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,5003179999.0,07.03.17,$675.00,$0.00,
1,VARI17001,INVVARI1703001,6-Mar-17,18.0,$150.00,2700,,18 x $150=$2700,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,5003191063.0,23.03.17,"$2,700.00",$0.00,
2,VARI17001,INVVARI1705001,8-May-17,17.5,$150.00,2625,,17.5 x 150=2625,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,5003220326.0,29.05.17,"$2,625.00",$0.00,
3,VARI17001,INVVARI1706001,14-Jun-17,4.5,$150.00,675,,,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,5003240231.0,10.07.17,$675.00,$0.00,
4,VARI17002,BILLED UNDER PASSIONISTA,,48.0,$85.00,0,,,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,,,,$0.00,




These are the bottom 5 rows of sales2017_df:


Unnamed: 0,Project Code,Invoice Number,Invoice Date,QTY,UNIT PRICE,TOTAL,Invoiced?\n[Y / N],Invoice \nRemarks,Reason for \nnot Invoicing,Entity,...,September,Sep Hols,October,November,December,Payment \nReference,Payment Date,Paid Amount,Outstanding\nAmount,Service \nConsultant
3115,,,,,,$0.00,,,,,...,,,,,,,,,$0.00,
3116,,,,,,$0.00,,,,,...,,,,,,,,,$0.00,
3117,,,,,,$0.00,,,,,...,,,,,,,,,$0.00,
3118,,,,,,$0.00,,,,,...,,,,,,,,,$0.00,
3119,,,,,,$0.00,,,,,...,,,,,,,,,$0.00,




sales2017_df has a total 93370 of nulls.


The columns in sales2017_df with nulls are: ['Project Code', 'Invoice Number', 'Invoice Date', 'QTY', 'UNIT PRICE', 'TOTAL', 'Invoiced?\n[Y / N]', 'Invoice \nRemarks', 'Reason for \nnot Invoicing', 'Entity', 'Actual Entity', 'Consultant Name', 'SCHOOL', 'Zone', 'Programme \nName', 'UOM', 'Projected \nAmount', 'January', 'February', 'March', 'March Hols', 'April', 'May', 'June Hols', 'July', 'August', 'September', 'Sep Hols', 'October', 'November', 'December', 'Payment \nReference', 'Payment Date', 'Paid Amount', 'Outstanding\nAmount', 'Service \nConsultant']


The variables with nulls in sales2017_df are:


Project Code                   901
Invoice Number                2793
Invoice Date                  2818
QTY                           2796
UNIT PRICE                    2795
TOTAL                           22
Invoiced?\n[Y / N]            2872
Invoice \nRemarks             3081
Reason for \nnot Invoicing    3099
Entity                         908
Actual Entity                 2782
Consultant Name               2774
SCHOOL                        2775
Zone                          2782
Programme \nName              2775
UOM                           2789
Projected \nAmount               7
January                       3078
February                      3067
March                         3060
March Hols                    3117
April                         3073
May                           3047
June Hols                     3108
July                          3059
August                        3068
September                     3081
Sep Hols                      3118
October             



The top 5 variables in sales2017_df with the highest percentage of missing values are:


Sep Hols                      0.999359
March Hols                    0.999038
December                      0.997756
June Hols                     0.996154
Reason for \nnot Invoicing    0.993269
dtype: float64

In [16]:
# Previewing sales2018_df

preview(sales2018_df)

sales2018_df has shape: (3798, 36).

sales2018_df has the following columns: Index(['Project Code', 'Invoice Number', 'Invoice Date', 'QTY', 'UNIT PRICE',
       'TOTAL', 'Invoiced?\n[Y / N]', 'Invoice \nRemarks',
       'Reason for \nnot Invoicing', 'Entity', 'Actual Entity',
       'Consultant Name', 'SCHOOL', 'Zone', 'Programme \nName', 'UOM',
       'Projected \nAmount', 'January', 'February', 'March', 'March Hols',
       'April', 'May', 'June Hols', 'July', 'August', 'September', 'Sep Hols',
       'October', 'November', 'December', 'Payment \nReference',
       'Payment Date', 'Paid Amount', 'Outstanding\nAmount',
       'Service \nConsultant'],
      dtype='object')


These are the top 5 rows of sales2018_df:


Unnamed: 0,Project Code,Invoice Number,Invoice Date,QTY,UNIT PRICE,TOTAL,Invoiced?\n[Y / N],Invoice \nRemarks,Reason for \nnot Invoicing,Entity,...,September,Sep Hols,October,November,December,Payment \nReference,Payment Date,Paid Amount,Outstanding\nAmount,Service \nConsultant
0,VARI18001,,,,,$0.00,,,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,,,,$0.00,
1,VARI18002,,,,,$0.00,,,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,,,,$0.00,
2,VARI18003,,,,,$0.00,,,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,,,,$0.00,
3,VARI18004,,,,,$0.00,,,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,,,,$0.00,
4,VARI18005,,,,,$0.00,,,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,,,,$0.00,




These are the bottom 5 rows of sales2018_df:


Unnamed: 0,Project Code,Invoice Number,Invoice Date,QTY,UNIT PRICE,TOTAL,Invoiced?\n[Y / N],Invoice \nRemarks,Reason for \nnot Invoicing,Entity,...,September,Sep Hols,October,November,December,Payment \nReference,Payment Date,Paid Amount,Outstanding\nAmount,Service \nConsultant
3793,TL18198,,,,,,,,,,...,,,,,,,,,,
3794,TL18199,,,,,,,,,,...,,,,,,,,,,
3795,TL18200,,,,,,,,,,...,,,,,,,,,,
3796,VART18015,,,84.0,$108.00,"$9,072.00",,,,ARTELIER PTE LTD,...,,,,,,,,,,
3797,TL18014,,,2.0,"$1,000.00",,,,,,...,,,,,,,,,,




sales2018_df has a total 109756 of nulls.


The columns in sales2018_df with nulls are: ['Project Code', 'Invoice Number', 'Invoice Date', 'QTY', 'UNIT PRICE', 'TOTAL', 'Invoiced?\n[Y / N]', 'Invoice \nRemarks', 'Reason for \nnot Invoicing', 'Entity', 'Actual Entity', 'Consultant Name', 'SCHOOL', 'Zone', 'Programme \nName', 'UOM', 'Projected \nAmount', 'January', 'February', 'March', 'March Hols', 'April', 'May', 'June Hols', 'July', 'August', 'September', 'Sep Hols', 'October', 'November', 'December', 'Payment \nReference', 'Payment Date', 'Paid Amount', 'Outstanding\nAmount', 'Service \nConsultant']


The variables with nulls in sales2018_df are:


Project Code                     3
Invoice Number                3236
Invoice Date                  3247
QTY                           3238
UNIT PRICE                    3238
TOTAL                          111
Invoiced?\n[Y / N]            3471
Invoice \nRemarks             3741
Reason for \nnot Invoicing    3772
Entity                         190
Actual Entity                 3256
Consultant Name               3210
SCHOOL                        3222
Zone                          3239
Programme \nName              3224
UOM                           3276
Projected \nAmount             107
January                       3738
February                      3730
March                         3704
March Hols                    3795
April                         3738
May                           3708
June Hols                     3772
July                          3722
August                        3735
September                     3745
Sep Hols                      3793
October             



The top 5 variables in sales2018_df with the highest percentage of missing values are:


March Hols                    0.999210
Sep Hols                      0.998684
December                      0.998420
Reason for \nnot Invoicing    0.993154
June Hols                     0.993154
dtype: float64

In [21]:
# Combining sales2017_df and sales2018_df

combined_df = pd.concat([sales2017_df, sales2018_df], ignore_index=True)

In [22]:
# Previewing combined_df

preview(combined_df)

combined_df has shape: (6918, 36).

combined_df has the following columns: Index(['Project Code', 'Invoice Number', 'Invoice Date', 'QTY', 'UNIT PRICE',
       'TOTAL', 'Invoiced?\n[Y / N]', 'Invoice \nRemarks',
       'Reason for \nnot Invoicing', 'Entity', 'Actual Entity',
       'Consultant Name', 'SCHOOL', 'Zone', 'Programme \nName', 'UOM',
       'Projected \nAmount', 'January', 'February', 'March', 'March Hols',
       'April', 'May', 'June Hols', 'July', 'August', 'September', 'Sep Hols',
       'October', 'November', 'December', 'Payment \nReference',
       'Payment Date', 'Paid Amount', 'Outstanding\nAmount',
       'Service \nConsultant'],
      dtype='object')


These are the top 5 rows of combined_df:


Unnamed: 0,Project Code,Invoice Number,Invoice Date,QTY,UNIT PRICE,TOTAL,Invoiced?\n[Y / N],Invoice \nRemarks,Reason for \nnot Invoicing,Entity,...,September,Sep Hols,October,November,December,Payment \nReference,Payment Date,Paid Amount,Outstanding\nAmount,Service \nConsultant
0,VARI17001,INVVARI1702001,16-Feb-17,4.5,$150.00,675,,4.5X$150=$675,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,5003179999.0,07.03.17,$675.00,$0.00,
1,VARI17001,INVVARI1703001,6-Mar-17,18.0,$150.00,2700,,18 x $150=$2700,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,5003191063.0,23.03.17,"$2,700.00",$0.00,
2,VARI17001,INVVARI1705001,8-May-17,17.5,$150.00,2625,,17.5 x 150=2625,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,5003220326.0,29.05.17,"$2,625.00",$0.00,
3,VARI17001,INVVARI1706001,14-Jun-17,4.5,$150.00,675,,,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,5003240231.0,10.07.17,$675.00,$0.00,
4,VARI17002,BILLED UNDER PASSIONISTA,,48.0,$85.00,0,,,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,,,,$0.00,




These are the bottom 5 rows of combined_df:


Unnamed: 0,Project Code,Invoice Number,Invoice Date,QTY,UNIT PRICE,TOTAL,Invoiced?\n[Y / N],Invoice \nRemarks,Reason for \nnot Invoicing,Entity,...,September,Sep Hols,October,November,December,Payment \nReference,Payment Date,Paid Amount,Outstanding\nAmount,Service \nConsultant
6913,TL18198,,,,,,,,,,...,,,,,,,,,,
6914,TL18199,,,,,,,,,,...,,,,,,,,,,
6915,TL18200,,,,,,,,,,...,,,,,,,,,,
6916,VART18015,,,84.0,$108.00,"$9,072.00",,,,ARTELIER PTE LTD,...,,,,,,,,,,
6917,TL18014,,,2.0,"$1,000.00",,,,,,...,,,,,,,,,,




combined_df has a total 203126 of nulls.


The columns in combined_df with nulls are: ['Project Code', 'Invoice Number', 'Invoice Date', 'QTY', 'UNIT PRICE', 'TOTAL', 'Invoiced?\n[Y / N]', 'Invoice \nRemarks', 'Reason for \nnot Invoicing', 'Entity', 'Actual Entity', 'Consultant Name', 'SCHOOL', 'Zone', 'Programme \nName', 'UOM', 'Projected \nAmount', 'January', 'February', 'March', 'March Hols', 'April', 'May', 'June Hols', 'July', 'August', 'September', 'Sep Hols', 'October', 'November', 'December', 'Payment \nReference', 'Payment Date', 'Paid Amount', 'Outstanding\nAmount', 'Service \nConsultant']


The variables with nulls in combined_df are:


Project Code                   904
Invoice Number                6029
Invoice Date                  6065
QTY                           6034
UNIT PRICE                    6033
TOTAL                          133
Invoiced?\n[Y / N]            6343
Invoice \nRemarks             6822
Reason for \nnot Invoicing    6871
Entity                        1098
Actual Entity                 6038
Consultant Name               5984
SCHOOL                        5997
Zone                          6021
Programme \nName              5999
UOM                           6065
Projected \nAmount             114
January                       6816
February                      6797
March                         6764
March Hols                    6912
April                         6811
May                           6755
June Hols                     6880
July                          6781
August                        6803
September                     6826
Sep Hols                      6911
October             



The top 5 variables in combined_df with the highest percentage of missing values are:


March Hols                    0.999133
Sep Hols                      0.998988
December                      0.998121
June Hols                     0.994507
Reason for \nnot Invoicing    0.993206
dtype: float64

##### Shaun:

There seems to be a lot of missing values. 

Before dropping anything, I will investigate each feature individually, and determine which features are likely to provide insight, and which features are less likely to be helpful. 

you can try this: df.na.drop(how = "all"), this will remove the row only if all the rows are null or

https://forums.databricks.com/questions/15146/how-to-remove-empty-rows-from-the-data-frame.html

In [12]:
sales2017_df.dropna(axis=0, thresh=10)

Unnamed: 0,Project Code,Invoice Number,Invoice Date,QTY,UNIT PRICE,TOTAL,Invoiced?\n[Y / N],Invoice \nRemarks,Reason for \nnot Invoicing,Entity,...,September,Sep Hols,October,November,December,Payment \nReference,Payment Date,Paid Amount,Outstanding\nAmount,Service \nConsultant
0,VARI17001,INVVARI1702001,16-Feb-17,4.5,$150.00,675,,4.5X$150=$675,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,5003179999,07.03.17,$675.00,$0.00,
1,VARI17001,INVVARI1703001,6-Mar-17,18,$150.00,2700,,18 x $150=$2700,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,5003191063,23.03.17,"$2,700.00",$0.00,
2,VARI17001,INVVARI1705001,8-May-17,17.5,$150.00,2625,,17.5 x 150=2625,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,5003220326,29.05.17,"$2,625.00",$0.00,
3,VARI17001,INVVARI1706001,14-Jun-17,4.5,$150.00,675,,,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,5003240231,10.07.17,$675.00,$0.00,
4,VARI17002,BILLED UNDER PASSIONISTA,,48,$85.00,0,,,,ARTELIER (INSTRUCTOR) PTE LTD,...,,,,,,,,,$0.00,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2188,VIVE17077,INVVIVE1711022,26TH NOV,217,$40.00,"$8,680.00",,,,VIVARCH ENRICHMENT PTE LTD,...,,,,"$8,680.00",,5003320855,19.12.17,"$8,680.00",$0.00,
2189,VIVE17078,INVVIVE1711016,24TH NOV,1,"$17,130.00","$17,130.00",,,,VIVARCH ENRICHMENT PTE LTD,...,,,"$17,130.00",,,5003323984,26.12.17,"$17,130.00",$0.00,
2190,VIVE17079,INVVIVE1711001,16th Nov,1,"$3,600.00","$3,600.00",,,,VIVARCH ENRICHMENT PTE LTD,...,,,"$3,192.00",,,5003314588,11.12.17,"$3,600.00",$0.00,
2191,VIVE17080,INVVIVE1711001,,1,$650.00,$650.00,,,,VIVARCH ENRICHMENT PTE LTD,...,,,,$650.00,,5003309500,30.11.17,$650.00,$0.00,
