## Final Project Day 4: AutoGluon TabularPrediction for a Classification Problem

Let's finally see how __AutoGluon TabularPrediction__ works to predict the __isPositive__ field of our final project dataset.

* We are giving you two pieces of code to read your training and test datasets.
* Use the notebooks from the class and implement the model, train and test with the corresponding datasets.
* You can use the __AutoGluon TabularPrediction__ from here: https://autogluon.mxnet.io/tutorials/tabular_prediction/index.html

*Note: No need to incorporate the data preprocessing from previous days - __AutoGluon TabularPrediction__ handles all that!*

Overall dataset schema:
* __reviewText:__ Text of the review
* __summary:__ Summary of the review
* __verified:__ Whether the purchase was verified (True or False)
* __time:__ UNIX timestamp for the review
* __log_votes:__ Logarithm-adjusted votes log(1+votes)
* __isPositive:__ Rating of the review


### 1. Reading the datasets

We will use the __pandas__ library to read our datasets.

In [1]:
import pandas as pd

df_train = pd.read_csv('../../DATA/NLP/EMBK-NLP-FINAL-TRAIN-CSV.csv')

In [2]:
df_train.head()

Unnamed: 0,reviewText,summary,verified,time,log_votes,isPositive
0,"PURCHASED FOR YOUNGSTER WHO\nINHERITED MY ""TOO...",IDEAL FOR BEGINNER!,True,1361836800,0.0,1.0
1,unable to open or use,Two Stars,True,1452643200,0.0,0.0
2,Waste of money!!! It wouldn't load to my system.,Dont buy it!,True,1433289600,0.0,0.0
3,I attempted to install this OS on two differen...,I attempted to install this OS on two differen...,True,1518912000,0.0,0.0
4,I've spent 14 fruitless hours over the past tw...,Do NOT Download.,True,1441929600,1.098612,0.0


In [3]:
import pandas as pd

df_test = pd.read_csv('../../DATA/NLP/EMBK-NLP-FINAL-TEST-CSV.csv')

In [4]:
df_test.head()

Unnamed: 0,reviewText,summary,verified,time,log_votes,isPositive
0,Kaspersky offers the best security for your co...,State of the art protection,True,1465516800,0.0,1.0
1,This Value was extremely discounted which I ap...,Quickbooks,True,1393632000,0.0,1.0
2,Some dufus probably got stock options by the t...,Sad,False,1228176000,2.639057,0.0
3,I have reviewed the software and it is beyond ...,Excellent product,True,1402531200,0.0,1.0
4,"Plain old simple you need Anti-Virus,I have tr...",A must have,True,1367539200,0.0,1.0


### 2. Setup the AutoGluon environment 

In [5]:
!pip install --upgrade pip
!pip install --upgrade mxnet autogluon

import warnings
warnings.filterwarnings('ignore')


Requirement already up-to-date: pip in /home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages (20.0.2)
Requirement already up-to-date: mxnet in /home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages (1.5.1.post0)
Collecting autogluon
  Downloading autogluon-0.0.5-py3-none-any.whl (328 kB)
[K     |████████████████████████████████| 328 kB 17.2 MB/s eta 0:00:01
Collecting scikit-optimize
  Downloading scikit_optimize-0.7.2-py2.py3-none-any.whl (80 kB)
[K     |████████████████████████████████| 80 kB 11.8 MB/s eta 0:00:01
Collecting ConfigSpace<=0.4.10
  Downloading ConfigSpace-0.4.10.tar.gz (882 kB)
[K     |████████████████████████████████| 882 kB 42.7 MB/s eta 0:00:01
[?25hCollecting boto3==1.9.187
  Downloading boto3-1.9.187-py2.py3-none-any.whl (128 kB)
[K     |████████████████████████████████| 128 kB 35.4 MB/s eta 0:00:01
Collecting distributed==2.6.0
  Downloading distributed-2.6.0-py3-none-any.whl (560 kB)
[K     |████████████████████████████████| 560 

  Building wheel for ConfigSpace (setup.py) ... [?25ldone
[?25h  Created wheel for ConfigSpace: filename=ConfigSpace-0.4.10-cp36-cp36m-linux_x86_64.whl size=3000917 sha256=67bfd8c697881cf2f17426065b9fd179e58e30ad6fe16340098a70a1df59dbf6
  Stored in directory: /home/ec2-user/.cache/pip/wheels/70/71/a2/00ca7cb0f71294d73e8791d6fe5cd0c7401066ec3b7e1026db
  Building wheel for gluonnlp (setup.py) ... [?25ldone
[?25h  Created wheel for gluonnlp: filename=gluonnlp-0.8.1-py3-none-any.whl size=289392 sha256=f6bc79e8060b6bd2f9f92c9285e4a58b7fa66785a5a86d0a1d2825046c83ad17
  Stored in directory: /home/ec2-user/.cache/pip/wheels/70/cb/1c/e6fb5e5eefcd5fe8ee2163f27c79a63c96d9a956e8d93fb496
Successfully built ConfigSpace gluonnlp
[31mERROR: sagemaker 1.50.5 has requirement boto3>=1.10.44, but you'll have boto3 1.9.187 which is incompatible.[0m
[31mERROR: awscli 1.17.5 has requirement botocore==1.14.5, but you'll have botocore 1.12.253 which is incompatible.[0m
[31mERROR: awscli 1.17.5 has req