## Objective

To identify users prone to subscribe to cartola `Cartola Pro` using behavior data.
This is a series of notebooks divided into the following sections:
- **01_eda:**
    - Exploratory data analysis of user profile aiming identify propensity to subscribe to cartola
    - Data wrangling to prepare the dataset for modeling
- **02_model:**
    - Assessing different models with hyperparameter tuning to select the best classifier to discriminate `Cartola Pro`
    - Analyzing feature importances to classify users
- **03_app:**
    - Creating flask API to predict the probability of users to subscribe to cartola

In this notebook I will demonstrate how we should run the app to trigger the API.

First step is to open the terminal and run the command `python 03_app.py`. Then the api will run on `http://localhost:5000`

The api has two endpoints:
- `predict`: return dataframe with the features given and the probabilities of each class 
- `predict_csv`: trigger interface to download csv with the probabilities of each class

In [1]:
import pandas as pd
import requests

In [36]:
df = pd.read_csv('data/ge_df_users_editorias_02.zip')

In order to test the api, I will generate a sample of the dataset, drop the target value and pass it to the api as a json

In [37]:
sample = df.sample(1)

x_sample = sample.drop('cartola_status', axis=1)
x_sample = x_sample.to_json(orient='records')

In [38]:
url = 'http://localhost:5000/predict'
data = sample.to_json(orient='records')
headers = {'Content-type': 'application/json'}

r = requests.post(url=url, data=x_sample, headers=headers)

The status code 200 says that the app ran without any problems

In [39]:
r.status_code

200

We can see that the actual value of sample

In [40]:
sample['cartola_status']

23912    Cartola Free
Name: cartola_status, dtype: object

Here we can see the data returned by the api with the probabilities of each class

In [41]:
pd.DataFrame(r.json(), columns=r.json()[0].keys())

Unnamed: 0,uf,idade,dias,pviews,visitas,tempo_total,futebol,futebol_intenacional,futebol_olimpico,blog_cartola,...,home,home_olimpiadas,sexo_F,sexo_M,device_m_only,device_pc_e_m,device_pc_only,Cartola Free,Cartola Pro,Não Cartola
0,0,23.0,1,5,1,2366.396,2366.396,0.0,0.0,0.0,...,0.0,0.0,0,1,0,0,1,0.500032,0.244408,0.25556


We can also save the probabilities as a csv file with the following code

In [33]:
url = 'http://localhost:5000/predict_csv'
data = sample.to_json(orient='records')
headers = {'Content-type': 'application/json'}

r = requests.post(url=url, data=x_sample, headers=headers)

500
