# [Estimation of Obesity Levels Based On Eating Habits and Physical Condition](https://archive.ics.uci.edu/dataset/544/estimation+of+obesity+levels+based+on+eating+habits+and+physical+condition)

This dataset include data for the estimation of obesity levels in individuals from the countries of Mexico, Peru and Colombia, based on their eating habits and physical condition.

### Estrutura do dataset

| Alvo  | Valores |
| ------------- | ------------- |
|NObeyesdad| {Insufficient Weight, Normal Weight, Overweight Level I, Overweight Level II, Obesity Type I, Obesity Type II, Obesity Type III}| 

| Features  | Tipo |
| ------------- | ------------- |
|Gender|Categorical|
|Age|Continuous|
|Height|Continuous|
|Weight|Continuous|
|family_history_with_overweight|Binary|
|FAVC|Binary|
|FCVC|Integer|
|NCP|Continuous|
|CAEC|Categorical|
|SMOKE|Binary|
|CH2O|Continuous|
|SCC|Binary|
|FAF|Continuous|
|TUE|Integer|
|CALC|Categorical|
|MTRANS|Categorical|

### Instalação das dependências

In [290]:
import pandas as pd

import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

from ucimlrepo import fetch_ucirepo 
pd.options.mode.copy_on_write = True

import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

### Importando o dataset

In [291]:
# fetch dataset 
dataset = fetch_ucirepo(id=544) 
  
# data (as pandas dataframes) 
X = dataset.data.features 
Y = dataset.data.targets 

# Transform labels to int
target_class = 'NObeyesdad'

labels = Y[target_class].unique()
for i in range(len(labels)):
  Y.loc[Y[target_class]==labels[i], target_class] = i

### One-hot Encoding das features

In [325]:
cat_features = []
num_features = []
for feature in dataset.data.features:
    feature_info = dataset.variables.loc[dataset.variables["name"] == feature]
    if feature_info.type.values[0] == "Categorical":
        cat_features.append(feature)
    else:
        if feature_info.type.values[0] == "Binary":
            values = X[feature].unique()
            for i in range(len(values)):
                X.loc[X[feature]==values[i], feature] = 2*i -1

        X[feature] = X[feature].astype(float) # Transforma em float caso seja str
        num_features.append(feature)

In [293]:
X = pd.get_dummies(X,columns=cat_features, dtype=int)

### Dividindo os conjuntos de teste e treino

In [294]:
x_train, x_test, y_train, y_test = train_test_split(X, Y,test_size= 0.3, random_state = 28)
y_true = list(y_test[target_class])

### Termômetro

In [295]:
from thermometer import Thermometer

num_bits = 10

term = Thermometer(x_train,num_bits) # Termometro para dados de treinamento
term.encode(num_features)

term = Thermometer(x_test,num_bits) # Termometro para dados de test
term.encode(num_features)

### Wisard

In [320]:
from wisard import Wisard

wisard = Wisard(num_classes=len(labels),n=4)

### Treinamento

In [321]:
wisard.train(x_train,y_train)

### Teste

In [322]:
predictions = wisard.classify(x_test)
print(predictions)

[5, 3, 2, 5, 0, 2, 5, 6, 1, 5, 3, 6, 0, 5, 6, 0, 2, 4, 6, 0, 0, 1, 3, 0, 3, 4, 3, 5, 5, 6, 4, 4, 5, 2, 1, 2, 1, 0, 2, 3, 0, 1, 5, 4, 0, 1, 1, 3, 1, 5, 6, 0, 1, 4, 0, 6, 4, 3, 1, 0, 2, 1, 0, 6, 1, 3, 0, 0, 1, 0, 6, 0, 1, 1, 2, 5, 0, 1, 2, 4, 4, 1, 1, 3, 1, 1, 3, 1, 3, 6, 1, 4, 4, 6, 1, 3, 4, 3, 1, 5, 4, 1, 1, 1, 5, 0, 1, 6, 3, 3, 4, 3, 0, 1, 0, 0, 0, 5, 3, 5, 5, 6, 0, 1, 0, 6, 4, 2, 5, 4, 5, 6, 2, 4, 5, 1, 2, 5, 6, 3, 5, 2, 3, 4, 4, 5, 1, 6, 1, 0, 1, 1, 0, 6, 5, 2, 1, 5, 3, 1, 5, 1, 4, 0, 0, 2, 4, 0, 6, 6, 3, 1, 2, 6, 5, 0, 4, 0, 3, 4, 0, 6, 2, 1, 6, 2, 0, 0, 6, 6, 0, 0, 1, 1, 1, 5, 6, 0, 5, 1, 3, 0, 0, 3, 1, 1, 0, 0, 3, 5, 5, 6, 0, 1, 0, 5, 1, 3, 3, 6, 2, 0, 1, 3, 2, 6, 0, 6, 3, 2, 3, 3, 5, 4, 3, 6, 5, 3, 3, 6, 4, 0, 1, 0, 5, 1, 3, 6, 0, 5, 4, 6, 3, 2, 5, 4, 6, 6, 1, 0, 0, 0, 2, 0, 1, 0, 0, 0, 1, 2, 3, 4, 0, 2, 5, 0, 6, 0, 5, 4, 1, 6, 0, 4, 5, 0, 3, 2, 6, 0, 0, 6, 3, 0, 1, 0, 0, 3, 1, 2, 0, 1, 0, 2, 5, 0, 3, 5, 2, 6, 3, 1, 2, 3, 0, 5, 5, 2, 0, 6, 0, 1, 6, 2, 0, 2, 2, 3, 2, 5, 6, 2, 3, 

In [323]:
print(f'Acurácia obtida: {accuracy_score(y_true, predictions) * 100:.2f}%')
print(f'Precisão obtida: {precision_score(y_true, predictions, average="macro", zero_division=np.nan) * 100:.2f}%')
print(f'Recall obtido: {recall_score(y_true, predictions, average="macro") * 100:.2f}%')
print(f'F1 Score obtida: {f1_score(y_true, predictions, average="macro") * 100:.2f}%')


Acurácia obtida: 74.13%
Precisão obtida: 77.55%
Recall obtido: 75.25%
F1 Score obtida: 75.05%
