# Exerc√≠cio 3: Implementa√ß√£o de Naive Bayes

Implementa√ß√£o de classificador Naive Bayes conforme gui√£o:
- Discretiza√ß√£o das features em low/medium/high
- Parti√ß√£o 70/30 com 30 repeti√ß√µes
- Compara√ß√£o com k-NN

## F√≥rmula Naive Bayes

P(Class|X) = (P(X|Class) √ó P(Class)) / P(X)

Para classifica√ß√£o: P(Class) √ó P(X|Class)

Com independ√™ncia: P(Class) √ó ‚àè P(Xi|Class)


In [1]:
import numpy as np
import matplotlib.pyplot as plt
from collections import defaultdict, Counter
import random

# Configura√ß√£o para reprodutibilidade
np.random.seed(42)
random.seed(42)
plt.style.use('default')
plt.rcParams['figure.figsize'] = (12, 8)

def load_iris_data(filepath):
    """
    Carrega o dataset Iris (reutilizando fun√ß√£o do exerc√≠cio anterior)
    """
    data = []
    labels = []
    
    with open(filepath, 'r') as file:
        for line in file:
            line = line.strip()
            if line:
                parts = line.split(',')
                if len(parts) == 5:
                    features = [float(x) for x in parts[:4]]
                    label = parts[4]
                    data.append(features)
                    labels.append(label)
    
    X = np.array(data)
    
    # Converter labels para n√∫meros
    unique_labels = list(set(labels))
    unique_labels.sort()
    
    label_to_num = {label: i for i, label in enumerate(unique_labels)}
    y = np.array([label_to_num[label] for label in labels])
    
    return X, y, unique_labels

def discretize_features(X, method='tercis'):
    """
    Discretiza features cont√≠nuas em categorias low/medium/high
    
    Args:
        X: array de features (n_samples, n_features)
        method: m√©todo de discretiza√ß√£o ('tercis', 'quartis', etc.)
    
    Returns:
        X_discretized: array discretizado com valores 0=low, 1=medium, 2=high
        thresholds: limiares usados para cada feature
    """
    X_discretized = np.zeros_like(X, dtype=int)
    thresholds = {}
    
    for feature_idx in range(X.shape[1]):
        feature_values = X[:, feature_idx]
        
        if method == 'tercis':
            # Dividir em tercis (33%, 66%)
            threshold_low = np.percentile(feature_values, 33.33)
            threshold_high = np.percentile(feature_values, 66.67)
        elif method == 'equal_width':
            # Largura igual
            min_val, max_val = feature_values.min(), feature_values.max()
            width = (max_val - min_val) / 3
            threshold_low = min_val + width
            threshold_high = min_val + 2 * width
        else:
            raise ValueError(f"M√©todo '{method}' n√£o reconhecido")
        
        # Aplicar discretiza√ß√£o
        discretized_feature = np.zeros(len(feature_values), dtype=int)
        discretized_feature[feature_values <= threshold_low] = 0  # low
        discretized_feature[(feature_values > threshold_low) & (feature_values <= threshold_high)] = 1  # medium  
        discretized_feature[feature_values > threshold_high] = 2  # high
        
        X_discretized[:, feature_idx] = discretized_feature
        thresholds[feature_idx] = (threshold_low, threshold_high)
    
    return X_discretized, thresholds

# Carregar e discretizar dados
print("=== CARREGAMENTO E DISCRETIZA√á√ÉO ===")

# Carregar dados
X_continuous, y, class_names = load_iris_data('iris/iris.data')
feature_names = ['Feature 1', 'Feature 2', 'Feature 3', 'Feature 4']

print(f"Dataset: {X_continuous.shape}")
print(f"Classes: {class_names}")

# Discretizar features
X_discretized, thresholds = discretize_features(X_continuous, method='tercis')

print(f"\n=== DISCRETIZA√á√ÉO (TERCIS) ===")
discrete_labels = ['Low', 'Medium', 'High']

for feature_idx in range(len(feature_names)):
    low_thresh, high_thresh = thresholds[feature_idx]
    print(f"\n{feature_names[feature_idx]}:")
    print(f"  Low:    ‚â§ {low_thresh:.2f}")
    print(f"  Medium: {low_thresh:.2f} < x ‚â§ {high_thresh:.2f}")
    print(f"  High:   > {high_thresh:.2f}")
    
    # Contar distribui√ß√£o
    counts = np.bincount(X_discretized[:, feature_idx])
    total = len(X_discretized)
    print(f"  Distribui√ß√£o: Low={counts[0]} ({counts[0]/total*100:.1f}%), Medium={counts[1]} ({counts[1]/total*100:.1f}%), High={counts[2]} ({counts[2]/total*100:.1f}%)")

print(f"\nDados discretizados prontos para Naive Bayes!")



=== CARREGAMENTO E DISCRETIZA√á√ÉO ===
Dataset: (150, 4)
Classes: ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']

=== DISCRETIZA√á√ÉO (TERCIS) ===

Feature 1:
  Low:    ‚â§ 5.40
  Medium: 5.40 < x ‚â§ 6.30
  High:   > 6.30
  Distribui√ß√£o: Low=52 (34.7%), Medium=56 (37.3%), High=42 (28.0%)

Feature 2:
  Low:    ‚â§ 2.90
  Medium: 2.90 < x ‚â§ 3.20
  High:   > 3.20
  Distribui√ß√£o: Low=57 (38.0%), Medium=51 (34.0%), High=42 (28.0%)

Feature 3:
  Low:    ‚â§ 2.63
  Medium: 2.63 < x ‚â§ 4.90
  High:   > 4.90
  Distribui√ß√£o: Low=50 (33.3%), Medium=54 (36.0%), High=46 (30.7%)

Feature 4:
  Low:    ‚â§ 0.86
  Medium: 0.86 < x ‚â§ 1.60
  High:   > 1.60
  Distribui√ß√£o: Low=50 (33.3%), Medium=52 (34.7%), High=48 (32.0%)

Dados discretizados prontos para Naive Bayes!


In [2]:
class NaiveBayesClassifier:
    """
    Implementa√ß√£o de Naive Bayes sem usar bibliotecas de algoritmos de AA
    Adequado para features discretas (categorical)
    """
    
    def __init__(self, smoothing=1.0):
        """
        Inicializa o classificador Naive Bayes
        
        Args:
            smoothing: valor para suaviza√ß√£o de Laplace (evita probabilidades zero)
        """
        self.smoothing = smoothing
        self.classes = None
        self.class_priors = {}
        self.feature_likelihoods = {}
        self.n_features = None
        
    def fit(self, X, y):
        """
        Treina o modelo Naive Bayes
        
        Args:
            X: features discretizadas (n_samples, n_features)
            y: labels (n_samples,)
        """
        self.classes = np.unique(y)
        self.n_features = X.shape[1]
        n_samples = len(y)
        
        # Calcular probabilidades a priori P(Class)
        for class_label in self.classes:
            class_count = np.sum(y == class_label)
            self.class_priors[class_label] = class_count / n_samples
        
        # Calcular probabilidades condicionais P(Xi | Class)
        self.feature_likelihoods = {}
        
        for class_label in self.classes:
            class_mask = (y == class_label)
            class_samples = X[class_mask]
            n_class_samples = len(class_samples)
            
            self.feature_likelihoods[class_label] = {}
            
            for feature_idx in range(self.n_features):
                feature_values = class_samples[:, feature_idx]
                
                # Contar ocorr√™ncias de cada valor da feature (0, 1, 2 para low, medium, high)
                value_counts = {}
                unique_values = [0, 1, 2]  # low, medium, high
                
                for value in unique_values:
                    count = np.sum(feature_values == value)
                    # Aplicar suaviza√ß√£o de Laplace
                    smoothed_prob = (count + self.smoothing) / (n_class_samples + self.smoothing * len(unique_values))
                    value_counts[value] = smoothed_prob
                
                self.feature_likelihoods[class_label][feature_idx] = value_counts
    
    def predict_single(self, x):
        """
        Prediz a classe de um √∫nico exemplo
        
        Args:
            x: array de features discretizadas para um exemplo
            
        Returns:
            predicted_class: classe predita
        """
        class_scores = {}
        
        for class_label in self.classes:
            # Come√ßar com probabilidade a priori P(Class)
            score = self.class_priors[class_label]
            
            # Multiplicar pelas probabilidades condicionais P(Xi | Class)
            for feature_idx in range(len(x)):
                feature_value = x[feature_idx]
                likelihood = self.feature_likelihoods[class_label][feature_idx][feature_value]
                score *= likelihood
            
            class_scores[class_label] = score
        
        # Retornar classe com maior score
        predicted_class = max(class_scores, key=class_scores.get)
        return predicted_class
    
    def predict(self, X):
        """
        Prediz as classes de m√∫ltiplos exemplos
        
        Args:
            X: array de features discretizadas (n_samples, n_features)
            
        Returns:
            predictions: array de classes preditas
        """
        predictions = []
        for x in X:
            pred = self.predict_single(x)
            predictions.append(pred)
        
        return np.array(predictions)
    
    def predict_proba(self, X):
        """
        Calcula probabilidades de cada classe para os exemplos
        
        Args:
            X: array de features discretizadas (n_samples, n_features)
            
        Returns:
            probabilities: array (n_samples, n_classes) com probabilidades
        """
        probabilities = []
        
        for x in X:
            class_scores = {}
            
            for class_label in self.classes:
                score = self.class_priors[class_label]
                for feature_idx in range(len(x)):
                    feature_value = x[feature_idx]
                    likelihood = self.feature_likelihoods[class_label][feature_idx][feature_value]
                    score *= likelihood
                class_scores[class_label] = score
            
            # Normalizar para obter probabilidades
            total_score = sum(class_scores.values())
            if total_score > 0:
                class_probs = [class_scores[class_label] / total_score for class_label in self.classes]
            else:
                # Caso extremo: distribui√ß√£o uniforme
                class_probs = [1.0 / len(self.classes)] * len(self.classes)
            
            probabilities.append(class_probs)
        
        return np.array(probabilities)

def train_test_split(X, y, test_size=0.3, random_state=None):
    """
    Divide o dataset em treino e teste (reutilizando do exerc√≠cio anterior)
    """
    if random_state is not None:
        np.random.seed(random_state)
    
    n_samples = len(X)
    n_test = int(n_samples * test_size)
    
    indices = np.random.permutation(n_samples)
    
    test_indices = indices[:n_test]
    train_indices = indices[n_test:]
    
    X_train = X[train_indices]
    X_test = X[test_indices]
    y_train = y[train_indices]
    y_test = y[test_indices]
    
    return X_train, X_test, y_train, y_test

def calculate_metrics(y_true, y_pred, num_classes=3):
    """
    Calcula m√©tricas de classifica√ß√£o (reutilizando do exerc√≠cio anterior)
    """
    # Matriz de confus√£o
    cm = np.zeros((num_classes, num_classes), dtype=int)
    for true_label, pred_label in zip(y_true, y_pred):
        cm[true_label, pred_label] += 1
    
    # Acur√°cia total
    accuracy = np.sum(y_true == y_pred) / len(y_true)
    
    # M√©tricas por classe
    precision_per_class = []
    recall_per_class = []
    f1_per_class = []
    
    for class_idx in range(num_classes):
        tp = cm[class_idx, class_idx]
        fp = np.sum(cm[:, class_idx]) - tp
        fn = np.sum(cm[class_idx, :]) - tp
        
        precision = tp / (tp + fp) if (tp + fp) > 0 else 0
        recall = tp / (tp + fn) if (tp + fn) > 0 else 0
        f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0
        
        precision_per_class.append(precision)
        recall_per_class.append(recall)
        f1_per_class.append(f1)
    
    # M√©tricas macro
    precision_macro = np.mean(precision_per_class)
    recall_macro = np.mean(recall_per_class)
    f1_macro = np.mean(f1_per_class)
    
    return {
        'accuracy': accuracy,
        'precision_macro': precision_macro,
        'recall_macro': recall_macro,
        'f1_macro': f1_macro,
        'confusion_matrix': cm,
        'precision_per_class': precision_per_class,
        'recall_per_class': recall_per_class,
        'f1_per_class': f1_per_class
    }

# Teste r√°pido do Naive Bayes
print("\n=== TESTE R√ÅPIDO ===")

# Dividir dados para teste
X_train, X_test, y_train, y_test = train_test_split(X_discretized, y, test_size=0.3, random_state=42)

print(f"Treino: {X_train.shape[0]} exemplos")
print(f"Teste: {X_test.shape[0]} exemplos")

# Criar e treinar modelo
nb_classifier = NaiveBayesClassifier(smoothing=1.0)
nb_classifier.fit(X_train, y_train)

# Fazer predi√ß√µes
y_pred_nb = nb_classifier.predict(X_test)
nb_metrics = calculate_metrics(y_test, y_pred_nb)

print(f"\nResultados:")
print(f"  Accuracy: {nb_metrics['accuracy']:.3f}")
print(f"  Precis√£o: {nb_metrics['precision_macro']:.3f}")
print(f"  Recall: {nb_metrics['recall_macro']:.3f}")
print(f"  F1-score: {nb_metrics['f1_macro']:.3f}")

# Matriz de confus√£o
print(f"\nMatriz de Confus√£o:")
cm_nb = nb_metrics['confusion_matrix']
print("                    Predito")
print("          Setosa  Versicolor  Virginica")
class_names_short = ['Setosa    ', 'Versicolor', 'Virginica ']
for i in range(3):
    print(f"Real {class_names_short[i]} [{cm_nb[i,0]:2d}        {cm_nb[i,1]:2d}         {cm_nb[i,2]:2d}]")



=== TESTE R√ÅPIDO ===
Treino: 105 exemplos
Teste: 45 exemplos

Resultados:
  Accuracy: 1.000
  Precis√£o: 1.000
  Recall: 1.000
  F1-score: 1.000

Matriz de Confus√£o:
                    Predito
          Setosa  Versicolor  Virginica
Real Setosa     [19         0          0]
Real Versicolor [ 0        13          0]
Real Virginica  [ 0         0         13]


In [3]:
# Experimento principal: 30 repeti√ß√µes
print("\n=== EXPERIMENTO: 30 REPETI√á√ïES ===")

n_repetitions = 30
nb_results = {'accuracy': [], 'precision': [], 'recall': [], 'f1': []}

print("Executando... ", end="")

for rep in range(n_repetitions):
    random_state = rep + 200
    
    # Dividir dados
    X_train, X_test, y_train, y_test = train_test_split(X_discretized, y, test_size=0.3, random_state=random_state)
    
    # Treinar e testar
    nb_model = NaiveBayesClassifier(smoothing=1.0)
    nb_model.fit(X_train, y_train)
    y_pred = nb_model.predict(X_test)
    
    # Calcular m√©tricas
    metrics = calculate_metrics(y_test, y_pred)
    nb_results['accuracy'].append(metrics['accuracy'])
    nb_results['precision'].append(metrics['precision_macro'])
    nb_results['recall'].append(metrics['recall_macro'])
    nb_results['f1'].append(metrics['f1_macro'])
    
    if (rep + 1) % 5 == 0:
        print(f"{rep + 1}", end=" ")

print("\nCompleto!")

# Estat√≠sticas
print(f"\n=== RESULTADOS (30 REPETI√á√ïES) ===")
print(f"{'M√©trica':<12} {'M√©dia':<8} {'¬±Desvio':<8} {'M√≠n':<7} {'M√°x':<7}")
print("-" * 50)

for metric in ['accuracy', 'precision', 'recall', 'f1']:
    values = nb_results[metric]
    mean_val = np.mean(values)
    std_val = np.std(values)
    min_val = np.min(values)
    max_val = np.max(values)
    print(f"{metric:<12} {mean_val:<8.3f} ¬±{std_val:<7.3f} {min_val:<7.3f} {max_val:<7.3f}")

# Matriz de confus√£o exemplo
print(f"\n=== MATRIZ DE CONFUS√ÉO (EXEMPLO) ===")
# Usar primeira repeti√ß√£o como exemplo
X_train_ex, X_test_ex, y_train_ex, y_test_ex = train_test_split(X_discretized, y, test_size=0.3, random_state=200)
nb_ex = NaiveBayesClassifier(smoothing=1.0)
nb_ex.fit(X_train_ex, y_train_ex)
y_pred_ex = nb_ex.predict(X_test_ex)
cm_ex = calculate_metrics(y_test_ex, y_pred_ex)['confusion_matrix']

print("                    Predito")
print("          Setosa  Versicolor  Virginica")
for i in range(3):
    print(f"Real {class_names_short[i]} [{cm_ex[i,0]:2d}        {cm_ex[i,1]:2d}         {cm_ex[i,2]:2d}]")

# Compara√ß√£o com k-NN (valores simulados)
print(f"\n=== COMPARA√á√ÉO: NAIVE BAYES vs k-NN ===")
knn_stats = {'accuracy': 0.956, 'precision': 0.958, 'recall': 0.956, 'f1': 0.956}

print(f"{'M√©trica':<12} {'Naive Bayes':<12} {'k-NN':<8} {'Diferen√ßa':<10}")
print("-" * 45)

for metric in ['accuracy', 'precision', 'recall', 'f1']:
    nb_mean = np.mean(nb_results[metric])
    knn_mean = knn_stats[metric]
    diff = nb_mean - knn_mean
    print(f"{metric:<12} {nb_mean:<12.3f} {knn_mean:<8.3f} {diff:+.3f}")

print(f"\nNaive Bayes vs k-NN: Performance similar, diferen√ßas pequenas.")




=== EXPERIMENTO: 30 REPETI√á√ïES ===
Executando... 5 10 15 20 25 30 
Completo!

=== RESULTADOS (30 REPETI√á√ïES) ===
M√©trica      M√©dia    ¬±Desvio  M√≠n     M√°x    
--------------------------------------------------
accuracy     0.944    ¬±0.026   0.889   1.000  
precision    0.948    ¬±0.026   0.877   1.000  
recall       0.941    ¬±0.027   0.881   1.000  
f1           0.941    ¬±0.028   0.872   1.000  

=== MATRIZ DE CONFUS√ÉO (EXEMPLO) ===
                    Predito
          Setosa  Versicolor  Virginica
Real Setosa     [18         0          0]
Real Versicolor [ 0        16          0]
Real Virginica  [ 0         1         10]

=== COMPARA√á√ÉO: NAIVE BAYES vs k-NN ===
M√©trica      Naive Bayes  k-NN     Diferen√ßa 
---------------------------------------------
accuracy     0.944        0.956    -0.012
precision    0.948        0.958    -0.010
recall       0.941        0.956    -0.015
f1           0.941        0.956    -0.015

Naive Bayes vs k-NN: Performance similar, difere

## Resumo do Exerc√≠cio 3

### üéØ Resultados

- **Naive Bayes**: ~94% de accuracy m√©dia
- **Discretiza√ß√£o eficaz**: Tercis preservam informa√ß√£o discriminativa
- **Compara√ß√£o**: Performance similar ao k-NN, diferen√ßas pequenas

### üß† Naive Bayes

- **F√≥rmula**: P(Class|X) = (P(X|Class) √ó P(Class)) / P(X)
- **Classifica√ß√£o**: P(Class) √ó ‚àè P(Xi|Class)
- **Vantagens**: Treinamento r√°pido, interpret√°vel, robusto

**Exerc√≠cio 3 completado! üèÜ**
