# 🌸 Classificação de Flores Iris com Wang-Mendel

**Aula 3 - Minicurso de Sistemas de Inferência Fuzzy**

---

## 📋 Objetivo

Neste notebook, vamos aplicar o **Método de Wang-Mendel** para classificar flores Iris usando apenas **2 variáveis**:
- **Petal Length** (comprimento da pétala)
- **Petal Width** (largura da pétala)

### Por que Iris?
- Dataset clássico de Machine Learning (Fisher, 1936)
- 3 espécies: *Setosa*, *Versicolor*, *Virginica*
- 150 amostras (50 de cada espécie)
- **Vantagem do Wang-Mendel**: Regras interpretáveis!

---

## 📚 Referências
- Wang, L. X., & Mendel, J. M. (1992). "Generating fuzzy rules by learning from examples."
- Fisher, R. A. (1936). "The use of multiple measurements in taxonomic problems."

---

## 🔧 Instalação

In [None]:
!pip install pyfuzzy-toolbox[ml] scikit-learn -q

print('✅ pyfuzzy-toolbox e scikit-learn instalados com sucesso!')

## 📦 Importações

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import fuzzy_systems as fs
from fuzzy_systems.learning import WangMendelLearning
from fuzzy_systems.inference import MamdaniSystem
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.preprocessing import OneHotEncoder

%matplotlib inline

# Configuração
plt.rcParams['figure.figsize'] = (12, 6)
np.set_printoptions(precision=3, suppress=True)
np.random.seed(42)

print('✅ Bibliotecas importadas!')
print(f'   pyfuzzy-toolbox: {fs.__version__}')

## 📊 Passo 1: Carregar e Explorar Dataset Iris

In [None]:
iris = load_iris()
X = iris.data  # Shape (150, 4) - 4 features
y = iris.target

feature_names = iris.feature_names 
class_names = iris.target_names

# feature_names = ['Petal Length (cm)', 'Petal Width (cm)']
# class_names = ['setosa', 'versicolor', 'virginica']


# Estatísticas
print('📈 Estatísticas:')
for i, name in enumerate(feature_names):
    print(f'   {name:20s}: [{X[:, i].min():.2f}, {X[:, i].max():.2f}]')
print()

# Distribuição de classes
print('🌸 Distribuição de classes:')
for i, name in enumerate(class_names):
    count = np.sum(y == i)
    print(f'   {name:12s}: {count} amostras')

## 📈 Visualização dos Dados

In [None]:
# Scatter plot colorido por classe
plt.figure(figsize=(10, 7))

colors = ['red', 'green', 'blue']
markers = ['o', 's', '^']

for i, (name, color, marker) in enumerate(zip(class_names, colors, markers)):
    idx = y == i
    plt.scatter(X[idx, 2], X[idx, 3], 
                c=color, marker=marker, s=100, 
                label=name, alpha=0.7, edgecolors='black')

plt.xlabel('Petal Length (cm)', fontsize=12)
plt.ylabel('Petal Width (cm)', fontsize=12)
plt.title('Dataset Iris - Visualização 2D', fontsize=14, fontweight='bold')
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print('✅ As classes são visualmente separáveis!')

In [None]:
# ============================================================================
# 1. LOAD AND PREPARE DATA
# ============================================================================

print("="*70)
print("🌸 WANG-MENDEL CLASSIFICATION - IRIS DATASET")
print("="*70)


print(f"\n📊 Dataset Information:")
print(f"   • Samples: {X.shape[0]}")
print(f"   • Features: {X.shape[1]} ({', '.join(feature_names)})")
print(f"   • Classes: {len(class_names)} ({', '.join(class_names)})")

# One-hot encode labels
encoder = OneHotEncoder(sparse_output=False)
y_onehot = encoder.fit_transform(y.reshape(-1, 1))

print(f"\n   • Target shape: {y.shape} → One-hot: {y_onehot.shape}")

# Train-test split
X_train, X_test, y_train_onehot, y_test_onehot = train_test_split(
    X, y_onehot, test_size=0.3, random_state=42, stratify=y
)
y_train = np.argmax(y_train_onehot, axis=1)
y_test = np.argmax(y_test_onehot, axis=1)

print(f"\n   • Train samples: {X_train.shape[0]}")
print(f"   • Test samples: {X_test.shape[0]}")


In [None]:
# ============================================================================
# 2. CREATE FUZZY SYSTEM
# ============================================================================

print(f"\n{'='*70}")
print("🔧 CREATING FUZZY SYSTEM")
print("="*70)

# Create Mamdani system
sistema = MamdaniSystem(name='IrisClassifier')

# Add input variables with 3 fuzzy partitions each
n_partitions = 3
partition_names = ['low', 'medium', 'high']

for i, feature_name in enumerate(feature_names):
    # Get feature range
    x_min = float(X_train[:, i].min())
    x_max = float(X_train[:, i].max())
    margin = (x_max - x_min) * 0.05
    
    # Add input variable
    sistema.add_input(feature_name, (x_min - margin, x_max + margin))
    
    # Add fuzzy terms
    x_range = (x_max + margin) - (x_min - margin)
    step = x_range / (n_partitions - 1)
    
    for j, term_name in enumerate(partition_names):
        center = (x_min - margin) + j * step
        left = max(x_min - margin, center - step)
        right = min(x_max + margin, center + step)
        
        if j == 0:
            params = [x_min - margin, x_min - margin, center + step]
        elif j == n_partitions - 1:
            params = [center - step, x_max + margin, x_max + margin]
        else:
            params = [left, center, right]
        
        sistema.add_term(feature_name, term_name, 'triangular', params)
    
    print(f"   ✓ {feature_name}: {n_partitions} terms")

# Add output variables (one per class, binary: no/yes)
for i, class_name in enumerate(class_names):
    sistema.add_output(class_name, (0, 1))
    sistema.add_term(class_name, 'no', 'triangular', [0, 0, 1.0])
    sistema.add_term(class_name, 'yes', 'triangular', [0, 1, 1])
    print(f"   ✓ Output '{class_name}': binary (no/yes)")

print(f"\n   Total variables: {len(sistema.input_variables)} inputs, {len(sistema.output_variables)} outputs")

In [None]:
# ============================================================================
# 3. TRAIN WITH WANG-MENDEL
# ============================================================================

print(f"\n{'='*70}")
print("🤖 TRAINING WITH WANG-MENDEL ALGORITHM")
print("="*70)

# Create Wang-Mendel learner with scaling enabled
wm = WangMendelLearning(
    sistema, 
    X_train, 
    y_train_onehot,
    task='auto',  # Will auto-detect classification
    scale_classification=True,
    verbose_init=True
)

# Train
sistema_treinado = wm.fit(verbose=True)

# Get training statistics
stats = wm.get_training_stats()
print(f"\n📈 Training Statistics:")
print(f"   • Task type: {stats['task']}")
print(f"   • Candidate rules: {stats['candidate_rules']}")
print(f"   • Final rules: {stats['final_rules']}")
print(f"   • Conflicts resolved: {stats['conflicts_resolved']}")
print(f"   • Rule coverage: {stats['final_rules']}/{n_partitions**4} possible combinations")



In [None]:
# ============================================================================
# 4. MAKE PREDICTIONS
# ============================================================================

print(f"\n{'='*70}")
print("🎯 MAKING PREDICTIONS")
print("="*70)

# Predict on test set
y_pred = wm.predict(X_test)
y_proba = wm.predict_proba(X_test)

# Calculate accuracy
accuracy_train = accuracy_score(y_train, wm.predict(X_train))
accuracy_test = accuracy_score(y_test, y_pred)

print(f"\n📊 Performance Metrics:")
print(f"   • Training accuracy: {accuracy_train:.4f} ({accuracy_train*100:.2f}%)")
print(f"   • Test accuracy: {accuracy_test:.4f} ({accuracy_test*100:.2f}%)")

# Classification report
print(f"\n📋 Classification Report:")
print(classification_report(y_test, y_pred, target_names=class_names))

In [None]:
# ============================================================================
# 5. VISUALIZATIONS
# ============================================================================

print(f"\n{'='*70}")
print("📊 GENERATING VISUALIZATIONS")
print("="*70)

fig = plt.figure(figsize=(18, 12))
gs = fig.add_gridspec(3, 3, hspace=0.3, wspace=0.3)

# ===== Confusion Matrix (SEM SEABORN) =====
ax1 = fig.add_subplot(gs[0, 0])
cm = confusion_matrix(y_test, y_pred)

# Plot com imshow
im = ax1.imshow(cm, interpolation='nearest', cmap='Blues')
cbar = fig.colorbar(im, ax=ax1, fraction=0.046, pad=0.04)
cbar.set_label('Count', rotation=270, labelpad=15)

# Configurar eixos
ax1.set(xticks=np.arange(len(class_names)),
        yticks=np.arange(len(class_names)),
        xticklabels=class_names, 
        yticklabels=class_names,
        ylabel='True Label',
        xlabel='Predicted Label',
        title='Confusion Matrix')

# Adicionar valores
for i in range(cm.shape[0]):
    for j in range(cm.shape[1]):
        text_color = 'white' if cm[i, j] > cm.max() / 2 else 'black'
        ax1.text(j, i, int(cm[i, j]),
                ha="center", va="center", 
                color=text_color, fontsize=12, fontweight='bold')

# ===== Probability Distribution =====
ax2 = fig.add_subplot(gs[0, 1])
for i, class_name in enumerate(class_names):
    ax2.hist(y_proba[:, i], bins=20, alpha=0.6, label=class_name, edgecolor='black')
ax2.set_xlabel('Probability', fontsize=11, fontweight='bold')
ax2.set_ylabel('Frequency', fontsize=11, fontweight='bold')
ax2.set_title('Predicted Probability Distribution', fontsize=13, fontweight='bold', pad=10)
ax2.legend()
ax2.grid(True, alpha=0.3)

# ===== Prediction Confidence =====
ax3 = fig.add_subplot(gs[0, 2])
max_proba = y_proba.max(axis=1)
colors = ['green' if p == t else 'red' for p, t in zip(y_pred, y_test)]
ax3.scatter(range(len(y_test)), max_proba, c=colors, alpha=0.6, edgecolors='black')
ax3.axhline(y=0.5, color='gray', linestyle='--', linewidth=1, alpha=0.5)
ax3.set_xlabel('Sample Index', fontsize=11, fontweight='bold')
ax3.set_ylabel('Max Probability', fontsize=11, fontweight='bold')
ax3.set_title('Prediction Confidence (Green=Correct, Red=Wrong)', 
             fontsize=13, fontweight='bold', pad=10)
ax3.grid(True, alpha=0.3)

# ===== Feature Distributions by Class (True vs Predicted) =====
for idx, feature_idx in enumerate([2, 3]):  # Petal length and width (most discriminative)
    ax = fig.add_subplot(gs[1, idx])
    
    for i, class_name in enumerate(class_names):
        mask_true = (y_test == i)
        ax.hist(X_test[mask_true, feature_idx], bins=15, alpha=0.5, 
               label=f'{class_name} (true)', edgecolor='black')
    
    ax.set_xlabel(feature_names[feature_idx], fontsize=11, fontweight='bold')
    ax.set_ylabel('Frequency', fontsize=11, fontweight='bold')
    ax.set_title(f'Distribution: {feature_names[feature_idx]}', 
                fontsize=13, fontweight='bold', pad=10)
    ax.legend(fontsize=9)
    ax.grid(True, alpha=0.3)

# ===== Misclassification Analysis =====
ax5 = fig.add_subplot(gs[1, 2])
errors = (y_pred != y_test)
error_indices = np.where(errors)[0]
if len(error_indices) > 0:
    error_proba = y_proba[error_indices]
    x_pos = np.arange(len(error_indices))
    width = 0.25
    
    for i in range(3):
        ax5.bar(x_pos + i*width, error_proba[:, i], width, 
               label=class_names[i], alpha=0.8)
    
    ax5.set_xlabel('Misclassified Sample', fontsize=11, fontweight='bold')
    ax5.set_ylabel('Class Probability', fontsize=11, fontweight='bold')
    ax5.set_title(f'Misclassified Samples ({len(error_indices)} total)', 
                 fontsize=13, fontweight='bold', pad=10)
    ax5.set_xticks(x_pos + width)
    ax5.set_xticklabels([f'S{i}' for i in error_indices], rotation=45)
    ax5.legend()
    ax5.grid(True, alpha=0.3, axis='y')
else:
    ax5.text(0.5, 0.5, '✅ Perfect Classification!\n No errors found', 
            ha='center', va='center', fontsize=14, fontweight='bold',
            transform=ax5.transAxes)
    ax5.axis('off')

# ===== Pairwise Feature Plot (Most Important Features) =====
ax6 = fig.add_subplot(gs[2, :2])
feature_x, feature_y = 2, 3  # Petal length vs width

for i, class_name in enumerate(class_names):
    # True labels
    mask_true = (y_test == i)
    ax6.scatter(X_test[mask_true, feature_x], X_test[mask_true, feature_y],
               s=100, alpha=0.3, label=f'{class_name} (true)', edgecolors='black')
    
    # Misclassified
    mask_error = (y_test == i) & (y_pred != i)
    if mask_error.sum() > 0:
        ax6.scatter(X_test[mask_error, feature_x], X_test[mask_error, feature_y],
                   s=200, marker='x', linewidths=3, color='red', 
                   label=f'{class_name} (error)')

ax6.set_xlabel(feature_names[feature_x], fontsize=11, fontweight='bold')
ax6.set_ylabel(feature_names[feature_y], fontsize=11, fontweight='bold')
ax6.set_title('Feature Space: True vs Predicted', fontsize=13, fontweight='bold', pad=10)
ax6.legend(fontsize=9, loc='upper left')
ax6.grid(True, alpha=0.3)

# ===== Rules Summary =====
ax7 = fig.add_subplot(gs[2, 2])
ax7.axis('off')

summary_text = f"""
WANG-MENDEL IRIS CLASSIFIER
{'='*35}

📊 DATASET
  • Total samples: {len(X)}
  • Training: {len(X_train)}
  • Testing: {len(X_test)}
  • Classes: {len(class_names)}

🔧 SYSTEM CONFIGURATION
  • Inputs: {len(feature_names)}
  • Partitions per input: {n_partitions}
  • Total possible rules: {n_partitions**len(feature_names)}
  • Generated rules: {stats['final_rules']}

🎯 PERFORMANCE
  • Train accuracy: {accuracy_train*100:.1f}%
  • Test accuracy: {accuracy_test*100:.1f}%
  • Errors: {(y_pred != y_test).sum()}/{len(y_test)}

⚙️ SCALING
  • Output scaling: ENABLED
  • Method: Structure-based
"""

ax7.text(0.05, 0.95, summary_text, transform=ax7.transAxes,
        fontsize=10, verticalalignment='top', fontfamily='monospace',
        bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.3))

plt.suptitle('Wang-Mendel Fuzzy Classification - Iris Dataset', 
            fontsize=16, fontweight='bold', y=0.98)

plt.show()

print("\n✅ Visualization complete!")


In [None]:
# ============================================================================
# 6. VISUALIZAR REGRAS
# ============================================================================
wm.system.plot_rule_matrix()

In [None]:
# Obter memberships
memberships = wm.predict_membership(X_test)

# Para classificação Iris (3 classes × 2 termos cada = 'no'/'yes')
print(memberships['setosa'].shape)  # (45, 2) - 45 amostras, 2 termos

# Primeira amostra
print(f"Setosa - no: {memberships['setosa'][0, 0]:.3f}, yes: {memberships['setosa'][0, 1]:.3f}")
print(f"Versicolor - no: {memberships['versicolor'][0, 0]:.3f}, yes: {memberships['versicolor'][0, 1]:.3f}")
print(f"Virginica - no: {memberships['virginica'][0, 0]:.3f}, yes: {memberships['virginica'][0, 1]:.3f}")