# An√°lise de M√©tricas de Qualidade - 75QUA

Este notebook analisa as m√©tricas CK e bugs detectados pelo SpotBugs em m√∫ltiplas releases de um projeto Java.

## ‚ö†Ô∏è IMPORTANTE: Execute a An√°lise Primeiro!

```bash
make analyze REPO=jhy/jsoup
# OU
make analyze-limit REPO=jhy/jsoup LIMIT=5
```

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from pathlib import Path
import json
import xml.etree.ElementTree as ET

# Configura√ß√£o de visualiza√ß√£o
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")
%matplotlib inline

## 1. Configura√ß√£o

In [None]:
# CONFIGURA√á√ÉO - Altere para o nome do seu projeto
PROJECT_NAME = "jsoup"
RESULTS_DIR = Path(f"/workspace/results/{PROJECT_NAME}")

print(f"Analisando projeto: {PROJECT_NAME}")
print(f"Diret√≥rio de resultados: {RESULTS_DIR}")
print(f"Diret√≥rio existe: {RESULTS_DIR.exists()}")

if RESULTS_DIR.exists():
    release_dirs = sorted([d for d in RESULTS_DIR.glob('*') if d.is_dir() and not d.name.startswith('.')])
    print(f"‚úì Encontradas {len(release_dirs)} releases")

## 2. Carregar M√©tricas CK

In [None]:
all_metrics = []

for release_dir in release_dirs:
    class_csv = release_dir / 'ck' / 'class.csv'
    
    if class_csv.exists():
        df = pd.read_csv(class_csv)
        df['release'] = release_dir.name
        
        metadata_file = release_dir / 'metadata.json'
        if metadata_file.exists():
            with open(metadata_file) as f:
                metadata = json.load(f)
                df['release_date'] = metadata.get('published_date', '')
        
        all_metrics.append(df)
        print(f"‚úì {release_dir.name}: {len(df)} classes")

if all_metrics:
    df_all = pd.concat(all_metrics, ignore_index=True)
    print(f"\n‚úì Total de classes: {len(df_all)}")
    print(f"‚úì Releases: {df_all['release'].nunique()}")
else:
    df_all = pd.DataFrame()

In [None]:
# Visualizar estrutura dos dados
df_all.head()

## 3. Estat√≠sticas Descritivas por Release

In [None]:
if not df_all.empty:
    metrics_by_release = df_all.groupby('release').agg({
        'wmc': ['mean', 'median', 'std', 'max'],
        'dit': ['mean', 'median', 'std', 'max'],
        'noc': ['mean', 'median', 'std', 'max'],
        'cbo': ['mean', 'median', 'std', 'max'],
        'lcom': ['mean', 'median', 'std', 'max'],
        'rfc': ['mean', 'median', 'std', 'max'],
        'loc': ['sum', 'mean', 'median', 'std']
    }).round(2)
    
    display(metrics_by_release)

## 4. Visualiza√ß√£o - Evolu√ß√£o das M√©tricas

In [None]:
if not df_all.empty:
    fig = plt.figure(figsize=(18, 12))
    fig.suptitle('Evolu√ß√£o das 7 M√©tricas CK', fontsize=16, fontweight='bold')
    
    # WMC
    ax1 = plt.subplot(3, 3, 1)
    metrics_by_release[('wmc', 'mean')].plot(ax=ax1, marker='o', color='blue', linewidth=2)
    ax1.set_title('WMC - Complexidade')
    ax1.set_ylabel('WMC M√©dio')
    ax1.tick_params(axis='x', rotation=45)
    ax1.grid(True, alpha=0.3)
    
    # DIT
    ax2 = plt.subplot(3, 3, 2)
    metrics_by_release[('dit', 'mean')].plot(ax=ax2, marker='s', color='orange', linewidth=2)
    ax2.set_title('DIT - Heran√ßa')
    ax2.set_ylabel('DIT M√©dio')
    ax2.tick_params(axis='x', rotation=45)
    ax2.grid(True, alpha=0.3)
    
    # NOC
    ax3 = plt.subplot(3, 3, 3)
    metrics_by_release[('noc', 'mean')].plot(ax=ax3, marker='^', color='brown', linewidth=2)
    ax3.set_title('NOC - Filhos')
    ax3.set_ylabel('NOC M√©dio')
    ax3.tick_params(axis='x', rotation=45)
    ax3.grid(True, alpha=0.3)
    
    # CBO
    ax4 = plt.subplot(3, 3, 4)
    metrics_by_release[('cbo', 'mean')].plot(ax=ax4, marker='D', color='green', linewidth=2)
    ax4.set_title('CBO - Acoplamento')
    ax4.set_ylabel('CBO M√©dio')
    ax4.tick_params(axis='x', rotation=45)
    ax4.grid(True, alpha=0.3)
    
    # LCOM
    ax5 = plt.subplot(3, 3, 5)
    metrics_by_release[('lcom', 'mean')].plot(ax=ax5, marker='v', color='red', linewidth=2)
    ax5.set_title('LCOM - Coes√£o')
    ax5.set_ylabel('LCOM M√©dio')
    ax5.tick_params(axis='x', rotation=45)
    ax5.grid(True, alpha=0.3)
    
    # RFC
    ax6 = plt.subplot(3, 3, 6)
    metrics_by_release[('rfc', 'mean')].plot(ax=ax6, marker='*', color='cyan', linewidth=2)
    ax6.set_title('RFC - Response')
    ax6.set_ylabel('RFC M√©dio')
    ax6.tick_params(axis='x', rotation=45)
    ax6.grid(True, alpha=0.3)
    
    # LOC
    ax7 = plt.subplot(3, 3, 7)
    metrics_by_release[('loc', 'sum')].plot(ax=ax7, marker='p', color='purple', linewidth=2)
    ax7.set_title('LOC - Linhas (Total)')
    ax7.set_ylabel('LOC Total')
    ax7.tick_params(axis='x', rotation=45)
    ax7.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.savefig(RESULTS_DIR / 'metrics_evolution.png', dpi=300, bbox_inches='tight')
    plt.show()

## 5. Distribui√ß√£o das M√©tricas (Boxplots)

**√ötil para identificar classes outliers em cada release**

In [None]:
if not df_all.empty:
    fig = plt.figure(figsize=(20, 12))
    fig.suptitle('Distribui√ß√£o das 7 M√©tricas CK - Boxplots (para identificar outliers)', 
                 fontsize=16, fontweight='bold')
    
    metrics = ['wmc', 'dit', 'noc', 'cbo', 'lcom', 'rfc', 'loc']
    titles = ['WMC (Complexidade)', 'DIT (Heran√ßa)', 'NOC (Filhos)', 
              'CBO (Acoplamento)', 'LCOM (Coes√£o)', 'RFC (Response)', 'LOC (Linhas)']
    
    for i, (metric, title) in enumerate(zip(metrics, titles), 1):
        ax = plt.subplot(3, 3, i)
        df_all.boxplot(column=metric, by='release', ax=ax, rot=45)
        ax.set_title(title)
        ax.set_xlabel('')
        ax.get_figure().suptitle('')  # Remove t√≠tulo autom√°tico
    
    plt.suptitle('Distribui√ß√£o das 7 M√©tricas CK por Release - Boxplots', 
                 fontsize=16, fontweight='bold', y=0.995)
    plt.tight_layout()
    plt.savefig(RESULTS_DIR / 'metrics_distribution.png', dpi=300, bbox_inches='tight')
    plt.show()

## 6. Correla√ß√£o entre M√©tricas (Heatmap)

**Mostra rela√ß√µes entre as m√©tricas CK**

In [None]:
if not df_all.empty:
    correlation_metrics = ['wmc', 'dit', 'noc', 'cbo', 'lcom', 'rfc', 'loc']
    corr_matrix = df_all[correlation_metrics].corr()
    
    plt.figure(figsize=(10, 8))
    sns.heatmap(corr_matrix, annot=True, fmt='.2f', cmap='coolwarm', center=0,
                square=True, linewidths=1, cbar_kws={"shrink": 0.8})
    plt.title('Matriz de Correla√ß√£o entre M√©tricas CK', fontsize=14, fontweight='bold', pad=20)
    plt.tight_layout()
    plt.savefig(RESULTS_DIR / 'correlation_matrix.png', dpi=300, bbox_inches='tight')
    plt.show()

## 7. Top Classes com Problemas

In [None]:
if not df_all.empty:
    latest_release = df_all[df_all['release'] == df_all['release'].unique()[-1]]
    
    print("="*80)
    print(f"AN√ÅLISE DAS 7 M√âTRICAS CK - √öLTIMA RELEASE ({latest_release['release'].iloc[0]})")
    print("="*80)
    
    print("\nTop 10 Classes com Maior Complexidade (WMC):")
    print(latest_release.nlargest(10, 'wmc')[['class', 'wmc', 'cbo', 'lcom', 'loc']])
    
    print("\n" + "-"*80)
    print("Top 10 Classes com Maior Profundidade de Heran√ßa (DIT):")
    print(latest_release.nlargest(10, 'dit')[['class', 'dit', 'wmc', 'cbo']])
    
    print("\n" + "-"*80)
    print("Top 10 Classes com Mais Filhos (NOC):")
    print(latest_release.nlargest(10, 'noc')[['class', 'noc', 'dit', 'wmc']])
    
    print("\n" + "-"*80)
    print("Top 10 Classes com Maior Acoplamento (CBO):")
    print(latest_release.nlargest(10, 'cbo')[['class', 'cbo', 'wmc', 'lcom']])
    
    print("\n" + "-"*80)
    print("Top 10 Classes com Menor Coes√£o (LCOM - valores altos):")
    print(latest_release.nlargest(10, 'lcom')[['class', 'lcom', 'wmc', 'cbo']])
    
    print("\n" + "-"*80)
    print("Top 10 Classes com Maior RFC (Response For Class):")
    print(latest_release.nlargest(10, 'rfc')[['class', 'rfc', 'wmc', 'cbo']])
    
    print("\n" + "-"*80)
    print("Top 10 Classes com Mais Linhas de C√≥digo (LOC):")
    print(latest_release.nlargest(10, 'loc')[['class', 'loc', 'wmc', 'cbo']])

## 8. An√°lise de Tend√™ncias

In [None]:
if not df_all.empty:
    first_release = metrics_by_release.iloc[0]
    last_release = metrics_by_release.iloc[-1]
    
    growth_rates = pd.DataFrame({
        'M√©trica': ['WMC', 'DIT', 'NOC', 'CBO', 'LCOM', 'RFC', 'LOC (total)'],
        'Primeira Release': [
            first_release[('wmc', 'mean')],
            first_release[('dit', 'mean')],
            first_release[('noc', 'mean')],
            first_release[('cbo', 'mean')],
            first_release[('lcom', 'mean')],
            first_release[('rfc', 'mean')],
            first_release[('loc', 'sum')]
        ],
        '√öltima Release': [
            last_release[('wmc', 'mean')],
            last_release[('dit', 'mean')],
            last_release[('noc', 'mean')],
            last_release[('cbo', 'mean')],
            last_release[('lcom', 'mean')],
            last_release[('rfc', 'mean')],
            last_release[('loc', 'sum')]
        ]
    })
    
    growth_rates['Varia√ß√£o (%)'] = ((growth_rates['√öltima Release'] - growth_rates['Primeira Release']) / growth_rates['Primeira Release'] * 100).round(2)
    
    print("An√°lise de Crescimento das M√©tricas:")
    display(growth_rates)

## 9. Exportar M√©tricas CK

In [None]:
if not df_all.empty:
    metrics_by_release.to_csv(RESULTS_DIR / 'metrics_summary.csv')
    growth_rates.to_csv(RESULTS_DIR / 'growth_rates.csv', index=False)
    print("‚úì M√©tricas CK exportadas:")
    print("  - metrics_summary.csv")
    print("  - growth_rates.csv")

## 10. An√°lise de Bugs - SpotBugs + find-sec-bugs

### 10.1 Carregar Bugs

In [None]:
def parse_spotbugs_xml(xml_file):
    """Parse SpotBugs XML report."""
    try:
        tree = ET.parse(xml_file)
        root = tree.getroot()
        bugs = []
        
        for bug in root.findall('.//BugInstance'):
            bug_info = {
                'type': bug.get('type'),
                'priority': int(bug.get('priority', 0)),
                'rank': int(bug.get('rank', 0)),
                'category': bug.get('category'),
                'abbrev': bug.get('abbrev', ''),
            }
            
            class_elem = bug.find('.//Class')
            bug_info['class'] = class_elem.get('classname', '') if class_elem is not None else ''
            
            method_elem = bug.find('.//Method')
            bug_info['method'] = method_elem.get('name', '') if method_elem is not None else ''
            
            long_msg = bug.find('.//LongMessage')
            bug_info['description'] = long_msg.text if long_msg is not None else ''
            
            bugs.append(bug_info)
        
        return bugs
    except Exception as e:
        print(f"Erro: {e}")
        return []

# Coletar bugs
all_bugs = []

for release_dir in release_dirs:
    spotbugs_xml = release_dir / 'spotbugs-report.xml'
    
    if spotbugs_xml.exists():
        bugs = parse_spotbugs_xml(spotbugs_xml)
        for bug in bugs:
            bug['release'] = release_dir.name
            
            metadata_file = release_dir / 'metadata.json'
            if metadata_file.exists():
                with open(metadata_file) as f:
                    metadata = json.load(f)
                    bug['release_date'] = metadata.get('published_date', '')
        
        all_bugs.extend(bugs)
        print(f"‚úì {release_dir.name}: {len(bugs)} bugs")
    else:
        print(f"‚úó {release_dir.name}: sem SpotBugs")

if all_bugs:
    df_bugs = pd.DataFrame(all_bugs)
    df_bugs['priority_label'] = df_bugs['priority'].map({1: 'HIGH', 2: 'MEDIUM', 3: 'LOW'})
    print(f"\n‚úì Total de bugs: {len(df_bugs)}")
    print(f"‚úì Releases com bugs: {df_bugs['release'].nunique()}")
else:
    df_bugs = pd.DataFrame()

### 10.2 Estat√≠sticas de Bugs

In [None]:
if not df_bugs.empty:
    bugs_by_release = df_bugs.groupby('release').agg({
        'type': 'count',
        'priority': ['mean', 'min', 'max']
    }).round(2)
    bugs_by_release.columns = ['Total_Bugs', 'Priority_Mean', 'Priority_Min', 'Priority_Max']
    
    print("="*80)
    print("BUGS POR RELEASE (cada release √© independente)")
    print("="*80)
    display(bugs_by_release)
    
    # Estat√≠sticas gerais
    print("\n" + "="*80)
    print("ESTAT√çSTICAS GERAIS:")
    print("="*80)
    print(f"M√©dia de bugs por release: {bugs_by_release['Total_Bugs'].mean():.1f}")
    print(f"Mediana de bugs por release: {bugs_by_release['Total_Bugs'].median():.1f}")
    print(f"M√≠nimo de bugs em uma release: {bugs_by_release['Total_Bugs'].min()}")
    print(f"M√°ximo de bugs em uma release: {bugs_by_release['Total_Bugs'].max()}")
    
    # Primeira vs √öltima release
    first_release = bugs_by_release.iloc[0]
    last_release = bugs_by_release.iloc[-1]
    variation = ((last_release['Total_Bugs'] - first_release['Total_Bugs']) / first_release['Total_Bugs'] * 100)
    
    print(f"\nPrimeira release ({bugs_by_release.index[0]}): {first_release['Total_Bugs']:.0f} bugs")
    print(f"√öltima release ({bugs_by_release.index[-1]}): {last_release['Total_Bugs']:.0f} bugs")
    print(f"Varia√ß√£o: {variation:+.1f}%")

### 10.3 An√°lise da √öltima Release (Estado Atual)

In [None]:
if not df_bugs.empty:
    fig, axes = plt.subplots(2, 2, figsize=(16, 12))
    fig.suptitle('Evolu√ß√£o dos Bugs ao Longo das Releases', fontsize=16, fontweight='bold')
    
    # 1. Evolu√ß√£o do total de bugs
    bugs_by_release['Total_Bugs'].plot(ax=axes[0, 0], marker='o', color='red', linewidth=2)
    axes[0, 0].set_title('Total de Bugs por Release')
    axes[0, 0].set_ylabel('Quantidade de Bugs')
    axes[0, 0].tick_params(axis='x', rotation=45)
    axes[0, 0].grid(True, alpha=0.3)
    axes[0, 0].axhline(y=bugs_by_release['Total_Bugs'].mean(), color='orange', 
                       linestyle='--', label=f'M√©dia: {bugs_by_release["Total_Bugs"].mean():.1f}')
    axes[0, 0].legend()
    
    # 2. Evolu√ß√£o por categoria (top 5)
    category_evolution = df_bugs.groupby(['release', 'category']).size().unstack(fill_value=0)
    top_categories = df_bugs['category'].value_counts().head(5).index
    category_evolution[top_categories].plot(ax=axes[0, 1], marker='o', linewidth=2)
    axes[0, 1].set_title('Evolu√ß√£o das Top 5 Categorias')
    axes[0, 1].set_ylabel('Quantidade de Bugs')
    axes[0, 1].tick_params(axis='x', rotation=45)
    axes[0, 1].legend(title='Categoria', bbox_to_anchor=(1.05, 1), loc='upper left')
    axes[0, 1].grid(True, alpha=0.3)
    
    # 3. Evolu√ß√£o por prioridade
    priority_evolution = df_bugs.groupby(['release', 'priority_label']).size().unstack(fill_value=0)
    priority_evolution.plot(ax=axes[1, 0], marker='o', linewidth=2, 
                           color=['#ff4444', '#ffaa44'])
    axes[1, 0].set_title('Evolu√ß√£o por Prioridade')
    axes[1, 0].set_ylabel('Quantidade de Bugs')
    axes[1, 0].tick_params(axis='x', rotation=45)
    axes[1, 0].legend(title='Prioridade')
    axes[1, 0].grid(True, alpha=0.3)
    
    # 4. Top 10 tipos de bugs na √∫ltima release
    latest_release_name = df_bugs['release'].unique()[-1]
    latest_bugs = df_bugs[df_bugs['release'] == latest_release_name]
    latest_bugs['type'].value_counts().head(10).plot(kind='barh', ax=axes[1, 1], color='steelblue')
    axes[1, 1].set_title(f'Top 10 Tipos de Bugs ({latest_release_name})')
    axes[1, 1].set_xlabel('Quantidade')
    
    plt.tight_layout()
    plt.savefig(RESULTS_DIR / 'bugs_evolution.png', dpi=300, bbox_inches='tight')
    plt.show()

### 10.5 Bugs de Seguran√ßa (find-sec-bugs)

In [None]:
if not df_bugs.empty:
    security_bugs = df_bugs[df_bugs['category'] == 'SECURITY'].copy()
    
    print("="*80)
    print("BUGS DE SEGURAN√áA (find-sec-bugs)")
    print("="*80)
    
    if not security_bugs.empty:
        # An√°lise por release
        security_by_release = security_bugs.groupby('release').size()
        
        print(f"\nM√©dia de bugs de seguran√ßa por release: {security_by_release.mean():.1f}")
        print(f"Mediana: {security_by_release.median():.1f}")
        
        print("\nBugs de Seguran√ßa por Release:")
        print(security_by_release)
        
        # √öltima release
        latest_release_name = df_bugs['release'].unique()[-1]
        latest_security = security_bugs[security_bugs['release'] == latest_release_name]
        
        print(f"\n{'='*80}")
        print(f"√öLTIMA RELEASE ({latest_release_name}): {len(latest_security)} bugs de seguran√ßa")
        print("="*80)
        
        print("\nTop 10 Tipos de Vulnerabilidades (√∫ltima release):")
        print(latest_security['type'].value_counts().head(10))
        
        print("\nTop 10 Classes com Bugs de Seguran√ßa (√∫ltima release):")
        print(latest_security['class'].value_counts().head(10))
        
        # Visualiza√ß√£o
        fig, axes = plt.subplots(1, 2, figsize=(16, 6))
        fig.suptitle('An√°lise de Bugs de Seguran√ßa', fontsize=16, fontweight='bold')
        
        security_by_release.plot(ax=axes[0], marker='o', color='darkred', linewidth=2)
        axes[0].set_title('Evolu√ß√£o de Bugs de Seguran√ßa')
        axes[0].set_ylabel('Quantidade')
        axes[0].tick_params(axis='x', rotation=45)
        axes[0].grid(True, alpha=0.3)
        axes[0].axhline(y=security_by_release.mean(), color='orange', 
                       linestyle='--', label=f'M√©dia: {security_by_release.mean():.1f}')
        axes[0].legend()
        
        latest_security['type'].value_counts().head(10).plot(kind='barh', ax=axes[1], color='crimson')
        axes[1].set_title(f'Top 10 Vulnerabilidades ({latest_release_name})')
        axes[1].set_xlabel('Quantidade')
        
        plt.tight_layout()
        plt.savefig(RESULTS_DIR / 'security_bugs.png', dpi=300, bbox_inches='tight')
        plt.show()
    else:
        print("\n‚úì Nenhum bug de seguran√ßa encontrado")

### 10.6 Bugs Cr√≠ticos (Prioridade HIGH)

In [None]:
if not df_bugs.empty:
    critical_bugs = df_bugs[df_bugs['priority'] == 1].copy()
    
    print("="*80)
    print("BUGS CR√çTICOS (Prioridade HIGH)")
    print("="*80)
    
    if not critical_bugs.empty:
        # An√°lise por release
        critical_by_release = critical_bugs.groupby('release').size()
        
        print(f"\nM√©dia de bugs cr√≠ticos por release: {critical_by_release.mean():.1f}")
        print(f"Mediana: {critical_by_release.median():.1f}")
        
        print("\nBugs Cr√≠ticos por Release:")
        print(critical_by_release)
        
        # √öltima release
        latest_release_name = df_bugs['release'].unique()[-1]
        latest_critical = critical_bugs[critical_bugs['release'] == latest_release_name]
        
        print(f"\n{'='*80}")
        print(f"√öLTIMA RELEASE ({latest_release_name}): {len(latest_critical)} bugs cr√≠ticos")
        print("="*80)
        
        if not latest_critical.empty:
            print("\nDETALHES DOS BUGS CR√çTICOS:")
            for idx, bug in latest_critical.iterrows():
                print(f"\nüî¥ {bug['type']} - {bug['category']}")
                print(f"   Classe: {bug['class']}")
                if bug['method']:
                    print(f"   M√©todo: {bug['method']}")
                if bug['description']:
                    print(f"   {bug['description'][:150]}...")
                print("   " + "-"*76)
    else:
        print("\n‚úì Nenhum bug cr√≠tico encontrado!")

### 10.7 Exportar Dados de Bugs

In [None]:
if not df_bugs.empty:
    # Identificar √∫ltima release
    latest_release_name = df_bugs['release'].unique()[-1]
    latest_bugs = df_bugs[df_bugs['release'] == latest_release_name]
    latest_security = security_bugs[security_bugs['release'] == latest_release_name] if not security_bugs.empty else pd.DataFrame()
    latest_critical = critical_bugs[critical_bugs['release'] == latest_release_name] if not critical_bugs.empty else pd.DataFrame()
    
    # Exportar todos os bugs
    df_bugs.to_csv(RESULTS_DIR / 'bugs_all_releases.csv', index=False)
    print("‚úì Bugs exportados:")
    print("  - bugs_all_releases.csv (todos os bugs de todas as releases)")
    
    # Bugs da √∫ltima release
    latest_bugs.to_csv(RESULTS_DIR / f'bugs_{latest_release_name}.csv', index=False)
    print(f"  - bugs_{latest_release_name}.csv (√∫ltima release)")
    
    if not security_bugs.empty:
        security_bugs.to_csv(RESULTS_DIR / 'security_bugs_all.csv', index=False)
        print("  - security_bugs_all.csv (todas as releases)")
    
    if not critical_bugs.empty:
        critical_bugs.to_csv(RESULTS_DIR / 'critical_bugs_all.csv', index=False)
        print("  - critical_bugs_all.csv (todas as releases)")
    
    # Resumo JSON
    summary_stats = {
        'latest_release': latest_release_name,
        'latest_release_bugs': len(latest_bugs),
        'latest_release_security_bugs': len(latest_security),
        'latest_release_critical_bugs': len(latest_critical),
        'average_bugs_per_release': float(bugs_by_release['Total_Bugs'].mean()),
        'median_bugs_per_release': float(bugs_by_release['Total_Bugs'].median()),
        'total_releases_analyzed': df_bugs['release'].nunique(),
        'most_common_bug_type_latest': latest_bugs['type'].value_counts().index[0] if len(latest_bugs) > 0 else 'N/A',
        'most_common_category_latest': latest_bugs['category'].value_counts().index[0] if len(latest_bugs) > 0 else 'N/A'
    }
    
    with open(RESULTS_DIR / 'bugs_summary.json', 'w') as f:
        json.dump(summary_stats, f, indent=2)
    print("  - bugs_summary.json")
    
    print("\n" + "="*80)
    print("üìä RESUMO GERAL:")
    print("="*80)
    for key, value in summary_stats.items():
        label = key.replace('_', ' ').title()
        if isinstance(value, float):
            print(f"  {label}: {value:.1f}")
        else:
            print(f"  {label}: {value}")
    print("="*80)

## 11. Resumo e Pr√≥ximos Passos

### ‚úÖ O que foi analisado:

1. **M√©tricas CK** - Complexidade (WMC), Acoplamento (CBO), Coes√£o (LCOM), etc.
2. **Distribui√ß√£o** - Boxplots para identificar classes outliers
3. **Correla√ß√µes** - Heatmap mostrando rela√ß√µes entre m√©tricas
4. **Bugs Gerais** - SpotBugs (todos os bugs detectados)
5. **Bugs de Seguran√ßa** - find-sec-bugs (vulnerabilidades)
6. **Bugs Cr√≠ticos** - Prioridade HIGH

### üìÅ Arquivos gerados:

**M√©tricas CK:**
- `metrics_summary.csv` - Estat√≠sticas por release
- `growth_rates.csv` - Taxa de crescimento
- `metrics_evolution.png` - Gr√°ficos de evolu√ß√£o
- `metrics_distribution.png` - Boxplots
- `correlation_matrix.png` - Heatmap de correla√ß√µes

**Bugs:**
- `bugs_all_releases.csv` - Todos os bugs
- `security_bugs.csv` - Bugs de seguran√ßa
- `critical_bugs.csv` - Bugs cr√≠ticos
- `bugs_summary.json` - Resumo estat√≠stico
- `bugs_analysis.png` - Visualiza√ß√µes gerais
- `security_bugs.png` - Visualiza√ß√µes de seguran√ßa

### üéØ Como usar no trabalho:

1. **Identificar problemas**: Use Top Classes e boxplots para encontrar outliers
2. **Priorizar**: Foque em bugs de seguran√ßa e cr√≠ticos primeiro
3. **Refatorar**: Classes com WMC/CBO/LCOM altos
4. **Submeter PRs**: Corrija bugs e refatore c√≥digo
5. **Documentar**: Use gr√°ficos no artigo cient√≠fico

### üí° Dicas para Pull Requests:

- Bugs de seguran√ßa s√£o sempre bem-vindos
- Comece com bugs simples (LOW priority)
- Classes com LCOM alto ‚Üí Split Responsibility
- M√©todos com WMC alto ‚Üí Extract Method