# üß¨ Generaci√≥n de Datos Sint√©ticos con SDV

**Versi√≥n:** v3_experimental  
**Rol:** Data Engineer (Synthetic Data)  
**Fecha:** 2026-01-08

---

## Objetivo

Generar datos sint√©ticos tabulares usando **GaussianCopulaSynthesizer** de SDV para aumentar el dataset de entrenamiento.

## M√©todo

**GaussianCopula** es el m√©todo recomendado para datasets peque√±os (<1000 filas) porque:
- No requiere muchos datos para aprender distribuciones
- Captura correlaciones entre variables
- Es estable y reproducible

## Restricciones

> ‚ö†Ô∏è **CERO LEAKAGE**: Solo se usa el archivo de Train. El Test est√° sellado.

---

In [1]:
# ==============================================================================
# CONFIGURACI√ìN Y DEPENDENCIAS
# ==============================================================================

import warnings
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd
import json
from pathlib import Path
from datetime import datetime

# SDV
from sdv.metadata import SingleTableMetadata
from sdv.single_table import GaussianCopulaSynthesizer
from sdv.evaluation.single_table import evaluate_quality

# Configuraci√≥n
RANDOM_STATE = 42
np.random.seed(RANDOM_STATE)

# Paths
DATA_DIR = Path("../../v2/data/processed")
OUTPUT_DIR = Path(".")

# Cu√°ntas filas sint√©ticas generar
N_SYNTHETIC = 1000

print("‚úÖ Dependencias cargadas")
print(f"   SDV GaussianCopula")
print(f"   Random State: {RANDOM_STATE}")
print(f"   N Sint√©tico: {N_SYNTHETIC}")

‚úÖ Dependencias cargadas
   SDV GaussianCopula
   Random State: 42
   N Sint√©tico: 1000


---
## 1. Carga de Datos de Entrenamiento

In [2]:
# ==============================================================================
# CARGA DE DATOS (SOLO TRAIN - CERO LEAKAGE)
# ==============================================================================

train_df = pd.read_parquet(DATA_DIR / "train_final.parquet")

print(f"üìä Datos cargados (SOLO TRAIN):")
print(f"   Shape: {train_df.shape}")
print(f"   Columnas: {len(train_df.columns)}")

# Verificar columnas zero-variance a excluir
zero_var_cols = []
for col in train_df.columns:
    if train_df[col].nunique() <= 1:
        zero_var_cols.append(col)
        
print(f"\n‚ö†Ô∏è Columnas zero-variance detectadas: {zero_var_cols}")
print(f"   (Ser√°n excluidas del sintetizador)")

# Crear dataframe limpio para s√≠ntesis
train_clean = train_df.drop(columns=zero_var_cols, errors='ignore')
print(f"\nüìã Dataset para s√≠ntesis: {train_clean.shape}")

üìä Datos cargados (SOLO TRAIN):
   Shape: (296, 63)
   Columnas: 63

‚ö†Ô∏è Columnas zero-variance detectadas: ['tech_python', 'tech_big_data']
   (Ser√°n excluidas del sintetizador)

üìã Dataset para s√≠ntesis: (296, 61)


---
## 2. Definici√≥n de Metadatos SDV

In [3]:
# ==============================================================================
# DEFINICI√ìN DE METADATOS
# ==============================================================================

print("üìù Definiendo metadatos para SDV...")

# Crear metadata autom√°ticamente
metadata = SingleTableMetadata()
metadata.detect_from_dataframe(train_clean)

# Mostrar tipos detectados
print(f"\nüìã Tipos detectados:")
for col, info in metadata.columns.items():
    sdtype = info.get('sdtype', 'unknown')
    print(f"   {col}: {sdtype}")
    
# Verificar metadata
metadata.validate()
print(f"\n‚úÖ Metadatos validados correctamente")

üìù Definiendo metadatos para SDV...

üìã Tipos detectados:
   edad: numerical
   genero_m: categorical
   hab_1: numerical
   hab_2: numerical
   hab_3: numerical
   hab_4: numerical
   hab_5: numerical
   hab_6: numerical
   hab_7: numerical
   tech_programacion: categorical
   tech_java: categorical
   tech_desarrollo_web: categorical
   tech_desarrollo_movil: categorical
   tech_base_datos: categorical
   tech_machine_learning: categorical
   tech_inteligencia_artificial: categorical
   tech_analisis_datos: categorical
   tech_redes: categorical
   tech_telecomunicaciones: categorical
   tech_redes_inalambricas: categorical
   tech_sistemas_celulares: categorical
   tech_ciberseguridad: categorical
   tech_cloud_computing: categorical
   tech_sistemas_operativos: categorical
   tech_iot: categorical
   tech_automatizacion: categorical
   tech_robotica: categorical
   tech_electronica: categorical
   tech_instrumentacion: categorical
   tech_control: categorical
   tech_electricid

In [4]:
# ==============================================================================
# AJUSTE MANUAL DE METADATOS (si es necesario)
# ==============================================================================

# Asegurar que 'event' sea categ√≥rico/booleano
if 'event' in metadata.columns:
    metadata.update_column('event', sdtype='categorical')
    print("‚úèÔ∏è 'event' ajustado a categorical")

# Asegurar que 'duration' sea num√©rico
if 'duration' in metadata.columns:
    metadata.update_column('duration', sdtype='numerical')
    print("‚úèÔ∏è 'duration' ajustado a numerical")

# Asegurar que binarios tech_* sean categ√≥ricos
binary_cols = [c for c in train_clean.columns if c.startswith('tech_') or c == 'genero_m']
for col in binary_cols:
    if col in metadata.columns:
        metadata.update_column(col, sdtype='categorical')

print(f"‚úèÔ∏è {len(binary_cols)} columnas binarias ajustadas a categorical")

# Validar de nuevo
metadata.validate()
print("\n‚úÖ Metadatos actualizados y validados")

‚úèÔ∏è 'event' ajustado a categorical
‚úèÔ∏è 'duration' ajustado a numerical
‚úèÔ∏è 51 columnas binarias ajustadas a categorical

‚úÖ Metadatos actualizados y validados


---
## 3. Entrenamiento del Sintetizador

In [5]:
# ==============================================================================
# CONFIGURAR Y ENTRENAR GAUSSIANCOPULASYNTHESIZER
# ==============================================================================

print("üîß Configurando GaussianCopulaSynthesizer...")

synthesizer = GaussianCopulaSynthesizer(
    metadata,
    enforce_min_max_values=True,
    enforce_rounding=True,
    numerical_distributions={
        'duration': 'truncnorm',  # Truncated normal para evitar negativos
        'edad': 'truncnorm'
    },
    default_distribution='norm'
)

print("\nüèãÔ∏è Entrenando sintetizador...")
synthesizer.fit(train_clean)

print("\n‚úÖ Sintetizador entrenado exitosamente")

üîß Configurando GaussianCopulaSynthesizer...

üèãÔ∏è Entrenando sintetizador...



‚úÖ Sintetizador entrenado exitosamente


---
## 4. Generaci√≥n de Datos Sint√©ticos

In [6]:
# ==============================================================================
# GENERAR DATOS SINT√âTICOS
# ==============================================================================

print(f"üß¨ Generando {N_SYNTHETIC} filas sint√©ticas...")

synthetic_df = synthesizer.sample(num_rows=N_SYNTHETIC)

print(f"\nüìä Datos sint√©ticos generados:")
print(f"   Shape: {synthetic_df.shape}")
print(f"\n   Primeras filas:")
print(synthetic_df.head())

üß¨ Generando 1000 filas sint√©ticas...



üìä Datos sint√©ticos generados:
   Shape: (1000, 61)

   Primeras filas:
   edad  genero_m  hab_1  hab_2  hab_3  hab_4  hab_5  hab_6  hab_7  \
0    25         1   0.47   0.46   0.48   0.42   0.59   0.75   0.66   
1    21         0   0.95   0.78   1.00   1.00   1.00   1.00   0.97   
2    25         1   0.94   0.92   0.60   0.93   0.74   1.00   0.54   
3    23         1   0.87   0.56   0.87   0.76   0.63   0.52   0.92   
4    25         1   0.47   0.76   0.68   0.50   0.47   0.47   0.57   

   tech_programacion  ...  tech_agroindustria  tech_matematica_aplicada  \
0                  0  ...                   0                         0   
1                  0  ...                   0                         0   
2                  1  ...                   0                         1   
3                  0  ...                   0                         0   
4                  0  ...                   0                         0   

   tech_fisica_aplicada  tech_economia_finanzas  tec

---
## 5. Post-Procesamiento

In [7]:
# ==============================================================================
# POST-PROCESAMIENTO: ASEGURAR RESTRICCIONES DE DOMINIO
# ==============================================================================

print("üîß Aplicando post-procesamiento...")

# 1. Duration debe ser > 0
min_duration = train_clean['duration'].min()
if 'duration' in synthetic_df.columns:
    before = (synthetic_df['duration'] <= 0).sum()
    synthetic_df['duration'] = synthetic_df['duration'].clip(lower=min_duration)
    print(f"   ‚úÖ duration: {before} valores corregidos (clip a {min_duration:.2f})")

# 2. Event debe ser 0 o 1
if 'event' in synthetic_df.columns:
    synthetic_df['event'] = synthetic_df['event'].round().astype(int).clip(0, 1)
    print(f"   ‚úÖ event: convertido a binario {synthetic_df['event'].unique()}")

# 3. Edad debe estar en rango razonable
if 'edad' in synthetic_df.columns:
    min_edad = train_clean['edad'].min()
    max_edad = train_clean['edad'].max()
    synthetic_df['edad'] = synthetic_df['edad'].clip(min_edad, max_edad).round().astype(int)
    print(f"   ‚úÖ edad: clip a [{min_edad}, {max_edad}]")

# 4. Columnas binarias tech_* deben ser 0 o 1
binary_cols = [c for c in synthetic_df.columns if c.startswith('tech_') or c == 'genero_m']
for col in binary_cols:
    synthetic_df[col] = synthetic_df[col].round().astype(int).clip(0, 1)
print(f"   ‚úÖ {len(binary_cols)} columnas binarias convertidas a 0/1")

# 5. Habilidades hab_* deben estar en [0, 1]
hab_cols = [c for c in synthetic_df.columns if c.startswith('hab_')]
for col in hab_cols:
    # Redondear a valores v√°lidos: 0, 0.25, 0.5, 0.75, 1.0
    synthetic_df[col] = (synthetic_df[col] * 4).round() / 4
    synthetic_df[col] = synthetic_df[col].clip(0, 1)
print(f"   ‚úÖ {len(hab_cols)} columnas hab_* normalizadas a [0, 0.25, 0.5, 0.75, 1.0]")

print("\n‚úÖ Post-procesamiento completado")

üîß Aplicando post-procesamiento...
   ‚úÖ duration: 0 valores corregidos (clip a 0.58)
   ‚úÖ event: convertido a binario [0 1]
   ‚úÖ edad: clip a [21, 40]
   ‚úÖ 51 columnas binarias convertidas a 0/1
   ‚úÖ 7 columnas hab_* normalizadas a [0, 0.25, 0.5, 0.75, 1.0]

‚úÖ Post-procesamiento completado


---
## 6. Validaci√≥n de Calidad

In [8]:
# ==============================================================================
# VALIDACI√ìN DE CALIDAD SDV
# ==============================================================================

print("üìä Evaluando calidad de los datos sint√©ticos...")

try:
    quality_report = evaluate_quality(
        real_data=train_clean,
        synthetic_data=synthetic_df,
        metadata=metadata
    )
    
    overall_score = quality_report.get_score()
    print(f"\nüèÜ Score de Calidad General: {overall_score:.4f}")
    
except Exception as e:
    print(f"‚ö†Ô∏è No se pudo calcular el score de calidad: {e}")
    overall_score = None

üìä Evaluando calidad de los datos sint√©ticos...
Generating report ...



|          | 0/61 [00:00<?, ?it/s]|

(1/2) Evaluating Column Shapes: |          | 0/61 [00:00<?, ?it/s]|

(1/2) Evaluating Column Shapes: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 61/61 [00:00<00:00, 2188.76it/s]|


Column Shapes Score: 96.68%



|          | 0/1830 [00:00<?, ?it/s]|

(2/2) Evaluating Column Pair Trends: |          | 0/1830 [00:00<?, ?it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñè         | 38/1830 [00:00<00:04, 375.01it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñç         | 79/1830 [00:00<00:04, 392.58it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñã         | 124/1830 [00:00<00:04, 416.58it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñâ         | 166/1830 [00:00<00:03, 417.79it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñè        | 208/1830 [00:00<00:03, 413.43it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñç        | 252/1830 [00:00<00:03, 419.69it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñå        | 295/1830 [00:00<00:03, 421.95it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñä        | 338/1830 [00:00<00:03, 424.11it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà        | 381/1830 [00:00<00:03, 407.15it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñé       | 422/1830 [00:01<00:03, 393.94it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñå       | 463/1830 [00:01<00:03, 396.99it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñã       | 503/1830 [00:01<00:03, 393.82it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà       | 549/1830 [00:01<00:03, 411.82it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñè      | 591/1830 [00:01<00:02, 413.91it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñç      | 633/1830 [00:01<00:03, 397.73it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñã      | 679/1830 [00:01<00:02, 415.13it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñâ      | 726/1830 [00:01<00:02, 428.36it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñè     | 772/1830 [00:01<00:02, 435.73it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñç     | 821/1830 [00:01<00:02, 450.56it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñã     | 867/1830 [00:02<00:02, 451.04it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà     | 916/1830 [00:02<00:01, 462.33it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñé    | 963/1830 [00:02<00:01, 452.15it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå    | 1009/1830 [00:02<00:01, 454.12it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä    | 1055/1830 [00:02<00:01, 453.64it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà    | 1101/1830 [00:02<00:01, 432.83it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñé   | 1145/1830 [00:02<00:01, 420.89it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå   | 1192/1830 [00:02<00:01, 433.01it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä   | 1238/1830 [00:02<00:01, 438.09it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà   | 1282/1830 [00:03<00:01, 437.10it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñé  | 1327/1830 [00:03<00:01, 438.78it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñç  | 1372/1830 [00:03<00:01, 441.96it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñã  | 1417/1830 [00:03<00:00, 430.84it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñâ  | 1463/1830 [00:03<00:00, 438.00it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñé | 1510/1830 [00:03<00:00, 444.82it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå | 1558/1830 [00:03<00:00, 452.49it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä | 1604/1830 [00:03<00:00, 447.97it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà | 1650/1830 [00:03<00:00, 448.74it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñé| 1696/1830 [00:03<00:00, 451.86it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå| 1742/1830 [00:04<00:00, 448.16it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä| 1787/1830 [00:04<00:00, 437.97it/s]|

(2/2) Evaluating Column Pair Trends: |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1830/1830 [00:04<00:00, 431.16it/s]|


Column Pair Trends Score: 90.59%



Overall Score (Average): 93.63%


üèÜ Score de Calidad General: 0.9363


In [9]:
# ==============================================================================
# COMPARACI√ìN ESTAD√çSTICA
# ==============================================================================

print("üìä Comparaci√≥n estad√≠stica Real vs Sint√©tico:")
print("\n" + "="*70)

# Comparar estad√≠sticas clave
comparison_cols = ['duration', 'event', 'edad']

for col in comparison_cols:
    if col in train_clean.columns and col in synthetic_df.columns:
        real_mean = train_clean[col].mean()
        synth_mean = synthetic_df[col].mean()
        real_std = train_clean[col].std()
        synth_std = synthetic_df[col].std()
        
        print(f"\n{col}:")
        print(f"   Real:      Œº={real_mean:.3f}, œÉ={real_std:.3f}")
        print(f"   Sint√©tico: Œº={synth_mean:.3f}, œÉ={synth_std:.3f}")
        print(f"   Œî mean:    {abs(real_mean - synth_mean):.3f}")

# Comparar tasa de eventos
if 'event' in train_clean.columns:
    real_event_rate = train_clean['event'].mean()
    synth_event_rate = synthetic_df['event'].mean()
    print(f"\nEvent Rate:")
    print(f"   Real:      {real_event_rate:.1%}")
    print(f"   Sint√©tico: {synth_event_rate:.1%}")
    print(f"   Œî:         {abs(real_event_rate - synth_event_rate):.1%}")

üìä Comparaci√≥n estad√≠stica Real vs Sint√©tico:


duration:
   Real:      Œº=15.431, œÉ=10.992
   Sint√©tico: Œº=15.508, œÉ=8.716
   Œî mean:    0.077

event:
   Real:      Œº=0.456, œÉ=0.499
   Sint√©tico: Œº=0.451, œÉ=0.498
   Œî mean:    0.005

edad:
   Real:      Œº=25.176, œÉ=2.884
   Sint√©tico: Œº=25.265, œÉ=2.967
   Œî mean:    0.089

Event Rate:
   Real:      45.6%
   Sint√©tico: 45.1%
   Œî:         0.5%


---
## 7. Guardar Resultados

In [10]:
# ==============================================================================
# GUARDAR DATOS SINT√âTICOS Y MODELO
# ==============================================================================

print("üíæ Guardando resultados...")

# 1. Guardar datos sint√©ticos
synthetic_df.to_parquet(OUTPUT_DIR / "synthetic_data_copula.parquet", index=False)
print(f"   ‚úÖ synthetic_data_copula.parquet ({len(synthetic_df)} filas)")

# 2. Guardar modelo sintetizador
synthesizer.save(OUTPUT_DIR / "synthesizer_model.pkl")
print(f"   ‚úÖ synthesizer_model.pkl")

# 3. Guardar metadatos
metadata.save_to_json(OUTPUT_DIR / "synthesizer_metadata.json")
print(f"   ‚úÖ synthesizer_metadata.json")

# 4. Guardar reporte de generaci√≥n
report = {
    "metadata": {
        "date": datetime.now().isoformat(),
        "method": "GaussianCopulaSynthesizer",
        "sdv_version": "1.32.0",
        "random_state": RANDOM_STATE
    },
    "input": {
        "n_real": len(train_clean),
        "n_features": len(train_clean.columns),
        "excluded_cols": zero_var_cols
    },
    "output": {
        "n_synthetic": len(synthetic_df),
        "quality_score": float(overall_score) if overall_score else None
    },
    "statistics_comparison": {
        "duration": {
            "real_mean": float(train_clean['duration'].mean()),
            "synth_mean": float(synthetic_df['duration'].mean()),
            "real_std": float(train_clean['duration'].std()),
            "synth_std": float(synthetic_df['duration'].std())
        },
        "event_rate": {
            "real": float(train_clean['event'].mean()),
            "synthetic": float(synthetic_df['event'].mean())
        }
    },
    "files_generated": [
        "synthetic_data_copula.parquet",
        "synthesizer_model.pkl",
        "synthesizer_metadata.json"
    ]
}

with open(OUTPUT_DIR / "generation_report.json", 'w') as f:
    json.dump(report, f, indent=2)
print(f"   ‚úÖ generation_report.json")

print("\n" + "="*50)
print("üéâ GENERACI√ìN SINT√âTICA COMPLETADA")
print("="*50)

üíæ Guardando resultados...
   ‚úÖ synthetic_data_copula.parquet (1000 filas)
   ‚úÖ synthesizer_model.pkl
   ‚úÖ synthesizer_metadata.json
   ‚úÖ generation_report.json

üéâ GENERACI√ìN SINT√âTICA COMPLETADA


---
## Resumen

### Archivos Generados

| Archivo | Descripci√≥n |
|---------|-------------|
| `synthetic_data_copula.parquet` | 1000 filas sint√©ticas |
| `synthesizer_model.pkl` | Modelo GaussianCopula entrenado |
| `synthesizer_metadata.json` | Metadatos SDV |
| `generation_report.json` | Reporte de generaci√≥n |

### Siguiente Paso

**Prompt 5: Entrenamiento con Datos Aumentados** - Entrenar modelos con combinaciones de datos reales y sint√©ticos.

---