| Bloque                                | Código / Plantilla                                                                                                                                                                                                                                                                                                                               |
| ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Importar librerías** | `import numpy as np`<br>`import pandas as pd`<br>`import matplotlib.pyplot as plt`<br>`import pingouin as pg`<br>`import statsmodels.formula.api as smf`<br>`from scipy import stats`<br>`from scipy.spatial import distance`<br>`from statsmodels.multivariate.manova import MANOVA`<br>`from numpy.linalg import inv`<br>`from scipy.stats import chi2` | 
| **Cargar archivo**                    | `df = pd.read_csv("archivo.csv")`                                                                                                                                                                                                                                                                                                                |
 **Ver categorías únicas en una columna**              | `df["columnaA"].unique()`                                     |
| **Separar columnas**       | `df_nuevo = df[["columnaA", "columnaB"]]`                                                                                                                                                                                                                                                                                                        |
| **(combinación columnaA×B)**  |  `df["Celda"] = df["columnaA"].astype(str) + "\|" + df["columnaB"].astype(str)` |                     |
| **Filtrar filas**          | `df_nuevo = df[df["columna"] == "valor"]`                                                                                                                                                                                                                                                                                                        |
| **Histogramas**                       | `df.select_dtypes(include="number").hist(bins=XX); plt.show()`                                                                                                                                                                                                                                                                                   |
| **Scatterplot** | `def scatter_pairs_row(df, pairs, title=None):`<br>`    fig, axes = plt.subplots(1, len(pairs), figsize=(4*len(pairs), 3))`<br>`    axes = [axes] if len(pairs)==1 else axes`<br>`    for ax, (x, y) in zip(axes, pairs):`<br>`        ax.scatter(df[x], df[y], s=14, alpha=0.7)`<br>`        ax.set_xlabel(x); ax.set_ylabel(y)`<br>`    if title: plt.suptitle(f"Scatter plots — {title}", y=1.05)`<br>`    plt.tight_layout(); plt.show()` |
| **Q–Q plots**                         | `for col in df.select_dtypes(include="number").columns:`<br>  `stats.probplot(df[col].dropna(), dist="norm", plot=plt)`<br>  `plt.title(col); plt.show()`                                                                                                                                                                                        |
| **Residuos del modelo (ANOVA 2-vías con interacción)**| `smf.ols('Y ~ C(columnaA)*C(columnaB)', data=df).fit().resid`                                     |
| **Shapiro–Wilk**       | `for col in df.select_dtypes(include="number").columns:`<br>  `stats.shapiro(df[col].dropna())`                                                                                                                                                                                                                                                  |
| **Normalidad multivariada (HZ)**      | `pg.multivariate_normality(df.select_dtypes(include="number").dropna(), alpha=.05)`                                                                                                                                                                                                                                                              |
| **Distancia de Mahalanobis (χ²)** | `def mahalanobis_distance(X):`<br>`    X = np.array(X)`<br>`    mu = X.mean(axis=0)`<br>`    S = np.cov(X, rowvar=False)`<br>`    S_inv = np.linalg.inv(S)`<br>`    dif = X - mu`<br>`    return np.einsum("ij,jk,ik->i", dif, S_inv, dif)` 
| **Q–Q Mahalanobis Plot**     | `def qq_mahalanobis_plot(X, celda):`<br>`    m = mahalanobis_distance(X)`<br>`    m_sorted = np.sort(m)`<br>`    chi2_q = chi2.ppf((np.arange(1, len(m)+1)-0.5)/len(m), df=X.shape[1])`<br>`    plt.figure(figsize=(5,5))`<br>`    plt.scatter(chi2_q, m_sorted, s=20, alpha=0.7)`<br>`    plt.plot([chi2_q.min(), chi2_q.max()], [chi2_q.min(), chi2_q.max()], 'r--', lw=2)`<br>`    plt.title(f"Q–Q Mahalanobis — {celda}")`<br>`    plt.xlabel("Cuantiles teóricos χ²")`<br>`    plt.ylabel("Distancias cuadradas observadas")`<br>`    plt.grid(alpha=0.3)`<br>`    plt.show()` 
 |
| **Hotelling T² (general)**            | `pg.multivariate_ttest(X=df.select_dtypes(include="number").dropna().values, Y=...)`                                                                                                                                                                                                                                                            |
| **Test de Levene**          | `pg.homoscedasticity(data=df, dv='Y', group='Celda', method='levene')`                                       |
| **ANOVA 2-vías**  | `pg.anova(dv='Y', between=['columnaA','columnaB'], data=df, detailed=True)`                                  |
| **Post-hoc Tukey(todas las celdas)**      | `pg.pairwise_tukey(dv='Y', between='Celda', data=df)`                                                         |
| **Valores críticos (región crítica)** | `stats.norm.ppf(area)` (Z)<br>`stats.t.ppf(area, df)` (t)<br>`stats.chi2.ppf(area, df)` (χ²)<br>`stats.f.ppf(area, df1, df2)` (F)                                                                                                                                                                                                                |
| **Chequeo MANOVA por celda (HZ + gráficos)** | `alpha = 0.05; resultados_HZ = []`<br>`for celda, sub in df.groupby("Celda"):`<br>`    print(f"\n==============================\nCelda: {celda}\n==============================")`<br>`    scatter_pairs_row(sub, pairs, title=celda)`<br>`    qq_mahalanobis_plot(sub[DVs], celda)`<br>`    stat, p, normal = pg.multivariate_normality(sub[DVs], alpha=alpha)`<br>`    resultados_HZ.append({"Celda": celda, "HZ_stat": stat, "pval": p, "Normal": normal})`<br>`    print(f"Henze–Zirkler: HZ = {stat:.3f}, p = {p:.4f}, Normalidad: {normal}")` |
| **MANOVA** | `formula = "DV1 + DV2 + DV3 ~ C(Factor1) * C(Factor2)"`<br>`manova = MANOVA.from_formula(formula, data=df)`<br>`mv_out = manova.mv_test()`<br>`print(mv_out)` |
| **Box’s M** | `pg.box_m(data=df, dvs=DVs, group=f1)` |




