# Modelo de mezcla de gaussianas para aproximar el espacio original y latente 
Se implementa un modelo probabilistico en tensorflow probability para aproximar una distribucion de probabilidad a los dato originales y los de el espacio latente usando una mezcla de Gaussianas.




In [2]:
import warnings
warnings.filterwarnings("ignore")
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from sklearn.preprocessing import StandardScaler
import tensorflow_probability as tfp

2025-12-17 01:24:35.234136: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [3]:
#preprocesamiento de datos
data = pd.read_csv('Liver_GSE14520_U133A.csv')
data.drop(['samples','type'], axis=1, inplace=True)
data.head()

Unnamed: 0,1007_s_at,1053_at,117_at,121_at,1255_g_at,1294_at,1316_at,1320_at,1405_i_at,1431_at,...,AFFX-r2-Ec-bioD-3_at,AFFX-r2-Ec-bioD-5_at,AFFX-r2-P1-cre-3_at,AFFX-r2-P1-cre-5_at,AFFX-ThrX-3_at,AFFX-ThrX-5_at,AFFX-ThrX-M_at,AFFX-TrpnX-3_at,AFFX-TrpnX-5_at,AFFX-TrpnX-M_at
0,6.801198,4.553189,6.78779,5.430893,3.250222,6.272688,3.413405,3.37491,3.654116,3.804983,...,10.735084,10.398843,12.298551,12.270505,3.855588,3.148321,3.366087,3.199008,3.160388,3.366417
1,7.585956,4.19354,3.763183,6.003593,3.309387,6.291927,3.754777,3.587603,5.137159,8.622475,...,11.528447,11.369919,12.867048,12.560433,4.016561,3.282867,3.541994,3.54868,3.460083,3.423348
2,7.80337,4.134075,3.433113,5.395057,3.476944,5.825713,3.505036,3.687333,4.515175,12.681439,...,10.89246,10.416151,12.356337,11.888482,3.839367,3.598851,3.516791,3.484089,3.282626,3.512024
3,6.92084,4.000651,3.7545,5.645297,3.38753,6.470458,3.629249,3.577534,5.192624,11.759412,...,10.686871,10.524836,12.006596,11.846195,3.867602,3.180472,3.309547,3.425501,3.166613,3.377499
4,6.55648,4.59901,4.066155,6.344537,3.372081,5.43928,3.762213,3.440714,4.961625,10.318552,...,11.014454,10.775566,12.657182,12.573076,4.09144,3.306729,3.493704,3.205771,3.378567,3.392938


In [4]:
# Normalización de los datos
data = data.values.astype(np.float32)
data = StandardScaler().fit_transform(data)

In [5]:
dataset = tf.data.Dataset.from_tensor_slices(data).batch(51)

I0000 00:00:1765934689.262387      61 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 21768 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:0a:00.0, compute capability: 8.6


In [6]:
input_dim = data.shape[1]

In [7]:
tfd = tfp.distributions

locs = tf.Variable(tf.random.normal([3, input_dim])) # inicializamos las medias en valores aleatorios de una distribucion normal
scales = tf.Variable(tf.ones([3, input_dim]))        # inicializamos el ancho de las distribuciones en 1
logits = tf.Variable(tf.zeros([3]))                  # inicializamos los pesos de la mezcla en 0

def MixOfGaussians():
    return tfd.MixtureSameFamily(
        mixture_distribution=tfd.Categorical(logits=logits),
        components_distribution=tfd.MultivariateNormalDiag( 
            loc=locs,
            scale_diag=tf.nn.softplus(scales) #aseguramos desviación estandar >0
        )
    )

optimizer = tf.optimizers.Adam(learning_rate=0.05)


@tf.function
def train_step(batch):
    with tf.GradientTape() as tape:
        MoG = MixOfGaussians()
        loss = -tf.reduce_mean(MoG.log_prob(batch))
    gradients = tape.gradient(loss, [locs, scales, logits])
    optimizer.apply_gradients(zip(gradients, [locs, scales, logits]))
    return loss

In [8]:
for epoch in range(30):  
    for batch in dataset:
        loss = train_step(batch)
    print(f"Epoch {epoch}, Loss: {loss.numpy():.4f}")

2025-12-17 01:24:59.444428: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
2025-12-17 01:24:59.608866: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence


Epoch 0, Loss: 34400.2461
Epoch 1, Loss: 31281.6621
Epoch 2, Loss: 29005.7598


2025-12-17 01:24:59.713218: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
2025-12-17 01:24:59.845670: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence


Epoch 3, Loss: 27335.7852
Epoch 4, Loss: 26095.5469
Epoch 5, Loss: 25192.4551
Epoch 6, Loss: 24521.0684
Epoch 7, Loss: 24021.7617
Epoch 8, Loss: 23680.4062
Epoch 9, Loss: 23459.5371
Epoch 10, Loss: 23320.1094
Epoch 11, Loss: 23228.1270
Epoch 12, Loss: 23166.4941
Epoch 13, Loss: 23118.4648
Epoch 14, Loss: 23084.0215


2025-12-17 01:25:00.120339: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence


Epoch 15, Loss: 23056.2129
Epoch 16, Loss: 23035.6074
Epoch 17, Loss: 23018.9414
Epoch 18, Loss: 23008.0293
Epoch 19, Loss: 22999.5781
Epoch 20, Loss: 22992.3145
Epoch 21, Loss: 22986.8340
Epoch 22, Loss: 22982.1836
Epoch 23, Loss: 22978.6328
Epoch 24, Loss: 22975.8926
Epoch 25, Loss: 22973.4707
Epoch 26, Loss: 22971.7031
Epoch 27, Loss: 22970.7773
Epoch 28, Loss: 22969.5195
Epoch 29, Loss: 22968.6035


In [10]:
ckpt = tf.train.Checkpoint(
    optimizer=optimizer,
    locs=locs,
    scales=scales,
    logits=logits
)

ckpt_manager = tf.train.CheckpointManager(
    ckpt,
    directory="./checkpoints/mog",
    max_to_keep=5
)





Checkpoint guardado correctamente


In [42]:

ckpt.restore(tf.train.latest_checkpoint("./checkpoints/mog")).expect_partial()


optimizer.learning_rate.assign(2e-4)


for epoch in range(90, 120):
    for batch in dataset:
        loss = train_step(batch)
    print(f"Epoch {epoch}, Loss: {loss.numpy():.4f}")


Epoch 90, Loss: 22913.1699
Epoch 91, Loss: 22913.1562
Epoch 92, Loss: 22913.1465
Epoch 93, Loss: 22913.1367
Epoch 94, Loss: 22913.1250
Epoch 95, Loss: 22913.1133
Epoch 96, Loss: 22913.1035
Epoch 97, Loss: 22913.0898
Epoch 98, Loss: 22913.0781
Epoch 99, Loss: 22913.0684
Epoch 100, Loss: 22913.0586
Epoch 101, Loss: 22913.0469
Epoch 102, Loss: 22913.0371
Epoch 103, Loss: 22913.0273
Epoch 104, Loss: 22913.0117
Epoch 105, Loss: 22913.0020
Epoch 106, Loss: 22912.9902
Epoch 107, Loss: 22912.9805
Epoch 108, Loss: 22912.9707
Epoch 109, Loss: 22912.9609
Epoch 110, Loss: 22912.9512
Epoch 111, Loss: 22912.9414
Epoch 112, Loss: 22912.9297
Epoch 113, Loss: 22912.9219
Epoch 114, Loss: 22912.9102
Epoch 115, Loss: 22912.8965
Epoch 116, Loss: 22912.8867
Epoch 117, Loss: 22912.8770
Epoch 118, Loss: 22912.8672
Epoch 119, Loss: 22912.8574


In [43]:
ckpt_manager.save()
print("Checkpoint guardado correctamente")

Checkpoint guardado correctamente
