We begin by importing necessary libraries and loading the UniPert embeddings dataset (ensure to replace file paths with actual dataset links).

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans

# Load the UniPert embeddings dataset (update the filename to the real dataset file)
data = pd.read_csv('uniPert_data.csv')
# Assume that embedding columns are labeled as emb_1, emb_2, ...
embeddings = data.filter(regex='emb_').values

# Apply PCA for dimensionality reduction
pca = PCA(n_components=2)
pca_result = pca.fit_transform(embeddings)

# Cluster the embeddings using KMeans
kmeans = KMeans(n_clusters=5, random_state=42).fit(pca_result)
labels = kmeans.labels_

# Visualize the PCA results with cluster labels
plt.figure(figsize=(8,6))
scatter = plt.scatter(pca_result[:,0], pca_result[:,1], c=labels, cmap='viridis')
plt.title('PCA of UniPert Embeddings')
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.colorbar(scatter, label='Cluster ID')
plt.show()

Next, we apply t-SNE for a non-linear representation that could better capture the intrinsic structure of the data.

In [None]:
from sklearn.manifold import TSNE

tsne = TSNE(n_components=2, random_state=42)
tsne_result = tsne.fit_transform(embeddings)

plt.figure(figsize=(8,6))
scatter_tsne = plt.scatter(tsne_result[:,0], tsne_result[:,1], c=labels, cmap='plasma')
plt.title('t-SNE of UniPert Embeddings')
plt.xlabel('Dimension 1')
plt.ylabel('Dimension 2')
plt.colorbar(scatter_tsne, label='Cluster ID')
plt.show()





***
### [**Evolve This Code**](https://biologpt.com/?q=Evolve%20Code%3A%20This%20notebook%20downloads%20and%20processes%20a%20UniPert%20dataset%20to%20compute%20clustering%20metrics%20and%20visualize%20the%20embedding%20space%2C%20aiding%20in%20evaluation%20of%20the%20unified%20representation.%0A%0AIntegrate%20ARI%20and%20NMI%20calculations%20for%20clustering%20evaluation%20and%20use%20actual%20UniPert%20dataset%20links%20for%20reproducible%20analysis.%0A%0AHybrid%20deep%20learning%20framework%20for%20genetic%20and%20chemical%20perturbagen%20representation%0A%0AWe%20begin%20by%20importing%20necessary%20libraries%20and%20loading%20the%20UniPert%20embeddings%20dataset%20%28ensure%20to%20replace%20file%20paths%20with%20actual%20dataset%20links%29.%0A%0Aimport%20pandas%20as%20pd%0Aimport%20numpy%20as%20np%0Aimport%20matplotlib.pyplot%20as%20plt%0Afrom%20sklearn.decomposition%20import%20PCA%0Afrom%20sklearn.cluster%20import%20KMeans%0A%0A%23%20Load%20the%20UniPert%20embeddings%20dataset%20%28update%20the%20filename%20to%20the%20real%20dataset%20file%29%0Adata%20%3D%20pd.read_csv%28%27uniPert_data.csv%27%29%0A%23%20Assume%20that%20embedding%20columns%20are%20labeled%20as%20emb_1%2C%20emb_2%2C%20...%0Aembeddings%20%3D%20data.filter%28regex%3D%27emb_%27%29.values%0A%0A%23%20Apply%20PCA%20for%20dimensionality%20reduction%0Apca%20%3D%20PCA%28n_components%3D2%29%0Apca_result%20%3D%20pca.fit_transform%28embeddings%29%0A%0A%23%20Cluster%20the%20embeddings%20using%20KMeans%0Akmeans%20%3D%20KMeans%28n_clusters%3D5%2C%20random_state%3D42%29.fit%28pca_result%29%0Alabels%20%3D%20kmeans.labels_%0A%0A%23%20Visualize%20the%20PCA%20results%20with%20cluster%20labels%0Aplt.figure%28figsize%3D%288%2C6%29%29%0Ascatter%20%3D%20plt.scatter%28pca_result%5B%3A%2C0%5D%2C%20pca_result%5B%3A%2C1%5D%2C%20c%3Dlabels%2C%20cmap%3D%27viridis%27%29%0Aplt.title%28%27PCA%20of%20UniPert%20Embeddings%27%29%0Aplt.xlabel%28%27Principal%20Component%201%27%29%0Aplt.ylabel%28%27Principal%20Component%202%27%29%0Aplt.colorbar%28scatter%2C%20label%3D%27Cluster%20ID%27%29%0Aplt.show%28%29%0A%0ANext%2C%20we%20apply%20t-SNE%20for%20a%20non-linear%20representation%20that%20could%20better%20capture%20the%20intrinsic%20structure%20of%20the%20data.%0A%0Afrom%20sklearn.manifold%20import%20TSNE%0A%0Atsne%20%3D%20TSNE%28n_components%3D2%2C%20random_state%3D42%29%0Atsne_result%20%3D%20tsne.fit_transform%28embeddings%29%0A%0Aplt.figure%28figsize%3D%288%2C6%29%29%0Ascatter_tsne%20%3D%20plt.scatter%28tsne_result%5B%3A%2C0%5D%2C%20tsne_result%5B%3A%2C1%5D%2C%20c%3Dlabels%2C%20cmap%3D%27plasma%27%29%0Aplt.title%28%27t-SNE%20of%20UniPert%20Embeddings%27%29%0Aplt.xlabel%28%27Dimension%201%27%29%0Aplt.ylabel%28%27Dimension%202%27%29%0Aplt.colorbar%28scatter_tsne%2C%20label%3D%27Cluster%20ID%27%29%0Aplt.show%28%29%0A%0A)
***

### [Created with BioloGPT](https://biologpt.com/?q=Paper%20Review%3A%20Unifying%20Genetic%20and%20Chemical%20Perturbagen%20Representation%20through%20a%20Hybrid%20Deep%20Learning%20Framework)
[![BioloGPT Logo](https://biologpt.com/static/icons/bioinformatics_wizard.png)](https://biologpt.com/)
***