# Latent Space Builder


This notebook provides a tutorial for using the Latent Space Builder. The Latent Space Builder builds the latent space for an image dataset using various latent methods (Principal Component Analysis, Diffusion Map, etc.).

## Import dependencies

We first import the Latent Space Builder.

In [None]:
import latent_space_builder

We then import additional dependencies.

In [None]:
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
import numpy as np

## Get the data

We then define a path to our dataset to be used as input for each latent space method. We also define the number of dimensions for the latent space.

In [None]:
# path to the diffraction images
dataset_file = "../{0}/dataset/cspi_synthetic_dataset_diffraction_patterns_{0}_uniform_quat_dataset-size={1}_diffraction-pattern-shape=1024x1040-copy.hdf5".format("3iyf-10K", 10000)

# type of images
image_type = "diffraction_patterns"

# type of targets
target_type = "orientations"

# dimension of the latent space
latent_dim = 50


## Build the latent spaces

### Principal Component Analysis

We build the latent space using PCA.

In [None]:
# latent method
latent_method = "principal_component_analysis"

# build the latent space using PCA
latent_model = latent_space_builder.build_latent_space(dataset_file, image_type, target_type, latent_method, latent_dim, dataset_size=10000, training_set_size=1000, batch_size=1000)


We plot the cumulative variance captured by PCA.

In [None]:
explained_variance_ratio = latent_model.explained_variance_ratio_
explained_variance_ratio_cumulative_sum = np.cumsum(np.round(explained_variance_ratio, decimals=3) * 100)

ax = plt.figure().gca()
ax.set_ylabel("% Variance Explained")
ax.set_xlabel("# of Latent Dimensions")
ax.set_title("PCA Analysis")
ax.plot(np.arange(1, latent_dim + 1), explained_variance_ratio_cumulative_sum)
ax.xaxis.set_major_locator(MaxNLocator(integer=True))
plt.show()


We plot the singular values of PCA.

In [None]:
singular_values_ = latent_model.singular_values_

ax = plt.figure().gca()
ax.set_xlabel("Singular Values")
ax.set_title("PCA Analysis")
ax.scatter(np.arange(1, latent_dim + 1), singular_values_)
ax.xaxis.set_major_locator(MaxNLocator(integer=True))
plt.show()


### Diffusion map

We build the latent space using DM.

In [None]:
# latent method
latent_method = "diffusion_map"

# build the latent space using DM
latent_space_builder.build_latent_space(dataset_file, image_type, target_type, latent_method, latent_dim, dataset_size=1000, training_set_size=200, batch_size=200)


### Incremental Principal Component Analysis

We build the latent space using Incremental Principal Component Analysis.

In [None]:
# latent method
latent_method = "incremental_principal_component_analysis"

# build the latent space using Incremental PCA
latent_space_builder.build_latent_space(dataset_file, image_type, target_type, latent_method, latent_dim, dataset_size=10000, training_set_size=2000, batch_size=100)


### Ensemble Principal Component Analysis

We build the latent space using an Ensemble of Principal Component Analysis models.

In [None]:
# latent method
latent_method = "ensemble_pca"

# build the latent space using PCA
latent_space_builder.build_latent_space(dataset_file, image_type, target_type, latent_method, latent_dim, dataset_size=10000, training_set_size=4000, batch_size=2000, n_shuffles=30)
