## Basic Examples for using each of the 6 methods of the sklearn.preprocessing.MinMaxScaler class

### Methods of the sklearn.preprocessing.MinMaxScaler class
1. fit(X[, y])	Compute the minimum and maximum to be used for later scaling.
2. fit_transform(X[, y])	Fit to data, then transform it.
3. get_params([deep])	Get parameters for this estimator.
4. inverse_transform(X)	Undo the scaling of X according to feature_range.
5. set_params(**params)	Set the parameters of this estimator.
6. transform(X)	Scaling features of X according to feature_range.

#### fit(X)
Computes the minimum and maximum to be used for later scaling.

In [None]:
import numpy as np
from sklearn.preprocessing import MinMaxScaler

# Sample data
# X = np.array([[1, -1, 2],
#               [2, 0, 0],
#               [0, 1, -1]])

# X = np.array([[1, 2, 3, 4, 5],
#               [5, 4, 3, 2, 1],
#               [1, 1, 1, 2, 5]])

array_1 = [1, -1, 2]
array_2 = [2, 0, 0]
array_3 = [0, 1, -1]

X = np.array([array_1,
             array_2,
             array_3])

print('X:')
print(X)
X_min_axis_0 = X.min(axis=0)
X_max_axis_0 = X.max(axis=0)

## X_std is the Standardization of the X Matrix
X_std_num = (X - X_min_axis_0)
X_std_denom = (X_max_axis_0 - X_min_axis_0)
X_std = X_std_num / X_std_denom
print('X.min(axis=0):')
print(X_min_axis_0)
print('X.max(axis=0):')
print(X_max_axis_0)
print('X_std_num:')
print(X_std_num)
print('X_std_denom:')
print(X_std_denom)
print('X_std:')
print(X_std)


#std_dev_np
X_std_dev_np = np.std(X)
print('X_std_dev_np:')
print(X_std_dev_np)

# Initialize the scaler
scaler = MinMaxScaler()

scaler.fit(X)

X:
[[ 1 -1  2]
 [ 2  0  0]
 [ 0  1 -1]]
X.min(axis=0):
[ 0 -1 -1]
X.max(axis=0):
[2 1 2]
X_std_num:
[[1 0 3]
 [2 1 1]
 [0 2 0]]
X_std_denom:
[2 2 3]
X_std:
[[0.5        0.         1.        ]
 [1.         0.5        0.33333333]
 [0.         1.         0.        ]]
X_std_dev_np:
1.0657403385139377


MinMaxScaler()

#### transform(X)
Scales the data based on the previously computed min and max.

In [None]:
X_scaled = scaler.transform(X)
print("Transformed:\n", X_scaled)

Transformed:
 [[0.5        0.         1.        ]
 [1.         0.5        0.33333333]
 [0.         1.         0.        ]]


#### fit_transform(X)
Combines fit and transform in one step (common shortcut).

In [None]:
X_scaled_direct = scaler.fit_transform(X)
print("Fit and Transformed:\n", X_scaled_direct)

Fit and Transformed:
 [[0.5        0.         1.        ]
 [1.         0.5        0.33333333]
 [0.         1.         0.        ]]


####  inverse_transform(X_scaled)
Reverses the scaling back to the original data values.

In [None]:
X_original = scaler.inverse_transform(X_scaled_direct)
print("Inverse Transformed:\n", X_original)

Inverse Transformed:
 [[ 1. -1.  2.]
 [ 2.  0.  0.]
 [ 0.  1. -1.]]


#### get_params()
Returns the parameters used in the scaler instance (e.g., feature_range).

In [None]:
params = scaler.get_params()
print("Scaler Parameters:\n", params)

Scaler Parameters:
 {'clip': False, 'copy': True, 'feature_range': (0, 1)}


#### set_params(**params)
Updates the parameters of the scaler.

In [None]:
scaler.set_params(feature_range=(-1, 1))
X_scaled_custom_range = scaler.fit_transform(X)
print("Transformed with custom range (-1, 1):\n", X_scaled_custom_range)

Transformed with custom range (-1, 1):
 [[ 0.         -1.          1.        ]
 [ 1.          0.         -0.33333333]
 [-1.          1.         -1.        ]]


## Compare & Contrast CTGAN to basic GAN (Generative Adversarial Networks)

### Core Difference:
The fundamental difference lies in their architecture and the way they handle tabular data, especially categorical features.

- CTGAN: Is explicitly designed for tabular data. It incorporates techniques to handle categorical variables effectively and to learn the underlying distributions of the data more accurately. It often uses a conditional generator that takes both random noise and information about the discrete columns as input.   
- Basic GAN: Is a more general generative model originally designed for continuous data like images. When applied directly to tabular data, it typically treats all columns as continuous, which can lead to poor handling of categorical features and potentially unrealistic synthetic data.

## FAILED ATTEMPTS

### Filter out rows that are Out-of-Bounds

In [None]:
# print('No Filters on synthetic_data')
# synthetic_data_filtered = synthetic_data.copy()
# inspect(synthetic_data, 2)

# print(len(likert_cols))
# count = 0

# for item in likert_cols:
#     synthetic_data_filtered = synthetic_data_filtered[
#         (synthetic_data_filtered[item] >= 1) & (synthetic_data_filtered[item] <= 5)
#     ]
#     count = count + 1
#     print(synthetic_data_filtered.shape)

# print("\n\n******")
# print('Likert Scale scores')
# print('Count of loop on likert_cols list: ', count)
# inspect(synthetic_data, 2)
# inspect(synthetic_data_filtered, 2)
# examinePDF(synthetic_data_filtered, 2)

# """
# Time spent on question in milliseconds.... so the min_of_mins of 414.0 (milliseconds) is 0.414 seconds.
# Let's make 0.25 seconds the floor.
# The max_of_maxes of 2924135.0 milliseconds is 2,924.135 seconds which is 48.735 minutes.
# Let's make the max time for a question 1hour = 3,600,000 milliseconds
# """
# question_time_floor = 250
# question_time_ceiling = 3600000

# for item in time_cols:
#     synthetic_data_filtered_2 = synthetic_data_filtered[
#         (synthetic_data_filtered[item] >= question_time_floor) & (synthetic_data_filtered[item] <= question_time_ceiling)
#     ]

# print("\n\n******")
# print('Question time')
# inspect(synthetic_data, 2)
# inspect(synthetic_data_filtered, 2)
# inspect(synthetic_data_filtered_2, 2)
# #examinePDF(synthetic_data_filtered_2, 2)

# """
# Let's keep the min and max for screen width and height the same from the Sample of 63 real data values
# """
# print("\n\n******")
# print('Screen size\n')
# screenw_min = other_col_meta_data[0][0]
# screenw_max = other_col_meta_data[0][1]
# screenh_min = other_col_meta_data[1][0]
# screenh_max = other_col_meta_data[1][1]
# screen_w_h_min_max_list = [screenw_min, screenw_max, screenh_min, screenh_max]
# for item in screen_w_h_min_max_list:
#     print(item)


# synthetic_data_filtered_3 = synthetic_data_filtered_2[
#     (synthetic_data_filtered_2[screenw] >= screenw_min) & (synthetic_data_filtered_2[screenw] <= screenw_max) &
#     (synthetic_data_filtered_2[screenh] >= screenh_min) & (synthetic_data_filtered_2[screenh] <= screenh_max)
# ]

# inspect(synthetic_data, 2)
# inspect(synthetic_data_filtered, 2)
# inspect(synthetic_data_filtered_2, 2)
# inspect(synthetic_data_filtered_3, 2)
# #examinePDF(synthetic_data_filtered_3, 2)

# """
# Let's keep the min of introelapse to 2.0
# """
# introelapse_min = 2.0
# print("\n\n******")
# print('introelapse')

# synthetic_data_filtered_4 = synthetic_data_filtered_3[
#     (synthetic_data_filtered_3[introelapse] >= introelapse_min)
# ]

# inspect(synthetic_data, 2)
# inspect(synthetic_data_filtered, 2)
# inspect(synthetic_data_filtered_2, 2)
# inspect(synthetic_data_filtered_3, 2)
# inspect(synthetic_data_filtered_4, 2)
    
# examinePDF(synthetic_data_filtered_4, 3)

No Filters on synthetic_data

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(25200, 108)
50
(17654, 108)
(15805, 108)
(12609, 108)
(11372, 108)
(7082, 108)
(5992, 108)
(4074, 108)
(3595, 108)
(2149, 108)
(1653, 108)
(1208, 108)
(1069, 108)
(936, 108)
(491, 108)
(440, 108)
(252, 108)
(185, 108)
(163, 108)
(149, 108)
(93, 108)
(69, 108)
(62, 108)
(24, 108)
(17, 108)
(16, 108)
(12, 108)
(11, 108)
(11, 108)
(7, 108)
(7, 108)
(7, 108)
(7, 108)
(7, 108)
(6, 108)
(5, 108)
(2, 108)
(2, 108)
(2, 108)
(0, 108)
(0, 108)
(0, 108)
(0, 108)
(0, 108)
(0, 108)
(0, 108)
(0, 108)
(0, 108)
(0, 108)
(0, 108)
(0, 108)


******
Likert Scale scores
Count of loop on likert_cols list:  50

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(25200, 108)

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(0, 108)

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(0, 108)
       EXT1  EXT2  EXT3  EXT4  EXT5  EXT6  EXT7

Unnamed: 0,EXT1,EXT2,EXT3,EXT4,EXT5,EXT6,EXT7,EXT8,EXT9,EXT10,...,OPN9_E,OPN10_E,screenw,screenh,introelapse,testelapse,endelapse,country,lat_appx_lots_of_err,long_appx_lots_of_err




******
Question time

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(25200, 108)

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(0, 108)

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(0, 108)


******
Screen size

320.0
2048.0
533.0
1200.0

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(25200, 108)

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(0, 108)

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(0, 108)

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(0, 108)


******
introelapse

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(25200, 108)

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(0, 108)

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(0, 108)

**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.Data

Unnamed: 0,EXT1,EXT2,EXT3,EXT4,EXT5,EXT6,EXT7,EXT8,EXT9,EXT10,...,OPN9_E,OPN10_E,screenw,screenh,introelapse,testelapse,endelapse,country,lat_appx_lots_of_err,long_appx_lots_of_err


### Use CTGAN from 'sdv.tabular' for 'data_pdf_clean'

In [None]:
# import pandas as pd
# from sdv.tabular import CTGAN

# # Identify categorical columns
# categorical_columns = ['country']

# # Model inputs
# epochs = 100

# # Initialize the CTGAN model
# ctgan = CTGAN(epochs=epochs) # You can adjust the number of epochs

# # Fit the CTGAN model to your real data
# ctgan.fit(data_pdf_clean, discrete_columns=categorical_columns)

# # Generate synthetic data
# num_samples = len(data_pdf_clean) * 2 # Generate twice the number of real samples
# print("Number of samples:")
# print(num_samples)
# synthetic_data = ctgan.sample(num_samples)

# # Print the first few rows of the synthetic data
# # print("First 5 rows of synthetic data:")
# # print(synthetic_data.head())

# ##Display Synthetic Data Frame
# inspect(synthetic_data, 3)
# print(synthetic_data.describe())
# display(synthetic_data)

ModuleNotFoundError: No module named 'sdv.tabular'

### Bound Columns with an **Inverse Transformation and Applying Bounds After Generation**:

- All Likert Scale questions to be between 1 and 5
- 'screenw' and 'screenh'
- 'introelapse', 'testelapse', 'endelapse',

### Use from sdv.tabular import CTGANSynthesizer instead of from ctgan import CTGAN

In [None]:
# ##Define Constraints for CTGANSynthesizer
# # import sdv
# print(sdv.__version__)

# from sdv.tabular import CTGANSynthesizer

1.20.0


ModuleNotFoundError: No module named 'sdv.tabular'

In [None]:
# from sklearn.preprocessing import MinMaxScaler
# from ctgan import CTGAN

# ##Scale Likert data columns before training CTGAN
# scaler = MinMaxScaler()
# data_pdf_clean[likert_cols] = scaler.fit_transform(data_pdf_clean[likert_cols]) ##???Something must be wrong with fit_transform???

# #Train CTGAN with Scaled data
# epochs = 100
# ctgan = CTGAN(epochs=epochs)
# ctgan.fit(data_pdf_clean, discrete_columns=categorical_columns)
# # Now, generate synthetic data of the same size as the original training data:
# num_samples=len(data_pdf_clean)
# #num_samples = len(data_pdf_clean) * 200 ## Scaling up samples here somehow distorts distribution of Values
# synthetic_scaled = ctgan.sample(num_samples)

# inspect(synthetic_scaled, 3)
# display(synthetic_scaled)
# synthetic_scaled.describe()


ImportError: cannot import name 'TypeVar' from 'typing_extensions' (/Users/maliksaunders/opt/anaconda3/lib/python3.9/site-packages/typing_extensions.py)

In [None]:
###

# # Create a copy to avoid modifying the scaled data
# synthetic_data = synthetic_scaled.copy()

# # Inverse transform the numerical columns
# synthetic_data[likert_cols] = scaler.inverse_transform(synthetic_scaled[likert_cols])

# # Apply bounds and discretize for rating columns 
# rating_cols = likert_cols
# min_rating = 1
# max_rating = 5

# for col in rating_cols:
#     synthetic_data[col] = synthetic_data[col].round().clip(min_rating, max_rating).astype(int)

# # # Apply bounds for other numerical columns if needed (e.g., time measurements)
# # time_cols = ['EXT1_E', 'AGR1_E'] # List your time-related columns
# # min_time = 0 # Assuming time cannot be negative
# # max_time = 10000 # Example maximum time (adjust based on your data)

# # for col in time_cols:
# #     synthetic_data[col] = synthetic_data[col].clip(min_time, max_time)

# # Now 'synthetic_data' contains values within your desired bounds and is discretized for rating columns
# inspect(synthetic_data, 3)
# display(synthetic_data)
# synthetic_data.describe()


**Inspection of Pandas DataFrame**
<class 'pandas.core.frame.DataFrame'>
(63, 108)
['EXT1', 'EXT2', 'EXT3', 'EXT4', 'EXT5', 'EXT6', 'EXT7', 'EXT8', 'EXT9', 'EXT10', 'EST1', 'EST2', 'EST3', 'EST4', 'EST5', 'EST6', 'EST7', 'EST8', 'EST9', 'EST10', 'AGR1', 'AGR2', 'AGR3', 'AGR4', 'AGR5', 'AGR6', 'AGR7', 'AGR8', 'AGR9', 'AGR10', 'CSN1', 'CSN2', 'CSN3', 'CSN4', 'CSN5', 'CSN6', 'CSN7', 'CSN8', 'CSN9', 'CSN10', 'OPN1', 'OPN2', 'OPN3', 'OPN4', 'OPN5', 'OPN6', 'OPN7', 'OPN8', 'OPN9', 'OPN10', 'EXT1_E', 'EXT2_E', 'EXT3_E', 'EXT4_E', 'EXT5_E', 'EXT6_E', 'EXT7_E', 'EXT8_E', 'EXT9_E', 'EXT10_E', 'EST1_E', 'EST2_E', 'EST3_E', 'EST4_E', 'EST5_E', 'EST6_E', 'EST7_E', 'EST8_E', 'EST9_E', 'EST10_E', 'AGR1_E', 'AGR2_E', 'AGR3_E', 'AGR4_E', 'AGR5_E', 'AGR6_E', 'AGR7_E', 'AGR8_E', 'AGR9_E', 'AGR10_E', 'CSN1_E', 'CSN2_E', 'CSN3_E', 'CSN4_E', 'CSN5_E', 'CSN6_E', 'CSN7_E', 'CSN8_E', 'CSN9_E', 'CSN10_E', 'OPN1_E', 'OPN2_E', 'OPN3_E', 'OPN4_E', 'OPN5_E', 'OPN6_E', 'OPN7_E', 'OPN8_E', 'OPN9_E', 'OPN10_E', 'scre

Unnamed: 0,EXT1,EXT2,EXT3,EXT4,EXT5,EXT6,EXT7,EXT8,EXT9,EXT10,...,OPN9_E,OPN10_E,screenw,screenh,introelapse,testelapse,endelapse,country,lat_appx_lots_of_err,long_appx_lots_of_err
0,1,1,1,1,1,1,1,1,1,1,...,2140,4980,903,591,19,213,10,GB,-14.357633,216.785205
1,1,1,1,1,1,1,1,1,1,1,...,7157,3554,1157,977,51,97,5,CA,15.025642,196.829831
2,1,1,1,1,1,1,1,1,1,1,...,29225,5050,624,611,31,-130,18,HR,49.798277,211.435933
3,1,1,1,1,1,1,1,1,1,1,...,4383,4456,936,630,43,61,7,AU,42.128067,96.524341
4,1,1,1,1,1,1,1,1,1,1,...,5234,3586,1400,571,30,470,13,PT,57.904745,-43.862155
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
58,1,1,1,1,1,1,1,1,1,1,...,20877,4447,-153,704,11,-31,14,CO,52.427400,75.629390
59,1,1,1,1,1,1,1,1,1,1,...,4332,4392,948,567,45,363,8,JO,58.042447,98.195827
60,1,1,1,1,1,1,1,1,1,1,...,6989,4657,1422,512,9,20,18,TR,23.748721,194.110053
61,1,1,1,1,1,1,1,1,1,1,...,6761,1381,1543,775,45,319,28,FR,31.211897,111.686421


Unnamed: 0,EXT1,EXT2,EXT3,EXT4,EXT5,EXT6,EXT7,EXT8,EXT9,EXT10,...,OPN8_E,OPN9_E,OPN10_E,screenw,screenh,introelapse,testelapse,endelapse,lat_appx_lots_of_err,long_appx_lots_of_err
count,63.0,63.0,63.0,63.0,63.0,63.0,63.0,63.0,63.0,63.0,...,63.0,63.0,63.0,63.0,63.0,63.0,63.0,63.0,63.0,63.0
mean,1.0,1.0,1.0,1.0,1.0,1.0,1.031746,1.0,1.015873,1.0,...,1862.730159,5455.888889,4272.15873,907.333333,705.698413,69.253968,170.031746,16.222222,40.412478,96.494353
std,0.0,0.0,0.0,0.0,0.0,0.0,0.176731,0.0,0.125988,0.0,...,2193.311079,6130.724419,2546.902598,618.653976,169.252973,116.905251,159.127051,7.0698,24.35566,92.465639
min,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,-2281.0,-1939.0,414.0,-384.0,424.0,-23.0,-145.0,3.0,-15.000944,-94.660984
25%,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,514.5,2289.5,2928.0,536.0,580.5,18.5,65.5,11.0,23.178995,46.324147
50%,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1458.0,4332.0,4127.0,946.0,676.0,31.0,157.0,16.0,50.289184,117.859605
75%,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,2582.5,6831.5,4910.5,1276.0,819.0,45.0,283.0,19.0,56.975809,166.502809
max,1.0,1.0,1.0,1.0,1.0,1.0,2.0,1.0,2.0,1.0,...,8569.0,29225.0,15327.0,2409.0,1189.0,575.0,550.0,40.0,81.835745,224.027779


In [None]:
# synthetic_data.describe()

## Use GAN (Basic GAN)

### General Guidelines and Strategies that can help design an effective architecture for tabular GANs

1. Input Size and Complexity of Data
The number of hidden layer nodes often depends on the complexity and dimensionality of the input data. Tabular data can vary significantly in complexity, with features that can have a wide range of distributions and relationships. In general:
- If the input data has fewer features and is relatively simple, fewer hidden nodes may suffice.
- If the data is high-dimensional and complex, a larger number of hidden nodes might be necessary to capture the underlying structure.


2. Progressive Layer Sizes
In many GAN architectures, a common practice is to progressively increase (or decrease) the number of nodes in hidden layers as you move deeper into the network. For example, you might start with a layer size similar to the input layer, then increase the number of nodes in subsequent hidden layers. A typical structure might look like:
- Generator: Start with a smaller number of nodes (e.g., 128 or 256) and progressively increase the number of nodes as you move to deeper layers to generate data that matches the complexity of the original input.
- Discriminator: Start with a large number of nodes and decrease the size as you move deeper into the network.


3. Common Ratios and Heuristics
While there's no fixed ratio, a typical approach is to:
- Set the number of hidden nodes to be 2 to 4 times the size of the input layer nodes, especially in the initial hidden layers. For example, if you have 10 input features, starting with a hidden layer size of 32 or 64 nodes might be a reasonable approach.
- Experiment with decreasing or increasing the size of subsequent layers (e.g., 128 → 64 → 32).


4. Regularization
As you increase the number of hidden nodes, the risk of overfitting can also increase, especially if your tabular data is not very large. In these cases, it is important to:
- Use dropout layers or batch normalization to prevent overfitting and ensure better generalization.
- Apply early stopping based on validation performance.


5. Empirical Testing and Hyperparameter Tuning
GANs can be sensitive to architecture design, and performance will vary based on the specific dataset and task. The best practice is to empirically test different architectures using:
- Grid search or random search to experiment with different hidden layer sizes.
- Tools like Hyperopt or Optuna for automated hyperparameter optimization.


### Conclusion
There is no fixed "ideal ratio" of hidden layer nodes to input layer nodes for tabular GANs. A good rule of thumb is to start with 2–4 times the input layer size for hidden nodes, but the best architecture will depend on the complexity of your data and the nature of your task. Empirical tuning, guided by validation results, will help you find the best architecture.

### Use GANs and PyTorch to create synthetic data 

Use GANs and PyTorch to create synthetic data from the sample data above (63 rows).

#### Define the Generator and Discriminator Models of GANs

In [None]:
##import torch
##import torch.nn as nn

# ## Original Generator and Discriminator Functions
# # Define the Generator model 
# class Generator(nn.Module):
#     def __init__(self, input_dim, output_dim):
#         super(Generator, self).__init__()
#         self.model = nn.Sequential(
#             nn.Linear(input_dim, 128),
#             nn.ReLU(True),
#             nn.Linear(128, 256),
#             nn.ReLU(True),
#             nn.Linear(256, output_dim),
#         )

#     def forward(self, z):
#         return self.model(z)

# # Define the Discriminator model
# class Discriminator(nn.Module):
#     def __init__(self, input_dim):
#         super(Discriminator, self).__init__()
#         self.model = nn.Sequential(
#             nn.Linear(input_dim, 256),
#             nn.ReLU(True),
#             nn.Linear(256, 128),
#             nn.ReLU(True),
#             nn.Linear(128, 1),
#             nn.Sigmoid(),
#         )

#     def forward(self, x):
#         return self.model(x)



## 2nd Generator and Discriminator Functions
# Define the Generator model 
class Generator(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(Generator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(input_dim, 256),
            nn.ReLU(True),
            nn.Linear(256, 512),
            nn.ReLU(True),
            nn.Linear(512, output_dim),
        )

    def forward(self, z):
        return self.model(z)

# Define the Discriminator model
class Discriminator(nn.Module):
    def __init__(self, input_dim):
        super(Discriminator, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(input_dim, 512),
            nn.ReLU(True),
            nn.Linear(512, 256),
            nn.ReLU(True),
            nn.Linear(256, 1),
            nn.Sigmoid(),
        )

    def forward(self, x):
        return self.model(x)

### Initialize Models and Optimizers

In [None]:
# Dimensions
##input_dim = len(predictor_cols)  # Number of features
# input_dim = len(data_pdf_clean.columns) # Number of features
# output_dim = input_dim + 1  # Features + target

input_dim = len(data_pdf_clean_encoded.columns) # Number of features
output_dim = input_dim + 1  # Features + target
print('input_dim =', input_dim)
print('output_dim =', output_dim)

## Create Tensor
data_np_array = data_pdf_clean_encoded.to_numpy()
##data_tensor = torch.from_numpy(data_pdf_clean_encoded.values)
data_tensor = torch.tensor(data_pdf_clean_encoded.values)
# print('data_pdf_clean.shape =', data_pdf_clean.shape)
# print('data_pdf_clean.size =', data_pdf_clean.size)
print('data_pdf_clean_encoded.shape =', data_pdf_clean_encoded.shape)
print('data_pdf_clean_encoded.size =', data_pdf_clean_encoded.size)
print(
    'type(enumerate(data_pdf_clean_encoded)):', 
    type(enumerate(data_pdf_clean_encoded))
)
print(
    'len(list(enumerate(data_pdf_clean_encoded))):', 
    len(list(enumerate(data_pdf_clean_encoded)))
)

"""In PyTorch, .size(0) returns the size of the first 
dimension of a tensor."""
print('data_tensor.shape =', data_tensor.shape)
##print('data_tensor.size =', data_tensor.size)
print('data_tensor.size(0) =', data_tensor.size(0))
print(
    'type(enumerate(data_tensor)):', 
    type(enumerate(data_tensor))
)
print(
    'len(list(enumerate(data_tensor))):', 
    len(list(enumerate(data_tensor)))
)


count = 0
for i, data in enumerate(data_tensor):
    print('\ni=', i, type(data))
    print('Length of data=', len(data))
    ##print('data.size=', data.size)
    print('data.size(0)=', data.size(0))
    ##print('data:')
    ##print(data)
    count = count + 1
    if count >= 3:
       break

# Instantiate models
generator = Generator(input_dim=input_dim, output_dim=output_dim)
discriminator = Discriminator(input_dim=output_dim)
'''
Remember that for GANs the number of "Backfed Input Cell" is equal
to the number of "Match Input Output Cells" 
'''

# Optimizers
lr = 0.0002
optimizer_G = torch.optim.Adam(generator.parameters(), lr=lr)
optimizer_D = torch.optim.Adam(discriminator.parameters(), lr=lr)

# Loss function
criterion = nn.BCELoss() ##Binary Cross Entropy Loss

input_dim = 131
output_dim = 132
data_pdf_clean_encoded.shape = (63, 131)
data_pdf_clean_encoded.size = 8253
type(enumerate(data_pdf_clean_encoded)): <class 'enumerate'>
len(list(enumerate(data_pdf_clean_encoded))): 131
data_tensor.shape = torch.Size([63, 131])
data_tensor.size(0) = 63
type(enumerate(data_tensor)): <class 'enumerate'>
len(list(enumerate(data_tensor))): 63

i= 0 <class 'torch.Tensor'>
Length of data= 131
data.size(0)= 131

i= 1 <class 'torch.Tensor'>
Length of data= 131
data.size(0)= 131

i= 2 <class 'torch.Tensor'>
Length of data= 131
data.size(0)= 131


### Training the GAN

- num_epochs: In tabular GANs (Generative Adversarial Networks), an epoch refers to one complete pass through the entire training dataset by both the generator and the discriminator networks.
- batch_size: In tabular GANs (Generative Adversarial Networks), the batch size refers to the number of data samples processed together during one forward and backward pass through the network. It is a key parameter in training both the generator and the discriminator networks.
- latent_dim: In tabular GANs (Generative Adversarial Networks), the latent dimension refers to the size of the random input vector (often called latent space or noise vector) fed into the generator network. This vector is used by the generator to produce synthetic data that mimics the real tabular data.

#### torch.rand()
* torch.rand() returns a tensor defined by the variable argument size (sequence of integers defining the shape of the output tensor), containing random numbers from standard normal distribution.
    *  size: sequence of integers defining the size of the output tensor. Can be a variable number of arguments or a collection like a list or tuple.
    * out: (optional) output tensor. 

In [None]:
# num_epochs = 10000
# batch_size = 64
# latent_dim = 100  # Dimension of the noise vector

num_epochs = 7
batch_size = 10
latent_dim = 100  # Dimension of the noise vector

for epoch in range(num_epochs):
    ##for i, data in enumerate(dataloader):
    ##for i, data in enumerate(data_pdf_clean_encoded):
    for i, data in enumerate(data_tensor):
        # Get real data batch
        real_data = data
        batch_size = real_data.size(0)

        # Labels for real and fake data
        real_labels = torch.ones(batch_size, 1)
        fake_labels = torch.zeros(batch_size, 1)

        # Train Discriminator
        optimizer_D.zero_grad()

        # Discriminator loss on real data
        output_real = discriminator(real_data)
        loss_real = criterion(output_real, real_labels)

        # Generate fake data
        z = torch.randn(batch_size, latent_dim)
        fake_data = generator(z)

        # Discriminator loss on fake data
        output_fake = discriminator(fake_data.detach())
        loss_fake = criterion(output_fake, fake_labels)

        # Total discriminator loss
        loss_D = loss_real + loss_fake
        loss_D.backward()
        optimizer_D.step()

        # Train Generator
        optimizer_G.zero_grad()

        # Generator loss
        output = discriminator(fake_data)
        loss_G = criterion(output, real_labels)  # Want generator to fool the discriminator
        loss_G.backward()
        optimizer_G.step()

    if epoch % 1000 == 0:
        print(f"Epoch [{epoch}/{num_epochs}] | Loss D: {loss_D.item()}, Loss G: {loss_G.item()}")
        

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x131 and 132x512)

## Tensors vs Vectors

- A **vector** is a mathematical object with both magnitude and direction, while a **tensor** is a more generalized concept that can represent quantities with multiple directions, essentially acting as a multi-dimensional vector, meaning it can describe relationships between multiple vectors or quantities in different directions; in simpler terms, a vector is a single-directional quantity, while a tensor can represent interactions in multiple directions simultaneously.
- Example: **Force is a vector** (magnitude and direction), while **stress (which describes forces acting in multiple directions on a surface) is a tensor**.
- All vectors are usually tensors. But all tensors can’t be vectors. This means tensors are more widespread than vectors (https://allthedifferences.com/difference-between-vectors-and-tensor/#:~:text=A%20vector%20is%20a%20one-dimensional%20array%20of%20numbers%2C,%28strictly%20speaking%2C%20though%20mathematicians%20assemble%20tensors%20through%20vectors%29.)

## Generating Synthetic Data

In [None]:
z = torch.randn(1000, latent_dim)
synthetic_data = generator(z).detach().numpy()

## Create Predictor Data Set and Target Series

In [None]:
pred_data_pdf_clean = data_pdf_clean_encoded.drop(['EST1'], axis=1)
predictor_cols = pred_data_pdf_clean.columns
inspect(predictor_cols)

target_series = data_pdf_clean_encoded['EST1']
inspect(target_series)

# GAN Example for an Image

## Understand your data - Become One With Your Data!

You should always invest time to understand your data. You should be able to answer questions like:
1. How many images do I have?
2. What's the shape of my image?
3. How do my images look like?

So let's first answer those questions!

In [None]:
batch_size = 128

# Images are usually in the [0., 1.] or [0, 255] range, Normalize transform will bring them into [-1, 1] range
# It's one of those things somebody figured out experimentally that it works (without special theoretical arguments)
# https://github.com/soumith/ganhacks <- you can find more of those hacks here
transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((.5,), (.5,))  
    ])

# MNIST is a super simple, "hello world" dataset so it's included in PyTorch.
# First time you run this it will download the MNIST dataset and store it in DATA_DIR_PATH
# the 'transform' (defined above) will be applied to every single image
mnist_dataset = datasets.MNIST(root=DATA_DIR_PATH, train=True, download=True, transform=transform)

# Nice wrapper class helps us load images in batches (suitable for GPUs)
mnist_data_loader = DataLoader(mnist_dataset, batch_size=batch_size, shuffle=True, drop_last=True)

# Let's answer our questions

# Q1: How many images do I have?
print(f'Dataset size: {len(mnist_dataset)} images.')

num_imgs_to_visualize = 25  # number of images we'll display
batch = next(iter(mnist_data_loader))  # take a single batch from the dataset
img_batch = batch[0]  # extract only images and ignore the labels (batch[1])
img_batch_subset = img_batch[:num_imgs_to_visualize]  # extract only a subset of images

# Q2: What's the shape of my image?
# format is (B,C,H,W), B - number of images in batch, C - number of channels, H - height, W - width
print(f'Image shape {img_batch_subset.shape[1:]}')  # we ignore shape[0] - number of imgs in batch.

# Q3: How do my images look like?
# Creates a 5x5 grid of images, normalize will bring images from [-1, 1] range back into [0, 1] for display
# pad_value is 1. (white) because it's 0. (black) by default but since our background is also black,
# we wouldn't see the grid pattern so I set it to 1.
grid = make_grid(img_batch_subset, nrow=int(np.sqrt(num_imgs_to_visualize)), normalize=True, pad_value=1.)
grid = np.moveaxis(grid.numpy(), 0, 2)  # from CHW -> HWC format that's what matplotlib expects! Get used to this.
plt.figure(figsize=(6, 6))
plt.title("Samples from the MNIST dataset")
plt.imshow(grid)
plt.show()

In [None]:
mnist_dataset = datasets.MNIST(root=DATA_DIR_PATH, train=True, download=True)

print(type(mnist_dataset))

In [None]:
batch_size = 128

# Images are usually in the [0., 1.] or [0, 255] range, Normalize transform will bring them into [-1, 1] range
# It's one of those things somebody figured out experimentally that it works (without special theoretical arguments)
# https://github.com/soumith/ganhacks <- you can find more of those hacks here
transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((.5,), (.5,))  
    ])

# MNIST is a super simple, "hello world" dataset so it's included in PyTorch.
# First time you run this it will download the MNIST dataset and store it in DATA_DIR_PATH
# the 'transform' (defined above) will be applied to every single image
mnist_dataset = datasets.MNIST(root=DATA_DIR_PATH, train=True, download=True, transform=transform)

# Nice wrapper class helps us load images in batches (suitable for GPUs)
mnist_data_loader = DataLoader(mnist_dataset, batch_size=batch_size, shuffle=True, drop_last=True)

# Let's answer our questions

# Q1: How many images do I have?
print(f'Dataset size: {len(mnist_dataset)} images.')

num_imgs_to_visualize = 25  # number of images we'll display
batch = next(iter(mnist_data_loader))  # take a single batch from the dataset
img_batch = batch[0]  # extract only images and ignore the labels (batch[1])
img_batch_subset = img_batch[:num_imgs_to_visualize]  # extract only a subset of images

# Q2: What's the shape of my image?
# format is (B,C,H,W), B - number of images in batch, C - number of channels, H - height, W - width
print(f'Image shape {img_batch_subset.shape[1:]}')  # we ignore shape[0] - number of imgs in batch.

# Q3: How do my images look like?
# Creates a 5x5 grid of images, normalize will bring images from [-1, 1] range back into [0, 1] for display
# pad_value is 1. (white) because it's 0. (black) by default but since our background is also black,
# we wouldn't see the grid pattern so I set it to 1.
grid = make_grid(img_batch_subset, nrow=int(np.sqrt(num_imgs_to_visualize)), normalize=True, pad_value=1.)
grid = np.moveaxis(grid.numpy(), 0, 2)  # from CHW -> HWC format that's what matplotlib expects! Get used to this.
plt.figure(figsize=(6, 6))
plt.title("Samples from the MNIST dataset")
plt.imshow(grid)
plt.show()

## Understand your model (neural networks)!

Let's define the generator and discriminator networks!

The original paper used the maxout activation and dropout for regularization (you don't need to understand this). <br/>
I'm using `LeakyReLU` instead and `batch normalization` which came after the original paper was published.

Those design decisions are inspired by the DCGAN model which came later than the original GAN.

In [None]:
# Size of the generator's input vector. Generator will eventually learn how to map these into meaningful images!
LATENT_SPACE_DIM = 100


# This one will produce a batch of those vectors
def get_gaussian_latent_batch(batch_size, device):
    return torch.randn((batch_size, LATENT_SPACE_DIM), device=device)


# It's cleaner if you define the block like this - bear with me
def vanilla_block(in_feat, out_feat, normalize=True, activation=None):
    layers = [nn.Linear(in_feat, out_feat)]
    if normalize:
        layers.append(nn.BatchNorm1d(out_feat))
    # 0.2 was used in DCGAN, I experimented with other values like 0.5 didn't notice significant change
    layers.append(nn.LeakyReLU(0.2) if activation is None else activation)
    return layers


class GeneratorNet(torch.nn.Module):
    """Simple 4-layer MLP generative neural network.

    By default it works for MNIST size images (28x28).

    There are many ways you can construct generator to work on MNIST.
    Even without normalization layers it will work ok. Even with 5 layers it will work ok, etc.

    It's generally an open-research question on how to evaluate GANs i.e. quantify that "ok" statement.

    People tried to automate the task using IS (inception score, often used incorrectly), etc.
    but so far it always ends up with some form of visual inspection (human in the loop).
    
    Fancy way of saying you'll have to take a look at the images from your generator and say hey this looks good!

    """

    def __init__(self, img_shape=(MNIST_IMG_SIZE, MNIST_IMG_SIZE)):
        super().__init__()
        self.generated_img_shape = img_shape
        num_neurons_per_layer = [LATENT_SPACE_DIM, 256, 512, 1024, img_shape[0] * img_shape[1]]

        # Now you see why it's nice to define blocks - it's super concise!
        # These are pretty much just linear layers followed by LeakyReLU and batch normalization
        # Except for the last layer where we exclude batch normalization and we add Tanh (maps images into our [-1, 1] range!)
        self.net = nn.Sequential(
            *vanilla_block(num_neurons_per_layer[0], num_neurons_per_layer[1]),
            *vanilla_block(num_neurons_per_layer[1], num_neurons_per_layer[2]),
            *vanilla_block(num_neurons_per_layer[2], num_neurons_per_layer[3]),
            *vanilla_block(num_neurons_per_layer[3], num_neurons_per_layer[4], normalize=False, activation=nn.Tanh())
        )

    def forward(self, latent_vector_batch):
        img_batch_flattened = self.net(latent_vector_batch)
        # just un-flatten using view into (N, 1, 28, 28) shape for MNIST
        return img_batch_flattened.view(img_batch_flattened.shape[0], 1, *self.generated_img_shape)


# You can interpret the output from the discriminator as a probability and the question it should
# give an answer to is "hey is this image real?". If it outputs 1. it's 100% sure it's real. 0.5 - 50% sure, etc.
class DiscriminatorNet(torch.nn.Module):
    """Simple 3-layer MLP discriminative neural network. It should output probability 1. for real images and 0. for fakes.

    By default it works for MNIST size images (28x28).

    Again there are many ways you can construct discriminator network that would work on MNIST.
    You could use more or less layers, etc. Using normalization as in the DCGAN paper doesn't work well though.

    """

    def __init__(self, img_shape=(MNIST_IMG_SIZE, MNIST_IMG_SIZE)):
        super().__init__()
        num_neurons_per_layer = [img_shape[0] * img_shape[1], 512, 256, 1]

        # Last layer is Sigmoid function - basically the goal of the discriminator is to output 1.
        # for real images and 0. for fake images and sigmoid is clamped between 0 and 1 so it's perfect.
        self.net = nn.Sequential(
            *vanilla_block(num_neurons_per_layer[0], num_neurons_per_layer[1], normalize=False),
            *vanilla_block(num_neurons_per_layer[1], num_neurons_per_layer[2], normalize=False),
            *vanilla_block(num_neurons_per_layer[2], num_neurons_per_layer[3], normalize=False, activation=nn.Sigmoid())
        )

    def forward(self, img_batch):
        img_batch_flattened = img_batch.view(img_batch.shape[0], -1)  # flatten from (N,1,H,W) into (N, HxW)
        return self.net(img_batch_flattened)

## GAN Training

**Feel free to skip this entire section** if you just want to use the pre-trained model to generate some new images - which don't exist in the original MNIST dataset and that's the whole magic of GANs!

Phew, so far we got familiar with data and our models, awesome work! <br/>
But brace yourselves as this is arguable the hardest part. How to actually train your GAN?

Let's start with understanding the loss function! We'll be using `BCE (binary cross-entropy loss`), let's see why? <br/>

If we input real images into the discriminator we expect it to output 1 (I'm 100% sure that this is a real image). <br/>
The further away it is from 1 and the closer it is to 0 the more we should penalize it, as it is making wrong prediction. <br/>
So this is how the loss should look like in that case (it's basically `-log(x)`):

<img src="data/examples/jupyter/cross_entropy_loss.png" alt="BCE loss when true label = 1." align="left"/> <br/>

BCE loss basically becomes `-log(x)` when it's target (true) label is 1. <br/>

Similarly for fake images, the target (true) label is 0 (as we want the discriminator to output 0 for fake images) and we want to penalize the generator if it starts outputing values close to 1. So we basically want to mirror the above loss function and that's just: `-log(1-x)`. <br/>

BCE loss basically becomes `-log(1-x)` when it's target (true) label is 0. That's why it perfectly fits the task! <br/>


### Training utility functions
Let's define some useful utility functions:

In [None]:
# Tried SGD for the discriminator, had problems tweaking it - Adam simply works nicely but default lr 1e-3 won't work!
# I had to train discriminator more (4 to 1 schedule worked) to get it working with default lr, still got worse results.
# 0.0002 and 0.5, 0.999 are from the DCGAN paper it works here nicely!
def get_optimizers(d_net, g_net):
    d_opt = Adam(d_net.parameters(), lr=0.0002, betas=(0.5, 0.999))
    g_opt = Adam(g_net.parameters(), lr=0.0002, betas=(0.5, 0.999))
    return d_opt, g_opt


# It's useful to add some metadata when saving your model, it should probably make sense to also add the number of epochs
def get_training_state(generator_net, gan_type_name):
    training_state = {
        "commit_hash": git.Repo(search_parent_directories=True).head.object.hexsha,
        "state_dict": generator_net.state_dict(),
        "gan_type": gan_type_name
    }
    return training_state


# Makes things useful when you have multiple models
class GANType(enum.Enum):
    VANILLA = 0


# Feel free to ignore this one not important for GAN training. 
# It just figures out a good binary name so as not to overwrite your older models.
def get_available_binary_name(gan_type_enum=GANType.VANILLA):
    def valid_binary_name(binary_name):
        # First time you see raw f-string? Don't worry the only trick is to double the brackets.
        pattern = re.compile(rf'{gan_type_enum.name}_[0-9]{{6}}\.pth')
        return re.fullmatch(pattern, binary_name) is not None

    prefix = gan_type_enum.name
    # Just list the existing binaries so that we don't overwrite them but write to a new one
    valid_binary_names = list(filter(valid_binary_name, os.listdir(BINARIES_PATH)))
    if len(valid_binary_names) > 0:
        last_binary_name = sorted(valid_binary_names)[-1]
        new_suffix = int(last_binary_name.split('.')[0][-6:]) + 1  # increment by 1
        return f'{prefix}_{str(new_suffix).zfill(6)}.pth'
    else:
        return f'{prefix}_000000.pth'

### Tracking your model's progress during training
You can track how your GAN training is progressing through:
1. Console output
2. Images dumped to: `data/debug_imagery`
3. Tensorboard, just type in `tensorboard --logdir=runs` to your Anaconda console 

Note: to use tensorboard just navigate to project root first via `cd path_to_root` and open `http://localhost:6006/` (browser)

In [None]:
####################### constants #####################
# For logging purpose
ref_batch_size = 16
ref_noise_batch = get_gaussian_latent_batch(ref_batch_size, device)  # Track G's quality during training on fixed noise vectors

discriminator_loss_values = []
generator_loss_values = []

img_cnt = 0

enable_tensorboard = True
console_log_freq = 50
debug_imagery_log_freq = 50
checkpoint_freq = 2

# For training purpose
num_epochs = 10  # feel free to increase this

########################################################

writer = SummaryWriter()  # (tensorboard) writer will output to ./runs/ directory by default

# Hopefully you have some GPU ^^
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Prepare feed-forward nets (place them on GPU if present) and optimizers which will tweak their weights
discriminator_net = DiscriminatorNet().train().to(device)
generator_net = GeneratorNet().train().to(device)

discriminator_opt, generator_opt = get_optimizers(discriminator_net, generator_net)

# 1s (real_images_gt) will configure BCELoss into -log(x) (check out the loss image above that's -log(x)) 
# whereas 0s (fake_images_gt) will configure it to -log(1-x)
# So that means we can effectively use binary cross-entropy loss to achieve adversarial loss!
adversarial_loss = nn.BCELoss()
real_images_gt = torch.ones((batch_size, 1), device=device)
fake_images_gt = torch.zeros((batch_size, 1), device=device)

ts = time.time()  # start measuring time

# GAN training loop, it's always smart to first train the discriminator so as to avoid mode collapse!
# A mode collapse, for example, is when your generator learns to only generate a single digit instead of all 10 digits!
for epoch in range(num_epochs):
    for batch_idx, (real_images, _) in enumerate(mnist_data_loader):

        real_images = real_images.to(device)  # Place imagery on GPU (if present)

        #
        # Train discriminator: maximize V = log(D(x)) + log(1-D(G(z))) or equivalently minimize -V
        # Note: D = discriminator, x = real images, G = generator, z = latent Gaussian vectors, G(z) = fake images
        #

        # Zero out .grad variables in discriminator network,
        # otherwise we would have corrupt results - leftover gradients from the previous training iteration
        discriminator_opt.zero_grad()

        # -log(D(x)) <- we minimize this by making D(x)/discriminator_net(real_images) as close to 1 as possible
        real_discriminator_loss = adversarial_loss(discriminator_net(real_images), real_images_gt)

        # G(z) | G == generator_net and z == get_gaussian_latent_batch(batch_size, device)
        fake_images = generator_net(get_gaussian_latent_batch(batch_size, device))
        # D(G(z)), we call detach() so that we don't calculate gradients for the generator during backward()
        fake_images_predictions = discriminator_net(fake_images.detach())
        # -log(1 - D(G(z))) <- we minimize this by making D(G(z)) as close to 0 as possible
        fake_discriminator_loss = adversarial_loss(fake_images_predictions, fake_images_gt)

        discriminator_loss = real_discriminator_loss + fake_discriminator_loss
        discriminator_loss.backward()  # this will populate .grad vars in the discriminator net
        discriminator_opt.step()  # perform D weights update according to optimizer's strategy

        #
        # Train generator: minimize V1 = log(1-D(G(z))) or equivalently maximize V2 = log(D(G(z))) (or min of -V2)
        # The original expression (V1) had problems with diminishing gradients for G when D is too good.
        #

        # if you want to cause mode collapse probably the easiest way to do that would be to add "for i in range(n)"
        # here (simply train G more frequent than D), n = 10 worked for me other values will also work - experiment.

        # Zero out .grad variables in discriminator network (otherwise we would have corrupt results)
        generator_opt.zero_grad()

        # D(G(z)) (see above for explanations)
        generated_images_predictions = discriminator_net(generator_net(get_gaussian_latent_batch(batch_size, device)))
        # By placing real_images_gt here we minimize -log(D(G(z))) which happens when D approaches 1
        # i.e. we're tricking D into thinking that these generated images are real!
        generator_loss = adversarial_loss(generated_images_predictions, real_images_gt)

        generator_loss.backward()  # this will populate .grad vars in the G net (also in D but we won't use those)
        generator_opt.step()  # perform G weights update according to optimizer's strategy

        #
        # Logging and checkpoint creation
        #

        generator_loss_values.append(generator_loss.item())
        discriminator_loss_values.append(discriminator_loss.item())
        
        if enable_tensorboard:
            global_batch_idx = len(mnist_data_loader) * epoch + batch_idx + 1
            writer.add_scalars('losses/g-and-d', {'g': generator_loss.item(), 'd': discriminator_loss.item()}, global_batch_idx)
            # Save debug imagery to tensorboard also (some redundancy but it may be more beginner-friendly)
            if batch_idx % debug_imagery_log_freq == 0:
                with torch.no_grad():
                    log_generated_images = generator_net(ref_noise_batch)
                    log_generated_images = nn.Upsample(scale_factor=2, mode='nearest')(log_generated_images)
                    intermediate_imagery_grid = make_grid(log_generated_images, nrow=int(np.sqrt(ref_batch_size)), normalize=True)
                    writer.add_image('intermediate generated imagery', intermediate_imagery_grid, global_batch_idx)

        if batch_idx % console_log_freq == 0:
            prefix = 'GAN training: time elapsed'
            print(f'{prefix} = {(time.time() - ts):.2f} [s] | epoch={epoch + 1} | batch= [{batch_idx + 1}/{len(mnist_data_loader)}]')

        # Save intermediate generator images (more convenient like this than through tensorboard)
        if batch_idx % debug_imagery_log_freq == 0:
            with torch.no_grad():
                log_generated_images = generator_net(ref_noise_batch)
                log_generated_images_resized = nn.Upsample(scale_factor=2.5, mode='nearest')(log_generated_images)
                out_path = os.path.join(DEBUG_IMAGERY_PATH, f'{str(img_cnt).zfill(6)}.jpg')
                save_image(log_generated_images_resized, out_path, nrow=int(np.sqrt(ref_batch_size)), normalize=True)
                img_cnt += 1

        # Save generator checkpoint
        if (epoch + 1) % checkpoint_freq == 0 and batch_idx == 0:
            ckpt_model_name = f"vanilla_ckpt_epoch_{epoch + 1}_batch_{batch_idx + 1}.pth"
            torch.save(get_training_state(generator_net, GANType.VANILLA.name), os.path.join(CHECKPOINTS_PATH, ckpt_model_name))

# Save the latest generator in the binaries directory
torch.save(get_training_state(generator_net, GANType.VANILLA.name), os.path.join(BINARIES_PATH, get_available_binary_name()))


## Generate images with your vanilla GAN

Nice, finally we can use the generator we trained to generate some MNIST-like imagery!

Let's define a couple of utility functions which will make things cleaner!

In [None]:
def postprocess_generated_img(generated_img_tensor):
    assert isinstance(generated_img_tensor, torch.Tensor), f'Expected PyTorch tensor but got {type(generated_img_tensor)}.'

    # Move the tensor from GPU to CPU, convert to numpy array, extract 0th batch, move the image channel
    # from 0th to 2nd position (CHW -> HWC)
    generated_img = np.moveaxis(generated_img_tensor.to('cpu').numpy()[0], 0, 2)

    # Since MNIST images are grayscale (1-channel only) repeat 3 times to get RGB image
    generated_img = np.repeat(generated_img,  3, axis=2)

    # Imagery is in the range [-1, 1] (generator has tanh as the output activation) move it into [0, 1] range
    generated_img -= np.min(generated_img)
    generated_img /= np.max(generated_img)

    return generated_img


# This function will generate a random vector pass it to the generator which will generate a new image
# which we will just post-process and return it
def generate_from_random_latent_vector(generator):
    with torch.no_grad():  # Tells PyTorch not to compute gradients which would have huge memory footprint
        
        # Generate a single random (latent) vector
        latent_vector = get_gaussian_latent_batch(1, next(generator.parameters()).device)
        
        # Post process generator output (as it's in the [-1, 1] range, remember?)
        generated_img = postprocess_generated_img(generator(latent_vector))

    return generated_img


# You don't need to get deep into this one - irrelevant for GANs - it will just figure out a good name for your generated
# images so that you don't overwrite the old ones. They'll be stored with xxxxxx.jpg naming scheme.
def get_available_file_name(input_dir): 
    def valid_frame_name(str):
        pattern = re.compile(r'[0-9]{6}\.jpg')  # regex, examples it covers: 000000.jpg or 923492.jpg, etc.
        return re.fullmatch(pattern, str) is not None

    # Filter out only images with xxxxxx.jpg format from the input_dir
    valid_frames = list(filter(valid_frame_name, os.listdir(input_dir)))
    if len(valid_frames) > 0:
        # Images are saved in the <xxxxxx>.jpg format we find the biggest such <xxxxxx> number and increment by 1
        last_img_name = sorted(valid_frames)[-1]
        new_prefix = int(last_img_name.split('.')[0]) + 1  # increment by 1
        return f'{str(new_prefix).zfill(6)}.jpg'
    else:
        return '000000.jpg'


def save_and_maybe_display_image(dump_dir, dump_img, out_res=(256, 256), should_display=False):
    assert isinstance(dump_img, np.ndarray), f'Expected numpy array got {type(dump_img)}.'

    # step1: get next valid image name
    dump_img_name = get_available_file_name(dump_dir)

    # step2: convert to uint8 format <- OpenCV expects it otherwise your image will be completely black. Don't ask...
    if dump_img.dtype != np.uint8:
        dump_img = (dump_img*255).astype(np.uint8)

    # step3: write image to the file system (::-1 because opencv expects BGR (and not RGB) format...)
    cv.imwrite(os.path.join(dump_dir, dump_img_name), cv.resize(dump_img[:, :, ::-1], out_res, interpolation=cv.INTER_NEAREST))  

    # step4: maybe display part of the function
    if should_display:
        plt.imshow(dump_img)
        plt.show()

### We're now ready to generate some new digit images!

In [None]:
# VANILLA_000000.pth is the model I pretrained for you, feel free to change it if you trained your own model (last section)!
model_path = os.path.join(BINARIES_PATH, 'VANILLA_000000.pth')  
assert os.path.exists(model_path), f'Could not find the model {model_path}. You first need to train your generator.'

# Hopefully you have some GPU ^^
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# let's load the model, this is a dictionary containing model weights but also some metadata
# commit_hash - simply tells me which version of my code generated this model (hey you have to learn git!)
# gan_type - this one is "VANILLA" but I also have "DCGAN" and "cGAN" models
# state_dict - contains the actuall neural network weights
model_state = torch.load(model_path)  
print(f'Model states contains this data: {model_state.keys()}')

gan_type = model_state["gan_type"]  # 
print(f'Using {gan_type} GAN!')

# Let's instantiate a generator net and place it on GPU (if you have one)
generator = GeneratorNet().to(device)
# Load the weights, strict=True just makes sure that the architecture corresponds to the weights 100%
generator.load_state_dict(model_state["state_dict"], strict=True)
generator.eval()  # puts some layers like batch norm in a good state so it's ready for inference <- fancy name right?
    
generated_imgs_path = os.path.join(DATA_DIR_PATH, 'generated_imagery')  # this is where we'll dump images
os.makedirs(generated_imgs_path, exist_ok=True)

#
# This is where the magic happens!
#

print('Generating new MNIST-like images!')
generated_img = generate_from_random_latent_vector(generator)
save_and_maybe_display_image(generated_imgs_path, generated_img, should_display=True)

I'd love to hear your feedback

If you found this notebook useful and would like me to add the same for cGAN and DCGAN please [open an issue](https://github.com/gordicaleksa/pytorch-gans/issues/new). <br/>

I'm super not aware of how useful people find this, I usually do stuff through my IDE.

Connect....

Lots of useful (I hope so at least!) content on LinkedIn, Twitter, YouTube and Medium. <br/>
So feel free to connect with me there:
1. My [LinkedIn](https://www.linkedin.com/in/aleksagordic) and [Twitter](https://twitter.com/gordic_aleksa) profiles
2. My YouTube channel - [The AI Epiphany](https://www.youtube.com/c/TheAiEpiphany)
3. My [Medium](https://gordicaleksa.medium.com/) profile
