
## Concise Tutorial for VSGenerator

**VSGenerator** provides **DVSNet**, a neural network designed to dynamically generate virtual spaces for training datasets.  
This project extends the **Bgolearn** platform, focusing on applying generative algorithms to material design problems.

### Installation

Install VSGenerator via pip:

```sh
pip install VSGenerator
```

### Preparing Your Dataset

Prepare your dataset following the example in the `./example_data` folder.  

The folder `example_data` contains two datasets:

- **`data.csv`**: The training dataset.  
  - It consists of five columns:  
    - The first four columns represent the **design space** (features).  
    - The last column is the **regression target**, representing the **material property**.  
  - This dataset is derived after the first stage of **active learning design**.  

- **`gooddata.csv`**: The "good data" subset.  
  - VSGenerator samples promising candidates around these so-called **good data** points.  
  - It must have the **same dimensions** as `data.csv`.  

### Execution Steps

The execution process consists of two steps:

1. **Train the model**  
   - Learn the distribution in the latent space.  

2. **Sample new candidates**  
   - Generate new promising candidates to define the **search space** for the next iteration.  

### Output

The results will be saved in a new folder named **BgolearnData**, with the output stored as `vs_data.csv`.  

The generated search space **respects your constraints** on the first training property, aligning with your desired specifications.  


In [1]:
# Import the VSGenerator package
from VSGenerator import DVSNet

# Path to the training dataset
data_file = './example_data/data.csv'

# Dimensionality of input features for DVSNet
input_dim = 4

# Dimensionality of the target variable. 
# In this simple example, the target is one-dimensional, but DVSNet supports multi-target regression.
y_dim = 1

# Dimensionality of the latent space distribution. 
# In this work, samples are represented by a 5-dimensional Gaussian distribution in the latent space.
latent_dim = 5

# Number of training epochs
epochs = 50

# Batch size for training
batch_size = 16  

# Train the model using the specified parameters. 
# Optimized parameters will be saved automatically.
DVSNet.train(
    data_file, input_dim, y_dim, latent_dim,
    epochs=epochs, batch_size=batch_size, patience=3
)

DVSNet, Bin Cao, Hong Kong University of Science and Technology (Guangzhou).
URL : https://github.com/Bin-Cao/Bgolearn
URL : https://github.com/Bgolearn/VSGenerator
Executed on : 2025-03-28 11:52:47  | Have a great day.


Epoch 1/50
Epoch 1: val_loss improved from inf to 4.00145, saving model to ./DSVNetparams\best_model.h5


  saving_api.save_model(


Epoch 2/50
Epoch 2: val_loss improved from 4.00145 to 2.24500, saving model to ./DSVNetparams\best_model.h5
Epoch 3/50
Epoch 3: val_loss improved from 2.24500 to 1.27687, saving model to ./DSVNetparams\best_model.h5
Epoch 4/50
Epoch 4: val_loss did not improve from 1.27687
Epoch 5/50
Epoch 5: val_loss did not improve from 1.27687
Epoch 6/50
Epoch 6: val_loss did not improve from 1.27687
Epoch 6: early stopping


In [2]:
# Path to the dataset containing good data
data_file = './example_data/gooddata.csv'

# Dimensionality of input features for DVSNet (must remain the same as in the training stage)
input_dim = 4  

# Dimensionality of the target variable (must remain the same as in the training stage)
y_dim = 1

# Dimensionality of the latent space distribution (must remain the same as in the training stage)
latent_dim = 5

# A list with 'input_dim' elements, specifying the step size for each feature dimension
step_list = [0.1, 5, 25, 1]

# A list with 'input_dim' elements, defining the boundaries for each feature dimension
boundary = [[1.0, 2.0], [5, 30], [250, 475], [4, 8]]

# The 'step_list' and 'boundary' will discretize the generated data according to the designed grid.

# Number of generated samples per distribution.
# Note: Due to potential duplication, the actual number of generated samples may be smaller than the specified 'gen_num'.
gen_num = 50

# Generate new data based on the provided configuration.
DVSNet.generator(data_file, input_dim, y_dim, latent_dim, step_list, boundary, gen_num)



## For more parameters, please refer to the function's docstring or the code logic.
## Feel free to contact me in case of any issues.