Skip to content

Venseven/SyntheticData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TDGAN - Synthetic Tabular data with datetime generation model


Python 3.8 Tensorflow 2.x

Installing tdgan package

 pip install .

or

External installation

  • Create a Personal Access Token :
https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token
  • Run
pip install git+https://<PERSONAL ACCESS TOKEN>@github.com/saamaresearch/TGAN_tf.git@packaging

Default MODEL HYPERPARAMETERS

Tune the parameters at train.py for intensive training

PARAMS = {
            "max_epoch":5,
            "steps_per_epoch":10000,
            "batch_size":128,
            "z_dim":200,
            "noise":0.2,
            "l2norm":0.00001,
            "learning_rate":0.001,
            "num_gen_rnn":100,
            "num_gen_feature":100,
            "num_dis_layers":1,
            "num_dis_hidden":100
        }

Default setup configuration

train.py and' inference.py contain a configuration CONFIG, where a runtime specific information should be supplied.

  • NUMERICAL_COLS and DATE_COLS are the names of the numerical/continuous data columns and the date data columns, respectively.
  • DATE_DELIMITER is the datetime delimiter of date records.
  • MODELPATH specifies the location of the trained model.
  • GPU if available, specify the number of GPUs, Otherwise, leave it empty.
CONFIG = {
          "NUMERICAL_COLS": ["Total Costs of OSHPD Projects", "Number of OSHPD Projects"],
          "DATE_COLS": ["Data Generation Date"], 
          "DATE_DELIMITER": "/", 
          "MODELPATH": "/mnt/new/research/TGAN_tf/output/model/date_model.pkl", 
          "GPU": "0,1,2,3", 
          "final_date_columns": None

Colab Notebooks

Training - Training

Inference - Inference

Training the TDGAN model:

  python train.py \
    --datapath [TRAIN DATAFILE PATH] \ 
    --data_format [TRAIN DATAFILE FORMAT]

Generating synthetic data samples from trained TDGAN:

  python inference.py \
    --num_samples [NUMBER OF SYNTHETIC SAMPLES] \ 
    --model_path [TRAINED MODEL PATH]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages