### TODO: Synthetic Data Generation
- [ ] Confirm `gretel-client`/`gretel-trainer` installs succeed for this kernel. Tip: rerun `%pip install -U gretel-client gretel-trainer` after kernel restart if dependencies change.
- [ ] Load the cleaned NF-CSE-CIC-IDS2018 CSV and keep a lightweight schema dict (column → dtype) for validation.
- [ ] Configure an ACTGAN model via `models.GretelACTGAN()` and capture key hyperparameters (epochs, max_rows).
- [ ] Generate ≥5k synthetic samples and persist them (e.g., `synthetic_df.to_csv(...)`) for downstream phases.

In [None]:

# REFERENCE: https://github.com/gretelai/gretel-python-client
# %pip install gretel-trainer

%pip install gretel-client

Collecting gretel-client
  Using cached gretel_client-0.30.1-py3-none-any.whl.metadata (4.2 kB)
Collecting inflection==0.5.1 (from gretel-client)
  Downloading inflection-0.5.1-py2.py3-none-any.whl.metadata (1.7 kB)
Collecting networkx==3.0 (from gretel-client)
  Downloading networkx-3.0-py3-none-any.whl.metadata (5.1 kB)
Collecting pyarrow==19.0.1 (from gretel-client)
  Downloading pyarrow-19.0.1-cp313-cp313-win_amd64.whl.metadata (3.4 kB)
Collecting pycryptodome<4,>=3.19 (from gretel-client)
  Downloading pycryptodome-3.23.0-cp37-abi3-win_amd64.whl.metadata (3.5 kB)
Collecting smart-open<8.0,>=2.1.0 (from gretel-client)
  Downloading smart_open-7.5.0-py3-none-any.whl.metadata (24 kB)
Collecting tabulate>=0.8.9 (from gretel-client)
  Downloading tabulate-0.9.0-py3-none-any.whl.metadata (34 kB)
Collecting wrapt (from smart-open<8.0,>=2.1.0->gretel-client)
  Downloading wrapt-2.0.1-cp313-cp313-win_amd64.whl.metadata (9.2 kB)
Downloading gretel_client-0.30.1-py3-none-any.whl (525 kB)
   

In [8]:
%pip install git+https://github.com/gretelai/gretel-python-client@main

Collecting git+https://github.com/gretelai/gretel-python-client@main
  Cloning https://github.com/gretelai/gretel-python-client (to revision main) to c:\users\dorot\appdata\local\temp\pip-req-build-o7us0oz_
  Resolved https://github.com/gretelai/gretel-python-client to commit 78340b7f58c83b318e37b8c6f6beb04938a39412
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Installing backend dependencies: started
  Installing backend dependencies: finished with status 'done'
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Building wheels for collected packages: gretel-client
  Building wheel for gretel-client (pyproject.toml): started
  Building wheel for gretel-client (pyproject.toml): finished with status 'done'
  Created wheel for gretel-client: filename=gretel

  Running command git clone --filter=blob:none --quiet https://github.com/gretelai/gretel-python-client 'C:\Users\dorot\AppData\Local\Temp\pip-req-build-o7us0oz_'


In [1]:
%pip install -U pip setuptools wheel

Note: you may need to restart the kernel to use updated packages.


In [None]:
#  gretel-trainer installation, so upgrading pip, setuptools, and wheel first
%pip install -U gretel-client gretel-trainer

In [None]:
from gretel_trainer import trainer, models
import pandas as pd

# Change the path to your local CSV file or dataset location (Google Drive)
csv_path = "C:\\Users\\dorot\\Downloads\\f7546561558c07c5_NFV3DATA-A11964_A11964\\f7546561558c07c5_NFV3DATA-A11964_A11964\\data\\NF-UNSW-NB15-v3.csv"
df = pd.read_csv(csv_path)

# Initialize the trainer with the ACTGAN configuration
model = trainer.Trainer(
    project_name="ids-2018-actgan",
    model_type=models.GretelACTGAN(),
    overwrite=True,
    cache_file="ids-2018-actgan-runner.json",
    session=None,
 )

# Train the model using the CSV path (Trainer handles loading internally)
model.train(dataset_path=csv_path)

# Generate 5,000 new synthetic records
synthetic_df = model.generate(num_records=5000)
synthetic_df.to_csv("actgan_synthetic_traffic.csv", index=False)