Consider verbosity parameter for per-epoch losses #15

kevinykuo · 2020-01-08T14:20:34Z

Either on/off or maybe a frequency (e.g. every N epochs)

kevinykuo · 2020-01-20T23:50:17Z

@csala I think for this we should have a verbose parameter that turns the printing on/off. However, in either case I think it'd be helpful for fit() to return data of the training history, so users can inspect/plot it afterwards. Maybe a wrapper around a pandas data frame but you'd probably have a better idea on what the most Pythonic approach is. Let me know your thoughts on this and I'd be happy to whip something together.

oregonpillow · 2020-03-08T15:11:00Z

Any updates on this issue?
Playing around in Colab I put this together : https://colab.research.google.com/drive/1JA_Ap1bQDmlhm_tC1k8RL0MNYKBluJNa
and added some new arguments to the fit() class in synthesizer.py . However I'm certain that my methods of implementation are probably completely off. Any feedback greatly appreciated.

Args:
            train_data (numpy.ndarray or pandas.DataFrame):
                Training Data. It must be a 2-dimensional numpy array or a
                pandas.DataFrame.
            discrete_columns (list-like):
                List of discrete columns to be used to generate the Conditional
                Vector. If ``train_data`` is a Numpy array, this list should
                contain the integer indices of the columns. Otherwise, if it is
                a ``pandas.DataFrame``, this list should contain the column names.
            verbosity (boolean):
                Choose to display epochs during the run. Defaults to ``True``.
            epochs (int):
                Number of training epochs. Defaults to 300.
            log_frequency (boolean):
                Whether to use log frequency of categorical levels in conditional
                sampling. Defaults to ``True``.
            gpu_stats (boolean):
                Whether to display gpu stats for each epoch. Fitting may be slowed down
                with this option turned on. Only supports nvidia GPUs at this time.
                Defaults to ``False``.
            early_stopping (boolean):
                Whether to stop fitting early if loss function has not improved for
                specified number 'patience' of epochs. Defaults to ``False``.
            patience (int):
                Number of epochs to monitor to see if loss function improves.
                Defaults to ``10`` if early_stopping turned on. 
            logging (boolean):
                Whether to store the generator loss and discriminator loss into a csv
                log file with timestamp. Defaults to ``False``.

elisim · 2020-11-04T09:37:38Z

@csala it will be very helpful.
IMHO, something similar to Keras model.fit output, may be considered.

ctgan = CTGANSynthesizer()
hist = ctgan.fit(data, discrete_columns)

where hist is a dictionary containing the generator and discriminator loss per epoch, and may be extended to other metrics in the future.

Baukebrenninkmeijer · 2020-12-03T12:32:01Z

In my own implementation I added loops using tqdm (progress bars) for both the epochs and steps. You can add logging information like loss there as well.

Related to how this information should be logged and also the proposal @oregonpillow did, I think the following:

The information that you're logging is really good and I like it a lot! The GPU stats are also a nice added bonus.
The histogram should not be returned by fit. To me at least, this does not feel intuitive. I think this information can be logged as a attribute, like ctgan.hist or ctgan.logs or something.
Writing directly to files seems a bit much for an implementation in CTGAN.
I think an option to facilitate many of these things is using a callback systems, similar to FastAI. We call on_epoch_end, on_epoch_start and other methods on the objects in ctgan.callbacks. These callbacks can be anything, ranging from logging objects to early stopping.

oregonpillow · 2020-12-03T14:41:08Z

I'll be honest, the only reason I added GPU status was because I liked watching the temperature go up with more epochs 😏

NadeemNicoR · 2021-12-13T00:01:57Z

Can i please know what is the metric used here in the loss calculation
Epoch 105, Loss G: -7.7396, Loss D: -0.3223, this is what i get when i try to fit the model over the training data

Baukebrenninkmeijer · 2022-02-08T08:15:29Z

@NadeemNicoR De metric is raw logit output iirc. The loss of G is just the average error of the samples produced by G. The loss of D is the loss of G - the loss of the real samples. I'm doing this by heart, so let me know if this is incorrect.

npatki · 2022-07-11T19:16:47Z

#147 addressed this issue so I'm closing it off. For further discussion about the verbosity parameter, let's use the overall SDV GitHub.

csala added this to the 0.2.1 milestone Jan 9, 2020

csala added internal The issue doesn't change the API or functionality good first issue labels Jan 9, 2020

csala removed this from the 0.2.1 milestone Jan 27, 2020

csala assigned kevinykuo Jan 27, 2020

csala modified the milestone: 0.2.1 Jan 27, 2020

csala unassigned kevinykuo Jan 27, 2020

FlorentRamb mentioned this issue Apr 19, 2021

Verbosity enhancement #147

Closed

csala added feature request Request for a new feature needs discussion and removed good first issue internal The issue doesn't change the API or functionality labels Sep 6, 2021

npatki removed the needs discussion label Jul 11, 2022

npatki closed this as completed Jul 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider verbosity parameter for per-epoch losses #15

Consider verbosity parameter for per-epoch losses #15

kevinykuo commented Jan 8, 2020 •

edited

Loading

kevinykuo commented Jan 20, 2020

oregonpillow commented Mar 8, 2020

elisim commented Nov 4, 2020

Baukebrenninkmeijer commented Dec 3, 2020 •

edited

Loading

oregonpillow commented Dec 3, 2020

NadeemNicoR commented Dec 13, 2021

Baukebrenninkmeijer commented Feb 8, 2022

npatki commented Jul 11, 2022

Consider verbosity parameter for per-epoch losses #15

Consider verbosity parameter for per-epoch losses #15

Comments

kevinykuo commented Jan 8, 2020 • edited Loading

kevinykuo commented Jan 20, 2020

oregonpillow commented Mar 8, 2020

elisim commented Nov 4, 2020

Baukebrenninkmeijer commented Dec 3, 2020 • edited Loading

oregonpillow commented Dec 3, 2020

NadeemNicoR commented Dec 13, 2021

Baukebrenninkmeijer commented Feb 8, 2022

npatki commented Jul 11, 2022

kevinykuo commented Jan 8, 2020 •

edited

Loading

Baukebrenninkmeijer commented Dec 3, 2020 •

edited

Loading