You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Most distributions used by GaussianCopula and CopulaGAN fit better to continuous variables. In order to optimize the fitting process we can add noise to the output of the LabelEncoder to makes the transformed categorical variables continuous.
Expected behavior
Add a new boolean parameter, add_noise, to __init__ for LabelEncoder.
If value is false, no changes should be made
If value is true, then
On the forward transform: Perform the label encoding as usual, and then add uniform noise within the interval. For example:
Label=1 is noised to anything in the interval [1, 2) → 1.002, 1.3, 1.9999, ..
Label=2 is noised to anything in the interval [2, 3)
Label=3 is noised to anything in the interval [3, 4)
On the reverse transform floor the values to the nearest integer (eg. 3.9 becomes 3, 4.321 becomes 4) and then continue the normal reverse transformation.
The text was updated successfully, but these errors were encountered:
Problem Description
Most distributions used by
GaussianCopula
andCopulaGAN
fit better to continuous variables. In order to optimize the fitting process we can add noise to the output of theLabelEncoder
to makes the transformed categorical variables continuous.Expected behavior
add_noise
, to__init__
forLabelEncoder
.The text was updated successfully, but these errors were encountered: