You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I create synthetic rowsets using sampl_rows, one thing I noticed is that it assumes that my distribution is Normal and therefore it skews the values. For example, the secs elapsed field doesn’t have a negative value in my original dataset but in the synthetic ones it contains negative values ranging from -4000 to +6000. Is there any way I can specify the Distribution for each column?
The text was updated successfully, but these errors were encountered:
There is a way to indicate which distribution to use, but currently this setting affects all the columns, so you cannot choose a custom distribution per column. This is a feature that will be added in the future.
Also, regarding sampling negative values, this would be better taken care of by the data constraints functionality that is also being developed. See #70
When I create synthetic rowsets using sampl_rows, one thing I noticed is that it assumes that my distribution is Normal and therefore it skews the values. For example, the secs elapsed field doesn’t have a negative value in my original dataset but in the synthetic ones it contains negative values ranging from -4000 to +6000. Is there any way I can specify the Distribution for each column?
The text was updated successfully, but these errors were encountered: