The IndependentSynthesizer
should follow the sdtypes in the metadata (not the data's dtypes)
#249
Labels
bug
Something isn't working
Environment Details
What is expected
The
IndependentSynthesizer
is expected to independently model each column.numerical
ordatetime
sdtypes, it should learn a univariate GMM during fit. Then during sample, it can create data from it.categorical
orboolean
sdtypes, it should learn the frequencies of each category. Then during sample, it can create data using those frequencies as weights.id
,pii
, etc.), it can simply use theRegexGenerator
orAnonymizedFaker
to generate values from scratch (no learning is expected)How does this synthesizer know which type is which? It should use the provided metadata as the ground source of truth.
What is actually observed
Similar to the
UniformSynthesizer
(see #248), this synthesizer just lets the RDT HyperTransformer decide which column is which sdtype (based on the data).It should be referencing the metadata, since the metadata is the source of truth.
The text was updated successfully, but these errors were encountered: