Skip to content

Add option to derive column value from src dataset (marginals) #24

@Iain-S

Description

@Iain-S

The user may want to use the distribution of, for example, src.patient.sex to derive the dst.patient.sex value. We could add noise to existing data (anonymisation) or sample from a distribution (synthetic data). Either way, the amount of noise should be customisable.

Longer term, it would be nice to specify an epsilon or privacy budget in some form. For now, we shall simply:

  • Add a convenient way to specify that the source data should be used to populate the target column.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions