You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let's make sampling more user friendly. We can create multiple methods for different user needs.
A new sample_conditions method can address conditional sampling needs.
Expected behavior
Parameters:
(required) conditions : A list of Condition objects (see Create Condition object #689) -- this specifies num_rows
max_tries: renamed from existing max_retries param (default: 100)
batch_size_per_try: Number of rows to sample per try (default: 10x requested num)
randomize_samples will determine whether or not there should be a fixed seed (default: True)
# works with any tabular modelfromsdv.tabularimportCTGANmodel=CTGAN()
model.fit(data)
# see Issue #689 for Condition object detailsfromsdv.tabular.samplingimportConditionsfemale_users=Condition(column_values={'sex': 'F', 'active_user': True}, num_rows=50)
inactive_users=Condition(column_values={'active_user': False}, num_rows=100)
conditions= [female_users, inactive_users]
# pass in list of conditionsmodel.sample_conditions(conditions, max_retries=200, randomize_sample=False)
Error Handling
Running out of tries
# Always gracefully reject sample (ie return any rows that are sampled)>>>synthetic_data=model.sample_conditions(conditions)
Warning: Onlyabletosample75rowsforthegivenconditions. Tosamplemorerows, tryincreasingmax_tries
(currently: 100) orincreasingbatch_size_per_try (currently: 10000). Notethatincreasingthesevalueswillalsoincreasethesamplingtime.
# Error if we weren't able to sample any rows>>>synthetic_data=model.sample_conditions(conditions)
Error: Unabletosampleanyrowsforthegivenconditions. Tryincreasingmax_tries
(currently: 100) orincreasingbatch_size_per_try (currently: 10000). Notethatincreasingthesevalueswillalsoincreasethesamplingtime.
Problem Description
Let's make sampling more user friendly. We can create multiple methods for different user needs.
A new
sample_conditions
method can address conditional sampling needs.Expected behavior
Parameters:
conditions
: A list ofCondition
objects (see CreateCondition
object #689) -- this specifiesnum_rows
max_tries
: renamed from existingmax_retries
param (default: 100)batch_size_per_try
: Number of rows to sample per try (default: 10x requested num)randomize_samples
will determine whether or not there should be a fixed seed (default: True)Error Handling
Running out of tries
Checking for invalid input
The text was updated successfully, but these errors were encountered: