You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By my understanding, cleanlab.benchmarking.noise_generation.generate_noise_matrix_from_trace is used to generate a set percentage of incorrect labels. In the code below, I've tried using a 0.2 noise amount (20% label noise) however the number of label errors created was only 776/20000 = 3.88% which is a big gap from the intended 20% label noise. I would like to clarify the meaning of "noise amount" as used with the trace. I would also like to enquire if there is a way to generate a set percentage of incorrect labels e.g. 20% of 20000 = 4000.
import random
from cleanlab.benchmarking.noise_generation import *
random.seed(100)
random_numbers = [random.randint(0, 119) for _ in range(20000)]
trace = 120 * (1 - 0.2)
noisy_matrix = generate_noise_matrix_from_trace(K=120, trace=trace, valid_noise_matrix=False, seed=100)
noisy_numbers = generate_noisy_labels(random_numbers, noisy_matrix)
sum(noisy_numbers != random_numbers) # prints 776
The text was updated successfully, but these errors were encountered:
By my understanding,
cleanlab.benchmarking.noise_generation.generate_noise_matrix_from_trace
is used to generate a set percentage of incorrect labels. In the code below, I've tried using a 0.2 noise amount (20% label noise) however the number of label errors created was only 776/20000 = 3.88% which is a big gap from the intended 20% label noise. I would like to clarify the meaning of "noise amount" as used with the trace. I would also like to enquire if there is a way to generate a set percentage of incorrect labels e.g. 20% of 20000 = 4000.The text was updated successfully, but these errors were encountered: