Prints inside the worker processes mess up the progress bar #76

carmocca · 2024-03-24T20:51:54Z

🐛 Bug

In my code, I am enabling a tqdm bar per worker with:

    global_rank = int(os.environ["DATA_OPTIMIZER_GLOBAL_RANK"])
    num_workers = int(os.environ["DATA_OPTIMIZER_NUM_WORKERS"])
    local_rank = global_rank % num_workers
    for example in tqdm(data, position=local_rank):
        tokens = tokenizer.encode(example)
        yield tokens

But litdata prints this in each rank:

Rank 3 inferred the following `['no_header_tensor:16']` data format.

Breaking the tqdm bars at the beginning.

Since this print doesn't seem very useful for users, I would suggest that it is removed or put under fast_dev_run or a similar verbose-like flag.

The text was updated successfully, but these errors were encountered:

github-actions · 2024-03-24T20:52:17Z

Hi! thanks for your contribution!, great first issue!

carmocca added bug Something isn't working help wanted Extra attention is needed labels Mar 24, 2024

carmocca mentioned this issue Apr 2, 2024

Mention how the progress bar updates #75

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prints inside the worker processes mess up the progress bar #76

Prints inside the worker processes mess up the progress bar #76

carmocca commented Mar 24, 2024 •

edited

github-actions bot commented Mar 24, 2024

Prints inside the worker processes mess up the progress bar #76

Prints inside the worker processes mess up the progress bar #76

Comments

carmocca commented Mar 24, 2024 • edited

🐛 Bug

github-actions bot commented Mar 24, 2024

carmocca commented Mar 24, 2024 •

edited