Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prints inside the worker processes mess up the progress bar #76

Open
carmocca opened this issue Mar 24, 2024 · 1 comment
Open

Prints inside the worker processes mess up the progress bar #76

carmocca opened this issue Mar 24, 2024 · 1 comment
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@carmocca
Copy link
Contributor

carmocca commented Mar 24, 2024

馃悰 Bug

In my code, I am enabling a tqdm bar per worker with:

    global_rank = int(os.environ["DATA_OPTIMIZER_GLOBAL_RANK"])
    num_workers = int(os.environ["DATA_OPTIMIZER_NUM_WORKERS"])
    local_rank = global_rank % num_workers
    for example in tqdm(data, position=local_rank):
        tokens = tokenizer.encode(example)
        yield tokens

But litdata prints this in each rank:

Rank 3 inferred the following `['no_header_tensor:16']` data format.

Breaking the tqdm bars at the beginning.

Since this print doesn't seem very useful for users, I would suggest that it is removed or put under fast_dev_run or a similar verbose-like flag.

@carmocca carmocca added bug Something isn't working help wanted Extra attention is needed labels Mar 24, 2024
Copy link

Hi! thanks for your contribution!, great first issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant