Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config gets overwritten on resume by id #912

Closed
stathius opened this issue Mar 13, 2020 · 9 comments
Closed

Config gets overwritten on resume by id #912

stathius opened this issue Mar 13, 2020 · 9 comments

Comments

@stathius
Copy link

wandb --version && python --version && uname

  • wandb, version 0.8.29
  • Python 3.7.5
  • Linux

Description

I am trying to resume a run by using wandb.init(resume='someid') but this overwrites the config params.

@vanpelt
Copy link
Contributor

vanpelt commented Mar 13, 2020

Hey @stathius, are you calling wandb.config.update(...newconfig)? We should restore the previous values of config, and only update values that you manually set.

@stathius
Copy link
Author

No, I'm not calling anything other than wand.init(resume='id'). Isn't the expected behavior that the run keeps it's previous config values?

@vanpelt
Copy link
Contributor

vanpelt commented Mar 14, 2020

Yep, that should be the case. Can you share a code snippet so we can reproduce?

@stathius
Copy link
Author

import wandb
run = wandb.init(project='lightning', config={'batch_size':16})
id = run.id
wandb.join()
run = wandb.init(resume=id)

@vanpelt
Copy link
Contributor

vanpelt commented Mar 17, 2020

This should work if the run is launched in a new process, but there may be a bug when re-launched from the same process. We'll try to reproduce.

@dHonerkamp
Copy link

I'm seeing something similar since I've updated to version 10.0:

    common_args = {'entity': args.pop('wandb_account'),
                   'project': args.pop('project_name'),
                   'dir': args['logpath'],
                   'tags': [f'v{version:.1f}'] + tags + args.pop('tags'),
                   'sync_tensorboard': sync_tensorboard,
                   'name': run_name,
                   'group': group,}

        os.environ["WANDB_RESUME"] = "must"
        os.environ["WANDB_RUN_ID"] = "1jms2gwz"

    run = wandb.init(id=args['resume_id'],
                         resume=args['resume_id'],
                         **common_args)

leads to run.config == {} and 'wandb.config == {}` instead of using the restored config.

It seemed to work fine on version 9.8. Any idea what changed? (I experimented with the id and resume arguments, but can't seem to find any combination that works)

@vanpelt
Copy link
Contributor

vanpelt commented Sep 14, 2020

We're tracking this and will cut a new release asap with a fix. It should go out in 10.0.1 within the next couple days.

@github-actions
Copy link

This issue is stale because it has been open 60 days with no activity.

@github-actions github-actions bot added the stale label Dec 20, 2020
@ariG23498
Copy link
Contributor

Hey folks
I tried reproducing this ticket with wandb 0.10.20

Code

Creating a run

id = wandb.util.generate_id()
print(f"[INFO] Generated ID: {id}")

config = {
    "epochs":50,
    "loss":"mse",
    "opt":"adam"
}

run = wandb.init(entity="xxx", project="xxx", id=id, config=config)
w_c = run.config
print(w_c)
a = tf.keras.layers.Input(shape=(32,))
b = tf.keras.layers.Dense(10)(a)
model = tf.keras.models.Model(inputs=a,outputs=b)

model.compile(w_c["opt"], loss=w_c["loss"])
model.fit(np.random.rand(100, 32), np.random.rand(100, 10),
    initial_epoch=wandb.run.step, epochs=w_c["epochs"],
    callbacks=[WandbCallback(save_model=True, monitor="loss")])
run.finish()

This gets me the config dictionary {'epochs': 50, 'loss': 'mse', 'opt': 'adam'}

Resuming the previous run with a different config

config = {
    "epochs":100, #changed the epoch from 50 to 100
}

run = wandb.init(entity="xxx", project="xxx", resume=id, config=config)
w_c = run.config
print(w_c)
a = tf.keras.layers.Input(shape=(32,))
b = tf.keras.layers.Dense(10)(a)
model = tf.keras.models.Model(inputs=a,outputs=b)

model.compile(w_c["opt"], loss=w_c["loss"])
model.fit(np.random.rand(100, 32), np.random.rand(100, 10),
    initial_epoch=wandb.run.step, epochs=w_c["epochs"],
    callbacks=[WandbCallback(save_model=True, monitor="loss")])
run.finish()

This runs and prints the following config dictionary {'epochs': 100, 'opt': 'adam', 'loss': 'mse'}

Closing this ticket. Please feel free to comment in the thread for further assistance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants