New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial pass at wandb Ludwig integration #514
Conversation
Just tried to sign the CLA and it's blank... |
@briankhsieh can you please look into the CLA issue? |
Thanks for this contribution, will look into it shortly. |
It looks like cla-assistant.io is having intermittent issues. It worked for me this morning. Can you try again later? |
@briankhsieh got it this time, just took a little while to load. |
@w4nderlust I'm using VSCode, if you have any ideas on settings I'm all ears. I'll do some googling as well. Other than formatting, any thoughts on the pull? |
@vanpelt sorry for the delay on this, was out for conferences. Will look into it by end of day Friday. |
ludwig/predict.py
Outdated
save_csv(csv_filename.format( | ||
output_field, output_type), values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please format this on 4 lines:
save_csv(
csv_filename.format(output_field, output_type),
values
)
ludwig/contribs/wandb.py
Outdated
del config["output_features"] | ||
wandb.config.update(config) | ||
|
||
def train_init(self, experiment_directory, experiment_name, model_name, resume, output_directory): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks longer than 80 columns, please split accordingly
I'm really sorry for the long delay @vanpelt , I had some personal issues that kept me away from working on Ludwig, now I'm back. The PR looks good to me as a starting point, just added a couple formatting suggestions. Regarding VSCode, I don't use it, but likely you can set the linter to 80 columns somehow (note sure this is still valid but https://stackoverflow.com/questions/29968499/vertical-rulers-in-visual-studio-code ). Do you prefer if I merge it now or do you want to keep on working on it adding the missing features? |
Hey @w4nderlust no worries. Let me take a pass at getting formatting setup properly and address your comments. I'll have something shortly. |
Ok, @w4nderlust addressed your feedback. Let me know if I missed anything, I'm still seeing DeepSource issues. Would love to get my linter / formatter setup correctly in vscode but not sure how best to do it. |
I looked at the deepsource complaints. They are msotly related with global and import. try:
import wandb
wandb_obj = Wandb()
setattr(wandb_obj, 'wandb', wandb)
return wndb_obj What do you think? |
Sorry for the delay on this, I'll hopefully get to it this week. |
Ok @w4nderlust finally got to this. Let me know if you need any other changes! |
@w4nderlust Ping |
Hey @w4nderlust any updates here? |
Checked it out, lgtm, will test it later this week and it everything works fine as expected will merge. please give me some instructions for testing, like an example command or something like that ;) |
Awesome! To test it, create a wandb account here: https://app.wandb.ai then run |
@w4nderlust any updates here? |
@w4nderlust I'm sure the holidays are hectic. Would love to get this in, do you have an eta? |
Thank you, sounds good! Consider the TF2 integration is not there yet, the test would be there to avoid problems when we will do it. |
I missed an injection point added in ludwig-ai#514
The comments should now be addressed. Quick questions:
|
Thank you!
Yes Currently the tensorboard contains only the batch level measures (that are only about the training set) and none of the epoch level measures we display in the cli. There is an open PR for that #571 that will be merged soon.
That's fine. if other integrations need it we will add it, but if you don't need it we can do without for now.
It's fine to ignore them, in particular if they are warnings. I would say try to address as many as you can, but only do it if it makes sense. The static method complaint can be ignored for instance. |
logging done entirely by syncing tensorboard data
Except for formatting issues everything should be working. Here is an example run: https://app.wandb.ai/borisd13/ludwig_mnist/runs/qzu4b5qq?workspace=user-borisd13 Whenever the other metrics are added to tensorboard they should automatically become in sync with W&B. Let me know if I need to change anything! |
resume, output_directory): | ||
import wandb | ||
logger.info("wandb.train_init() called...") | ||
wandb.init(project=os.getenv("WANDB_PROJECT", experiment_name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add name=model_name
to the init parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I was not aware of what model_name
was used for initially but now it makes sense to use it as the W&B run name
@borisdayma tested it and everything looks fine. Added just one little comment about the model name, as at the moment one gets names that are auto-generated by wandb I believe like "dandy-donkey-2". Once that is done, I'll merge. |
I added it and the runs are now named per model_name, see example: Once PR #571 is merged, the visualization will be more useful as validation metrics should also be logged. |
@borisdayma thank you, will merge that one first, test again ,and then merge this. I Don't expect any problem to happen, but just want to be sure. |
THanks a lot for your work and your patience. Sorry if it took me longer than I expected to get to this, but in the end we merged it :) |
Thanks, I'll be playing with it! |
Thanks for that great integration guys! Is it now fully working? |
Yes! |
This is initial pass at a basic integration of wandb with Ludwig. It currently mirrors tensorboard events, stores hyperparameters, syncs artifacts, and stores evaluation results.
I'm using the python "black" code formatter which seems to be clashing with some of the existing code. Is there a preferred formatter to use?