Add NeptuneLogger #730

jakubczakon · 2020-01-26T19:38:34Z

Description:

Added NeptuneLogger that lets your log experiment metadata to neptune.ai
Logging involves handlers for Output, Grads(scalars), Optimizer Params, Model Weights (scalars), and logging best model checkpoints to the server.

Check list:

New tests are added (if a new feature is added)
New doc strings: description and/or example code are in RST format
Documentation is updated (if required)

… added initial tests file

vfdev-5 · 2020-01-26T20:37:33Z

@jakubczakon thanks a lot for the PR. Looks good!

A question on the link provided in the docs:
https://ui.neptune.ai/o/neptune-ai/org/pytorch-ignite-integration/e/PYTOR-20/charts
Should it work as is or we need to do something before as I get

Something went wrong :(
We're fixing the problem so please try again in a few minutes.

on your site.

About ModelCheckpointHandler I agree with this implementation, we are not stated on that, however I thought about reusing ignite.handlers.Checkpoint class with specific save_handler like DiskSaver.

jakubczakon · 2020-01-26T20:42:01Z

I fixed that @vfdev-5 .
Made this project private by accident.

jakubczakon · 2020-01-27T07:26:13Z

Thanks for pointing that out @vfdev-5 .

I initially started with ModelCheckpointHandler and wrote my own NeptuneServerSaver handler.
The problem was, ModelCheckpointHandler would save all the checkpoints to the server.
As of now, we don't have an option to remove things from the artifacts section (only overwrite them).

On a second thought, I can talk to the team and see if adding this is actually a problem for them.
Wouldn't I need to overwrite the ModelCheckpoint class anyway to change the os.listdir and stuff?
I could extend the Checkpoint class though.

vfdev-5 · 2020-01-27T09:39:09Z

@jakubczakon I see. Yes, a similar situation is with MLflow for example. A solution can be to log during the training to a temp folder and after the training to send best n model to the server...

Wouldn't I need to overwrite the ModelCheckpoint class anyway to change the os.listdir and stuff? I could extend the Checkpoint class though.

I would suggest to use directly Checkpoint with a specific save_handler which sends(and optionally removes) best models to the server.

jakubczakon · 2020-01-27T09:42:12Z

I would suggest to use directly Checkpoint with a specific save_handler which sends(and optionally removes) best models to the server.

I will talk to our folks and come back on that as it is a cleaner solution I think.

Side note
I really like this handlers and events system.
It gives you a lot of freedom to choose when to do what.
It takes a moment to see the difference vs callbacks but it is clear to me now :)

examples/contrib/mnist/mnist_with_neptune_logger.py

vfdev-5 · 2020-01-27T09:48:32Z

To merge this PR, I propose to remove for instance ModelCheckpointHandler. We can add something later once we are done with send/remove or mlflow like solution...

jakubczakon · 2020-01-27T09:50:57Z

It is possible that this neptune.remove_artifact() functionality can be done quite quickly on our side so I will come back today with info on that.
If it's not going to be done quickly I will remove ModelCheckpointHandler.
Is that ok?

vfdev-5 · 2020-01-27T09:52:08Z

@jakubczakon perfect! We can wait until the feedback from your side and do as you proposed.

jakubczakon · 2020-01-27T13:35:17Z

Ok, I spoke to my team and we should have that in the next few days but I am not sure on the ETA and I think it is a better idea to drop ModelCheckpointHandler from this PR and do another PR once we have that feature implemented.
Fair @vfdev-5 ?

vfdev-5 · 2020-01-27T14:10:22Z

@jakubczakon sounds good, let's do it like that !

vfdev-5 · 2020-01-27T14:46:21Z

@jakubczakon yes, I see. Could you please, just drop from __future__ import print_function and we are OK and I can merge it

jakubczakon · 2020-01-27T14:47:45Z

Sorry @vfdev-5 forgot that. It's on the way.

vfdev-5

LGTM! @jakubczakon thanks!

jakubczakon · 2020-01-27T15:20:53Z

Perfect, thank you @vfdev-5!

jakubczakon added 9 commits January 26, 2020 18:40

added output handlers, model checkpoint handler, added mnist example,…

0760506

… added initial tests file

added exp link to examples, added tests

ea65a1a

added neptune do docs

9372b8b

Merge branch 'master' into master

f3d81c5

fixed test

c3d9556

Merge branch 'master' of https://github.com/neptune-ai/ignite

bfff60c

fixed imports

741d3d7

added neptune-client to test dependencies

0d3b1f5

fixed missing package message

7f00f6a

Merge branch 'master' into master

cc5ca70

vfdev-5 reviewed Jan 27, 2020

View reviewed changes

examples/contrib/mnist/mnist_with_neptune_logger.py Outdated Show resolved Hide resolved

dropped model checkpoing handler

cc450b8

updated experiment link

b0bf959

dropped __futures__ print_function

73d4001

vfdev-5 approved these changes Jan 27, 2020

View reviewed changes

vfdev-5 merged commit feff57f into pytorch:master Jan 27, 2020

Uh oh!

Add NeptuneLogger #730

Add NeptuneLogger #730

Uh oh!

Conversation

jakubczakon commented Jan 26, 2020

Uh oh!

vfdev-5 commented Jan 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jakubczakon commented Jan 26, 2020

Uh oh!

jakubczakon commented Jan 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vfdev-5 commented Jan 27, 2020

Uh oh!

jakubczakon commented Jan 27, 2020

Uh oh!

Uh oh!

vfdev-5 commented Jan 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jakubczakon commented Jan 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vfdev-5 commented Jan 27, 2020

Uh oh!

jakubczakon commented Jan 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vfdev-5 commented Jan 27, 2020

Uh oh!

vfdev-5 commented Jan 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jakubczakon commented Jan 27, 2020

Uh oh!

vfdev-5 left a comment

Choose a reason for hiding this comment

Uh oh!

jakubczakon commented Jan 27, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vfdev-5 commented Jan 26, 2020 •

edited

Loading

jakubczakon commented Jan 27, 2020 •

edited

Loading

vfdev-5 commented Jan 27, 2020 •

edited

Loading

jakubczakon commented Jan 27, 2020 •

edited

Loading

jakubczakon commented Jan 27, 2020 •

edited

Loading

vfdev-5 commented Jan 27, 2020 •

edited

Loading