Add standard vision models #1

darsnack · 2020-06-09T16:27:38Z

darsnack · 2020-06-09T16:45:38Z

dnabanita7 · 2020-06-10T07:36:15Z

I will be starting with Alexnet.

avik-pal · 2020-06-10T08:07:18Z

FYI I had made an old PR in FluxML/Metalhead.jl#17 which added most of these models (there have been some changes in Flux but it should still work with minor changes). We never merged the PR due to some CI test failures, and lack of trained weights. So it might be a helpful reference.

DhairyaLGandhi · 2020-06-13T19:32:09Z

Agreed, we should not replicate work. The ResNet in Metalhead is also generic but only exposes the ResNet50 model. A PR that adds a kwarg and constructs these appropriately would get us most of the way there. Not to mention easy to integrate with the rest of the CI

darsnack · 2020-06-13T20:24:50Z

Though can I suggest not wrapping the models in structs like Metalhead? This adds the complexity of accessing the "actual" model through m.layers, and I don't think the type wrapper is useful for dispatch (e.g. if someone wanted to dispatch on ::AlexNet, it would be trivial to wrap it in a type). We should still copy the source code from Metalhead to avoid rewriting from scratch.

I think we should have FluxModels.jl be really terse, minimal constructions of standard models to allow for reuse across codebases. I recall instances on Slack etc. where folks are told to copy-paste the model def from Metalhead because they don't want all the extra functionality that Metalhead provides. So, I think there is value in having a standard package of just function that construct standard models. Later down the line we can have a PR to Metalhead that uses FluxModels.jl to construct the inner m.layers.

DhairyaLGandhi · 2020-06-13T21:47:23Z

I would suggest not to add too much complexity. Maybe Metalhead can be the "public" API and we can have smaller packages that implement one model (re UNet.jl). That way you have access to both. I would prefer to have instances of training these models be in the zoo. Can we actually first work on a list of available models in Flux? I don't think copying the source code without meaningful separation of concern would achieve much.

…

On Sun, Jun 14, 2020 at 1:55 AM Kyle Daruwalla ***@***.***> wrote: Though can I suggest not wrapping the models in structs like Metalhead? This adds the complexity of accessing the "actual" model through m.layers, and I don't think the type wrapper is useful for dispatch (e.g. if someone wanted to dispatch on ::AlexNet, it would be trivial to wrap it in a type). We should still copy the source code from Metalhead to avoid rewriting from scratch. I think we should have FluxModels.jl be really terse, minimal constructions of standard models to allow for reuse across codebases. I recall instances on Slack etc. where folks are told to copy-paste the model def from Metalhead because they don't want all the extra functionality that Metalhead provides. So, I think there is value in having a standard package of just function that construct standard models. Later down the line we can have a PR to Metalhead that uses FluxModels.jl to construct the inner m.layers. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJOZVVKRZI6WNZ6DJIMXGF3RWPOB5ANCNFSM4NZSGQSA> .

darsnack · 2020-06-14T02:22:15Z

Can we actually first work on a list of available models in Flux?

Yeah, I agree this is a good idea.

I would prefer to have instances of training these models be in the zoo.

We can do that if that's preferred. We haven't started training any models anywhere yet.

I think this would be good to discuss during the next call. I'm not suggesting creating a lot of packages. I would say there should be one package (e.g. FluxModels.jl though it could be Metalhead.jl) that is the source standard models. Based on the current design, I would not suggest Metalhead.jl as the public facing package. At least I, as a Flux user, would prefer an alternative.

Metalhead does a lot: datasets, data loading, the classify function, and image display, all in addition to providing the models. That's not really an extensible design. All those other aspects should be provided elsewhere with more than standard vision models in mind. Personally, I think it makes more sense to have a single package that as the one place for loading models (not just vision models, but other domains as well).

AriMKatz · 2020-06-14T03:02:02Z

Agree totally with @darsnack. Further, I think there should be a general framework for incorporating and fine-tuning trained models as just regular layers, not simply for inference.

That can be composed with domain specific variants of dataloaders and other components. Again, taking inspiration from fast.ai here.

DhairyaLGandhi · 2020-06-14T03:12:22Z

If the issue is with having the models wrapped in a custom type, I can understand. Fwiw since moving to zygote, we can basically treat the model as a layer already and should be able to train it. If there are errors in that, we should consider it a bug.

Not against the core idea, just want to avoid redoing the work, especially if just moving the data providers et al outside would get us most of the way there.

darsnack · 2020-06-14T03:22:50Z

Yes maybe this is related to: FluxML/FluxML-Community-Call-Minutes#2 (comment)

Metalhead could be the vision model package referenced in that comment. We’d have to extract the other functionality out of the package. Though I think @dnabanita7 ‘s implementations are good since they use adaptive pooling to allow the model to work on multiple datasets.

dnabanita7 · 2020-06-14T14:15:46Z

Shall I continue working on other models?

darsnack · 2020-06-14T14:20:31Z

Yes, let's not complicate the fellowship work for now. You can keep on the same plan committing to this repo.

add untrained alexnet

a-r-n-o-l-d · 2020-06-22T20:14:10Z

Hello,

I have forked your package to add my VGGs implementations and to share my code with you, docs and unit testing are missing, but quick test on CIFAR10 seems to be OK (82% of accuracy after 30 epochs for a VGG11 without batchnorm).
I will likely write a small VGG.jl package in the near future, but in the meantime I hope it might be helpful for you.
There is a modified Conv layer to have no biases when there is a BatchNorm (it is not needed with Flux#master).

https://github.com/arnold-dev/FluxModels.jl

Arnold

darsnack · 2020-06-23T14:13:09Z

Hi @arnold-dev thanks for your work! If you make a PR to this repo, then I'll be happy to merge. There are some higher level design issues that @dnabanita7 can address on top of your PR.

Just so you know, we currently have an MLH student fellow (@dnabanita7) who is working on this issue. She has already made significant progress on the other models, and I normally wouldn't ask something like this, but we'd appreciate your hard work on other issues. I am not mentioning this to discourage you from working on the ML/Flux ecosystem, but only to allow Nabanita to learn as much as she can from this experience.

a-r-n-o-l-d · 2020-06-23T16:05:48Z

Hi @darsnack, I'm happy to contribute and this kind of package could be very helpful for my research. I will make a PR, but I'm not very comfortable with github.
I am currently working on some models of the U-net family and ResNet (in particular Resnet9, a small Resnet that may be interesting for benchmarking https://lambdalabs.com/blog/resnet9-train-to-94-cifar10-accuracy-in-100-seconds/).
I plane to work this summer on the transfer of parameters from torchvision to Flux trained models.
By the way, I think this repo can clearly contribute to Flux ecosytem as the source code could be integrate in Metalhead or split in small packages (VGG.jl, ResNet.jl, MobileNet.jl etc..) or keep as-is (as an equivalent of torhvision).

darsnack · 2020-06-23T16:07:50Z

By the way, I think this repo can clearly contribute to Flux ecosytem as the source code could be integrate in Metalhead or split in small packages (VGG.jl, ResNet.jl, MobileNet.jl etc..) or keep as-is (as an equivalent of torhvision).

Yup, that’s the plan!

@darsnack

1239: add adaptive pool r=CarloLucibello a=dnabanita7 I have added ``AdaptiveMaxPool`` and ``AdaptiveMeanPool`` so that we can do a similar [PyTorch implementation](darsnack/FluxModels.jl#1 (comment)). cc @darsnack ### PR - [x] Tests are added - [x] Entry in NEWS.md - [x] Documentation, if applicable - [ ] Final review from `@MikeInnes` or `@dhairyagandhi96` (for API changes). ### Flux issue linking [Flux#1224](#1224) ### MLH issue linking [0.3.x-projects#26](https://github.com/MLH-Fellowship/0.3.x-projects/issues/26) Co-authored-by: Nabanita Dash <dashnabanita@gmail.com>

DrChainsaw · 2020-08-28T19:04:58Z

@arnold-dev Sorry for hijacking, but did you have any luck replicating those resnet9 dawnbench experiments?

I wanted to use it for another experiment but I couldn't get the same performance despite quite some effort to cross-check implementations. I didn't need to match the absolute performance so I gave up after about half a day of trying, but it would be interesting to know what I did wrong. I put the experiments in a notebook here.

I might have a Chain version of the resnet somewhere in the revision history as well although it is probably less effort to recreate from scratch than to find it.

darsnack mentioned this issue Jun 10, 2020

Add vision models FluxML/FluxML-Community-Call-Minutes#2

Closed

This was referenced Jun 13, 2020

Create additional pooling layers FluxML/Flux.jl#1224

Closed

add untrained alexnet MLH-Fellowship/FluxModels.jl#1

Merged

This was referenced Jun 16, 2020

Added Adaptive Pooling MLH-Fellowship/Flux.jl#2

Merged

Adding untrained alexnet #2

Merged

darsnack pushed a commit that referenced this issue Jun 17, 2020

Merge pull request #1 from dnabanita7/vision

816702a

add untrained alexnet

This was referenced Jun 17, 2020

add adaptive pooling layers FluxML/Flux.jl#1233

Closed

add adaptive pool FluxML/Flux.jl#1239

Merged

add untrained resnet #3

Closed

darsnack assigned dnabanita7 Jun 25, 2020

a-r-n-o-l-d mentioned this issue Sep 4, 2020

About ResNet9 benchmark DrChainsaw/NaiveGAExperiments#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add standard vision models #1

Add standard vision models #1

darsnack commented Jun 9, 2020 •

edited

Loading

darsnack commented Jun 9, 2020

dnabanita7 commented Jun 10, 2020

avik-pal commented Jun 10, 2020

DhairyaLGandhi commented Jun 13, 2020

darsnack commented Jun 13, 2020

DhairyaLGandhi commented Jun 13, 2020 via email

darsnack commented Jun 14, 2020

AriMKatz commented Jun 14, 2020 •

edited

Loading

DhairyaLGandhi commented Jun 14, 2020

darsnack commented Jun 14, 2020

dnabanita7 commented Jun 14, 2020

darsnack commented Jun 14, 2020

a-r-n-o-l-d commented Jun 22, 2020

darsnack commented Jun 23, 2020

a-r-n-o-l-d commented Jun 23, 2020

darsnack commented Jun 23, 2020

DrChainsaw commented Aug 28, 2020

Add standard vision models #1

Add standard vision models #1

Comments

darsnack commented Jun 9, 2020 • edited Loading

darsnack commented Jun 9, 2020

dnabanita7 commented Jun 10, 2020

avik-pal commented Jun 10, 2020

DhairyaLGandhi commented Jun 13, 2020

darsnack commented Jun 13, 2020

DhairyaLGandhi commented Jun 13, 2020 via email

darsnack commented Jun 14, 2020

AriMKatz commented Jun 14, 2020 • edited Loading

DhairyaLGandhi commented Jun 14, 2020

darsnack commented Jun 14, 2020

dnabanita7 commented Jun 14, 2020

darsnack commented Jun 14, 2020

a-r-n-o-l-d commented Jun 22, 2020

darsnack commented Jun 23, 2020

a-r-n-o-l-d commented Jun 23, 2020

darsnack commented Jun 23, 2020

DrChainsaw commented Aug 28, 2020

darsnack commented Jun 9, 2020 •

edited

Loading

AriMKatz commented Jun 14, 2020 •

edited

Loading