Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of Inceptionv4, InceptionResNetv2 and Xception #170

Merged
merged 12 commits into from
Jun 19, 2022

Conversation

theabhirath
Copy link
Member

@theabhirath theabhirath commented Jun 17, 2022

This PR is my first official contribution as a GSoC student 😁. It does a couple of things:

  1. It adds new models (Inceptionv4 and InceptionResNetv2). Internal docs for the layers of these models are left out because they are repetitive and don't really explain much - but if that's a concern I can always fill them in.
  2. It deprecates Inception3 in favour of Inceptionv3 since the vX notation is being used for models in other libraries (as well as in Metalhead for MobileNetv3) and so it makes sense to be explicit about it.

Other things that I happened to fill in because I found them:

  1. cat_channels is now type-stable as it uses Val(3) for dims - thanks to @ToucheSir pointing that out in some conversations
  2. Catches some formatting issues with Hotfix for ViT on GPU #169

Notes:

  1. TTFG for the Inception models is off the charts. Particularly, InceptionResNetv2 is absolutely insane, sometimes taking minutes. Subsequent gradients are quite slow as well (in comparison, ViTs, which are heavier models, seem to be faster) and also take a lot of memory. This might be helped by using Chain(::Vector) but I haven't used that yet because I'm not sure if it is causing stability issues while training.
  2. Closes Implementation of Inceptionv4 #131 (supersedes). Should also close Implementation of Inception v4 and Inception resnet v2 #129 since that hasn't seen activity for quite some time.

Edit: added the Xception model as well

1. Transition to `Inceptionv3` instead of `Inception3` for standardisation of names
2. Formatting
1. Loosen type constraints on `inputscale`
2. Updated README
Copy link
Member

@darsnack darsnack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nearly ready right off the bat, nice job! Just a couple initial comments.

src/convnets/inception.jl Show resolved Hide resolved
src/convnets/inception.jl Show resolved Hide resolved
src/convnets/inception.jl Outdated Show resolved Hide resolved
@theabhirath
Copy link
Member Author

CI times are through the roof...we might have to disable gradtests again (or maybe enable it only for select models)

@theabhirath
Copy link
Member Author

CI times are through the roof...we might have to disable gradtests again (or maybe enable it only for select models)

...how on earth are the second CI times so much lesser than the first one?

@ToucheSir
Copy link
Member

AIUI GHA runners are containerized but not otherwise isolated, so times can vary drastically depending on what else is running on the host.

@theabhirath theabhirath changed the title Implementation of Inceptionv4 and InceptionResNetv2 Implementation of Inceptionv4, InceptionResNetv2 and Xception Jun 18, 2022
@theabhirath
Copy link
Member Author

theabhirath commented Jun 18, 2022

Xception seems to have an even worse gradient benchmark:

julia> x = rand(Float32, 299, 299, 3, 2);

julia> model = Xception();

julia> @benchmark Zygote.gradient(p -> sum($model(p)), $x)
BenchmarkTools.Trial: 1 sample with 1 evaluation.
 Single result which took 9.255 s (78.29% GC) to evaluate,
 with a memory estimate of 10.65 GiB, over 1553316 allocations.

src/utilities.jl Outdated Show resolved Hide resolved
src/Metalhead.jl Outdated Show resolved Hide resolved
src/convnets/inception.jl Outdated Show resolved Hide resolved
src/convnets/inception.jl Outdated Show resolved Hide resolved
src/utilities.jl Outdated Show resolved Hide resolved
src/convnets/inception.jl Outdated Show resolved Hide resolved
1. Support `pretrain` for the Inception model APIs
2. Group deprecations in a single source file to make stuff more organised
3. Random formatting nitpicks
4. Use a plain broadcast instead of `applyactivation`
@theabhirath
Copy link
Member Author

This should be ready now, although I'm not sure what exactly I am to make of the gradient times. I suppose they shouldn't block this PR, but is there something that could be a possible contributor that I can change?

@theabhirath
Copy link
Member Author

The memory problems return 😬

src/convnets/mobilenet.jl Show resolved Hide resolved
src/layers/conv.jl Outdated Show resolved Hide resolved
src/layers/conv.jl Outdated Show resolved Hide resolved
@darsnack
Copy link
Member

Not sure there's much we can do beyond what's already been tried in Metalhead. We just need to do something about applychain in Flux/Zygote.

Copy link
Member

@darsnack darsnack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only some minor doc stuff

src/layers/conv.jl Outdated Show resolved Hide resolved
src/layers/conv.jl Outdated Show resolved Hide resolved
src/convnets/inception.jl Outdated Show resolved Hide resolved
push!(layers, x -> relu.(x))
append!(layers,
depthwise_sep_conv_bn((3, 3), inc, outc; pad = 1, bias = false,
use_bn = (false, false)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a use case of (true, false) or (false, true)? If not, I was thinking it is kinda silly to use conv_bn with the keyword use_bn = false cause that's just Conv. It makes sense for consistency with depthwise_sep_conv_bn. But it might be cleaner to just introduce depthwise_sep_conv then instead of more keywords in multiple places?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say I see something yet but I think this can be kept for some time...with all the churn over the next two months it might be handy. If not, I'll simplify it later

src/convnets/inception.jl Outdated Show resolved Hide resolved
src/convnets/inception.jl Outdated Show resolved Hide resolved
src/convnets/inception.jl Outdated Show resolved Hide resolved
src/convnets/inception.jl Outdated Show resolved Hide resolved
src/convnets/inception.jl Outdated Show resolved Hide resolved
@darsnack darsnack mentioned this pull request Jun 18, 2022
46 tasks
theabhirath and others added 2 commits June 18, 2022 22:56
Co-authored-by: Kyle Daruwalla <daruwalla.k.public@icloud.com>
src/convnets/inception.jl Outdated Show resolved Hide resolved
src/convnets/inception.jl Outdated Show resolved Hide resolved
src/convnets/inception.jl Outdated Show resolved Hide resolved
src/convnets/inception.jl Outdated Show resolved Hide resolved
Co-authored-by: Kyle Daruwalla <daruwalla.k.public@icloud.com>
@theabhirath
Copy link
Member Author

Ah sorry, I keep missing these minor doc tweaks 🤦🏽 I'll be a bit more thorough next time

Copy link
Member

@darsnack darsnack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem! Thanks and nice job!

@darsnack darsnack merged commit 93ce7e5 into FluxML:master Jun 19, 2022
@theabhirath theabhirath deleted the inception-plus branch June 19, 2022 03:27
@darsnack darsnack mentioned this pull request Jun 19, 2022
@ToucheSir ToucheSir mentioned this pull request Jun 19, 2022
@theabhirath theabhirath added the new-model Request or implementation of a new model label Aug 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-model Request or implementation of a new model
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants