Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscellaneous fixes for MobileNet #179

Merged
merged 3 commits into from
Jun 26, 2022

Conversation

theabhirath
Copy link
Member

Closes #178. Also removes an extra fully connected layer in MobileNetv1, making it now conform to the configuration described in the paper. And some minor docs and formatting changes I caught have also made it through

@theabhirath
Copy link
Member Author

theabhirath commented Jun 24, 2022

This is also another step towards #176, as all the MobileNet models now expose inchannels and nclasses

@@ -9,7 +9,7 @@ Creates a ConvMixer model.

- `planes`: number of planes in the output of each block
- `depth`: number of layers
- `inchannels`: number of channels in the input
- `inchannels`: The number of channels in the input. The default value is 3.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to specify this? It's already in the docstring above, and it seems inconsistent with the other bullets (either re-specify all + capitalize all or leave it as is).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on making this happen for all the docstrings in the new PRs (that is, having the default values in there). I need to push a larger documentation change sometime in the next few weeks and that should probably cover everything

Comment on lines 49 to 47
Dense(inchannels, fcsize, activation),
Dense(fcsize, nclasses)))
Dense(inchannels, nclasses)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically a bug fix but also slightly breaking. Maybe we can make a breaking release just to err on the side of caution? Deciding what to do for structural changes to the model here is tricky. cc @ToucheSir for opinions

Copy link
Member Author

@theabhirath theabhirath Jun 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we are going to be making a huge breaking release anyways sometime after my GSoC work lands, so this just gets added to the list

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do/should we provide any guarantees about model structure across releases? I believe the only people affected would be those who've already trained models, and thus far I've not heard of any for MobileNet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or (user) code that relies on the structure but doesn't use Functors.jl. I think previously we decided to consider structural changes as breaking. But the more nuanced question here is whether a structural change that fixes an error is considered a bug fix.

I think probably here no one would complain given how recent the model definitions are.

@maxfreu
Copy link

maxfreu commented Jun 24, 2022

If the slightly breaking changes hold this up too much it would be cool to factor the padding change out into a separate commit so that I can use it.

@darsnack
Copy link
Member

Yes, this is the main reason I brought it up. I want to make a patch release with the recent changes before merging the breaking overhaul.

@theabhirath
Copy link
Member Author

The breaking change has been undone for now. The rest I think should fit into the patch-release without a problem

src/convnets/mobilenet.jl Show resolved Hide resolved
src/convnets/mobilenet.jl Outdated Show resolved Hide resolved
@darsnack
Copy link
Member

Can you rebase with the test changes from #171?

theabhirath and others added 2 commits June 25, 2022 22:05
Co-authored-by: Kyle Daruwalla <daruwalla.k.public@icloud.com>
@theabhirath
Copy link
Member Author

....wait. How did the CI go from one check passing to all checks passing

@ToucheSir
Copy link
Member

Something to do with #171 disabling tests for some of the largest models?

@theabhirath
Copy link
Member Author

theabhirath commented Jun 26, 2022

Oh, possibly 🤦🏽 I feel like this is just 50-50 at this point as to whether the tests pass or not (the funny thing is, one test in #171 actually failed 😂). Although I do hope they keep passing....

@ToucheSir
Copy link
Member

Looks like the failure was a timeout instead of an OOM, so hopefully it was an anomaly!

@theabhirath
Copy link
Member Author

Should be ready for merge and release now 😄

@ToucheSir ToucheSir merged commit b37bee7 into FluxML:master Jun 26, 2022
@theabhirath theabhirath deleted the mobilenet-fixes branch June 26, 2022 15:56
@maxfreu
Copy link

maxfreu commented Jun 27, 2022

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MobileNetv2/3 need padding in first layer
4 participants