-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add densenet169, densenet210 models #288
Conversation
Also @BradLarson could you add a few Image classification model checkpoints to the bucket you made for models? Most of them are tar files containing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
|
||
public struct Conv: Layer { | ||
public var batchNorm: BatchNorm<Float> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider renaming this to BatchNormConv2DLayer
. It is a little confusing right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rick has a good point that Conv probably isn't descriptive enough of a name for a convolution + batchnorm layer. In the ResNet implementation, we refer to this as ConvBN. Would it even be possible to re-use the ResNet's implementation of this and / or extract it as a common implementation of this subunit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second that. The users would get a sense of consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BradLarson I'm working on the general reduction of the implementations and was hoping to add that in a follow up PR. This PR was to generally support the rest of the densenet models. I will rename it to ConvBN
.
public func callAsFunction(_ input: Tensor<Float>) -> Tensor<Float> { | ||
conv(relu(batchNorm(input))) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you use chaining over sequenced(through:)
here?
strides: (2, 2), | ||
padding: .same | ||
) | ||
public var denseBlock1 = DenseBlock(repetitionCount: 6, inputFilterCount: 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than having three discrete implementations of DenseNets, would it be possible to have one general model and three sets of block sizes that could be used to generate the three variants, in the same way as is done with the current ResNet implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be a nice simplification, and it's probably nice to use an enum for this.
Just checking in, because it's been a little while. Are you still interested in bringing these models in? I still like the idea of generalizing the three variants into a single model, like we did with the ResNets. I think that would organize this model nicely. Additionally, the minor renaming that had been suggested still seems to make sense. If those couple of items were addressed, we'd be still glad to bring this in. |
@BradLarson alright I'll push the generalized versions soon. |
Reminder to add generalized versions. |
Hi, there hasn't been activity on this PR for a while. We're going to close it to keep our list of open PRs small. Of course, feel free to reopen any time if you have time to come back and work on this! |
Initial addition of densenet models, If anyone wants to take it up, the densenet models can be abstracted into neater code, I'll do it in a follow up PR later this week.