-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Resnet Module #61
Conversation
* test path change * configured tests * trying test fix * trying to fix windows build * removed unit testing from cmake * Update windows-steps.yaml * trying windows fix * try fix, P.S. copied from mlpack * anotehr try * dir files display windows * namespace except tests dir * namespace added
* namespace except tests dir * namespace added * Catch2 (#6) * Update windows-steps.yaml * Update windows-steps.yaml * copying dll and lib files to bin * copy dll and lib to build/test * copy dll and lib to build/test * cleanup * added exclusion for catch * style check solve * fix style check * applied some suggestions * added new line in main.cpp * style check warnings * updating with models/master (#7) * Update windows-steps.yaml * Update windows-steps.yaml * copying dll and lib files to bin * copy dll and lib to build/test * copy dll and lib to build/test * cleanup * added exclusion for catch * style check solve * fix style check * trigger build Co-authored-by: kartikdutt18 <39593019+kartikdutt18@users.noreply.github.com> * removed mlpack::ann::models in favour of mlpack::models * style checks * style checks * ctest tests add * ctest parsing * added catch.cmake * build fix * test name fix * syntax error fix * removed main test from ctest as that would run tests 2 times * specifying CMAKE_INSTALL_PREFIX * reverting from --list-test-name-only to --list-tests * update cmake_install_prefix * turn off mlpack debugging in models repo Co-authored-by: kartikdutt18 <39593019+kartikdutt18@users.noreply.github.com>
The idea of the Residual<>* residual = new Residual<>(true);
Linear<>* linearA = new Linear<>(10, 10);
Linear<>* linearB = new Linear<>(10, 10);
residual->Add(linearA);
residual->Add(linearB); in this case There is also a test case - https://github.com/mlpack/mlpack/blob/83e70110595eaf3cf3758f270433801e673615b2/src/mlpack/tests/ann_layer_test.cpp#L3325 |
Hi thanks @zoq for this but the part that confused me was that it checks in the code that if the dimensions of the first layer are equal to the last layer or not and for ResNet there would be a case where the input dim of the first layer will be diff from the last one and so I need a conv 1*1 block just for the first layer input that is not run like all layers but separately before adding it to the last layer. how do you suggest i accomplish that ? cc: @kartikdutt18 |
In this case you can use a combination of AddMerge<> resblock(false, false);
Sequential<>* sequential = new Sequential<>(true);
Linear<>* linearA = new Linear<>(10, 10);
Linear<>* linearB = new Linear<>(10, 10);
sequential->Add(linearA);
sequential->Add(linearB);
Convolution<> conv = new Convolution<>(...);
resblock.Add(sequential);
resblock.Add(conv); Let me know if this is what you looked for. Maybe it makes sense to implement that structure as an independent layer in mlpack. |
I am still stuck at this and don't exactly know what to do, it would be easy if we somehow had a way to define the flow for the network but that is not how it is designed and the main problem here is the downsampling block, I can put all of the things inside a residual block and say it saves the input of the first layer into a temp variable and then tries to add it to the last layer in the residual block but when it does that it finds out that the shapes don't match and I don't see how I can use AddMerge to achieve the same flow. but do let me know if you see some other way around it. I have been thinking around it for way too long. |
I will try to think of a solution for this and get back to you. |
Yes i believe this is exactly what i am trying to do. |
Okay nice, if not let me know and I'll check if I can find a solution as well. |
I guess that worked but we have another thing
anything passed to the downsample should be reduced now but it isin't doing that. |
Hey @kartikdutt18, @zoq I have pushed the code too can you take another look ? |
I'm out rn, will take a look around 2030 IST. |
Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ignore this, this was some error as github was forcing me to leave a review.
models/resnet/resnet.hpp
Outdated
downSampleInputHeight, strideWidth, strideHeight, kernelWidth, | ||
kernelHeight, padW, padH, true); | ||
|
||
downSample->Add(new ann::BatchNorm<>(outSize)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should batchnorm be added as third connection or as part of downsample layer. Wrap it as sequential and insert it into downsample layer. This will add BaseLayer, Downsample, BatchNorm as three different Paths causing incorrect size. I can elaborate if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just pushed it, can you verify if this is what you meant ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It worked !!! 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Cool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for spotting it.
Hey, so all the architectures are ready but the thing that pulls me back a bit is a resnet152 with |
Also @kartikdutt18 it would be great if you can walk me through the weight converter because that part is a bit tougher than what thought, i assumed that i would just need to supply the model object and it would save them as a file which i could further edit. |
Let's first clean up this PR so it can be reviewed. This included adding comments, making the style check pass and squashing commits. |
btw should i keep the output that is printed now? maybe as mlpack log or something that can be enabled or something like that ? |
You can use logs similar to Darknet. I think they are more clear / easier to understand. |
Great, shall do that. |
btw how do we see the output of mlpack::log::info ? i haven't used it before. |
I haven't added the pretrained part of code because i didn't really had weights but that can be easily added once we have the code so i don't think much to worry about that. |
Hey @kartikdutt18, @zoq I have created #63, It wouild be great if we can review the PR over there. |
This PR aims to implement resnet module which would be able to create all the resnet variants from the paper and this aims to follow the same architecture as PyTorch for some reasons.
Things i have some doubts about:
Resources: