[WIP] Resnet Module #61

Aakash-kaushik · 2021-05-29T16:44:03Z

This PR aims to implement resnet module which would be able to create all the resnet variants from the paper and this aims to follow the same architecture as PyTorch for some reasons.

We can't train so many models on imagenet right now.
We don't know if they will converge.
keeping the same architecture that allows us to get weights from PyTorch.

Things i have some doubts about:

How would the residual block be implemented from the sequential layer?
Can I get the output of a layer at a random stage and add it to another layer? (skip connections)

Resources:

Pytorch's resnet implementation.
Resnet paper.

* test path change * configured tests * trying test fix * trying to fix windows build * removed unit testing from cmake * Update windows-steps.yaml * trying windows fix * try fix, P.S. copied from mlpack * anotehr try * dir files display windows * namespace except tests dir * namespace added

…o resnet

* namespace except tests dir * namespace added * Catch2 (#6) * Update windows-steps.yaml * Update windows-steps.yaml * copying dll and lib files to bin * copy dll and lib to build/test * copy dll and lib to build/test * cleanup * added exclusion for catch * style check solve * fix style check * applied some suggestions * added new line in main.cpp * style check warnings * updating with models/master (#7) * Update windows-steps.yaml * Update windows-steps.yaml * copying dll and lib files to bin * copy dll and lib to build/test * copy dll and lib to build/test * cleanup * added exclusion for catch * style check solve * fix style check * trigger build Co-authored-by: kartikdutt18 <39593019+kartikdutt18@users.noreply.github.com> * removed mlpack::ann::models in favour of mlpack::models * style checks * style checks * ctest tests add * ctest parsing * added catch.cmake * build fix * test name fix * syntax error fix * removed main test from ctest as that would run tests 2 times * specifying CMAKE_INSTALL_PREFIX * reverting from --list-test-name-only to --list-tests * update cmake_install_prefix * turn off mlpack debugging in models repo Co-authored-by: kartikdutt18 <39593019+kartikdutt18@users.noreply.github.com>

zoq · 2021-05-30T17:54:32Z

The idea of the sequential layer is that it wraps arbitrary layers and exposes them as if it would be a single layer. The sequential layer has a template parameter (https://github.com/mlpack/mlpack/blob/83e70110595eaf3cf3758f270433801e673615b2/src/mlpack/methods/ann/layer/sequential.hpp#L70), which tells the layer to add the input to the output of the last layer. There is also a convenient typedef (https://github.com/mlpack/mlpack/blob/83e70110595eaf3cf3758f270433801e673615b2/src/mlpack/methods/ann/layer/sequential.hpp#L260-L261) that already sets the template parameter for you. Below is an elementary example:

 Residual<>* residual = new Residual<>(true);

Linear<>* linearA = new Linear<>(10, 10);
Linear<>* linearB = new Linear<>(10, 10);
	 
residual->Add(linearA);
residual->Add(linearB);

in this case linearA and linearB are run and the input is also added to the output of the last layer, which in this case is linearB.

There is also a test case - https://github.com/mlpack/mlpack/blob/83e70110595eaf3cf3758f270433801e673615b2/src/mlpack/tests/ann_layer_test.cpp#L3325

Aakash-kaushik · 2021-05-30T18:18:51Z

The idea of the sequential layer is that it wraps arbitrary layers and exposes them as if it would be a single layer. The sequential layer has a template parameter (https://github.com/mlpack/mlpack/blob/83e70110595eaf3cf3758f270433801e673615b2/src/mlpack/methods/ann/layer/sequential.hpp#L70), which tells the layer to add the input to the output of the last layer. There is also a convenient typedef (https://github.com/mlpack/mlpack/blob/83e70110595eaf3cf3758f270433801e673615b2/src/mlpack/methods/ann/layer/sequential.hpp#L260-L261) that already sets the template parameter for you. Below is an elementary example:
 Residual<>* residual = new Residual<>(true);

Linear<>* linearA = new Linear<>(10, 10);
Linear<>* linearB = new Linear<>(10, 10);
	 
residual->Add(linearA);
residual->Add(linearB);
in this case linearA and linearB are run and the input is also added to the output of the last layer, which in this case is linearB.

There is also a test case - https://github.com/mlpack/mlpack/blob/83e70110595eaf3cf3758f270433801e673615b2/src/mlpack/tests/ann_layer_test.cpp#L3325

Hi thanks @zoq for this but the part that confused me was that it checks in the code that if the dimensions of the first layer are equal to the last layer or not and for ResNet there would be a case where the input dim of the first layer will be diff from the last one and so I need a conv 1*1 block just for the first layer input that is not run like all layers but separately before adding it to the last layer.

how do you suggest i accomplish that ?

cc: @kartikdutt18

zoq · 2021-05-30T20:06:16Z

In this case you can use a combination of AddMerge and Sequential. The AddMerge layer just takes arbitrary runs each layer and at the end adds them together.

AddMerge<> resblock(false, false);

Sequential<>* sequential = new Sequential<>(true);

Linear<>* linearA = new Linear<>(10, 10);
Linear<>* linearB = new Linear<>(10, 10);

 sequential->Add(linearA);
 sequential->Add(linearB);

Convolution<> conv = new Convolution<>(...);

resblock.Add(sequential);
resblock.Add(conv);

Let me know if this is what you looked for. Maybe it makes sense to implement that structure as an independent layer in mlpack.

…o resnet

Aakash-kaushik · 2021-06-05T14:16:01Z

In this case you can use a combination of AddMerge and Sequential. The AddMerge layer just takes arbitrary runs each layer and at the end adds them together.
AddMerge<> resblock(false, false);

Sequential<>* sequential = new Sequential<>(true);

Linear<>* linearA = new Linear<>(10, 10);
Linear<>* linearB = new Linear<>(10, 10);

 sequential->Add(linearA);
 sequential->Add(linearB);

Convolution<> conv = new Convolution<>(...);

resblock.Add(sequential);
resblock.Add(conv);
Let me know if this is what you looked for. Maybe it makes sense to implement that structure as an independent layer in mlpack.

I am still stuck at this and don't exactly know what to do, it would be easy if we somehow had a way to define the flow for the network but that is not how it is designed and the main problem here is the downsampling block, I can put all of the things inside a residual block and say it saves the input of the first layer into a temp variable and then tries to add it to the last layer in the residual block but when it does that it finds out that the shapes don't match and I don't see how I can use AddMerge to achieve the same flow. but do let me know if you see some other way around it. I have been thinking around it for way too long.

kartikdutt18 · 2021-06-05T15:03:57Z

I will try to think of a solution for this and get back to you.

zoq · 2021-06-05T15:22:42Z

Just to make sure, I get what you are trying to do, in some cases the output of the sequential part doesn't match with the input so if you add the skip connection you have to add another layer to convert the input?

Something like:

Aakash-kaushik · 2021-06-05T16:18:54Z

Just to make sure, I get what you are trying to do, in some cases the output of the sequential part doesn't match with the input so if you add the skip connection you have to add another layer to convert the input?

Something like:

Yes i believe this is exactly what i am trying to do.
this was a great diagram for that. Thank you so much.

zoq · 2021-06-18T13:05:15Z

Okay nice, if not let me know and I'll check if I can find a solution as well.

Aakash-kaushik · 2021-06-18T13:38:35Z

Okay nice, if not let me know and I'll check if I can find a solution as well.

I guess that worked but we have another thing

Convolution: 3 64 7 7 2 2 3 3 112 112
BatchNorm: 64
Relu
Padding: 1,1,1,1 114 114
MaxPool: 3,3,2,2 56 56
Convolution: 64 64 3 3 1 1 1 1 56 56
BatchNorm: 64
Relu
Convolution: 64 64 3 3 1 1 1 1 56 56
BatchNorm: 64
IdentityLayer
Relu
Convolution: 64 64 3 3 1 1 1 1 56 56
BatchNorm: 64
Relu
Convolution: 64 64 3 3 1 1 1 1 56 56
BatchNorm: 64
IdentityLayer
Relu
new layer
Convolution: 64 128 3 3 2 2 1 1 28 28
BatchNorm: 128
Relu
Convolution: 128 128 3 3 1 1 1 1 28 28
BatchNorm: 128
DownSample below
Convolution: 64 128 1 1 2 2 0 0 56 56
BatchNorm: 128
Relu

error: addition: incompatible matrix dimensions: 100352x1 and 200704x1
terminate called after throwing an instance of 'std::logic_error'
  what():  addition: incompatible matrix dimensions: 100352x1 and 200704x1
Aborted (core dumped)

anything passed to the downsample should be reduced now but it isin't doing that.

Aakash-kaushik · 2021-06-18T13:39:09Z

Hey @kartikdutt18, @zoq I have pushed the code too can you take another look ?

kartikdutt18 · 2021-06-18T13:50:16Z

I'm out rn, will take a look around 2030 IST.

Aakash-kaushik · 2021-06-18T13:55:07Z

I'm out rn, will take a look around 2030 IST.

Thanks.

Aakash-kaushik

ignore this, this was some error as github was forcing me to leave a review.

kartikdutt18 · 2021-06-18T14:57:46Z

models/resnet/resnet.hpp

+        downSampleInputHeight, strideWidth, strideHeight, kernelWidth,
+        kernelHeight, padW, padH, true);
+
+    downSample->Add(new ann::BatchNorm<>(outSize));


Should batchnorm be added as third connection or as part of downsample layer. Wrap it as sequential and insert it into downsample layer. This will add BaseLayer, Downsample, BatchNorm as three different Paths causing incorrect size. I can elaborate if needed.

I just pushed it, can you verify if this is what you meant ?

It worked !!! 🚀

Thank you so much for spotting it.

Aakash-kaushik · 2021-06-18T20:11:24Z

Hey, so all the architectures are ready but the thing that pulls me back a bit is a resnet152 with 224 * 224 * 3 input dim takes up 7.1 GB of RAM, isin't that too much, I am happy i had a swap space otherwise it wouldn't even run, btw this is runtime memory not compile time.

Aakash-kaushik · 2021-06-18T20:52:25Z

Also @kartikdutt18 it would be great if you can walk me through the weight converter because that part is a bit tougher than what thought, i assumed that i would just need to supply the model object and it would save them as a file which i could further edit.

kartikdutt18 · 2021-06-19T04:21:28Z

Let's first clean up this PR so it can be reviewed. This included adding comments, making the style check pass and squashing commits.

Aakash-kaushik · 2021-06-19T08:23:32Z

Let's first clean up this PR so it can be reviewed. This included adding comments, making the style check pass and squashing commits.

btw should i keep the output that is printed now? maybe as mlpack log or something that can be enabled or something like that ?

kartikdutt18 · 2021-06-19T08:24:52Z

You can use logs similar to Darknet. I think they are more clear / easier to understand.

Aakash-kaushik · 2021-06-19T08:25:20Z

You can use logs similar to Darknet. I think they are more clear / easier to understand.

Great, shall do that.

Aakash-kaushik · 2021-06-19T08:55:40Z

btw how do we see the output of mlpack::log::info ? i haven't used it before.

Aakash-kaushik · 2021-06-19T10:26:34Z

I haven't added the pretrained part of code because i didn't really had weights but that can be easily added once we have the code so i don't think much to worry about that.

Aakash-kaushik · 2021-06-19T11:51:00Z

Hey @kartikdutt18, @zoq I have created #63, It wouild be great if we can review the PR over there.

Aakash-kaushik and others added 10 commits April 21, 2021 09:59

test path change

1d4231c

not this repo

beb4159

Merge branch 'resnet' of https://github.com/Aakash-kaushik/models int…

79b0d8b

…o resnet

resnet

b8fe4b8

Merge branch 'mlpack:master' into resnet

cc5ec3b

resnet file update

2884395

merge from models/master

7f9ec47

Merge branch 'master' of https://github.com/mlpack/models into resnet

707cd70

mlpack-bot bot added s: needs review s: unanswered s: unlabeled labels May 29, 2021

remove false cmake commits

1555ed7

kartikdutt18 added t: added feature and removed s: needs review s: unanswered s: unlabeled labels May 29, 2021

kartikdutt18 self-requested a review May 29, 2021 16:48

Aakash-kaushik added 2 commits May 29, 2021 22:20

removed ignorance of *.txt files

59ff818

Reverting false commits

fcfad05

Aakash-kaushik added 2 commits June 5, 2021 19:42

added downsample block

3218349

Merge branch 'resnet' of https://github.com/Aakash-kaushik/models int…

8144192

…o resnet

another error regarding dims

7b6d773

Aakash-kaushik commented Jun 18, 2021

View reviewed changes

kartikdutt18 reviewed Jun 18, 2021

View reviewed changes

Aakash-kaushik added 3 commits June 18, 2021 20:32

wrap downSample in seq block

8c69fc1

we have a working resnet18 and 34 with simple tests that pass

f74ee31

architectures are ready for all resnets

d20dccd

used mlpack::log::info instead of cout

ca6b501

Aakash-kaushik added 5 commits June 19, 2021 14:27

style checks

badde0b

added tests

3e75840

style check fix - part1

ffb00dd

style check fix - final

8f4a2af

added comments

0f7c834

Aakash-kaushik requested review from zoq and kartikdutt18 June 19, 2021 10:26

Aakash-kaushik mentioned this pull request Jun 19, 2021

Addition of resnet module. #63

Merged

Aakash-kaushik closed this Jun 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Resnet Module #61

[WIP] Resnet Module #61

Aakash-kaushik commented May 29, 2021

zoq commented May 30, 2021

Aakash-kaushik commented May 30, 2021 •

edited

zoq commented May 30, 2021

Aakash-kaushik commented Jun 5, 2021 •

edited

kartikdutt18 commented Jun 5, 2021

zoq commented Jun 5, 2021

Aakash-kaushik commented Jun 5, 2021

zoq commented Jun 18, 2021

Aakash-kaushik commented Jun 18, 2021

Aakash-kaushik commented Jun 18, 2021

kartikdutt18 commented Jun 18, 2021

Aakash-kaushik commented Jun 18, 2021 •

edited

Aakash-kaushik left a comment

kartikdutt18 Jun 18, 2021 •

edited

Aakash-kaushik Jun 18, 2021

Aakash-kaushik Jun 18, 2021

kartikdutt18 Jun 18, 2021 •

edited

Aakash-kaushik Jun 18, 2021

Aakash-kaushik commented Jun 18, 2021

Aakash-kaushik commented Jun 18, 2021

kartikdutt18 commented Jun 19, 2021

Aakash-kaushik commented Jun 19, 2021

kartikdutt18 commented Jun 19, 2021

Aakash-kaushik commented Jun 19, 2021

Aakash-kaushik commented Jun 19, 2021

Aakash-kaushik commented Jun 19, 2021

Aakash-kaushik commented Jun 19, 2021

[WIP] Resnet Module #61

[WIP] Resnet Module #61

Conversation

Aakash-kaushik commented May 29, 2021

zoq commented May 30, 2021

Aakash-kaushik commented May 30, 2021 • edited

zoq commented May 30, 2021

Aakash-kaushik commented Jun 5, 2021 • edited

kartikdutt18 commented Jun 5, 2021

zoq commented Jun 5, 2021

Aakash-kaushik commented Jun 5, 2021

zoq commented Jun 18, 2021

Aakash-kaushik commented Jun 18, 2021

Aakash-kaushik commented Jun 18, 2021

kartikdutt18 commented Jun 18, 2021

Aakash-kaushik commented Jun 18, 2021 • edited

Aakash-kaushik left a comment

Choose a reason for hiding this comment

kartikdutt18 Jun 18, 2021 • edited

Choose a reason for hiding this comment

Aakash-kaushik Jun 18, 2021

Choose a reason for hiding this comment

Aakash-kaushik Jun 18, 2021

Choose a reason for hiding this comment

kartikdutt18 Jun 18, 2021 • edited

Choose a reason for hiding this comment

Aakash-kaushik Jun 18, 2021

Choose a reason for hiding this comment

Aakash-kaushik commented Jun 18, 2021

Aakash-kaushik commented Jun 18, 2021

kartikdutt18 commented Jun 19, 2021

Aakash-kaushik commented Jun 19, 2021

kartikdutt18 commented Jun 19, 2021

Aakash-kaushik commented Jun 19, 2021

Aakash-kaushik commented Jun 19, 2021

Aakash-kaushik commented Jun 19, 2021

Aakash-kaushik commented Jun 19, 2021

Aakash-kaushik commented May 30, 2021 •

edited

Aakash-kaushik commented Jun 5, 2021 •

edited

Aakash-kaushik commented Jun 18, 2021 •

edited

kartikdutt18 Jun 18, 2021 •

edited

kartikdutt18 Jun 18, 2021 •

edited