Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TransposedConvolution layer. #1493

Merged
merged 6 commits into from Oct 12, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/mlpack/methods/ann/layer/layer_types.hpp
Expand Up @@ -213,7 +213,7 @@ using LayerTypes = boost::variant<
NaiveConvolution<FullConvolution>,
NaiveConvolution<ValidConvolution>, arma::mat, arma::mat>*,
TransposedConvolution<NaiveConvolution<ValidConvolution>,
NaiveConvolution<FullConvolution>,
NaiveConvolution<ValidConvolution>,
NaiveConvolution<ValidConvolution>, arma::mat, arma::mat>*,
DropConnect<arma::mat, arma::mat>*,
Dropout<arma::mat, arma::mat>*,
Expand Down
158 changes: 136 additions & 22 deletions src/mlpack/methods/ann/layer/transposed_convolution.hpp
Expand Up @@ -40,7 +40,7 @@ namespace ann /** Artificial Neural Network. */ {
*/
template <
typename ForwardConvolutionRule = NaiveConvolution<ValidConvolution>,
typename BackwardConvolutionRule = NaiveConvolution<FullConvolution>,
typename BackwardConvolutionRule = NaiveConvolution<ValidConvolution>,
typename GradientConvolutionRule = NaiveConvolution<ValidConvolution>,
typename InputDataType = arma::mat,
typename OutputDataType = arma::mat
Expand All @@ -52,8 +52,14 @@ class TransposedConvolution
TransposedConvolution();

/**
* Create the Transposed Convolution object using the specified number of input maps,
* output maps, filter size, stride and padding parameter.
* Create the Transposed Convolution object using the specified number of
* input maps, output maps, filter size, stride and padding parameter.
*
* Note: The equivalent stride of a transposed convolution operation is always
* equal to 1. In this implementation, stride of filter represents the stride
* of the associated convolution operation.
* Note: Padding of input represents padding of associated convolution
* operation.
*
* @param inSize The number of input maps.
* @param outSize The number of output maps.
Expand All @@ -65,6 +71,8 @@ class TransposedConvolution
* @param padH Padding height of the input.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to avoid confusion we should write. Padding height of the input of the equivalent convolution layer. Same for padW. I also think the same for dW and dH as transposed convolution always has a stride of one. Let me know if it's incorrect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had added two notes above the @param listing to clarify this thing. Do you think that's enough? Or should we mention it for individual parameters?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will say. Let's write it. If we look at some other layers then their @param listing is quite similar to this layer. So people (like me) may skip reading the upward note. I don't know whether my view is correct or not so it's totally up to you.

* @param inputWidth The width of the input data.
* @param inputHeight The height of the input data.
* @param outputWidth The width of the output data.
* @param outputHeight The height of the output data.
*/
TransposedConvolution(const size_t inSize,
const size_t outSize,
Expand All @@ -75,7 +83,9 @@ class TransposedConvolution
const size_t padW = 0,
const size_t padH = 0,
const size_t inputWidth = 0,
const size_t inputHeight = 0);
const size_t inputHeight = 0,
const size_t outputWidth = 0,
const size_t outputHeight = 0);

/*
* Set the weight and bias term.
Expand Down Expand Up @@ -173,24 +183,6 @@ class TransposedConvolution
void serialize(Archive& ar, const unsigned int /* version */);

private:
/*
* Return the transposed convolution output size.
*
* @param size The size of the input (row or column).
* @param k The size of the filter (width or height).
* @param s The stride size (x or y direction).
* @param p The size of the padding (width or height).
* @return The transposed convolution output size.
*/
size_t TransposedConvOutSize(const size_t size,
const size_t k,
const size_t s,
const size_t p)
{
size_t out = std::floor(size - k + 2 * p) / s;
return out * s + 2 * (k - p) - 1 + ((((size + 2 * p - k) % s) + s) % s);
}

/*
* Rotates a 3rd-order tensor counterclockwise by 180 degrees.
*
Expand Down Expand Up @@ -220,6 +212,119 @@ class TransposedConvolution
output = arma::fliplr(arma::flipud(input));
}

/*
* Pad the given input data.
*
* @param input The input to be padded.
* @param wPad Padding width of the input.
* @param hPad Padding height of the input.
* @param wExtra The number of extra zeros to the right.
* @param hExtra The number of extra zeros to the bottom.
* @param output The padded output data.
*/
template<typename eT>
void Pad(const arma::Mat<eT>& input,
Copy link
Member

@saksham189 saksham189 Jul 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could introduce a Padding layer and use that inside the convolution layers to avoid redundant code. Pytorch already has support for quite a lot of padding layers. This would also make the implementation more clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saksham189 Has there been any discussion on this on irc after you posted this? I think this might be good idea and should be given some thought.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow response, makes totally sense to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akhandait I will work on adding the Padding layer and remove the redundant code in a new PR so that it is easy to review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the Padding layer is done and merged; should we incorporate those changes here, or maybe in another PR after this is merged? It may be worth opening an issue so we don't forget. :)

const size_t wPad,
const size_t hPad,
const size_t wExtra,
const size_t hExtra,
arma::Mat<eT>& output)
{
if (output.n_rows != input.n_rows + wPad * 2 + wExtra ||
output.n_cols != input.n_cols + hPad * 2 + hExtra)
{
output = arma::zeros(input.n_rows + wPad * 2 + wExtra,
input.n_cols + hPad * 2 + hExtra);
}

output.submat(wPad, hPad, wPad + input.n_rows - 1,
hPad + input.n_cols - 1) = input;
}

/*
* Pad the given input data.
*
* @param input The input to be padded.
* @param wPad Padding width of the input.
* @param hPad Padding height of the input.
* @param wExtra The number of extra zeros to the right.
* @param hExtra The number of extra zeros to the bottom.
* @param output The padded output data.
*/
template<typename eT>
void Pad(const arma::Cube<eT>& input,
const size_t wPad,
const size_t hPad,
const size_t wExtra,
const size_t hExtra,
arma::Cube<eT>& output)
{
output = arma::zeros(input.n_rows + wPad * 2 + wExtra,
input.n_cols + hPad * 2 + hExtra, input.n_slices);

for (size_t i = 0; i < input.n_slices; ++i)
{
Pad<eT>(input.slice(i), wPad, hPad, wExtra, hExtra, output.slice(i));
}
}

/*
* Insert zeros between the units of the given input data.
* Note: This function should be used before the Pad() function.
*
* @param input The input to be padded.
* @param dW Stride of filter application in the x direction.
* @param dH Stride of filter application in the y direction.
* @param output The padded output data.
*/
template<typename eT>
void InsertZeros(const arma::Mat<eT>& input,
const size_t dW,
const size_t dH,
arma::Mat<eT>& output)
{
if (output.n_rows != input.n_rows * dW - dW + 1 ||
output.n_cols != input.n_cols * dH - dH + 1)
{
output = arma::zeros(input.n_rows * dW - dW + 1,
input.n_cols * dH - dH + 1);
}

for (size_t i = 0; i < output.n_rows; i += dH)
{
for (size_t j = 0; j < output.n_cols; j += dW)
{
// TODO: Use [] instead of () for speedup after this is completely
// debugged and approved.
output(i, j) = input(i / dH, j / dW);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure using [] will bring us any speedup here, but maybe I missed something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't [] an optimized version of () without bounds check? It's mentioned in the Armadillo docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this should only matter in debug mode:

ARMA_NO_DEBUG   Disable all run-time checks, such as bounds checking. This will result in faster code, but you first need to make sure that your code runs correctly! We strongly recommend to have the run-time checks enabled during development, as this greatly aids in finding mistakes in your code, and hence speeds up development. We recommend that run-time checks be disabled only for the shipped version of your program (ie. final release build).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't know that. I guess I'll leave it with () then, might actually be useful in debug mode if something goes wrong.

}
}
}

/*
* Insert zeros between the units of the given input data.
* Note: This function should be used before the Pad() function.
*
* @param input The input to be padded.
* @param dW Stride of filter application in the x direction.
* @param dH Stride of filter application in the y direction.
* @param output The padded output data.
*/
template<typename eT>
void InsertZeros(const arma::Cube<eT>& input,
const size_t dW,
const size_t dH,
arma::Cube<eT>& output)
{
output = arma::zeros(input.n_rows * dW - dW + 1,
input.n_cols * dH - dH + 1, input.n_slices);

for (size_t i = 0; i < input.n_slices; ++i)
{
InsertZeros<eT>(input.slice(i), dW, dH, output.slice(i));
}
}

//! Locally-stored number of input channels.
size_t inSize;

Expand Down Expand Up @@ -247,6 +352,12 @@ class TransposedConvolution
//! Locally-stored padding height.
size_t padH;

//! Locally-stored number of zeros added to the right of input.
size_t aW;

//! Locally-stored number of zeros added to the top of input.
size_t aH;

//! Locally-stored weight object.
OutputDataType weights;

Expand Down Expand Up @@ -277,6 +388,9 @@ class TransposedConvolution
//! Locally-stored transformed padded input parameter.
arma::cube inputPaddedTemp;

//! Locally-stored transformed expanded input parameter.
arma::cube inputExpandedTemp;

//! Locally-stored transformed error parameter.
arma::cube gTemp;

Expand Down