Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A doubt about scale layer's backward (may be bug) #6604

Open
5 tasks
huchhong opened this issue Nov 9, 2018 · 1 comment
Open
5 tasks

A doubt about scale layer's backward (may be bug) #6604

huchhong opened this issue Nov 9, 2018 · 1 comment

Comments

@huchhong
Copy link

huchhong commented Nov 9, 2018

Issue summary

I read scale layer code recently. I found some suspicious code. Here is it:

template <typename Dtype>
void ScaleLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,
    const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
   ...
  else {
        const Dtype* sum_mult = sum_multiplier_.cpu_data();
        sum_result = (outer_dim_ == 1) ?
            scale->mutable_cpu_diff() : sum_result_.mutable_cpu_data();
        caffe_cpu_gemv(CblasNoTrans, sum_result_.count(), inner_dim_,
                       Dtype(1), product, sum_mult, Dtype(0), sum_result);
      }
      if (outer_dim_ != 1) {
      ...
}

In the above code, if outer_dim == 1, then scale_diff will be replaced instead of adding by caffe_cpu_gemv's result since BETA param of gemv is zero. This seems wrong. The same happens
in gpu version.

Steps to reproduce

Just code review.

Tried solutions

System configuration

  • Operating system:
  • Compiler:
  • CUDA version (if applicable):
  • CUDNN version (if applicable):
  • BLAS:
  • Python version (if using pycaffe):
  • MATLAB version (if using matcaffe):

Issue checklist

  • read the guidelines and removed the first paragraph
  • written a short summary and detailed steps to reproduce
  • explained how solutions to related problems failed (tick if found none)
  • filled system configuration
  • attached relevant logs/config files (tick if not applicable)
@huchhong
Copy link
Author

huchhong commented Nov 9, 2018

I have tried this compare:

  1. set train batch to 1 and iter_size to 2
  2. set train batch to 2 and iter_size to 1

the input data is set to one single image, so theoretically, these two test should give the same scale diff, but it don't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant