Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug report #1

Closed
peghoty opened this issue Jun 6, 2013 · 11 comments
Closed

bug report #1

peghoty opened this issue Jun 6, 2013 · 11 comments
Labels

Comments

@peghoty
Copy link

peghoty commented Jun 6, 2013

Dear Yusuke Sugomori,

In the C version of your deep learning program, I found a minor bug in DBN_predict (Line 193-200 of the DBN.c file). The initialization of " linear_output" should be moved into the k-loop as follows:

for(k=0; k<this->sigmoid_layers[i].n_out; k++) {
  linear_output = 0.0;
  for(j=0; j<this->sigmoid_layers[i].n_in; j++) {
    linear_output += this->sigmoid_layers[i].W[k][j] * prev_layer_input[j];
  }
  linear_output += this->sigmoid_layers[i].b[k];
  layer_input[k] = sigmoid(linear_output);
}

I guess the same problem may appear in the other versions.

Actually, I want to communicate with you for more details about the deep learning algorithm for specifc applications. May I know your email address?

peghoty

@Talentor
Copy link

Talentor commented Jun 6, 2013

Dear Peghoty,
I'm so sorry to tell you that I'm not Yusuke Sugomori, you made a wrong E-mail address.
Stranger

At 2013-06-06 15:47:26,peghoty notifications@github.com wrote:

Dear Yusuke Sugomori,

In the C version of your deep learning program, I found a minor bug in DBN_predict (Line 193-200 of the DBN.c file). The initialization of " linear_output" should be moved into the k-loop as follows:

for(k=0; ksigmoid_layers[i].n_out; k++) {
linear_output = 0.0;
for(j=0; jsigmoid_layers[i].n_in; j++) {
linear_output += this->sigmoid_layers[i].W[k][j] * prev_layer_input[j];
}
linear_output += this->sigmoid_layers[i].b[k];
layer_input[k] = sigmoid(linear_output);
}

I guess the same problem may appear in the other versions.

Actually, I want to communicate with you for more details about the deep learning algorithm for specifc applications. May I know your email address?

peghoty

¡ª
Reply to this email directly or view it on GitHub.

@bayesrule
Copy link

Hi Peghoty,

I think you should go to the author's website (http://yusugomori.com/) to contact him. BTW, as a user I thank you for the bug report. It also occurs in c++ and java versions as I checked. Though I couldn't find the same bug in the python version, the performance of the python version is the same as the undebugged c++ and java version. So I suspect there's a bug in the python version as well, I just didn't find it. I don't know scala, so I didn't touch it.

best

yusugomori added a commit that referenced this issue Jun 23, 2013
@yusugomori
Copy link
Owner

Dear Peghoty,

Thank you for the report, and sorry for the delay in replying to you.
I have just fixed the bug in C, C++, and Java.

@peghoty
Copy link
Author

peghoty commented Jun 26, 2013

Dear Yusuke Sugomori,

Thanks for your reply. I'm a novice of deep learing and read your C code very carefully as a guidence for my studying.

May I ask you some more questions?

  1. In your demonstration example, the training samples are all binary, like (1,0,1,...), does the code deal with the real-valued input(like (0.2,0.5,1.4,...))? If not, how should I adjust the code?
  2. During the fine-tuning phase, you only update the parameters of the Logistic Regression classifier with parameters of the prevous levels
    fixed, is this a popular way? or you just implement it for simplicity? If I want to update all the parameters together in the fine-tuning phase, any good suggestions?

Thanks again, hope to get your reply soon.

Peghoty

At 2013-06-23 23:57:48,yusugomori notifications@github.com wrote:

Dear Peghoty,

Thank you for the report, and sorry for the delay in replying to you.
I have just fixed the bug in C, C++, and Java.

¡ª
Reply to this email directly or view it on GitHub.

@bayesrule
Copy link

Same questions here. Thanks!

@samkuok
Copy link

samkuok commented Jul 12, 2013

I have the same question like above and another question: did you guys test the code?It looks like that the result of predict function is not correct,all the test samples have the same predict value...

@peghoty
Copy link
Author

peghoty commented Jul 12, 2013

I also faced with such problem on my own testing data, say, all the test samples have the same predict value.

Here is a clue for you, though I'm not sure the above problem comes out because of this.

As we all know, during the pre-training phase, output of the previous RBM will be taken as input of the current RBM.
let x, y be the input and output of the previous layer, respectively, W be the weight matrix which reflects the connection from the current layer to the previous layer, and b be the bias vector of the current layer, then, y can be computed in two ways as follows:

  1. y = sigmoid(Wx + b)
  2. z = sigmoid(Wx + b), y = binomial(z)

In the code of yusugomori/DeepLearning, the second one is used. However, is that really reasonable? Why not use the first one?
Maybe you can have a try, wish you good luck.

peghoty

At 2013-07-12 10:15:21,samkuok notifications@github.com wrote:

I have the same question like above and another question: did you guys test the code?It looks like that the result of predict function is not correct,all the test samples have the same predict value...

¡ª
Reply to this email directly or view it on GitHub.

@samkuok
Copy link

samkuok commented Jul 12, 2013

thanks,peghoty
It is really helpful,I will try to check that

@samkuok
Copy link

samkuok commented Jul 17, 2013

I found that the result of code "ph_sample[i] * input[j] - nh_means[i] * nv_samples[j]" in the contrastive_divergence is always 0,so the weight won't be updated at all.

@yusugomori
Copy link
Owner

Thanks for the bug report. It should be

ph_mean[i] * input[j] - nh_means[i] * nv_samples[j]

I'll fix that.

@yusugomori
Copy link
Owner

Dear Peghoty,

Sorry for the delay.

  1. In your demonstration example, the training samples are all binary, like (1,0,1,...), does the code deal with the real-valued input(like (0.2,0.5,1.4,...))? If not, how should I adjust the code?

the sample_v_given_h function has to be changed when coping with real-valued data.
I wrote CRBM.py, but I'm not sure if it's correct...

  1. During the fine-tuning phase, you only update the parameters of the Logistic Regression classifier with parameters of the prevous levels fixed, is this a popular way? or you just implement it for simplicity? If I want to update all the parameters together in the fine-tuning phase, any good suggestions?

yes, it's for simpicity, and i'm also seeking the best solution...

yusugomori added a commit that referenced this issue Jul 18, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants