New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generative Adversarial Network #1066

Closed
wants to merge 57 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@kris-singh
Contributor

kris-singh commented Jul 20, 2017

I still have to make the test work and add the evaluate function.
Also can somebody help me regarding what loss function to use for 0/1 loss.

kris-singh added some commits Jul 26, 2017

@lozhnikov

This comment has been minimized.

Show comment
Hide comment
@lozhnikov

lozhnikov Jul 26, 2017

Contributor

@kris-singh Remember, I asked you in #mlpack if it is possible to refactor the Gradient() function in such a way that the Train() function contains only optimizer.Optimize()? I mean is it possible to call the optimizer once? I think it is possible. If so, you are doing unnecessary work right now since the code requires complete refactoring in order to do that.

Contributor

lozhnikov commented Jul 26, 2017

@kris-singh Remember, I asked you in #mlpack if it is possible to refactor the Gradient() function in such a way that the Train() function contains only optimizer.Optimize()? I mean is it possible to call the optimizer once? I think it is possible. If so, you are doing unnecessary work right now since the code requires complete refactoring in order to do that.

@kris-singh

This comment has been minimized.

Show comment
Hide comment
@kris-singh

kris-singh Jul 28, 2017

Contributor

When you say have theOptimizer.Optimize() function just once. I do understand that i am calling function 2* iteration number of times. but i think with your fix also I would still have to call it iteration number of times right.

Contributor

kris-singh commented Jul 28, 2017

When you say have theOptimizer.Optimize() function just once. I do understand that i am calling function 2* iteration number of times. but i think with your fix also I would still have to call it iteration number of times right.

@kris-singh

This comment has been minimized.

Show comment
Hide comment
@kris-singh

kris-singh Aug 1, 2017

Contributor

@lozhnikov can you review this. The samples generated are still random. Can you look at any obvious mistakes that i might be overlooking.

Contributor

kris-singh commented Aug 1, 2017

@lozhnikov can you review this. The samples generated are still random. Can you look at any obvious mistakes that i might be overlooking.

@lozhnikov

This comment has been minimized.

Show comment
Hide comment
@lozhnikov

lozhnikov Aug 1, 2017

Contributor

@kris-singh I'll look through the PR later. I think the ssRBM is more priority than this PR right now. There are still a lot of questions regarding the implementation of the ssRBM. The ssRBM is not well tested yet and shows poor performance. We didn't try to reproduce the test from the paper yet.
So, if you don't mind I'll return to this PR as soon as we test the ssRBM implementation in detail.

Contributor

lozhnikov commented Aug 1, 2017

@kris-singh I'll look through the PR later. I think the ssRBM is more priority than this PR right now. There are still a lot of questions regarding the implementation of the ssRBM. The ssRBM is not well tested yet and shows poor performance. We didn't try to reproduce the test from the paper yet.
So, if you don't mind I'll return to this PR as soon as we test the ssRBM implementation in detail.

@kris-singh

This comment has been minimized.

Show comment
Hide comment
@kris-singh

kris-singh Aug 3, 2017

Contributor

The main error here is that the gradient values are too small and when multiplied with a smaller learning rate they parameters end up not changing much any ideas on how to counter this. I was sgd as optimizer with learning rate 0.001 there is very less learning with 0.1 also.

Contributor

kris-singh commented Aug 3, 2017

The main error here is that the gradient values are too small and when multiplied with a smaller learning rate they parameters end up not changing much any ideas on how to counter this. I was sgd as optimizer with learning rate 0.001 there is very less learning with 0.1 also.

Show outdated Hide outdated src/mlpack/methods/ann/gan_impl.hpp
Show outdated Hide outdated src/mlpack/tests/gan_test.cpp
Show outdated Hide outdated src/mlpack/methods/ann/gan_impl.hpp

kris-singh added some commits Aug 25, 2017

@lozhnikov

I pointed out some minor issues.

@@ -92,6 +92,10 @@ double MiniBatchSGDType<
Log::Info << "Mini-batch SGD: iteration " << i << ", objective "
<< overallObjective << "." << std::endl;
Log::Info << "Last objective" << lastObjective << std::endl;
Log::Info << "Gradient = " << arma::max(gradient) << std::endl;

This comment has been minimized.

@lozhnikov

lozhnikov Aug 26, 2017

Contributor

I understand you need that in order to debug the implementation, but we should remove that in the future.

@lozhnikov

lozhnikov Aug 26, 2017

Contributor

I understand you need that in order to debug the implementation, but we should remove that in the future.

Show outdated Hide outdated src/mlpack/methods/ann/gan.hpp
Show outdated Hide outdated src/mlpack/methods/ann/gan.hpp
Show outdated Hide outdated src/mlpack/methods/ann/gan.hpp
Show outdated Hide outdated src/mlpack/methods/ann/gan.hpp
Show outdated Hide outdated src/mlpack/methods/ann/gan_impl.hpp
generator.Backward();
generator.ResetGradients(gradientGenerator);
generator.Gradient();
double multiplier = 1.5;

This comment has been minimized.

@lozhnikov

lozhnikov Aug 26, 2017

Contributor

I think this variable could be useful if we want to set different step sizes. In that case it is better to add an argument to the constructor. How do you think?

@lozhnikov

lozhnikov Aug 26, 2017

Contributor

I think this variable could be useful if we want to set different step sizes. In that case it is better to add an argument to the constructor. How do you think?

preTrainSize--;
}
if (preTrainSize)
Log::Info << "Pre Training batch " << preTrainSize << std::endl;

This comment has been minimized.

@lozhnikov

lozhnikov Aug 26, 2017

Contributor

I leave a comment as a reminder. I guess we should remove that in the future.

@lozhnikov

lozhnikov Aug 26, 2017

Contributor

I leave a comment as a reminder. I guess we should remove that in the future.

}
template<typename Model, typename InitializationRuleType, typename Noise>
void GAN<Model, InitializationRuleType, Noise>::Forward(arma::mat&& input)

This comment has been minimized.

@lozhnikov

lozhnikov Aug 26, 2017

Contributor

I am not sure of the functions below. Do we need them?

@lozhnikov

lozhnikov Aug 26, 2017

Contributor

I am not sure of the functions below. Do we need them?

This comment has been minimized.

@kris-singh

kris-singh Aug 27, 2017

Contributor

Could be required for some cases when extending Gan's . Not required now though.

@kris-singh

kris-singh Aug 27, 2017

Contributor

Could be required for some cases when extending Gan's . Not required now though.

coeff3 = (1 - deltaR) * deltaC;
coeff4 = deltaR * deltaC;
double depth = k * inRowSize * inColSize;

This comment has been minimized.

@lozhnikov

lozhnikov Aug 26, 2017

Contributor

That's not an error but I suggest to rename the variable. I think you could save some space if you define ptr = k * inRowSize * inColSize + cOrigin * inColSize + rOrigin. In that case

output(k * outRowSize * outColSize + j * outColSize + i) =
              input(ptr) * coeff1 +
              input(ptr + 1) * coeff2 +
              input(ptr + inColSize) * coeff3 +
              input(ptr + inColSize + 1) * coeff4;

Could you make these changes in a separate commit in order to cherry pick them to the ResizeLayer branch?

@lozhnikov

lozhnikov Aug 26, 2017

Contributor

That's not an error but I suggest to rename the variable. I think you could save some space if you define ptr = k * inRowSize * inColSize + cOrigin * inColSize + rOrigin. In that case

output(k * outRowSize * outColSize + j * outColSize + i) =
              input(ptr) * coeff1 +
              input(ptr + 1) * coeff2 +
              input(ptr + inColSize) * coeff3 +
              input(ptr + inColSize + 1) * coeff4;

Could you make these changes in a separate commit in order to cherry pick them to the ResizeLayer branch?

@kris-singh

This comment has been minimized.

Show comment
Hide comment
@kris-singh

kris-singh Aug 27, 2017

Contributor

Fixed your comments. I have cherry picked the commit to the Resize Layer and would add the make the changes there.

Contributor

kris-singh commented Aug 27, 2017

Fixed your comments. I have cherry picked the commit to the Resize Layer and would add the make the changes there.

@ShikharJ ShikharJ referenced this pull request Jan 19, 2018

Merged

Generative Adversarial Network #1204

3 of 3 tasks complete
@ShikharJ

This comment has been minimized.

Show comment
Hide comment
@ShikharJ

ShikharJ Jun 10, 2018

Member

Thanks @kris-singh and @lozhnikov for your contribution, this has been incorporated in #1204! @zoq This can be closed now.

Member

ShikharJ commented Jun 10, 2018

Thanks @kris-singh and @lozhnikov for your contribution, this has been incorporated in #1204! @zoq This can be closed now.

@zoq

This comment has been minimized.

Show comment
Hide comment
@zoq

zoq Jun 10, 2018

Member

I agree with @ShikharJ thanks for the awesome work.

Member

zoq commented Jun 10, 2018

I agree with @ShikharJ thanks for the awesome work.

@zoq zoq closed this Jun 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment