Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use scipy.optimize.minimize method='CG' to run gradient descent #1

Closed
oeyh opened this issue Jul 28, 2018 · 3 comments
Closed
Assignees

Comments

@oeyh
Copy link
Owner

oeyh commented Jul 28, 2018

This is in assignment3 section 1.4 function oneVsAll()

result = minimize(lrCostFucntion, theta0, args=(X, ylabel, lmd), method='TNC', jac=True, options={'disp': True, 'maxiter':1000})

Method='CG' should work, too, or even faster. But there's error.

Suspect a bug inside minimize function, it seems to change the shape of theta in the process, causing dimension issue when trying to do matrix multiplication.

@oeyh oeyh self-assigned this Sep 2, 2018
@oeyh
Copy link
Owner Author

oeyh commented Sep 3, 2018

Further observation: method='CG' works for all cases i in range(9) except i=8, giving error: shapes (5000,401) and (1,1,401) not aligned: 401 (dim 1) != 1 (dim 1)

oeyh added a commit that referenced this issue Sep 3, 2018
@oeyh oeyh added wontfix This will not be worked on on hold and removed wontfix This will not be worked on labels Sep 8, 2018
@oeyh
Copy link
Owner Author

oeyh commented Sep 8, 2018

Seems to be a bug in scipy.

oeyh added a commit that referenced this issue Sep 9, 2018
@oeyh
Copy link
Owner Author

oeyh commented Sep 9, 2018

Observation: in certain circumstances, scipy.optimize.minimize(... method='CG'...) will add dimension to x0 (here x0=theta0) and thus making matrix multiplication in cost function error out.

Temporary workaround: in my cost function, ravel theta0 first, make sure it is 1D; then add proper dimension to it to make it a 2D array (column vector).

More comments:

  1. I dived deep into scipy's source codes hoping to find evidence that it adds dimension to x0 by mistake but couldn't. The source codes are still too hard for me to read. In the future, if possible, I'd like to find evidence, submit test report and maybe even try to fix it and submit pull request.....
  2. CG refers to conjugate gradient method, for more info, take a look at wikipedia page: https://en.wikipedia.org/wiki/Conjugate_gradient_method

@oeyh oeyh closed this as completed Sep 9, 2018
@oeyh oeyh removed the on hold label Sep 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant