Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document LocalCoordinateCoding #3711

Merged
merged 13 commits into from
May 24, 2024
Merged

Document LocalCoordinateCoding #3711

merged 13 commits into from
May 24, 2024

Conversation

rcurtin
Copy link
Member

@rcurtin rcurtin commented May 19, 2024

LocalCoordinateCoding is very similar to SparseCoding, so I went ahead and documented it using the sparse coding documentation as a starting point. Essentially the documentation is the same, since the algorithms are just variations of each other and present identical APIs.

Here is a rendered version:
https://www.ratml.org/misc/mlpack-markdown-doc/user/methods/local_coordinate_coding.html

Some notes about code changes that had to happen and other changes too:

  • Added MatType template parameter to LocalCoordinateCoding and adapted tests.
  • Found some bugs for when too much regularization is applied (and some atoms are unused); these were relatively simple bugfixes.
  • Realized some things could be improved in the sparse coding documentation too, so I made those fixes.
  • Found some bad HTML links in core.md that came from a different PR, but that PR had passed the documentation build... so I am going to look carefully at the output of the CI job this time and see if it is just ignoring failures or something.

@shrit
Copy link
Member

shrit commented May 19, 2024

@rcurtin is this ready for review ?

Copy link
Member

@shrit shrit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, one thing I was wondering when I was reading the docs is that technically we should be able to serialize the model with arma::fmat with no problem, or is there anything else that we need to add somewhere?

@@ -127,7 +127,7 @@ class SparseCoding
*
* If you want to initialize the dictionary to a custom matrix, consider
* either writing your own DictionaryInitializer class (with void
* Initialize(const arma::mat& data, arma::mat& dictionary) function), or call
* Initialize(const MatType& data, MatType& dictionary) function), or call
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch !

mat X;
X.load("mnist_first250_training_4s_and_9s.arm");
mat inX; // The .arm file is saved as an arma::mat.
inX.load("mnist_first250_training_4s_and_9s.arm");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never heard of this extension before, 😄
@conradsnicta should patent it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data saved in Armadillo's native arma_binary format is platform dependent. It's a "quick-n-dirty" binary format that's meant to be more efficient than CSV or ascii, and less finicky than HDF5. This comes at the cost of being platform and/or OS dependent. For example, stuff saved on an ARM machine may not load on an x86-64 machine and vice-versa.

This is not a problem in many cases, but it's probably not a good idea to have it as part of official mlpack tests, which are meant to work on many platforms. For better portability, suggest to use the CSV format instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. In 2c035b5, I converted the test data to be stored in .csv instead of the Armadillo binary format. It's very slightly larger in the .tar.bz2 (like 20kb), but I don't think that's a big deal.

Copy link

@mlpack-bot mlpack-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second approval provided automatically after 24 hours. 👍

@rcurtin
Copy link
Member Author

rcurtin commented May 21, 2024

@mlpack-jenkins test this please

@rcurtin rcurtin merged commit 65d70a0 into mlpack:master May 24, 2024
1 of 6 checks passed
@rcurtin rcurtin deleted the lcc-doc branch May 24, 2024 13:19
This was referenced May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants