Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow input and output parameters to be specified with PARAM_* macros #723

Merged
merged 22 commits into from Jul 20, 2016

Conversation

rcurtin
Copy link
Member

@rcurtin rcurtin commented Jul 15, 2016

This is part one of the automatic bindings project. I've taken all of the macros like PARAM_STRING and refactored them into PARAM_STRING_IN and PARAM_STRING_OUT, to denote which are input parameters and which are output parameters. This information will be necessary when I am generating bindings to other languages.

I then had to go through all of the command-line programs and adapt the options. In some cases, I had to change the name of parameters, but I preserved the old versions for reverse-compatibility (so there is some code we will be able to remove for mlpack 3.0.0). Any input string parameters that are filenames must have names that end in _file.

Note that while there is, e.g., PARAM_STRING_IN_REQ() (a required input parameter), there is no PARAM_STRING_OUT_REQ() since I don't think it's reasonable to require that users handle their output. That means that some programs, like mlpack_preprocess_split and others, have been changed to issue warnings instead of fatal errors when output parameters are not specified.

When there is an output parameter that does not end in _file, its value will be printed to stdout at the termination of the program. Right now I think only mlpack_hmm_loglik uses a non-filename output parameter.

I've tested each of the command-line programs to ensure that they still function, and in some cases I've fixed a few bugs with them.

@KeonKim: my work affects mlpack_preprocess_split and mlpack_preprocess_binarize more than trivially, so let me know if any of my changes are problematic.
@mentekid: when I merged lsh_main.cpp with multiprobe and recall computation support, the merge was imperfect. I think I fixed everything, but can you take a quick look and make sure everything still works properly?
@zoq: I could not get the mlpack_rmva program to run correctly ("Target class out of range"), but I couldn't get it to work before, either, so I don't know if my changes have broken anything.

The only reason all these checks are necessary is that we don't have a good infrastructure for testing command-line executables. But, this change here is the first in a long set of steps to getting that infrastructure built.

(The next part of the plan, not included in this PR, is to change PARAM_STRING_IN("input_file", "...", "i"); to PARAM_MATRIX_IN("input", "...", "i"); and then the input matrix will be available directly with CLI::GetParam<arma::mat>("input");, and the help dialog will automatically display an entry for --input_file, so that the changes are transparent to the user.)

@mentekid
Copy link
Contributor

Regarding Multiprobe: I don't see anything broken, everything seems to work well (parameters seem to get passed fine to the objects etc).

I am getting some relatively low recall rates for a few datasets but once I change the default parameters (K/L) things get better.
So I think from my PoV this is fine :)

@zoq
Copy link
Member

zoq commented Jul 16, 2016

@rcurtin The rmva model expects the responses to be in range [1, 2, .., #classes], I think you tested it with [0, 1, .., #classes]?

* @param required If required, the program will refuse to run unless the
* parameter is specified.
* @param input If true, the parameter is an input parameter (not an output
* parameter).
*/
template<typename T>
void CLI::Add(const std::string& identifier,
const std::string& description,
const std::string& alias,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's some inconsistency from the old code, but I think we should use parent here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look in 7a34fe3; I think that the parent term is totally outdated and should actually be alias, since what is actually being passed there is the short option name (i.e. -i instead of --input_file). Originally the idea was that the command-line interface would support arbitrarily hierarchical parameters, so each parameter needed to have some parent name associated with it. But this idea ended up being trashed, so we only have a single level of parameters for the sake of simplicity. Only these few lines of comments remained.

@rcurtin
Copy link
Member Author

rcurtin commented Jul 18, 2016

@zoq: thanks for the clarification; I got the rmva program to work right once I made the changes in c852746... good thing I checked. This should be ready to merge, if there are no more comments.

Every floating-point literal in C++ is assumed to be double unless otherwise
specified, so there is no reason to specify anything here.

This should fix the Windows build.
(this should trigger the AppVeyor build again)
@zoq
Copy link
Member

zoq commented Jul 18, 2016

Looks good, no further comments from me.

@rcurtin rcurtin merged commit 0d9a0e2 into mlpack:master Jul 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants