Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R bindings with documentation #11

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

Yashwants19
Copy link
Owner

No description provided.

@eddelbuettel
Copy link
Collaborator

Oh that will be tricky to review:

image

@Yashwants19
Copy link
Owner Author

I am really sorry, I didn't mention, but I open this PR for checking the package in all the condition. But you can help by reviewing the auto-generated files(R/*.R), and correct me with the auto-generated documentation part.

Thank You :).

@Yashwants19 Yashwants19 reopened this Jul 15, 2020
@Yashwants19 Yashwants19 force-pushed the R-bindings-with-documentation branch from 4e59b66 to 0379e5f Compare July 17, 2020 08:03
src/Makevars Outdated

# strip debug symbols for smaller Linux binaries
strippedLib: $(SHLIB)
if test -e "/usr/bin/strip" & test -e "/bin/uname" & [[ `uname` == "Linux" ]] ; then /usr/bin/strip --strip-debug $(SHLIB); fi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have forgotten what the story was but I think "they" decided to not allow this. Unless you see it used in currently-published packages (RcppArmadillo does not, RcppEnsmallen does not, ...) I would recommend against.

I keep strip instruction in my ~/.R/Makevars instead.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed "strip" in the last commit.

@Yashwants19 Yashwants19 reopened this Jul 18, 2020
@eddelbuettel
Copy link
Collaborator

@Yashwants19 I don't think CRAN will support of the stripper routine by Dan either -- or did you find a package on CRAN using it? He is a friend and former colleague, and if memory serves we later found that the invocation from src/Makevars is equivalent, but as I also recall, it is also not allowed at CRAN.

CRAN submission can be complicated. Hence, I to remove all points of friction. If you added this because R CMD check notes the large file of shared libraries: don't worry. That is tolerated. But CRAN does not like altering the libraries.

@Yashwants19
Copy link
Owner Author

@Yashwants19 I don't think CRAN will support of the stripper routine by Dan either -- or did you find a package on CRAN using it? He is a friend and former colleague, and if memory serves we later found that the invocation from src/Makevars is equivalent, but as I also recall, it is also not allowed at CRAN.

Yes, I found one package(RxODE) that uses rstripper, or I might overlooked something, if so I am really sorry about that.

CRAN submission can be complicated. Hence, I to remove all points of friction. If you added this because R CMD check notes the large file of shared libraries: don't worry. That is tolerated. But CRAN does not like altering the libraries.

Previously, I thought libs size might be a problem for cran submission, but now after your confirmation, I can skip this part. :)

@eddelbuettel
Copy link
Collaborator

Good (thorough) find but it seems like src/Makevars.in in RxODE has strip use commented out in the last CRAN version (viat the mirrored-at-GitHub-CRAN-version).

For questions like the shared library size, just ask us because that is what we're here for! Happy to help, and still lovely to see all the progress.

@Yashwants19
Copy link
Owner Author

I have revert all the changes regarding rstripper. Thanks for guidance @eddelbuettel, you are the best :).

@Yashwants19
Copy link
Owner Author

For questions like the shared library size, just ask us because that is what we're here for! Happy to help, and

My PR(#8) is pending for review, hence I was left with some bandwidth and I thought I can explore some of ways for reducing the size of libs. I did ask here.

still lovely to see all the progress.

Thank You :)

@@ -1,4 +1,4 @@
#PKG_CPPFLAGS = -DARMA_NO_DEBUG -DBOOST_DISABLE_ASSERTS
PKG_CXXFLAGS = -I. $(SHLIB_OPENMP_CXXFLAGS) -ftrack-macro-expansion=0
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing I want to ask here is that -- will cran allow us to use -ftrack-macro-expansion=0 for windows build ? Else we may face some memory issue for windows build, may be due to : https://sourceforge.net/p/mingw-w64/mailman/message/33182613/ or http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56746.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yashwants19 we'll need to probably communicate with CRAN in-advance. I'm not sure that it will be feasible. That said, the CRAN build servers are stronger than the GitHub Actions/Appveyor build system.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only found one package(ChannelAttribution) here that use this flag, or I am again making some blunder here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blunder, it looks like the check matrix is reporting everything is fine.

https://cran.r-project.org/web/checks/check_results_ChannelAttribution.html

It may be possible to use this flag, but we should only opt-in to flags as a last resort. We're stressing this because any additional compiler flags may impact that package's portability.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so Cf bindings is costing us some heavy memory usage, as by removing cf bindings, windows build is working smoothly on GH too. Hence cran's stronger server might not face this issue even with cf bindings, as I have checked this it is working fine locally with cf bindings too.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I remove this flag in the main repository(Yashwants19/mlpack)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's aim to remove it for now. We can bring it back later if need be.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (aff22c1).

Copy link
Collaborator

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Yashwants19, I took a look through the generated bindings. They all look good to me. The only major question I have is whether or not we include range search as a binding, but that's always something we can remove later, so, up to you what you want to do at this moment. 👍 I'm sure @eddelbuettel and @coatless will have more specific R comments---my R knowledge is such that I can mostly just look at it and say "ok, that should work!" but I don't know what the preferable way to write something would be. :)

IO_RestoreSettings("Hidden Markov Model (HMM) Training")

# Process each input argument before calling mlpackMain().
IO_SetParamString("input_file", input_file)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of our "weird" bindings: because we don't have an automatic binding type for what the HMM training algorithm takes in, we instead just pass a filename to either a dataset or a file containing names of files that contain datasets. I don't think there's anything to do here right now, but I at least wanted to point it out to @eddelbuettel and @coatless. In the long term, this is one of a few bindings that we have to figure out a better representation for.

#' image dataset into individual files, for instance after mlpack methods have
#' been used.
#'
#' @param input Image filenames which have to be loaded/saved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we print in the documentation here what the type of each parameter is? This one is a vector of strings, but I can't see that just from reading this documentation. I'm not sure if R provides facilities for automatically printing the expected type of each input parameter (probably not?).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rcurtin unfortunately, that isn't possible as R's type system is weak. There is no int x available. The best that can be done is probably inferring it from the C++ bindings and then saying,

  • If int, then a single integer value,
  • If arma::vec, then a vector of numeric values,
  • ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah---we can definitely do this from C++ then. All we need to do is associate some documentation string with each C++ type (just like you wrote above in the short list :)) and then we just need to print the associated documentation for each type. @Yashwants19 do you think you could add something like that? Let me know if I can clarify anything. Or if you have a better idea for how to accomplish the same goal go ahead. 👍

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure. Do we want this?

#' @param input Character Vector - Image filenames which have to be loaded/saved.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be sufficient to do this?

#' @param input Image filenames which have to be loaded/saved (character vector).

But in any case I think something like this is probably the best approach.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I will made the respective changes 👍.

Comment on lines +33 to +34
#' \item{distances_file}{File to output distances into. Default value "".}
#' \item{neighbors_file}{File to output neighbors into. Default value "".}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another "weird" binding. In C++, range search doesn't return an arma::mat; it can return a different number of results for each point in the input matrix, so actually it has to return a std::vector<std::vector<size_t>>. For other languages, I've simply not added the range search binding for this reason. Maybe that is the right thing to do for R too?

@@ -10,7 +8,33 @@ Serialize <- function(model, filename) {
model_serialization_function <-
switch(attributes(model)$type,
"GaussianKernel" = SerializeGaussianKernelPtr,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, the GaussianKernel is only needed for the test binding. It doesn't hurt to leave it here, but if you aren't planning on including the test binding in the release, it might be nice to remove the model type from here too (hopefully automatically). Up to you. 👍

IO_ClearSettings()

return(out)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whew! There are a lot of bindings! :)

@Yashwants19 Yashwants19 mentioned this pull request Aug 27, 2020
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants