Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upUpdate binding to support optional arguments with `getNNsBy*` methods, with code #12
Comments
|
Can you point at a more succinct set of changes? What you show is the exposed R interface implemented via Rcpp Modules. I'd rather not change that. |
|
Fair enough. What do you think about adding two methods? |
|
From the top of my head -- looks good! And I love that RcppAnnoy is used. Let me take another look, maybe this evening. In general, adding extra methods is simple and does not take away from existing interfaces so that would be doable. Does the interface you suggest adding exist at the Python side too or is it a C++-only change? |
|
The Python method uses the |
|
Any chance these changes might be merged in and available in CRAN by next March (2017)? I would like to use these extra arguments in an R package being hosted on Bioconductor. They require any dependencies be available in CRAN or Bioconductor. |
|
Yes, we should be able to accomodate this here. I have to reacquaint myself with that Mike wrote above to catch up on where/how/why the API would change. One thing would be to just add a new function with the new interface if the old function cannot be retrofitted. |
|
Great, thanks! |
|
Can you help and lay out what you need so that we can work towards including it or a pull request? |
|
Looking at this again ... I must have gotten too busy in the spring. This should really be doable. Thanks for the reminder, and for Mike to write it. |
|
I've been using Mike's fork locally to develop. I need both the search_k option and the option to include distances in the result. Not sure if that's what you were looking for or not. It seems like there were some syntax differences discussed previously. I happy to use whatever function/syntax works best for the package. |
|
Ok -- I'll create a branch and will insert his code there. |
|
I just added Mike's functions in a new branch: https://github.com/eddelbuettel/rcppannoy/tree/feature/optional_getNNsBy_args |
|
Now, your changes would indeed be a non-backwards compatible change. Those are generally frowned upon. I also like have this templated here ... as it is a simple example of template use with modules. Can you possibly work with the RcppAnnoy mainline ? |
|
Are you referring to my package having the non-backwards compatible change? If so, that's no problem. I haven't yet deployed any code using this library, that's scheduled for the March release of bioconductor. So I can use the syntax in the branch you just setup. I think those are compatible with the main version of RcppAnnoy, correct? |
|
Great. Yes, so far I just added two new functions ending in They would not alter existing interfaces so we could have this up on CRAN in days. |
|
However you want to name it is fine with me. |
|
Concise enough, and different enough. I am just thrilled that you are finding use for in in BioC. Did you try some other (approximate)NN methods? Erik has some nice benchmark charts... |
|
The package I'm using it in is eiR, which allows you to create a searchable database of millions of chemical compounds. I was using LSH-KIT previously, but that is very old now. I looked on Eric's page and several of the aNN methods mentioned there. I was looking for something with reasonable performance and that would not be too hard for me (or the bioC guys) to make use of in R. This package certainly made the R integration part easy! My tests so far show it to be slightly better than LSH-KIT. |
|
Could you take what is now version 0.0.7.1 in master for a spin? Unless it needs extra work it could probably go to CRAN now. |
|
Looks good to me. On 09/29/2016 06:53 PM, Dirk Eddelbuettel wrote:
|
|
Great. I have to update one internal file (for Travis), look it over and then ship. Likely this evening. |
|
This was added in this commit which became part of PR #16. Thanks for suggesting it. |
I managed to update the C++ code to use a template and changed
getNNsBy*to acceptsearch_kandinclude_distanceswith defaults:https://github.com/mikepb/rcppannoy
However, the changes break the API:
I've only been using R for months and I'm not very familiar with the R class system. Do you think there could be a way to merge the changes and maintain backwards compatibility?