Update binding to support optional arguments with `getNNsBy*` methods, with code #12

Closed
mikepb opened this Issue Feb 1, 2016 · 22 comments

Projects

None yet

3 participants

@mikepb
Contributor
mikepb commented Feb 1, 2016

I managed to update the C++ code to use a template and changed getNNsBy* to accept search_k and include_distances with defaults:
https://github.com/mikepb/rcppannoy

However, the changes break the API:

# Before:
# a <- new(AnnoyEuclidean, f)
# After:
a <- AnnoyIndex(f, "euclidean")

I've only been using R for months and I'm not very familiar with the R class system. Do you think there could be a way to merge the changes and maintain backwards compatibility?

@eddelbuettel
Owner

Can you point at a more succinct set of changes?

What you show is the exposed R interface implemented via Rcpp Modules. I'd rather not change that.

@mikepb
Contributor
mikepb commented Feb 1, 2016

Fair enough. What do you think about adding two methods?

master...mikepb:patch-include-distances

@eddelbuettel
Owner

From the top of my head -- looks good! And I love that RcppAnnoy is used. Let me take another look, maybe this evening. In general, adding extra methods is simple and does not take away from existing interfaces so that would be doable.

Does the interface you suggest adding exist at the Python side too or is it a C++-only change?

@mikepb
Contributor
mikepb commented Feb 1, 2016

The Python method uses the get_nns_by* methods with optional arguments, I think. The original equivalents for RcppAnnoy fixed the values of search_k=-1 and include_distances=NULL. If we could support formal arguments on the C++ side, that would probably be ideal!

@khoran
khoran commented Sep 28, 2016

Any chance these changes might be merged in and available in CRAN by next March (2017)? I would like to use these extra arguments in an R package being hosted on Bioconductor. They require any dependencies be available in CRAN or Bioconductor.
I had also asked mikepb about this a few months ago (mikepb@847017c) and he suggested submitting my own package to CRAN. I could certainly do that, but it just seems like it would not be very nice to have two packages in CRAN that are nearly identical. My other option is to just embed this package inside my own package, which I'm happy to do if that's best.
I'd also be happy to help do the merge and submission to CRAN if its just a matter of you having the time. Thanks.

@eddelbuettel
Owner

Yes, we should be able to accomodate this here.

I have to reacquaint myself with that Mike wrote above to catch up on where/how/why the API would change.

One thing would be to just add a new function with the new interface if the old function cannot be retrofitted.

@khoran
khoran commented Sep 28, 2016

Great, thanks!

@eddelbuettel
Owner

Can you help and lay out what you need so that we can work towards including it or a pull request?

@eddelbuettel
Owner

Looking at this again ... I must have gotten too busy in the spring. This should really be doable. Thanks for the reminder, and for Mike to write it.

@khoran
khoran commented Sep 28, 2016

I've been using Mike's fork locally to develop. I need both the search_k option and the option to include distances in the result.
Here is an example call:
neighbors = index$getNNsByVector(queries[,i],numNeighbors,searchK, TRUE)
I also use the getNNsByItem function:
neighbors = index$getNNsByItem(i-1,numNeighbors,searchK, TRUE)

Not sure if that's what you were looking for or not. It seems like there were some syntax differences discussed previously. I happy to use whatever function/syntax works best for the package.

@eddelbuettel
Owner

Ok -- I'll create a branch and will insert his code there.

@eddelbuettel
Owner
eddelbuettel commented Sep 28, 2016 edited
@eddelbuettel
Owner

Now, your changes would indeed be a non-backwards compatible change. Those are generally frowned upon.

I also like have this templated here ... as it is a simple example of template use with modules. Can you possibly work with the RcppAnnoy mainline ?

@khoran
khoran commented Sep 28, 2016

Are you referring to my package having the non-backwards compatible change? If so, that's no problem. I haven't yet deployed any code using this library, that's scheduled for the March release of bioconductor. So I can use the syntax in the branch you just setup. I think those are compatible with the main version of RcppAnnoy, correct?
I've tested that branch also and it seems to work. Thanks for the quick response.

@eddelbuettel
Owner

Great. Yes, so far I just added two new functions ending in L -- may make that end in List instead.

They would not alter existing interfaces so we could have this up on CRAN in days.

@khoran
khoran commented Sep 28, 2016

However you want to name it is fine with me.

@eddelbuettel
Owner

Concise enough, and different enough. I am just thrilled that you are finding use for in in BioC. Did you try some other (approximate)NN methods? Erik has some nice benchmark charts...

@khoran
khoran commented Sep 28, 2016

The package I'm using it in is eiR, which allows you to create a searchable database of millions of chemical compounds. I was using LSH-KIT previously, but that is very old now. I looked on Eric's page and several of the aNN methods mentioned there. I was looking for something with reasonable performance and that would not be too hard for me (or the bioC guys) to make use of in R. This package certainly made the R integration part easy! My tests so far show it to be slightly better than LSH-KIT.

@eddelbuettel
Owner

Could you take what is now version 0.0.7.1 in master for a spin? Unless it needs extra work it could probably go to CRAN now.

@khoran
khoran commented Sep 30, 2016

Looks good to me.

On 09/29/2016 06:53 PM, Dirk Eddelbuettel wrote:

Could you take what is now version 0.0.7.1 in master for a spin?
Unless it needs extra work it could probably go to CRAN now.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#12 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABRzuCxPYMLUyR0MNnxMwVKWLSqd31Lpks5qvGuZgaJpZM4HQwUn.

@eddelbuettel
Owner

Great. I have to update one internal file (for Travis), look it over and then ship. Likely this evening.

@eddelbuettel
Owner

This was added in this commit which became part of PR #16. Thanks for suggesting it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment