Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo program crashes on windows #21

Closed
khoran opened this issue May 22, 2017 · 17 comments
Closed

Demo program crashes on windows #21

khoran opened this issue May 22, 2017 · 17 comments

Comments

@khoran
Copy link

@khoran khoran commented May 22, 2017

I'm using Windows 8, R 3.4.0, and RcppAnnoy 0.0.8. I copied the code from demo/SimpleExample.R and modified the paths for "test.tree" to be just a filename (to avoid any path separator issues). The code runs fine until the very last line: "print(b$getNNsByItem(0, 40))". That line never returns, instead windows displays a crash report for R:

Problem signature:
Problem Event Name: APPCRASH
Application Name: rsession.exe
Application Version: 1.0.143.0
Application Timestamp: 58efc68f
Fault Module Name: RcppAnnoy.dll
Fault Module Version: 0.0.0.0
Fault Module Timestamp: 5921a5a5
Exception Code: c0000005
Exception Offset: 0000000000023e88
OS Version: 6.2.9200.2.0.0.256.48
Locale ID: 1033
Additional Information 1: 95c0
Additional Information 2: 95c0dadcc2e0790a3d6a0db9476ff7f7
Additional Information 3: c127
Additional Information 4: c127ededbb4a08cca6b24a2b1617bfb1

I get the same behavior whether run from the command line or Rstudio. The problem also occurred with R 3.3.0.

@eddelbuettel
Copy link
Owner

@eddelbuettel eddelbuettel commented May 23, 2017

Hm. Works for me on Linux.

@khoran
Copy link
Author

@khoran khoran commented May 23, 2017

@eddelbuettel
Copy link
Owner

@eddelbuettel eddelbuettel commented May 23, 2017

Can you debug? I don't really have access to Windows, and I don't think Erik uses it much either.

@khoran
Copy link
Author

@khoran khoran commented May 24, 2017

@bnbwn
Copy link

@bnbwn bnbwn commented Jun 15, 2017

Same crash as you Khoran on windows. (for others Win10 64, Rstudio Version 1.0.143, Rx64 3.4, RcppAnnoy 0.0.8)

If I had the chops to fix it I would. This would be great to have on windows, however, thank you Dirk for your work.

Any update khoran?

@eddelbuettel
Copy link
Owner

@eddelbuettel eddelbuettel commented Jun 15, 2017

Well if one of you guys could get out the debugger ...

@bnbwn
Copy link

@bnbwn bnbwn commented Jun 15, 2017

Just to be sure, and as I am learning, which is correct? A or B?

A) One must have previously installed and have a currently functioning Annoy library installed and loaded into one's python environment. Then, RcppAnnoy is how to leverage that existing functionality from within R.

B) Simply installing and loading the RccpAnnoy package into R should duplicated the functionality of installing annoy into python?

If it is A, I believe windows users are stuck as annoy's annoylib.h references #include <unistd.h> which isn't available for windows, and for places I have looked, one or two seemed to have ported it over to windows, but I was unable to get it working.

@eddelbuettel
Copy link
Owner

@eddelbuettel eddelbuettel commented Jun 15, 2017

No, it definitely B). Look at a recursive directory tree -- we have a complete and distinct copy of Annoy in RcppAnnoy.

Often these things go away via a simple recompliation. Sometimes it is something else. Hard to tell from here (on mostly Linux).

@khoran
Copy link
Author

@khoran khoran commented Jun 15, 2017

@khoran
Copy link
Author

@khoran khoran commented Jun 23, 2017

I've narrowed the problem to the mmap function in the load function in inst/include/annoylib.h. On windows it seems to load garbage. To test, I print out the n_descendants variable on each node in the tree that is stored in the mmap'd file. On Linux I get:

loaded 262 nodes
node[261] descendants: 50
node[260] descendants: 50
node[259] descendants: 50
...

But on windows the output is:

loaded 262 nodes
node[261] descendants: 651083354
node[260] descendants: -1011663796
node[259] descendants: 1254342341
node[258] descendants: -1255948749
node[257] descendants: -1103723859
node[256] descendants: -855229048
node[255] descendants: 224771609
...

I looked at the latest annoy code from spotify but didn't see any changes in the load function. I also looked for the most recent changes in the mman-win32 github repo, which defines the mmap function. There was one patch that seemed relevant, but after applying it, it didn't help.

I've created a fork with lots of debugging output and the above changes, if anyone wants to do more digging. https://github.com/khoran/rcppannoy. The "test.R" script in the root directory will run the test.

I'm not sure what else to do at this point. I'm not that familiar with mmap and it seems like I'd have to analyze the memory structure of their tree to see where things are going wrong.

@eddelbuettel
Copy link
Owner

@eddelbuettel eddelbuettel commented Jun 23, 2017

Could well be a 32 bit versus 64 bit issue. This was said to work, and the package installs (and tests) on CRAN for windows.

@khoran
Copy link
Author

@khoran khoran commented Jun 30, 2017

Found the fix! In the save function, the fwrite call is not writing in binary mode. As a result, on windows it translates all 0x0a bytes to 0x0d 0x0a in the output ( NL to CR NL). So if you had an input which did not produce that byte, it would work, but other inputs would fail.

The fix is the change "w" to "wb" on line 322 of annoylib.h. Checking the original source in spotfy's code shows they also are using "wb".

@eddelbuettel
Copy link
Owner

@eddelbuettel eddelbuettel commented Jul 1, 2017

Very nice catch, thanks so much -- will make that fix.

@khoran
Copy link
Author

@khoran khoran commented Aug 28, 2017

Thanks for making this change, and I see you've been do a bit other work on this package too. Would it be possible to get this new version deployed in CRAN before the next BioConductor release (usually around Sept/Oct)? That way I can just put in a dependency on the new version. Thanks.

@eddelbuettel
Copy link
Owner

@eddelbuettel eddelbuettel commented Aug 28, 2017

Thanks for poking / reminding.

I should be able to release what we have "soon". I should probably check with @erikbern / upstream to see which changes he made there which I may want to carry over.

@eddelbuettel
Copy link
Owner

@eddelbuettel eddelbuettel commented Aug 31, 2017

Hi again -- you tested this again too, right? I think the package is all clean, tests file and all. I diff-ed against upstream again and don't need changes. But I was thinking to maybe extend the unit tests some.
However, I could also do that once uploaded. Any strong feelings either way from your end?

@eddelbuettel
Copy link
Owner

@eddelbuettel eddelbuettel commented Aug 31, 2017

I shipped it. Should hopefully be on CRAN soon as 0.0.9.

Thanks again for your help with this issue, and particularly for patiently finding the cause. Much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.