Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance on Windows JDK 8 2-3x slower #19

Closed
joinr opened this issue Sep 3, 2020 · 7 comments
Closed

Performance on Windows JDK 8 2-3x slower #19

joinr opened this issue Sep 3, 2020 · 7 comments

Comments

@joinr
Copy link

joinr commented Sep 3, 2020

Posting here for posterity, as referenced in the reddit thread here related to the benchmark fork with datahike added datalevinbench.

Results on ubuntu with same JDK were not reproduced, so it appears to be a lmdb-java windows problem.
I attempted to see if the new lmbd-java native libs per lmdb-java #148 would have an impact to no avail. Datalevin ran fine with lmbd-java 0.9.24-1 though (benchmarks completed).

@huahaiy
Copy link
Contributor

huahaiy commented Sep 3, 2020

Thank you very much for including Datahike in the benchmark and reporting this performance issue with Windows. I would appreciate if you could update your reddit post to include your Ubuntu run numbers also. More data points help everyone. I will investigate the problem when I get hold of a Windows machine.

@joinr
Copy link
Author

joinr commented Sep 3, 2020

No problem. Updated in reddit thread and on the datalevinbench repo README.

@huahaiy
Copy link
Contributor

huahaiy commented Sep 3, 2020

Thank you. @joinr I appreciate it.

@huahaiy
Copy link
Contributor

huahaiy commented Sep 3, 2020

It may not be a performance issue. Please take a look at joinr/datalevinbench#2

@joinr
Copy link
Author

joinr commented Sep 4, 2020

That was it indeed. The assumption about the temp file directory, /tmp, doesn't hold on Windows. This led to an ever growing database. The interesting news is that a db about 13x the size was still reading about 3x slower than the datascript. With a fresh db, the results conform to your reported baseline. No problem with lmdb on windows, just tempfile expectations :)

I added new-db to the benchmark to delete the old db (similar to what datahike is doing), which technically adds a little to the measurements, but the results are apparently identical (e.g. deletion isn't a problem). The "real" fix would be to implement a portable temp file pathway for the db (I think the storage/open function called with nil directory creates a tmp/uuid file, so that would probably be the only place to define a wrapper that leverages (System/getProperty "java.io.tmpdir") to construct a portable temp directory.

@joinr
Copy link
Author

joinr commented Sep 4, 2020

Updated reddit thread and readme for posterity.

@huahaiy
Copy link
Contributor

huahaiy commented Sep 4, 2020

Great. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants