Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very Slow Insertion on Windows #789

Open
gdaniel opened this issue Dec 9, 2016 · 2 comments
Open

Very Slow Insertion on Windows #789

gdaniel opened this issue Dec 9, 2016 · 2 comments

Comments

@gdaniel
Copy link

gdaniel commented Dec 9, 2016

Hi,

I am using MapDB and I experience strange behavior on Windows platform. Here is my experimental setup:

I run the same program (see below) on Linux (Fedora 20), and Windows 7. On fedora the execution time is fast and everything behave as expected. However the execution time becomes very slow when I move to windows (I stopped it after around 10 minutes, and only 25% of the elements were inserted).

Note that I perform these two experimentations on the same computer to have the same hardware capabilities. I also asked other people to run it to see if this is only related to my computer or if it is a general issue, and the results are the same.

More surprising, if we execute this code on windows but inside a docker running a linux instance then the performances are good as expected.
On the same idea, if I run the experimentation from Linux but writing the database on the windows partition everything is fine too (this shows that the file system type is not the problem)

I am wondering if there is some windows system setting to take into account, or any windows related stuff. For now I am investigating on the Java implementation, and the way each systems manage memory mapped files but I didn't find an answer.

I am not trying to have the best possible performances for this toy application, I just want to understand why is there such a difference between Linux-based and Windows-based platforms.

I am running MapDB 3.0.2, with JDK8. JVM parameters used: -Xmx8g -XX:+UseConcMarkSweepGC -XX:MaxDirectMemorySize=3G

Thank you!

Gwendal

public static void main(String[] args) throws IOException {
	DB db = DBMaker.fileDB(new File("test")).make();
	HTreeMap<String, String> map = (HTreeMap<String, String>) db.hashMap("test").createOrOpen();
	
	for(int i = 0; i < 4000000; i++) {
		map.put(UUID.randomUUID().toString(), UUID.randomUUID().toString());
		if(i % 100000 == 0) {
			db.commit();
			System.out.println("commit " + i);
		}
	}
	db.commit();
}
@scottcarey
Copy link

File sync, and the OS / File system settings related to it, can have a massive difference in performance (and durability / corruption if the process or OS crashes).

The first thing to do is compare the I/O per second and the I/O utilization on disk in your two cases. I suspect that in the windows case it is forcing a flush to disk on each commit, and the linux case it is not.

Also, try disabling memory mapping. Compare memory mapped (fileMmapEnable) to file channel (fileChannelEnable). In my experience the RandomAccessFile is horribly slow (and it is single threaded, synchronizing all access).

One can mount a file system in Linux such that it won't properly sync file to disk on sync, which will make it rather fast but also corrupt or lose data on a crash.

@derreisende77
Copy link

It doesn´t matter whether you are using memory mapped store or file channel. The performance on windows 10 / windows 7 with jdk8u152 is about 1/3 the speed compared to linux and macOS. I had to use a memoryDirectDB store on windows just to achieve a usable performance. File based store is alsmost unusable on windows :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants