Support running without DIRECT IO #47

coffeemug · 2012-11-13T03:30:17Z

Running with direct io makes sense in production, but when developers try the product, they often have encrypted/journaled file systems that don't support direct io. We should implement an alternative code path that opens the files without direct io and warns devs that it's a compatibility mode.

frank-trampe · 2013-01-04T23:23:24Z

What was the original reason for having the software fail on lack of O_DIRECT? Was it merely to save time implementing the fallback, or are there things that fail when the file is opened without it?

frank-trampe · 2013-01-05T00:19:21Z

Let me rephrase that so that it doesn't sound like I don't know the importance of O_DIRECT for a database. What was the original reason for not allowing a bypass? Was it effort required to implement the fallback in disk.cc, things that broke elsewhere, or fear of cache crowding and other serious system performance problems?

coffeemug · 2013-01-05T14:18:25Z

Basically, we just didn't think about it :)

frank-trampe · 2013-01-07T17:58:26Z

So the low-level support in disk.cc seems to be present after all, just not commented. The argument is_really_direct to linux_file_t::linux_file_t (which is not Linux-specific anymore) controls whether one requests O_DIRECT or F_NOCACHE. All that we need to do is to implement a choice between the typedefs direct_file_t and nondirect_file_t in log_serializer.cc. We could pass the information to that point (from command_line stuff) via extra function arguments (either a session object or a simple flag for this option) or via a global variable. What would be preferable here?

coffeemug · 2013-01-07T18:30:50Z

I haven't looked at this stuff in a while, but here are two problems that I can think of off the top of my head:

F_NOCACHE may or may not be supported on all platforms
O_DIRECT implies that when one writes a block, the function does not return until the disk driver responds that the block has been committed to disk. This isn't the case without O_DIRECT, and may or may not be the case with F_NOCACHE. We need this behavior for sane operation, so we'll have to make sure it's replicated when O_DIRECT is off (one way to do that is to throw in fsyncs)

frank-trampe · 2013-01-07T18:45:32Z

F_NOCACHE is just for the Macintosh (or whatever you call a sharp-edged aluminum computer that does not support O_DIRECT).
I thought that we were willing to allow the user to bypass the disk commit safeguards in this case since it was just for testing. I can examine options for verifying a commit to disk without O_DIRECT if you like.

frank-trampe · 2013-01-07T20:16:26Z

As it turns out, O_DIRECT does not provide guarantees about data commits like O_SYNC. Would we also want to provide an O_SYNC option?

srh · 2013-01-07T21:41:25Z

We want O_DSYNC (which Linux allegedly treats O_SYNC as, anyway).

We warn the user when O_DIRECT doesn't work and pass O_DSYNC in any case.

This is currently in code review 134.

coffeemug · 2013-01-07T22:21:34Z

What do we do in OSX for sync?

srh · 2013-01-08T19:47:25Z

Nothing yet. If we want proper syncing in OS X we can follow up each write() call with fcntl(fd, F_FULLFSYNC).

coffeemug · 2013-01-08T19:48:50Z

@srh -- presumably we'd only need to do it twice -- once before writing the metablock, and once after.

srh · 2013-01-08T20:00:48Z

@coffeemug - I'm going to do F_FULLFSYNC in the I/O layer after every write call. This allegedly replicates O_DSYNC behavior (except that it presumably also syncs file metadata). Also, there's no reason it would perform worse (except for the relatively negligible cost of an extra syscall on a thread in the i/o pool) than some specific fsync calls made in the serializer.

srh · 2013-01-18T03:53:08Z

This is fixed, with F_FULLFSYNC and O_DSYNC options enabled.

coffeemug · 2013-01-18T09:34:31Z

@srh -- could you please specify review number and commit number?

coffeemug · 2013-02-04T06:06:41Z

@srh -- ping -- could you specify review number and commit number? What is the warning message the users get if DIRECT_IO isn't available?

Tryneus mentioned this issue Nov 20, 2012

Crash while sending queries + corrupted data and crash after restart #84

Closed

ghost assigned srh Dec 22, 2012

srh closed this as completed Jan 18, 2013

enterstudio mentioned this issue Nov 29, 2023

[Snyk] Fix for 17 vulnerabilities enterstudio/rethinkdb#26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support running without DIRECT IO #47

Support running without DIRECT IO #47

coffeemug commented Nov 13, 2012

frank-trampe commented Jan 4, 2013

frank-trampe commented Jan 5, 2013

coffeemug commented Jan 5, 2013

frank-trampe commented Jan 7, 2013

coffeemug commented Jan 7, 2013

frank-trampe commented Jan 7, 2013

frank-trampe commented Jan 7, 2013

srh commented Jan 7, 2013

coffeemug commented Jan 7, 2013

srh commented Jan 8, 2013

coffeemug commented Jan 8, 2013

srh commented Jan 8, 2013

srh commented Jan 18, 2013

coffeemug commented Jan 18, 2013

coffeemug commented Feb 4, 2013

Support running without DIRECT IO #47

Support running without DIRECT IO #47

Comments

coffeemug commented Nov 13, 2012

frank-trampe commented Jan 4, 2013

frank-trampe commented Jan 5, 2013

coffeemug commented Jan 5, 2013

frank-trampe commented Jan 7, 2013

coffeemug commented Jan 7, 2013

frank-trampe commented Jan 7, 2013

frank-trampe commented Jan 7, 2013

srh commented Jan 7, 2013

coffeemug commented Jan 7, 2013

srh commented Jan 8, 2013

coffeemug commented Jan 8, 2013

srh commented Jan 8, 2013

srh commented Jan 18, 2013

coffeemug commented Jan 18, 2013

coffeemug commented Feb 4, 2013