Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fdatasync not working as expected on fscq? #14

Closed
meng-xu opened this issue Apr 22, 2019 · 4 comments
Closed

fdatasync not working as expected on fscq? #14

meng-xu opened this issue Apr 22, 2019 · 4 comments

Comments

@meng-xu
Copy link

meng-xu commented Apr 22, 2019

Hi, our team at SSLab, Georgia Tech is testing the crash safety property of fscq and we found that fdatasync is not working as expected maybe? Following is the test case:

  1. Mount an empty image
mkdir -p mptr1
<path-to-fscq-src>/fscq empty.img -f -o big_writes,atomic_o_trunc,use_ino mptr1
  1. Compile and run the test case
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>

int main(int argc, char *argv[])
{
        chdir("mptr1");
        system("ls");
        system("ls");
        system("ls");
        system("ls");
        system("ls");
        system("tree");
 
        int rv;
 
        unsigned char buf[8192] = { 0 };
        memset(buf, 'a', 8192);
 
        int fd_foo = syscall(SYS_open, "foo", O_CREAT | O_RDWR, 0777);
        printf("fd: %d\n",  fd_foo);
 
        int fd_mntpnt = syscall(SYS_open, ".", O_DIRECTORY, 0);
        printf("root fd: %d\n", fd_mntpnt);
 
        rv = syscall(SYS_fsync, fd_mntpnt);
        printf("fsync mntpnt: %d\n", rv);
 
        rv = syscall(SYS_write, fd_foo, buf, 4000);
        printf("write: %d\n", rv);
 
        rv = syscall(SYS_fdatasync, fd_foo);
        printf("fdatasync: %d\n", rv);
 
        system("killall -9 fscq"); // crash fscq
        return 0;
}

Output of the test case should be:

.

0 directories, 0 files
fd: 3
root fd: 4
fsync mntpnt: 0
write: 4000
fdatasync: 0

Based on the return code of fsync, write, and fdatasync, file foo should be there with 4000 a in its content.

  1. Re-mount the image on another directory
<path-to-fscq-src>/src/fscq empty.img -f -o big_writes,atomic_o_trunc,use_ino mptr2
  1. Check inconsistency: file foo is empty
tree mptr2
mptr2
└── foo

0 directories, 1 file

cat mptr2/foo
<nothing returns back>

wc -c mptr2/foo
0 mptr2/foo

[System Setup]
I am seeing this issue on the master branch of the repo although I see this issue in the sosp17 branch as well. I am using Ubuntu 16.04 with 4.4.0 kernel and compiled fscq with coqc 8.8.1 and ghc 8.0.1.

Let me know if you need further information. Thank you!

@tchajed
Copy link
Member

tchajed commented Apr 22, 2019

This is within the FSCQ fdatasync specification as described in the OSDI 2018 paper. There are (at least) two trees in the tree sequence: one where the file exists and has length 0, and one where the 4000 bytes have been written. The fdatasync ensures that if the system crashes to the second tree, the data will match buf rather than be zeroes (of course in this situation those are the same since buf is all zeros). However, without an fsync the system simply crashes to the old tree.

This doesn't match one reading of the fdatasync(2) man page in Linux (which says metadata is flushed if "needed in order to allow a subsequent data retrieval to be correctly handled" - you could interpret "correctly handled" as meaning all previously written data to the file should be readable), but that isn't the specification FSCQ formalizes.

@meng-xu
Copy link
Author

meng-xu commented Apr 22, 2019

Thank you @tchajed for the quick answer. A follow-up question: since killall -9 fscq crashes to the old tree, how to make the system crash to the second tree? Or this is not possible?

@tchajed
Copy link
Member

tchajed commented Apr 22, 2019

You'll need an fsync call (for example, use fsync instead of fdatasync to both sync the direct writes to that file and flush the log).

@tchajed
Copy link
Member

tchajed commented Jul 29, 2019

Closing. This isn't a bug in FSCQ, just a difference between our specification and what other filesystems do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants