New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getLock unexpectedly can clear an existing lock belonging to a process #44

Open
joeyh opened this Issue May 12, 2015 · 6 comments

Comments

Projects
None yet
2 participants
@joeyh

joeyh commented May 12, 2015

I have a program that opens two different Fds to the same file, uses setLock on the first Fd, and then getLock on the second Fd. Once the getLock is done, the program no longer has a lock set on the file. This is with unix 2.7.0.1 and ghc 7.10.1.

This is very surprising (and is a testcase from stumbling over this problem in the wild), and I don't see behavior like this when I write a C program that does the same thing using fcntl F_SETLK and F_GETLK.

I'll paste my test program at the end of this bug report. The way to use it is open 2 terminals. In the first, run "./test setLock getLock". Then in the second, run "./test setLock" and see that it successfully sets the lock; the first process lost its lock on the file. If you instead run "./test setLock" in both terminals, it behaves as expected with the first process taking the lock so the second fails to take it.

import System.Posix.IO
import System.IO
import System.Environment

main = do
     mapM_ go =<< getArgs
     getLine

go "setLock" = do
    writeFile lck ""
    fd <- openFd lck ReadWrite Nothing defaultFileFlags
    print ("opened lock fd", fd)
    setLock fd (WriteLock, AbsoluteSeek, 0, 0)
    print ("setLock")
go "getLock" = do
    checkfd <- openFd lck ReadOnly Nothing defaultFileFlags
    print ("opened lock check fd", checkfd)
    ret <- getLock checkfd (ReadLock, AbsoluteSeek, 0, 0)
    case ret of
        Nothing -> print "getLock indicates the file is not locked"
        Just (pid, fl) -> print ("getLock detected lock by pid", pid)
    closeFd checkfd
lck = "locktest"
@joeyh

This comment has been minimized.

joeyh commented May 12, 2015

I'm not 100% sure if my C program is correct, but here is is for completeness. It seems to show that F_GETLK shouldn't clear a lock set by F_SETLK on a different Fd.

#include <fcntl.h>
#include <stdio.h>

main () {
    char buf[10];
    int fd = open("locktest", O_RDWR);
    struct flock fl;

    fl.l_type = F_WRLCK;
    fl.l_whence = SEEK_SET;
    fl.l_start = 0;
    fl.l_len = 0;
    fl.l_pid = 0;

    printf("opened lock fd\n", fd);
    fcntl(fd, F_SETLK, &fl);
    printf("set lock\n");
    read(0, &buf, 1);

    getlock();
}

getlock () {
    char buf[10];
    int fd = open("locktest", O_RDWR);
    struct flock fl;

    fl.l_type = F_RDLCK;
    fl.l_whence = SEEK_SET;
    fl.l_start = 0;
    fl.l_len = 0;
    fl.l_pid = 0;

    printf("opened lock fd\n", fd);
    fcntl(fd, F_GETLK, &fl);
    printf("lock info: %i\n", fl.l_pid);
    read(0, &buf, 1);
}
@argiopetech

This comment has been minimized.

Member

argiopetech commented May 12, 2015

Thanks for the heads up, and particularly for the thorough test cases. I probably won't have time to play with this before tomorrow, but I'll be on it ASAP.

For completeness, have you tried this test on 7.8 or under any other OS? If no to the latter, could you share which OS you're on (in the off case that it isn't reproducible on my machine)?

@joeyh

This comment has been minimized.

joeyh commented May 13, 2015

I've been testing this on Linux. ghc 7.6.3/unix-2.6.0.1 also behave as
described in this bug report so it's not a new reversion.

Also just did a quick test on OSX, and I see the same behavior there
too. That was with ghc 7.8.3/unix-2.7.0.1

see shy jo

@argiopetech

This comment has been minimized.

Member

argiopetech commented May 16, 2015

Having taken a look at this, I'm going to bounce it back to you for review. It seems to be performing to spec from my perspective. My rationale is below. Let me know if I've missed the problem you're seeing.

I see appropriate locking behavior when the closeFd is removed from the "getLock" case. This is what I would expect per the following clause from the description of POSIX fcntl in the Linux fcntl documentation..

If a process closes any file descriptor referring to a file, then
all of the process's locks on that file are released, regardless
of the file descriptor(s) on which the locks were obtained.

Linux provides open file description locks since 3.15 which don't have this behaviour, but they are Linux-specific and we don't (currently) support them.

@joeyh

This comment has been minimized.

joeyh commented May 16, 2015

Elliot Robinson wrote:

Having taken a look at this, I'm going to bounce it back to you for review. It
seems to be performing to spec from my perspective. My rationale is below. Let
me know if I've missed the problem you're seeing.

I see appropriate locking behavior when the closeFd is removed from the
"getLock" case. This is what I would expect per the following clause from the
fcntl documentation.

If a process closes any file descriptor referring to a file, then
all of the process's locks on that file are released, regardless
of the file descriptor(s) on which the locks were obtained.

Wow, ok. That's in spec, but certianly surprising behavior if not well
familiar with fcntl locks.

I don't think that flock locks behave that way, do they? A user reading
the documentation of System.Posix.IO is left guessing about the
underlying locking technology used (the data in FileLock is a good
hint). I feel this documentation could at least be improved.

see shy jo

@argiopetech

This comment has been minimized.

Member

argiopetech commented May 18, 2015

Wow, ok. That's in spec, but certianly surprising behavior if not well
familiar with fcntl locks.

Yep, POSIX is rife with such things. I've learned to read the spec first and then shut up and sit down. Mine is not to wonder why.

I don't think that flock locks behave that way, do they?

No, they don't. Unfortunately, flock() is a BSD extension (which has also been implemented in Linux). Can't put it in the POSIX hierarchy though.

A user reading the documentation of System.Posix.IO is left guessing about the
underlying locking technology used (the data in FileLock is a good
hint). I feel this documentation could at least be improved.

The assumption for me is that anything in System.Posix implements the standard, though I'll agree that the standard isn't always intuitive. I'll work on making this a bit more clear.

@argiopetech argiopetech self-assigned this May 18, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment