Skip to content

getLock unexpectedly can clear an existing lock belonging to a process #44

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
joeyh opened this issue May 12, 2015 · 7 comments
Closed
Assignees

Comments

@joeyh
Copy link

joeyh commented May 12, 2015

I have a program that opens two different Fds to the same file, uses setLock on the first Fd, and then getLock on the second Fd. Once the getLock is done, the program no longer has a lock set on the file. This is with unix 2.7.0.1 and ghc 7.10.1.

This is very surprising (and is a testcase from stumbling over this problem in the wild), and I don't see behavior like this when I write a C program that does the same thing using fcntl F_SETLK and F_GETLK.

I'll paste my test program at the end of this bug report. The way to use it is open 2 terminals. In the first, run "./test setLock getLock". Then in the second, run "./test setLock" and see that it successfully sets the lock; the first process lost its lock on the file. If you instead run "./test setLock" in both terminals, it behaves as expected with the first process taking the lock so the second fails to take it.

import System.Posix.IO
import System.IO
import System.Environment

main = do
     mapM_ go =<< getArgs
     getLine

go "setLock" = do
    writeFile lck ""
    fd <- openFd lck ReadWrite Nothing defaultFileFlags
    print ("opened lock fd", fd)
    setLock fd (WriteLock, AbsoluteSeek, 0, 0)
    print ("setLock")
go "getLock" = do
    checkfd <- openFd lck ReadOnly Nothing defaultFileFlags
    print ("opened lock check fd", checkfd)
    ret <- getLock checkfd (ReadLock, AbsoluteSeek, 0, 0)
    case ret of
        Nothing -> print "getLock indicates the file is not locked"
        Just (pid, fl) -> print ("getLock detected lock by pid", pid)
    closeFd checkfd
lck = "locktest"
@joeyh
Copy link
Author

joeyh commented May 12, 2015

I'm not 100% sure if my C program is correct, but here is is for completeness. It seems to show that F_GETLK shouldn't clear a lock set by F_SETLK on a different Fd.

#include <fcntl.h>
#include <stdio.h>

main () {
    char buf[10];
    int fd = open("locktest", O_RDWR);
    struct flock fl;

    fl.l_type = F_WRLCK;
    fl.l_whence = SEEK_SET;
    fl.l_start = 0;
    fl.l_len = 0;
    fl.l_pid = 0;

    printf("opened lock fd\n", fd);
    fcntl(fd, F_SETLK, &fl);
    printf("set lock\n");
    read(0, &buf, 1);

    getlock();
}

getlock () {
    char buf[10];
    int fd = open("locktest", O_RDWR);
    struct flock fl;

    fl.l_type = F_RDLCK;
    fl.l_whence = SEEK_SET;
    fl.l_start = 0;
    fl.l_len = 0;
    fl.l_pid = 0;

    printf("opened lock fd\n", fd);
    fcntl(fd, F_GETLK, &fl);
    printf("lock info: %i\n", fl.l_pid);
    read(0, &buf, 1);
}

@argiopetech
Copy link
Contributor

Thanks for the heads up, and particularly for the thorough test cases. I probably won't have time to play with this before tomorrow, but I'll be on it ASAP.

For completeness, have you tried this test on 7.8 or under any other OS? If no to the latter, could you share which OS you're on (in the off case that it isn't reproducible on my machine)?

@joeyh
Copy link
Author

joeyh commented May 13, 2015

I've been testing this on Linux. ghc 7.6.3/unix-2.6.0.1 also behave as
described in this bug report so it's not a new reversion.

Also just did a quick test on OSX, and I see the same behavior there
too. That was with ghc 7.8.3/unix-2.7.0.1

see shy jo

@argiopetech
Copy link
Contributor

Having taken a look at this, I'm going to bounce it back to you for review. It seems to be performing to spec from my perspective. My rationale is below. Let me know if I've missed the problem you're seeing.

I see appropriate locking behavior when the closeFd is removed from the "getLock" case. This is what I would expect per the following clause from the description of POSIX fcntl in the Linux fcntl documentation..

If a process closes any file descriptor referring to a file, then
all of the process's locks on that file are released, regardless
of the file descriptor(s) on which the locks were obtained.

Linux provides open file description locks since 3.15 which don't have this behaviour, but they are Linux-specific and we don't (currently) support them.

@joeyh
Copy link
Author

joeyh commented May 16, 2015

Elliot Robinson wrote:

Having taken a look at this, I'm going to bounce it back to you for review. It
seems to be performing to spec from my perspective. My rationale is below. Let
me know if I've missed the problem you're seeing.

I see appropriate locking behavior when the closeFd is removed from the
"getLock" case. This is what I would expect per the following clause from the
fcntl documentation.

If a process closes any file descriptor referring to a file, then
all of the process's locks on that file are released, regardless
of the file descriptor(s) on which the locks were obtained.

Wow, ok. That's in spec, but certianly surprising behavior if not well
familiar with fcntl locks.

I don't think that flock locks behave that way, do they? A user reading
the documentation of System.Posix.IO is left guessing about the
underlying locking technology used (the data in FileLock is a good
hint). I feel this documentation could at least be improved.

see shy jo

@argiopetech
Copy link
Contributor

Wow, ok. That's in spec, but certianly surprising behavior if not well
familiar with fcntl locks.

Yep, POSIX is rife with such things. I've learned to read the spec first and then shut up and sit down. Mine is not to wonder why.

I don't think that flock locks behave that way, do they?

No, they don't. Unfortunately, flock() is a BSD extension (which has also been implemented in Linux). Can't put it in the POSIX hierarchy though.

A user reading the documentation of System.Posix.IO is left guessing about the
underlying locking technology used (the data in FileLock is a good
hint). I feel this documentation could at least be improved.

The assumption for me is that anything in System.Posix implements the standard, though I'll agree that the standard isn't always intuitive. I'll work on making this a bit more clear.

@argiopetech argiopetech self-assigned this May 18, 2015
@hasufell
Copy link
Member

So it seems this is invalid? Please re-open with further information if you disagree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants