Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for rewinddir by restarting readdir if offset is zero. #825

Merged
merged 8 commits into from
Mar 28, 2024

Conversation

arsh
Copy link
Contributor

@arsh arsh commented Mar 23, 2024

Description of change

If we get a readdir call with offset 0, we restart the directory stream. In #581, we added support for consecutive calls with same offset by storing the last response. However, we're not using that feature when offset is 0 because we want any files that were added or removed between the opendir and the rewinddir to reflect in subsequent calls to readdir.

If all the use cases for which #581 was added are for offset 0, we could remove it. However, since #581 allows repeated offsets other than 0, it could break code that is relying on this behavior.

It is worth calling out that we cannot distinguish from a rewinddir or seekdir. We know that rewinddir will always call with offset 0, while seekdir could be with any offset. Regardless, seekdir is still not supported.

Relevant issues: #819

Does this change impact existing behavior?

The behavior of readdir is changing: now, calling it with an offset of 0 will restart the directory stream, ensuring that new or removed files are accurately reflected. Previously, it was possible to miss new files or still observe removed files if readdir returned the last response due to consecutive calls with the same offset.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

@arsh arsh requested a review from jamesbornholt March 23, 2024 16:16
@arsh arsh had a problem deploying to PR integration tests March 23, 2024 16:17 — with GitHub Actions Failure
@arsh arsh had a problem deploying to PR integration tests March 23, 2024 16:17 — with GitHub Actions Failure
@arsh arsh had a problem deploying to PR integration tests March 23, 2024 16:17 — with GitHub Actions Failure
@arsh arsh had a problem deploying to PR integration tests March 23, 2024 16:17 — with GitHub Actions Failure
@arsh arsh had a problem deploying to PR integration tests March 23, 2024 16:17 — with GitHub Actions Failure
@arsh arsh had a problem deploying to PR integration tests March 23, 2024 16:17 — with GitHub Actions Failure
@arsh arsh had a problem deploying to PR integration tests March 23, 2024 16:17 — with GitHub Actions Failure
mountpoint-s3/src/inode/readdir.rs Outdated Show resolved Hide resolved
mountpoint-s3/src/inode/readdir.rs Outdated Show resolved Hide resolved
mountpoint-s3/src/fs.rs Show resolved Hide resolved
mountpoint-s3/src/fs.rs Outdated Show resolved Hide resolved
Signed-off-by: Andres Santana <hernaa@amazon.com>
Signed-off-by: Andres Santana <hernaa@amazon.com>
Signed-off-by: Andres Santana <hernaa@amazon.com>
Signed-off-by: Andres Santana <hernaa@amazon.com>
Signed-off-by: Andres Santana <hernaa@amazon.com>
Signed-off-by: Andres Santana <hernaa@amazon.com>
Signed-off-by: Andres Santana <hernaa@amazon.com>
@arsh arsh had a problem deploying to PR integration tests March 26, 2024 07:19 — with GitHub Actions Failure
@arsh arsh temporarily deployed to PR integration tests March 26, 2024 07:19 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 26, 2024 07:19 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 26, 2024 07:19 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 26, 2024 07:19 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 26, 2024 07:19 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 26, 2024 09:31 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 26, 2024 09:31 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 26, 2024 09:31 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 26, 2024 09:31 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 26, 2024 09:31 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 26, 2024 09:31 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 26, 2024 09:31 — with GitHub Actions Inactive
@arsh arsh requested a review from jamesbornholt March 27, 2024 07:49
mountpoint-s3/src/fs.rs Outdated Show resolved Hide resolved
mountpoint-s3/src/fs.rs Outdated Show resolved Hide resolved
mountpoint-s3/src/fs.rs Outdated Show resolved Hide resolved
@@ -1070,11 +1088,11 @@ where
if dir_handle.offset() < 2 {
let lookup = self
.superblock
.getattr(&self.client, dir_handle.handle.parent(), false)
.getattr(&self.client, dir_handle.handle.lock().await.parent(), false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably just lock the dir_handle.handle once for this entire call (or at least only lock it once after the rewind logic is done) rather than dropping/reacquiring it for every entry.

Actually, I don't know what the rules are for a dir handle: can there be two concurrent calls to readdir on the same handle? I'm pretty sure the answer is no.

Copy link
Contributor Author

@arsh arsh Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the lock to be at the top after the rewind logic.

I'm not sure what prevents (if anything) concurrent calls to readdir. I wrote C code that calls it on the same fd and it worked fine. Reading readdir man page it says this is not expected to be thread-safe as per the POSIX standard. Open to hear your ideas on how to reason about this if this is not exactly what you meant.

For completeness, here is the C code and output. mnt is mounted via Mountpoint.

Output (as expected)

~/workspace/rewind » ./readdir_parallel ~/mnt
[139900944557824] .
[139900944557824] bench10GB.bin
[139900944557824] bench5MB.bin
[139900923578112] A
[139900923578112] new-file
[139900934067968] ..
[139900944557824] bench_dir_100

Code

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <dirent.h>
#include <pthread.h>

#define NUM_THREADS 3

void *readdir_parallel(void *dir_ptr) {
    DIR *dir = (DIR *)dir_ptr;
    struct dirent *entry;

    // Read directory entries until end is reached
    while ((entry = readdir(dir)) != NULL) {
        printf("[%ld] %s\n", pthread_self(), entry->d_name);
    }

    pthread_exit(NULL);
}

int main(int argc, char *argv[]) {
    if (argc != 2) {
        printf("Usage: %s <directory>\n", argv[0]);
        return EXIT_FAILURE;
    }

    DIR *dir = opendir(argv[1]);
    if (dir == NULL) {
        perror("opendir");
        return EXIT_FAILURE;
    }

    // Create threads to read directory entries in parallel
    pthread_t threads[NUM_THREADS];
    for (int i = 0; i < NUM_THREADS; i++) {
        if (pthread_create(&threads[i], NULL, readdir_parallel, dir) != 0) {
            fprintf(stderr, "Error creating thread %d\n", i);
            break;
        }
    }

    // Wait for all threads to finish
    for (int i = 0; i < NUM_THREADS; i++) {
        pthread_join(threads[i], NULL);
    }

    closedir(dir);
    return EXIT_SUCCESS;
}

Signed-off-by: Andres Santana <hernaa@amazon.com>
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 12:51 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 12:51 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 12:51 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 12:51 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 12:51 — with GitHub Actions Inactive
@arsh arsh had a problem deploying to PR integration tests March 28, 2024 12:51 — with GitHub Actions Failure
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 12:51 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 14:58 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 14:58 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 14:58 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 14:58 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 14:58 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 14:58 — with GitHub Actions Inactive
@arsh arsh temporarily deployed to PR integration tests March 28, 2024 14:58 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt added this pull request to the merge queue Mar 28, 2024
Merged via the queue into awslabs:main with commit 127fb71 Mar 28, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants