Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage of lfs_dir_seek #978

Open
NLLK opened this issue May 7, 2024 · 2 comments
Open

Usage of lfs_dir_seek #978

NLLK opened this issue May 7, 2024 · 2 comments
Labels

Comments

@NLLK
Copy link

NLLK commented May 7, 2024

Hello. I want to speed up the process of getting a file from the folder by file's index. To do that i suggest i need to use some functions like "lfs_dir_seek" and "lfs_dir_tell". According to lfs header file, 2nd mentioned function required only to provide it's result to seek function. The main question is "How to actually get file with given index from some folder?"

I tried using only lfs_dir_seek, but it's working strangely. I wrote an example test function that prints the folder content.

void Database::test()
{
    int err = lfs_dir_open(&fs, &dir_t, "/db/c_l");

    int index = 0;
    while (true)
    {
        if (index == 26) //the number of files
            break;

        err = lfs_dir_seek(&fs, &dir_t, index + 2); //2 - '.' and '..'
        int res = lfs_dir_read(&fs, &dir_t, &lfs_inf);

        qDebug() << index++ << " file: " << QString::fromUtf8(lfs_inf.name);
    }

    lfs_dir_close(&fs, &dir_t);
}

The output from the function and the real content are not the same. I got the real folder content with just reading files with "lfs_dir_read" which could be slow for large folders as i think. The other part is from given function. I placed comments there to get you know what i expect and what i got using seek function.

Output

reading normally, wo seek

0 file: "."
1 file: ".."
...
7 file: "c_00000a_651163245724"
8 file: "c_00000c_634615096992"
9 file: "c_00000f_734167697693"
10 file: "c_000010_683642954602"
11 file: "c_000013_264626523310"
12 file: "c_000014_957561453774"
13 file: "c_000015_022028852276"
14 file: "c_000017_212400520476"
15 file: "c_000019_673260401413"
16 file: "c_00001a_034241034244"
17 file: "c_00001b_241251262616"
18 file: "c_00001d_327007361318"
...

reading with seek

5 file: "c_00000a_651163245724"
6 file: "c_00000c_634615096992"
7 file: "c_00001a_034241034244" (it should be ) "c_00000f_734167697693" from "normal situation"
8 file: "c_000010_683642954602"
9 file: "c_000013_264626523310"
10 file: "c_000014_957561453774"
11 file: "c_000015_022028852276"
12 file: "c_000017_212400520476"
13 file: "c_000019_673260401413"
14 file: "c_00001a_034241034244"
15 file: "c_00001b_241251262616"
16 file: "" (res = 0 => read no data)
17 file: "c_00001e_215365648216"
18 file: "c_00001f_662121174234"
...

@geky geky added the question label May 7, 2024
@geky
Copy link
Member

geky commented May 7, 2024

Curious, I wouldn't have expected this behavior. Running something similar locally doesn't show random files. Is is possible an error from one of the previous functions went unnoticed? Are there other file operations?

But I would discourage this approach. Seeking to arbitrary indices in a directory isn't supported, and is uncommon in other filesystems.

lfs_dir_seek is usually rather limited in filesystems, and only really allows you to return to a file you've seen before, assuming no filesystem modifications. The reason for this is directories are optimized solely for filename lookup, and supporting indexing can get in the way of that.

The best solution for finding the nth file with current filesystem APIs is the naive solution:

lfs_dir_t dir;
int err = lfs_dir_open(&lfs, &dir, "db/c_l");
if (err) {
    return err;
}

// find index, +2 for . and ..
for (lfs_off_t i = 0; i < index + 2; i++) {
    int res = lfs_dir_read(&lfs, &dir, &info);
    if (res != 1) {
        assert(res != 0);
        return res;
    }
}

int err = lfs_dir_close(&lfs, &dir);
if (err) {
    return err;
}

// info now contains the nth file's info

It should be noted both the naive solution and a theoretical lfs_dir_seek solution run in $O(n)$.

Alternatively, you could consider making the index a part of the file name. Directories are optimized for name lookup after all. Though it looks like your file names already have quite a bit of info in them...

At the risk of over-engineering. You could keep an auxiliary directory containing a number of small files that map index -> file name.

Though the added complexity of this auxiliary directory may outweigh the cost of the naive iteration...

@NLLK
Copy link
Author

NLLK commented May 8, 2024

Thanks a lot, i'll stuck with native solution then. The issue is considered to be solved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants