Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FATFS] calling stat() on every file in a folder is too slow (IDFGH-8788) #10220

Closed
chipweinberger opened this issue Nov 22, 2022 · 6 comments
Closed
Assignees
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Feature Request Feature request for IDF

Comments

@chipweinberger
Copy link
Contributor

chipweinberger commented Nov 22, 2022

Is your feature request related to a problem?

Goal: get all paths in a folder that are directories (and ideally their filesize too)

stat() is a very slow function. It takes 10+ seconds to call stat() on 300+ paths in the same directory.

But! it takes less than a second to call readdir on every path in the same directory.

This is unfortunate because in the code , vfs_fat_readdir_r already has all the stat() information and just throws it away!

Is there anything I can do to speed this up?

Describe the solution you'd like.

Not sure.

Perhaps I can call f_readdir myself directly? Some guidance here would be helpful.

FRESULT f_readdir (
FF_DIR* dp, /* Pointer to the open directory object /
FILINFO
fno /* Pointer to file information to return */
)

@chipweinberger chipweinberger added the Type: Feature Request Feature request for IDF label Nov 22, 2022
@espressif-bot espressif-bot added the Status: Opened Issue is new label Nov 22, 2022
@github-actions github-actions bot changed the title [FATFS] calling stat() on every file in a folder is too slow [FATFS] calling stat() on every file in a folder is too slow (IDFGH-8788) Nov 22, 2022
@chipweinberger
Copy link
Contributor Author

chipweinberger commented Nov 22, 2022

So I may be pretty dumb! It looks like dirent has a entry->d_type = DT_DIR field that tells me exactly the information I need!

But, if I want to get filesize quickly as well, I would need to do something like this (reimplement readdir):

edit: Just btw, to anyone that finds this thread, this code does work! it is so much faster!

typedef struct {
    DIR dir;
    long offset;
    FF_DIR ffdir;
    FILINFO filinfo;
    struct dirent cur_dirent;
} vfs_fat_dir_t;

// returns the FILEINFO as well!
int better_vfs_fat_readdir_r(void* ctx, DIR* pdir,
        struct dirent* entry, struct dirent** out_dirent, FILEINFO* out_fileinfo)
{
    assert(pdir);
    vfs_fat_dir_t* fat_dir = (vfs_fat_dir_t*) pdir;
    FRESULT res = f_readdir(&fat_dir->ffdir, &fat_dir->filinfo);
    if (res != FR_OK) {
        *out_dirent = NULL;
        ESP_LOGD(TAG, "%s: fresult=%d", __func__, res);
        return fresult_to_errno(res);
    }

    // copy fileinfo to output
    *out_fileinfo = fat_dir->filinfo;

    if (fat_dir->filinfo.fname[0] == 0) {
        // end of directory
        *out_dirent = NULL;
        return 0;
    }
    entry->d_ino = 0;
    if (fat_dir->filinfo.fattrib & AM_DIR) {
        entry->d_type = DT_DIR;
    } else {
        entry->d_type = DT_REG;
    }
    strlcpy(entry->d_name, fat_dir->filinfo.fname,
            sizeof(entry->d_name));
    fat_dir->offset++;
    *out_dirent = entry;
    return 0;
}

// returns the FILEINFO as well!
static struct dirent* better_vfs_fat_readdir(void* ctx, DIR* pdir, FILEINFO* out_fileinfo)
{
    vfs_fat_dir_t* fat_dir = (vfs_fat_dir_t*) pdir;
    struct dirent* out_dirent;
    int err = better_vfs_fat_readdir_r(ctx, pdir, &fat_dir->cur_dirent, &out_dirent, out_fileinfo);
    if (err != 0) {
        errno = err;
        return NULL;
    }
    return out_dirent;
}


main() {
  DIR *dir;
  struct dirent *entry;
  if (dir = opendir("⁄")) { 
    FILEINFO info;
    while ((entry = better_vfs_fat_readdir(NULL, dir, &info)) != NULL) {
      printf("  %u\n", info-> fsize);
    }
    closedir(dir);
  }
}

@igrr
Copy link
Member

igrr commented Nov 22, 2022

While it is unlikely we would add a non-standard API like better_vfs_fat_readdir, we can do an optimization for this specific case. readdir can cache the FILINFO structure for the current file and stat can peek into that cached structure. This will speed up the operation in the common case when you are calling stat for every directory entry, while iterating over the directory.

If implemented, this can close #9570 which I think was asking for a similar feature.

@chipweinberger
Copy link
Contributor Author

chipweinberger commented Nov 22, 2022

That's a clever solution.

Yes, that would be a great idea.

@chipweinberger
Copy link
Contributor Author

When / if you implement caching, please consider adding it to the v4.4 branch as well. It'll be a massive improvement.

@tom-borcin tom-borcin added Status: Selected for Development Issue is selected for development and removed Status: Opened Issue is new labels Apr 25, 2023
@espressif-bot espressif-bot added Status: In Progress Work is in progress Status: Reviewing Issue is being reviewed and removed Status: Selected for Development Issue is selected for development Status: In Progress Work is in progress labels Apr 27, 2023
@espressif-bot espressif-bot added Status: Selected for Development Issue is selected for development and removed Status: Reviewing Issue is being reviewed labels Jun 19, 2023
@espressif-bot espressif-bot added Status: Opened Issue is new and removed Status: Selected for Development Issue is selected for development labels Jun 26, 2023
@svofski
Copy link

svofski commented Nov 25, 2023

In my project it can get up to 3-4 seconds per file just to query their size. This can't be right.

Thanks @chipweinberger for your workaround!

@espressif-bot espressif-bot added Status: Selected for Development Issue is selected for development and removed Status: Opened Issue is new labels Dec 20, 2023
@espressif-bot espressif-bot added Status: In Progress Work is in progress Status: Selected for Development Issue is selected for development and removed Status: Selected for Development Issue is selected for development Status: In Progress Work is in progress labels Jan 10, 2024
@espressif-bot espressif-bot added Status: Selected for Development Issue is selected for development Status: In Progress Work is in progress and removed Status: In Progress Work is in progress labels Mar 11, 2024
@espressif-bot espressif-bot added Status: Reviewing Issue is being reviewed and removed Status: Selected for Development Issue is selected for development Status: In Progress Work is in progress labels Mar 14, 2024
@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Done Issue is done internally and removed Status: Reviewing Issue is being reviewed labels Apr 23, 2024
@chipweinberger
Copy link
Contributor Author

wow very impressive work @RathiSonika

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Feature Request Feature request for IDF
Projects
None yet
Development

No branches or pull requests

6 participants