Support directories with millions of files. #95

davies · 2021-01-20T12:19:58Z

What would you like to be added:

Currently, we fetch the attributes of files in single directory with single batch request to Redis, that could be slow or fail, and block other requests.

We can split those into small batches, for example, 1000 per batch.

Why is this needed:

The number of files could be millions, we don't want people be bited by that.

Backlog

Call MGET with small batches Avoid calling mget with massive number of keys in Readdir #110
Use HSCAN instead of HGETALL List large directories as small batches #128

The text was updated successfully, but these errors were encountered:

suzaku · 2021-01-21T08:47:15Z

Any more detail for this issue? In what kind of operation are the file attributes fetched?

davies · 2021-01-21T09:13:13Z

When you do ls, a Readdir() will be called. When plus is true, it will fetch all the attributes in one batch.

suzaku · 2021-01-21T09:28:45Z

Got it, I can take a try on this one.

suzaku · 2021-01-22T11:43:18Z

If there are millions of files, we might run into trouble when the first HGetAll call is reached. Any idea how to work around it?

davies · 2021-01-22T12:02:12Z

Yes. Right now, we don't have a work around, that could be the next challenge.

davies · 2021-01-28T03:25:42Z

The second part is done by #128

davies added the kind/feature New feature or request label Jan 20, 2021

davies added this to the Release 1.0 milestone Jan 21, 2021

xiaogaozi added the area/metadata Issues or PRs related to metadata label Jan 22, 2021

davies assigned suzaku Jan 22, 2021

suzaku mentioned this issue Jan 22, 2021

Avoid calling mget with massive number of keys in Readdir #110

Merged

xiaogaozi added the area/performance Issues or PRs related to performance label Jan 28, 2021

davies closed this as completed Jan 28, 2021

davies pushed a commit that referenced this issue Jan 28, 2021

add scs as a storage type (#95)

6ea9bc4

davies pushed a commit that referenced this issue Jan 28, 2021

add scs as a storage type (#95)

65271c0

davies pushed a commit that referenced this issue Jan 28, 2021

add scs as a storage type (#95)

0d14afd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support directories with millions of files. #95

Support directories with millions of files. #95

davies commented Jan 20, 2021 •

edited

suzaku commented Jan 21, 2021

davies commented Jan 21, 2021

suzaku commented Jan 21, 2021

suzaku commented Jan 22, 2021

davies commented Jan 22, 2021

davies commented Jan 28, 2021

Support directories with millions of files. #95

Support directories with millions of files. #95

Comments

davies commented Jan 20, 2021 • edited

Backlog

suzaku commented Jan 21, 2021

davies commented Jan 21, 2021

suzaku commented Jan 21, 2021

suzaku commented Jan 22, 2021

davies commented Jan 22, 2021

davies commented Jan 28, 2021

davies commented Jan 20, 2021 •

edited