Reduce cost for repeated accesses #255

plgounod · 2023-05-23T11:59:25Z

Tell us more about this new feature.

After reading a file (or a portion of it) using Mountpoint for Amazon S3, customers want to keep the data on their compute instance for a configurable amount of time. With this enhancement, Mountpoint will make fewer GET requests to Amazon S3 when customers repeatedly access the same file data.

dannycjones · 2023-05-25T08:32:16Z

Hey, thank you for the feedback!

We've heard this ask from a few customers and we're looking into it, but we have nothing to share right now.

stevew3344 · 2023-08-25T20:50:19Z

Local file caching would make a huge performance gain in our workflows, where we access the same file dozens or hundreds of times in some cases. Currently I switch to s3fs when I need to run these workflows.

tchaton · 2023-09-10T12:49:48Z

Yes, I second this issue.

tchaton · 2023-10-06T07:20:13Z

Additionally, I wonder if it would be possible to provide a parameter to avoid re-listing the bucket once done once. This is a costly operation and can lead to slower training.

dannycjones · 2023-10-06T09:23:26Z

Additionally, I wonder if it would be possible to provide a parameter to avoid re-listing the bucket once done once. This is a costly operation and can lead to slower training.

Hey Thomas, would you be able to give a little more detail here on the access pattern? Maybe a short example code?

I'm wondering if the cost you're referring to is caused by some explicit call like os.listdir(path) or to the ListObjectsV2 and HeadObject performed when opening individual files, i.e. during f = open(file_path).

Either way, I see why listing the bucket/prefix up front and storing the results would help both of these cases.

tchaton · 2023-10-09T09:48:49Z

Hey @dannycjones,

Yes, let's assume the dataset doesn't change through time and it is used in read-only mode. Right now, everytime I am listing ImageNet from python (~1.2M files), it takes 300 seconds.

Here are 3 additional things I wish to have on top of the possibility to cache the files locally:

1. Keep the list of files in RAM or within a file when listed once e.g the second time I list from python, it is as fast as listing locally
1. Some python utilities to fast list a bucket folder by recursively listing sub-folders in parallel, pre-loading some files ahead of time with fine control from python.
1. Add support for dumping / restoring a bucket index to avoid listing the bucket over and over.

dannycjones · 2023-10-26T08:43:07Z

On-going work on this issue is being integrated behind the build-time feature flag caching.

mountpoint-s3/mountpoint-s3/Cargo.toml

Lines 64 to 66 in fa0d516

    
           [features] 
        
           # Experimental features 
        
           caching = []

tchaton · 2023-10-29T00:03:40Z

@dannycjones What are the steps to compile it with the feature flag ?

goldstar611 · 2023-10-29T15:07:48Z

@dannycjones What are the steps to compile it with the feature flag ?

https://github.com/awslabs/mountpoint-s3/blob/main/doc/INSTALL.md#building-mountpoint-for-amazon-s3-from-source

and I think step 4 should be changed from
cargo build --release
to
cargo build --release --features "caching"
if I'm reading the cargo features page correctly.

dannycjones · 2023-10-30T08:29:24Z

@dannycjones What are the steps to compile it with the feature flag ?

https://github.com/awslabs/mountpoint-s3/blob/main/doc/INSTALL.md#building-mountpoint-for-amazon-s3-from-source

and I think step 4 should be changed from cargo build --release to cargo build --release --features "caching" if I'm reading the cargo features page correctly.

Exactly this. When running cargo commands (Rust's build system), you can add --features "caching" such as during builds above. We're explicitly trying to limit the differences when building with this flag at the moment, so the main difference is that the CLI arguments for enabling the cache are hidden when the caching flag is not provided.

tchaton · 2023-11-04T00:24:47Z

Cool @dannycjones. I will give it a try next week.

tchaton · 2023-11-15T15:21:52Z

Hey @dannycjones. Going to bump it today and run heavy stress test over it for the next week. I will ping you if we find anything.

dannycjones · 2023-11-15T15:29:09Z

Hey @dannycjones. Going to bump it today and run heavy stress test over it for the next week. I will ping you if we find anything.

I'm planning to update the default metadata cache TTL to 1 second or a similar short value, so that it suits general purpose workloads.

I'd recommend pinning the metadata cache TTL to a value that suits your workload (using --metadata-cache-ttl <SECONDS>). Since its ML training and we don't expect objects in the prefix to change during training, a value that exceeds the expected duration of training sounds about right.

goldstar611 · 2023-11-18T02:56:02Z

I'd recommend pinning the metadata cache TTL to a value that suits your workload (using --metadata-cache-ttl <SECONDS>).

For those following this thread and compiling from the main branch (like me), it looks like this parameter has changed to --metadata-ttl in 7d38be7#diff-dc57c703340f88ddb4eab99dd8d870135117972f0f323c0a57b22a3f803b99ffL241

I'm excited to give this another try with a bucket that has 1000s of files in it.

tchaton · 2023-11-18T04:07:06Z

Cool, I will try that.

passaro · 2023-11-18T09:11:29Z

You will also need to specify --cache <DIR> to enable caching (both object metadata and content), in addition to --metadata-ttl <SECONDS>.

As noted above, both options are currently only available when building with the --features "caching" flag. You can find an early draft of the documentation in #587 .

passaro · 2023-11-22T13:40:33Z

Support for caching is now available in Mountpoint 1.2.0

passaro · 2023-11-22T13:42:47Z

See updated docs for help configuring the cache.

tchaton · 2023-11-24T11:12:49Z

Hey @dannycjones @passaro. After a week of heavy testing, it appears mountpoints3 is more flaky than other open source solutions. We are seeing a ratio of 7/10 failures in our heavy benchmarks with transport errors. I will update you with more details when we validate this isn't coming from us.

plgounod added the enhancement New feature or request label May 23, 2023

jamesbornholt mentioned this issue May 31, 2023

Support for local cache #251

Closed

jamesbornholt mentioned this issue Jul 25, 2023

Add ML benchmarks #369

Open

jamesbornholt mentioned this issue Aug 18, 2023

Add support for caching #463

Closed

jamesbornholt mentioned this issue Sep 6, 2023

Make fewer lookup requests when inode type might be known #12

Open

dannycjones mentioned this issue Oct 9, 2023

Implement CacheConfig field permitting some operations to return non-expired cache data #547

Merged

passaro mentioned this issue Oct 11, 2023

Introduce new abstraction between the prefetcher and GetObject calls #552

Merged

dannycjones mentioned this issue Oct 12, 2023

Add support for caching, exporting, importing bucket list #549

Open

passaro mentioned this issue Oct 13, 2023

Introduce a python client to communicate with the mount #554

Open

This was referenced Oct 16, 2023

Add new DataCache trait and InMemoryDataCache implementation #557

Merged

Add metadata cache configuration flags behind build-time feature #559

Merged

Add request count tests for FS operations with metadata caching enabled #567

Merged

passaro mentioned this issue Oct 23, 2023

Use new ChecksummedBlock in DataCache #572

Closed

sauraank mentioned this issue Oct 26, 2023

"http_status=503" When downloading with high concurrency #574

Closed

dannycjones mentioned this issue Oct 30, 2023

Add documentation for object metadata and data caching #587

Merged

This was referenced Oct 31, 2023

Add caching ObjectStore implementation #590

Closed

Introduce ObjectStore trait to replace ObjectClient in mountpoint-s3 #592

Closed

This was referenced Nov 1, 2023

Implement disk-based DataCache with no eviction #593

Merged

Add ETag into DiskDataCache hashed block path #594

Merged

passaro mentioned this issue Nov 3, 2023

Introduce Prefetch trait #595

Merged

This was referenced Nov 6, 2023

Expose DataCache module and CacheKey fields #596

Merged

Add caching Prefetch implementation #598

Merged

This was referenced Nov 15, 2023

Split cache hashed directory keys to avoid any FS-specific limits #606

Merged

Remove unused 'cached_block_indices' method in DataCache trait #607

Merged

This was referenced Nov 16, 2023

Implement cache eviction #610

Merged

Simplify and rename cache configuration flags #612

Merged

dannycjones mentioned this issue Nov 20, 2023

Implement O_DIRECT for open to bypass metadata cache #614

Merged

This was referenced Nov 20, 2023

Disable data cache when setting --max_cache_size=0 #616

Merged

Improve cache metrics and logging #619

Merged

dannycjones mentioned this issue Nov 21, 2023

Cleanup cache dir at mount and exit #620

Merged

passaro mentioned this issue Nov 21, 2023

Remove the temporary caching feature flag #622

Merged

This was referenced Nov 21, 2023

Add zero padding, remove suffix for block file names #623

Merged

Release v1.2.0 #624

Merged

passaro closed this as completed Nov 22, 2023

dannycjones mentioned this issue Nov 23, 2023

Add docs clarifications about which FS operations can be served from cache and when #627

Merged

dannycjones mentioned this issue Nov 28, 2023

Update cache directory to create content with MP owner access only #637

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce cost for repeated accesses #255

Reduce cost for repeated accesses #255

plgounod commented May 23, 2023

dannycjones commented May 25, 2023

stevew3344 commented Aug 25, 2023

tchaton commented Sep 10, 2023

tchaton commented Oct 6, 2023

dannycjones commented Oct 6, 2023 •

edited

Loading

tchaton commented Oct 9, 2023 •

edited

Loading

dannycjones commented Oct 26, 2023

tchaton commented Oct 29, 2023

goldstar611 commented Oct 29, 2023

dannycjones commented Oct 30, 2023

tchaton commented Nov 4, 2023

tchaton commented Nov 15, 2023

dannycjones commented Nov 15, 2023

goldstar611 commented Nov 18, 2023

tchaton commented Nov 18, 2023

passaro commented Nov 18, 2023

passaro commented Nov 22, 2023

passaro commented Nov 22, 2023

tchaton commented Nov 24, 2023 •

edited

Loading

Reduce cost for repeated accesses #255

Reduce cost for repeated accesses #255

Comments

plgounod commented May 23, 2023

Tell us more about this new feature.

dannycjones commented May 25, 2023

stevew3344 commented Aug 25, 2023

tchaton commented Sep 10, 2023

tchaton commented Oct 6, 2023

dannycjones commented Oct 6, 2023 • edited Loading

tchaton commented Oct 9, 2023 • edited Loading

dannycjones commented Oct 26, 2023

tchaton commented Oct 29, 2023

goldstar611 commented Oct 29, 2023

dannycjones commented Oct 30, 2023

tchaton commented Nov 4, 2023

tchaton commented Nov 15, 2023

dannycjones commented Nov 15, 2023

goldstar611 commented Nov 18, 2023

tchaton commented Nov 18, 2023

passaro commented Nov 18, 2023

passaro commented Nov 22, 2023

passaro commented Nov 22, 2023

tchaton commented Nov 24, 2023 • edited Loading

dannycjones commented Oct 6, 2023 •

edited

Loading

tchaton commented Oct 9, 2023 •

edited

Loading

tchaton commented Nov 24, 2023 •

edited

Loading