New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RaptorX: Unifying Hive-Connector and Raptor-Connector with Low Latency #13205
Comments
(out-of-date; check #13205 (comment) for actual benchmark result)Some benchmark results on prod queries |
Nice work! Would be even better to align the x-axis among graphs for easier visualization. |
(out-of-date; check #13205 (comment) for actual benchmark result)Reducing default read size from 8MB to 0.2MB can greatly leverage the benefit of using SSD |
(out-of-date; check #13205 (comment) for actual benchmark result) |
(out-of-date; check #13205 (comment) for actual benchmark result)0.5MB max read result. We will mainly focus on optimizing IO than using data cache. |
(out-of-date; check #13205 (comment) for actual benchmark result)Given HDFS can provide much higher throughput, we increase the IO fanout to workaround latency: |
(out-of-date; check #13205 (comment) for actual benchmark result) |
HDD results (out-of-date; check #13205 (comment) for actual benchmark result) |
Can you send out specific test models and use cases |
@fengyun2066 we use prod workload to test the performance. In case you are interested in the performance |
@highker |
@highker |
@hustnn, the file handle/descriptor is an FS-specific concept. It various from FS to FS. Distributed FS like Google's Colossus or Facebook's Warm Storage are similar in design that chops files into chunks and saves them on different chunk servers. File format on the other hand is orthogonal from from file handle. So whichever file format you use should be fine. For Facebook, Warm Storage opens file by retrieving chunk information from chunk allocator servers. This information is (mostly) immutable. It provides where each chunk of a file comes from. By caching this information, a split can avoid overhead of chunk lookup (which is around ~100ms for Warm Storage or Colossus). Therefore, it depends on what FS you use and how much implementation details it exposes to its interface. Check the |
@highker I see, Thanks. Is the chunk information of a file in Facebook Warm Storage similar to the blocks of a file in HDFS? If yes, does the cached information indicate the block locations in HDFS? |
@hustnn, that is correct. It's the block metadata for a file |
@highker Thanks! |
Just pinging to see if there are any updates on RaptorX? Curious to try it out and see how the disaggregated compute/storage will perform on some datasets. |
@teejteej, yes the feature is fully available and battled tested. Here are the configs to enable for hive connectors: Scheduling (/catalog/hive.properties):
Metastore versioned cache (/catalog/hive.properties):
List files cache (/catalog/hive.properties):
Data cache (/catalog/hive.properties):
Fragment result cache (/config.properties):
|
Adding to what @highker said, there are also knobs for file/stripe footer cache for ORC and Parquet files as shown below. Note that in Facebook's deployment (we use ORC), we found it would increase GC pressure so it is disabled, but I think it is still worthwhile to point this feature out just for visibility as it might work for different workload. @ClarenceThreepwood might have a better idea about how Parquet metadata cache works in Uber. ORC:
Parquet:
|
All merged; fully available in 0.248 |
@highker , hi, 0.247 includes all the features of RaptorX , or it needs to use 0.248? |
The feature has been general available and fully battled tested in Hive connector. Raptor connector is no longer maintained. Please use this feature instead.
Current Presto Design
Current Raptor Design
Read/Write Path
Background Jobs
Metadata
Pros and Cons with Current Raptor/Presto
RaptorX Design
"RaptorX" is a project code name. It aims to evolve Presto (presto-hive) in a way to unify presto-raptor and presto-hive. To make sure taking the pros from both Presto and Raptor, we made the following design decisions:
Use hierarchical cache to achieve low latency (sub-second)
Read/Write Path
Cache
Background Jobs
Metastore
Coordinator
Interactive query flow
RaptorX returns the schema, file locations, etc back to Presto coordinator.
Production Benchmark Result
Milestones and Tasks
How to Use
Enable the following configs in Hive connector (with the exception that fragment result cache is for main engine)
Scheduling (/catalog/hive.properties):
Metastore versioned cache (
/catalog/hive.properties
):List files cache (
/catalog/hive.properties
):Data cache (
/catalog/hive.properties
):Fragment result cache (
/config.properties
and/catalog/hive.properties
):File and stripe footer cache (
/catalog/hive.properties
):The text was updated successfully, but these errors were encountered: