Skip to content

KYLIN-5187 Support Alluxio Local Cache + Soft Affinity to speed up the query performance on the cloud#1880

Merged
hit-lacus merged 2 commits intoapache:mainfrom
zzcclp:kylin-soft-affinity-local-cache
Jun 21, 2022
Merged

KYLIN-5187 Support Alluxio Local Cache + Soft Affinity to speed up the query performance on the cloud#1880
hit-lacus merged 2 commits intoapache:mainfrom
zzcclp:kylin-soft-affinity-local-cache

Conversation

@zzcclp
Copy link
Copy Markdown
Contributor

@zzcclp zzcclp commented May 24, 2022

Proposed changes

Support Alluxio Local Cache + Soft Affinity to speed up the query performance on the cloud.
Support Spark 3.1 only.

Github Branch

As most of the development works are on Kylin 4, we need to switch it as main branch. Apache Kylin community changes the branch settings on Github since 2021-08-04 :

  1. The default branch main is for Kylin 4.x (Parquet storage);
  2. The original branch master for Kylin 3.x (HBase storage) has been renamed to kylin3 ;

Please check Intro to Kylin 4 architecture and INFRA-22166 if you are interested.

Types of changes

What types of changes does your code introduce to Kylin?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have create an issue on Kylin's jira, and have described the bug/feature there in detail
  • Commit messages in my PR start with the related jira ID, like "KYLIN-0000 Make Kylin project open-source"
  • Compiling and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged

Further comments

If this is a relatively large or complex change, kick off the discussion at user@kylin.apache.org or dev@kylin.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@zzcclp zzcclp requested a review from hit-lacus May 24, 2022 07:25
@zzcclp zzcclp force-pushed the kylin-soft-affinity-local-cache branch from b9b5b82 to f1d75bb Compare May 26, 2022 02:18
@zzcclp zzcclp force-pushed the kylin-soft-affinity-local-cache branch from f1d75bb to 85201fc Compare June 8, 2022 13:16
    1. Implement LocalDataCacheManager
    2. base xiaoxiang's PR
    3. Implement CacheFileScanRDD
    4. Implement AbstractCacheFileSystem
    5. Optimize performance
    6. Support soft affinity for hdfs
    7. Support ByteBuffer to read data, and avoid to read data one byte by one byte
    8. Support to cache small files in memory : ByteBufferPageStore extends PageStore to support cache data in memory
    9. Pre-init KylinCacheFileSystem to fix s3a issue
   10. Upgrade alluxio client verion to 2.7.4
@zzcclp zzcclp force-pushed the kylin-soft-affinity-local-cache branch from 85201fc to f22abfa Compare June 11, 2022 00:25
@WANGHui2022 WANGHui2022 force-pushed the kylin-soft-affinity-local-cache branch from 36262ea to b464928 Compare June 21, 2022 01:42
@lgtm-com
Copy link
Copy Markdown

lgtm-com bot commented Jun 21, 2022

This pull request introduces 4 alerts when merging f7479fc into fd4a472 - view on LGTM.com

new alerts:

  • 2 for Non-synchronized override of synchronized method
  • 2 for Dereferenced variable may be null

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 21, 2022

Codecov Report

❌ Patch coverage is 16.66667% with 15 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@8cde6dd). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...he/spark/sql/execution/datasource/FilePruner.scala 0.00% 9 Missing ⚠️
...cala/org/apache/spark/utils/SparkHadoopUtils.scala 0.00% 3 Missing ⚠️
.../spark/common/logging/AbstractHdfsLogAppender.java 0.00% 1 Missing ⚠️
...ark/common/logging/SparkDriverHdfsLogAppender.java 0.00% 1 Missing ⚠️
...park/common/logging/SparkExecutorHdfsAppender.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1880   +/-   ##
=======================================
  Coverage        ?   23.51%           
  Complexity      ?     4468           
=======================================
  Files           ?     1145           
  Lines           ?    65372           
  Branches        ?     9341           
=======================================
  Hits            ?    15374           
  Misses          ?    48379           
  Partials        ?     1619           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

    1. Fix exception to get more detailed message for other FileSystem
    2. Fix de.thetaphi:forbiddenapis:2.3:check for forbidden api, add default charset UTF-8
@WANGHui2022 WANGHui2022 force-pushed the kylin-soft-affinity-local-cache branch from 5ba66ec to 09ae132 Compare June 21, 2022 06:23
@lgtm-com
Copy link
Copy Markdown

lgtm-com bot commented Jun 21, 2022

This pull request introduces 4 alerts when merging 09ae132 into fd4a472 - view on LGTM.com

new alerts:

  • 2 for Non-synchronized override of synchronized method
  • 2 for Dereferenced variable may be null

@hit-lacus hit-lacus merged commit f2cdedb into apache:main Jun 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants