-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support user's app quota level limit #311
Changes from 12 commits
134de9d
4a01ea2
884fb18
85e2764
f61ce1a
e652a6d
ebdf28a
5ee05ed
d2c1156
3bdd834
7fa69e8
36fb720
25b4b78
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -88,6 +88,8 @@ public class RssShuffleManager implements ShuffleManager { | |
private final int dataCommitPoolSize; | ||
private boolean heartbeatStarted = false; | ||
private boolean dynamicConfEnabled = false; | ||
private final String user; | ||
private final String uuid; | ||
private ThreadPoolExecutor threadPoolExecutor; | ||
private EventLoop eventLoop = new EventLoop<AddBlockEvent>("ShuffleDataQueue") { | ||
|
||
|
@@ -142,7 +144,8 @@ public RssShuffleManager(SparkConf sparkConf, boolean isDriver) { | |
throw new IllegalArgumentException("Spark2 doesn't support AQE, spark.sql.adaptive.enabled should be false."); | ||
} | ||
this.sparkConf = sparkConf; | ||
|
||
this.user = sparkConf.get("spark.rss.quota.user", "user"); | ||
this.uuid = sparkConf.get("spark.rss.quota.uuid", Long.toString(System.currentTimeMillis())); | ||
// set & check replica config | ||
this.dataReplica = sparkConf.get(RssSparkConfig.RSS_DATA_REPLICA); | ||
this.dataReplicaWrite = sparkConf.get(RssSparkConfig.RSS_DATA_REPLICA_WRITE); | ||
|
@@ -204,12 +207,12 @@ public RssShuffleManager(SparkConf sparkConf, boolean isDriver) { | |
@Override | ||
public <K, V, C> ShuffleHandle registerShuffle(int shuffleId, int numMaps, ShuffleDependency<K, V, C> dependency) { | ||
// If yarn enable retry ApplicationMaster, appId will be not unique and shuffle data will be incorrect, | ||
// appId + timestamp can avoid such problem, | ||
// appId + uuid can avoid such problem, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we need uuid? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can't get the appId when we try Access, because the appId is generated after the RssManager is created. In order to support push down, we maintain the uuid as a substitute for the appId, and replace the uuid with the appId after the app heartbeat is reported to the coordinator. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can put the appId to the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean to generate the uuid on the driver? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My fault. We ignore that I can't get the appId in the construct method. Let me think twice. |
||
// can't get appId in construct because SparkEnv is not created yet, | ||
// appId will be initialized only once in this method which | ||
// will be called many times depend on how many shuffle stage | ||
if ("".equals(appId)) { | ||
appId = SparkEnv.get().conf().getAppId() + "_" + System.currentTimeMillis(); | ||
appId = SparkEnv.get().conf().getAppId() + "_" + uuid; | ||
LOG.info("Generate application id used in rss: " + appId); | ||
} | ||
|
||
|
@@ -260,6 +263,7 @@ public <K, V, C> ShuffleHandle registerShuffle(int shuffleId, int numMaps, Shuff | |
} | ||
|
||
private void startHeartbeat() { | ||
shuffleWriteClient.registerApplicationInfo(appId, heartbeatTimeout, user); | ||
if (!sparkConf.getBoolean(RssSparkConfig.RSS_TEST_FLAG.key(), false) && !heartbeatStarted) { | ||
heartBeatScheduledExecutorService.scheduleAtFixedRate( | ||
() -> { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we remove this if mr don't need to register application info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this because
applicationManager
remove expired app needuser
. When our Spark does not use the AppQuota checker, it also needs register application info, so it will not have much impact here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this isn't compatible feature. If we use old client to connect new service, something wrong will happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because the old client does not have user information or method of
registerApplicationInfo
, app can only be updated atrefreshApp
, which I am compatible with.