Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-38909][CORE][YARN] Encapsulate LevelDB used to store remote/external shuffle state as DB #36200

Closed
wants to merge 58 commits into from

Conversation

LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Apr 14, 2022

What changes were proposed in this pull request?

ExternalShuffleBlockResolver, YarnShuffleService and RemoteBlockPushResolver use LevelDB directly, this is not conducive to extending the use of RocksDB in this scenario. This pr is encapsulated for expansibility. It will be the pre-work of SPARK-38888, the main change as follows:

  • Introduce DB interface and implementation of corresponding LevelDB now, RocksDB implementation will be involved in the future
  • Introduce DBBackend enum and new config SHUFFLE_SERVICE_DB_BACKEND to specify a disk-based store used in shuffle service local db, only LEVELDB is supported now, RocksDB implementation will be involved in the future
  • Introduce DBIterator to traverse the items in the DB, only LevelDBIterator is supported now, RocksDBIterator implementation will be involved in the future
  • Introduce DBProvider to initialization DB, it is only used to generate LevelDB now, RocksDB implementation will be involved in the future
  • Change StoreVersion to top-level class, it will be reused by RocksDB implementation
  • Replace the directly use of LevelDB in ExternalShuffleBlockResolver, YarnShuffleService and RemoteBlockPushResolver with DB
  • fix corresponding UTs and test tools

Why are the changes needed?

This is pre work of SPARK-38888

Does this PR introduce any user-facing change?

  • Add new config SHUFFLE_SERVICE_DB_BACKEND(spark.shuffle.service.db.backend), the new config use to use to specify a disk-based store used in shuffle service local db. Only LEVELDB is supported after this pr.

How was this patch tested?

Pass GA

@LuciferYang LuciferYang marked this pull request as draft April 14, 2022 13:11
@LuciferYang LuciferYang changed the title [DON'T MERGE] Encapsulate LevelDB used by ExternalShuffleBlockResolver and YarnShuffleService as LocalDB [DON'T MERGE][CORE][YARN] Encapsulate LevelDB used by ExternalShuffleBlockResolver and YarnShuffleService as LocalDB Apr 15, 2022
@LuciferYang LuciferYang changed the title [DON'T MERGE][CORE][YARN] Encapsulate LevelDB used by ExternalShuffleBlockResolver and YarnShuffleService as LocalDB [SPARK-38909][CORE][YARN] Encapsulate LevelDB used by ExternalShuffleBlockResolver and YarnShuffleService as LocalDB Apr 15, 2022
@LuciferYang LuciferYang changed the title [SPARK-38909][CORE][YARN] Encapsulate LevelDB used by ExternalShuffleBlockResolver and YarnShuffleService as LocalDB [SPARK-38909][CORE][YARN] Encapsulate LevelDB used by ExternalShuffleBlockResolver and YarnShuffleService as DB Apr 24, 2022
@LuciferYang LuciferYang marked this pull request as ready for review April 24, 2022 09:55
@mridulm
Copy link
Contributor

mridulm commented Aug 9, 2022

@dongjoon-hyun

Actually, instead of providing wrapper layer, the final goal is the removal of LevelDB in favor of RocksDB.

I am sure in future, we might have another SPIP to remove RocksDB in favor of something else :-)
If we can abstract it away, the future evolution might be easier !

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a pass through it, looks mostly fine to me - will go over it again later this week.
But will let @dongjoon-hyun drive this review - since he is more close to the overall change.

@mridulm
Copy link
Contributor

mridulm commented Aug 9, 2022

nit: Btw, from a completeness point of view, DB.get makes sense to include :-)

@LuciferYang
Copy link
Contributor Author

nit: Btw, from a completeness point of view, DB.get makes sense to include :-)

OK ~

@LuciferYang
Copy link
Contributor Author

Also cc @tgravescs do you have time to help review this pr? Thanks ~

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but would be good to have more eyes on this.

@tgravescs
Copy link
Contributor

changes look fine to me

@mridulm mridulm closed this in 892a45f Aug 18, 2022
@mridulm
Copy link
Contributor

mridulm commented Aug 18, 2022

Merged to master.
Thanks for working on this @LuciferYang !
Thanks for reviews @dongjoon-hyun, @zhouyejoe and @tgravescs :-)

@LuciferYang
Copy link
Contributor Author

Thanks @mridulm @tgravescs @zhouyejoe @dongjoon-hyun !!!
I will do the RocksDB related work next week due to some other work needs to be completed this week ~

* The local KV storage used to persist the shuffle state,
* the implementations may include LevelDB, RocksDB, etc.
*/
public interface DB extends Closeable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a developer API or user API? I didn't see the custom DB could be plugged in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

developer API. The implementation of RocksDB is under review: #37610,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So could you add annotation @DeveloperApi for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor

@mridulm mridulm Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not support users plugging in a different DB instance - it comes from within spark code.
Supported DB's come from DBBackend and DBProvider.initDB has an explicit switch for the supported types.
Wont be @DeveloperAPI

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mridulm It's a public interface. I think it should either be a user API or developer API, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... should we use @Private as kvstore.KVStore?

@Private
public interface KVStore extends Closeable {

Copy link
Contributor

@mridulm mridulm Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration is public @Ngone51, not the implementation itself.
To put it differently, there is no way for users to leverage any of this - unless they modify spark code.

I am fine with marking it as @Private if we need to make the intent clearer.

It is the same for most submodules in common/* - except probably for common/kvstore, where Marcello marked the interfaces as @Private :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger.info("Configured {} as {} and actually used value {}",
Constants.SHUFFLE_SERVICE_DB_BACKEND, dbBackendName, dbBackend);
}
db = DBProvider.initDB(dbBackend, this.registeredExecutorFile, CURRENT_VERSION, mapper);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, this should be put inside the if block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. I'll give a follow-up to fix this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need change as

if (registeredExecutorFile != null) {
      ....
      db = DBProvider.initDB(dbBackend, this.registeredExecutorFile, CURRENT_VERSION, mapper);
    } else {
      db = null;
    }

due to db is defined as final, and when 'registeredExecutorFile' is null, 'DBProvider.initDB' will return null directly . So does DBProvider.initDB still need to be put inside the if block ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, I think if (registeredExecutorFile != null) is actually useless here..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

@mridulm mridulm Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DBProvider.initDB handles null registeredExecutorFile as input.
Given db is a final variable, it is better to update it at a single place, and not across if/else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to delete if (registeredExecutorFile != null) condition

Copy link
Contributor

@mridulm mridulm Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current code in #37648 looks ok to me.

val shuffleDBName = sparkConf.get(config.SHUFFLE_SERVICE_DB_BACKEND)
val dBBackend = DBBackend.byName(shuffleDBName)
logInfo(s"Configured ${config.SHUFFLE_SERVICE_DB_BACKEND.key} as $shuffleDBName " +
s"and actually used value ${dBBackend.name()} ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like we may use a different db backend regardless of the user configured value in some cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, because the DBBackend.byName method has changed during cr, this log is outdated. I will fix it in the follow-up also.

mridulm pushed a commit that referenced this pull request Sep 8, 2022
…ated to shuffle state db

### What changes were proposed in this pull request?
This is a followup of #36200, the main change of this pr as follows:

- Make `DB` and `DBProvider` as Private to make the API intent clearer
- Delete `remove` method from `DBProvider` witch is unnecessary `Override`
- Remove the useless null check condition from `ExternalShuffleBlockResolver` and `RemoteBlockPushResolver`, fix related sutes
- Correction log print content

### Why are the changes needed?
Fix new comments after #36200 merged.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass All GitHub Actions

Closes #37648 from LuciferYang/SPARK-38909-followup.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants