SPARK-2582. Make Block Manager Master pluggable.#1506
SPARK-2582. Make Block Manager Master pluggable.#1506harishreedharan wants to merge 4 commits intoapache:masterfrom
Conversation
This patch makes the BlockManagerMaster a trait and makes the current BlockManagerMaster one of the possible implementations and renames it to StandaloneBlockManagerMaster. An additional (as yet undocumented) configuration parameter is added which can be used to set the BlockManagerMaster type to use. At some point, when we add BlockManagerMasters which write metadata to HDFS or replicate, we can add other possible values which will use other implementations. There is no change in current behavior. We must also enforce other implementations to use the current Akka actor itself, so the code in the BlockManager does not need to care what implementation is used on the BMM side. I am not sure how to enforce this. This is not too much of a concern as we don't have to make it pluggable - so the only options would be part of Spark - so this should be fairly easy to enforce.
|
Can one of the admins verify this patch? |
…ance in BlockManagerMaster
There was a problem hiding this comment.
So if I want to use different BlockManagerMaster implementation I have to modified this code to support different type?
It is preferable to use fully classified class name to allow pluggability so as long as the Class implementation is in the classpath then it should be able to use the different implementation of the BlockManagerMaster.
There was a problem hiding this comment.
I thought about doing that - but doing that does not allow us a way to force the implementation to use the Akka BlockManagerMasterActor, since the Block Managers would continue to use that. If we could somehow force that - then it would be a good idea to just use FQCN.
Conflicts: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala
|
Hey @harishreedharan thanks for submitting, but like to close this PR for now pending a more complete design proposal about how external implementations of the block storage service would work. There are a bunch of other challenges in decoupling the block storage service from the SparkContext... it's definitely an interesting idea longer term but one that would need a thorough design and consensus. If we make this pluggable it will signal to the community that we want to head down this direction, so I'd propose getting a consensus on that before proceeding. For streaming specifically I know you and @tdas are working on some more specific mechanisms to provide H/A in that case. |
This patch makes the BlockManagerMaster a trait and makes the current BlockManagerMaster one of
the possible implementations and renames it to StandaloneBlockManagerMaster. An additional (as yet undocumented)
configuration parameter is added which can be used to set the BlockManagerMaster type to use. At some
point, when we add BlockManagerMasters which write metadata to HDFS or replicate, we can add other possible
values which will use other implementations.
There is no change in current behavior. We must also enforce other implementations to use the current Akka actor
itself, so the code in the BlockManager does not need to care what implementation is used on the BMM side. I am not sure
how to enforce this. This is not too much of a concern as we don't have to make it pluggable - so the only options would
be part of Spark - so this should be fairly easy to enforce.