Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20647][core] Port StorageTab to the new UI backend. #19679

Closed
wants to merge 4 commits into from

Conversation

vanzin
Copy link
Contributor

@vanzin vanzin commented Nov 6, 2017

This required adding information about StreamBlockId to the store,
which is not available yet via the API. So an internal type was added
until there's a need to expose that information in the API.

The UI only lists RDDs that have cached partitions, and that information
wasn't being correctly captured in the listener, so that's also fixed,
along with some minor (internal) API adjustments so that the UI can
get the correct data.

Because of the way partitions are cached, some optimizations w.r.t. how
often the data is flushed to the store could not be applied to this code;
because of that, some different ways to make the code more performant
were added to the data structures tracking RDD blocks, with the goal of
avoiding expensive copies when lots of blocks are being updated.

Tested with existing and updated unit tests.

This required adding information about StreamBlockId to the store,
which is not available yet via the API. So an internal type was added
until there's a need to expose that information in the API.

The UI only lists RDDs that have cached partitions, and that information
wasn't being correctly captured in the listener, so that's also fixed,
along with some minor (internal) API adjustments so that the UI can
get the correct data.

Because of the way partitions are cached, some optimizations w.r.t. how
often the data is flushed to the store could not be applied to this code;
because of that, some different ways to make the code more performant
were added to the data structures tracking RDD blocks, with the goal of
avoiding expensive copies when lots of blocks are being updated.

Tested with existing and updated unit tests.
@vanzin
Copy link
Contributor Author

vanzin commented Nov 6, 2017

For context:

@vanzin
Copy link
Contributor Author

vanzin commented Nov 6, 2017

Note that unit tests will fail; they depend on changes being added in #19678. I'm posting this to speed up the review process.

@SparkQA
Copy link

SparkQA commented Nov 6, 2017

Test build #83509 has finished for PR 19679 at commit 7147bd2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@squito squito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm (at least, as much I can say now till the other parts are merged in and tests pass etc.)

I still need to make sure I understand StreamBlocks better before I would merge, but I do think you've got the same behavior as before

@@ -685,12 +700,55 @@ class AppStatusListenerSuite extends SparkFunSuite with BeforeAndAfter {
assert(exec.info.diskUsed === 0L)
}

// Add a block from a different RDD. Verify the executor is updated correctly and also that
// the distribution data for rdd1 is up to date.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're only checking the distribution data for rdd2, right?

@squito
Copy link
Contributor

squito commented Nov 7, 2017

I figured out what is going on with StreamBlocks now, both before and after this change, so I'm OK with it

@SparkQA
Copy link

SparkQA commented Nov 7, 2017

Test build #83560 has finished for PR 19679 at commit 7744c6c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 8, 2017

Test build #83604 has finished for PR 19679 at commit e257a1a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 9, 2017

Test build #83619 has finished for PR 19679 at commit fd59a24.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor

squito commented Nov 9, 2017

merged to master

@asfgit asfgit closed this in 6ae1271 Nov 9, 2017
@vanzin vanzin deleted the SPARK-20647 branch November 9, 2017 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants