Skip to content

Conversation

@phaniarnab
Copy link
Contributor

This patch enables reuse of RDDs of redundant Spark operations. We also persist a subset of operations in the executors, where the rest are just cached locally. Reuse of even unpersisted RDDs allows Spark to apply optimizations and skip stages. In addition, this patch removes the compile-time flag to indicate reuse and instead reuse all RDDs. Local RDD caching is now disabled due to bugs.

LinCache Spark (Col/Loc/Dist): 16/2/2. =>
indicates the number of reused collects/prefetches (=16), local RDDs (=2) and persisted RDDs(=2).

This patch enables reuse of RDDs of redundant Spark operations.
We also persist a subset of operations in the executors, where
the rest are just cached locally. Reuse of even unpersisted RDDs
allows Spark to apply optimizations and skip stages. In addition,
this patch removes the compile-time flag to indicate reuse and
instead reuse all RDDs. Local RDD caching is now disabled due to
bugs.

LinCache Spark (Col/Loc/Dist): 	16/2/2. =>
indicates the number of reused collects/prefetches (=16), local
RDDs (=2) and persisted RDDs(=2).
@phaniarnab phaniarnab closed this in c93f9f9 Feb 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant