Skip to content

Conversation

@andrewpalumbo
Copy link
Member

Currently, the optimizer generates checkpoints and attaches them to actual logical elements of the DAG via CheckpointAction$cp.

ie:

drmC = drmA+ drmB

val cp1 = drmC.checkpoint() // checkpoint
val cp2 = drmC.checkpoint() // cp2 == cp1

drmD = cp1 + drmE // cp1 + drmE

but, in:
drmD = drmC + drmE // computes drmA + drmB + drmC all over

drmC already hascp1 attached to it so we should assume the common computational path is the intent here regardless and should be used, instead of building plans that recompute it. That is,

drmD = drmC + drmE should imply cp1 + drmEas well even if checkpoint is not used explicitly.

This PR allows us to avoid excessive declarations like

drmAcp = drmA.checkpoint

drmB = drmAcp %*%... 

and instead just use

drmA.checkpoint()

drmB = drmA %*% ....

…piointAction in physical translation and use its caching policy for the physical checkpoint
@smarthi
Copy link
Member

smarthi commented Mar 8, 2016

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants