New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[compiler] don't force TableRead to produce a TableValue #13229
Conversation
424b5f3
to
ce9ec06
Compare
Put WIP back on, noticed something I still want to look into. |
Alright, should be ready now. |
final def lower(ctx: ExecuteContext, requestedType: TableType, dropRows: Boolean): TableStage = | ||
if (dropRows) { | ||
val globals = lowerGlobals(ctx, requestedType.globalType) | ||
|
||
TableStage( | ||
globals, | ||
RVDPartitioner.empty(ctx, requestedType.keyType), | ||
TableStageDependency.none, | ||
MakeStream(FastIndexedSeq(), TStream(TStruct.empty)), | ||
(_: Ref) => MakeStream(FastIndexedSeq(), TStream(requestedType.rowType))) | ||
} else | ||
_lower(ctx, requestedType) | ||
|
||
def _lower(ctx: ExecuteContext, requestedType: TableType): TableStage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what this change adds other than a template pattern which I generally avoid whenever possible. I preferred keeping this logic in LowerTableIr
.
How is _lower
different to lower
? Who else uses lower
with dropRows
other than TableRead
? _lower
seems like something that shouldn't be public and calling it directly seems like a mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pattern we use occasionally. _lower
should really be protected, I'll make that change, and it is only called directly in lower
, which is final. So _lower
is the customization point for subclasses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I was trying to understand is why do we need _lower
at all when TableRead
seems to be the only place that uses dropRows
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I understand now. That's fair. Addressed.
def toExecuteIntermediate(ctx: ExecuteContext, requestedType: TableType, dropRows: Boolean): TableExecuteIntermediate = { | ||
assert(!dropRows) | ||
TableExecuteIntermediate(_lower(ctx, requestedType)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this interface change. One thing I'd suggest is to remove the TableExecuteIntermediate
companion object and construct TableStageIntermediate
and TableValueIntermediate
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not a bad change, but given that it will all be going away when everything is lowered, I don't think there's much benefit.
a702b5b
to
3b817ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- this is a great change
#13313) On the current release (0.2.119), `is.hail.rvd.RVD.localSort` attempts to serialize an entire `is.hail.rvd.RVD` object when called, which causes some operations, like reading a set of VCF files, to fail. That particular operation no longer fails due to some combination of the changes made to reading VCF files in #13206 and #13229, but the serialization could still cause issues if `is.hail.rvd.RVD.localSort` is called in another context. See https://hail.zulipchat.com/#narrow/stream/123011-Hail-Query-Dev/topic/Possible.20bug.20in.20Hail.200.2E2.2E119/near/378610151 for more information.
Currently
TableRead.execute
always produces aTableValueIntermediate
, even though almost allTableReader
s are lowerable, so could produce aTableStageIntermediate
.This pr refactors
TableReader
to allow producing aTableStageIntermediate
in most cases, and to make it clearer which readers still need to be lowered (onlyTableFromBlockMatrixNativeReader
,MatrixVCFReader
, andMatrixPLINKReader
). It also deletes some now dead code.