Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] Use EncodedLiteral instead of Literal from python to scala #13814

Merged
merged 40 commits into from Oct 25, 2023

Conversation

daniel-goldstein
Copy link
Contributor

@daniel-goldstein daniel-goldstein commented Oct 13, 2023

CHANGELOG: Pipelines that are memory-bound by copious use of hl.literal, such as vds.filter_intervals, require substantially less memory.

Closes #13757

Copy link
Collaborator

@danking danking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v. nice

hail/python/hail/expr/types.py Show resolved Hide resolved
hail/python/hail/expr/types.py Show resolved Hide resolved
@@ -1754,6 +1851,9 @@ def call_allele_pair(i):

return genetics.Call(alleles, phased)

def _convert_to_encoding(self, byte_writer, value):
pass # TODO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh yeah this one is a bit complex

)
def test_literal_encoding(value):
lit = hl.literal(value)
round_trip = lit.dtype._from_encoding(lit.dtype._to_encoding(value))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love me a round-trip test

@daniel-goldstein daniel-goldstein dismissed danking’s stale review October 16, 2023 15:44

Should be mostly if not entirely passing now.

byte_writer.write_int64(dim)

if self.element_type in _numeric_types:
byte_writer.write_bytes(value.data)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting the element ordering right here makes me a little nervous. Should maybe have some more rigorous tests.

interval = hl.Interval(start=start, end=end, includes_start=True, includes_end=False)
_assert_encoding_roundtrip(start)
_assert_encoding_roundtrip(end)
_assert_encoding_roundtrip(interval)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each of these three tests is slightly different, but they all ultimately call the same method. What's the argument against just parameterizing _assert_encoding_roundtrip itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that at first, and looks like I can do this with Call, but I can't instantiate a Locus before init (because it tries to load a reference genome) so I can't include a Locus in the parametrize.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahah right

{"foo", "bar", "baz"},
{"a": 1, "b": 2},
np.array([[1, 2], [3, 4], [5, 6]]),
np.array([[1, 2], [3, 4], [5, 6]]).T,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pursuant to your comment above can we add:

  1. Test for 0d np.array
  2. Test for 1d np.array
  3. Test for 2d np.array
  4. Test for 4d np.array
  5. Transposition of all of the above

danking
danking previously approved these changes Oct 18, 2023
byte_writer.write_bytes(value.data)
else:
for elem in np.nditer(value, order='F'):
self.element_type._convert_to_encoding(byte_writer, elem)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel good about this now. You're using ENDArrayColumnMajor and the column ordering is Fortran, so ti all checks out.

Copy link
Collaborator

@danking danking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bugs about getting non integral values is because pandas is giving you a pandas NA.

while i < length:
missing_byte = 0
for j in range(min(8, length - i)):
if value[i + j] is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if value[i + j] is None:
if value[i + j] in (None, pd.NA):

while i < length:
missing_byte = 0
for j in range(min(8, length - i)):
if value[keys[i + j]] is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here for missing values from pandas.

@daniel-goldstein daniel-goldstein force-pushed the encode-literals-python-to-scala branch 2 times, most recently from 6756fff to b5719f9 Compare October 19, 2023 16:05
@daniel-goldstein daniel-goldstein dismissed danking’s stale review October 19, 2023 16:09

dismissing as theres been a non-trivial change

@danking
Copy link
Collaborator

danking commented Oct 20, 2023

Hmm. Seems like the ArraySorter still isn't quite right?

I also saw some issues with sockets timing out but I don't know what to make of those yet.

_______________________________ test_union_rows1 _______________________________

    @test_timeout(local=3 * 60)
    def test_union_rows1():
        vds = hl.vds.read_vds(os.path.join(resource('vds'), '1kg_chr22_5_samples.vds'))
    
        vds1 = hl.vds.filter_intervals(vds,
                                       [hl.parse_locus_interval('chr22:start-10754094', reference_genome='GRCh38')],
                                       split_reference_blocks=True)
        vds2 = hl.vds.filter_intervals(vds,
                                       [hl.parse_locus_interval('chr22:10754094-end', reference_genome='GRCh38')],
                                       split_reference_blocks=True)
    
    
        vds_union = vds1.union_rows(vds2)
>       assert hl.vds.to_dense_mt(vds)._same(hl.vds.to_dense_mt(vds_union))

test/hail/vds/test_vds.py:597: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
<decorator-gen-1386>:2: in _same
    ???
/usr/local/lib/python3.9/dist-packages/hail/typecheck/check.py:587: in wrapper
    return __original_func(*args_, **kwargs_)
/usr/local/lib/python3.9/dist-packages/hail/matrixtable.py:3762: in _same
    return self._localize_entries(entries_name, cols_name)._same(
<decorator-gen-1276>:2: in _same
    ???
/usr/local/lib/python3.9/dist-packages/hail/typecheck/check.py:587: in wrapper
    return __original_func(*args_, **kwargs_)
/usr/local/lib/python3.9/dist-packages/hail/table.py:3658: in _same
    mismatched_globals, mismatched_rows = t.aggregate(hl.tuple((
<decorator-gen-1216>:2: in aggregate
    ???
/usr/local/lib/python3.9/dist-packages/hail/typecheck/check.py:587: in wrapper
    return __original_func(*args_, **kwargs_)
/usr/local/lib/python3.9/dist-packages/hail/table.py:1285: in aggregate
    return Env.backend().execute(hl.ir.MakeTuple([agg_ir]))[0]
/usr/local/lib/python3.9/dist-packages/hail/backend/py4j_backend.py:86: in execute
    raise e.maybe_user_error(ir) from None
/usr/local/lib/python3.9/dist-packages/hail/backend/py4j_backend.py:76: in execute
    result_tuple = self._jbackend.executeEncode(jir, stream_codec, timed)
/usr/local/lib/python3.9/dist-packages/py4j/java_gateway.py:1321: in __call__
    return_value = get_return_value(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = ('xro552', <py4j.java_gateway.GatewayClient object at 0x7f01e1182160>, 'o1', 'executeEncode')
kwargs = {}
pyspark = <module 'pyspark' from '/usr/local/lib/python3.9/dist-packages/pyspark/__init__.py'>
s = 'java.lang.RuntimeException: invalid memory access: 120001305f726574/00000004: not in 7f15e9ffb1f8/00010000'
tpl = JavaObject id=o553
deepest = 'RuntimeException: invalid memory access: 120001305f726574/00000004: not in 7f15e9ffb1f8/00010000'
full = 'java.lang.RuntimeException: invalid memory access: 120001305f726574/00000004: not in 7f15e9ffb1f8/00010000\n\tat is.h...ava:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n\tat java.lang.Thread.run(Thread.java:750)\n\n\n'
error_id = -1

    def deco(*args, **kwargs):
        import pyspark
        try:
            return f(*args, **kwargs)
        except py4j.protocol.Py4JJavaError as e:
            s = e.java_exception.toString()
    
            # py4j catches NoSuchElementExceptions to stop array iteration
            if s.startswith('java.util.NoSuchElementException'):
                raise
    
            tpl = Env.jutils().handleForPython(e.java_exception)
            deepest, full, error_id = tpl._1(), tpl._2(), tpl._3()
>           raise fatal_error_from_java_error_triplet(deepest, full, error_id) from None
E           hail.utils.java.FatalError: RuntimeException: invalid memory access: 120001305f726574/00000004: not in 7f15e9ffb1f8/00010000
E           
E           Java stack trace:
E           java.lang.RuntimeException: invalid memory access: 120001305f726574/00000004: not in 7f15e9ffb1f8/00010000
E           	at is.hail.annotations.Memory.checkAddress(Memory.java:226)
E           	at is.hail.annotations.Memory.loadInt(Memory.java:140)
E           	at is.hail.annotations.Region$.loadInt(Region.scala:20)
E           	at __C92844etypeDecode.__m92863ord_compareNonnull(Unknown Source)
E           	at __C92844etypeDecode.__m92862ord_compareNonnull(Unknown Source)
E           	at __C92844etypeDecode.__m92861ord_ltNonnull(Unknown Source)
E           	at __C92844etypeDecode.__m92860ord_lt(Unknown Source)
E           	at __C92844etypeDecode.__m92857arraySorter_merge(Unknown Source)
E           	at __C92844etypeDecode.__m92864arraySorter_splitMerge(Unknown Source)
E           	at __C92844etypeDecode.__m92864arraySorter_splitMerge(Unknown Source)
E           	at __C92844etypeDecode.__m92864arraySorter_splitMerge(Unknown Source)
E           	at __C92844etypeDecode.__m92864arraySorter_splitMerge(Unknown Source)
E           	at __C92844etypeDecode.__m92864arraySorter_splitMerge(Unknown Source)
E           	at __C92844etypeDecode.__m92864arraySorter_splitMerge(Unknown Source)
E           	at __C92844etypeDecode.__m92864arraySorter_splitMerge(Unknown Source)
E           	at __C92844etypeDecode.__m92864arraySorter_splitMerge(Unknown Source)
E           	at __C92844etypeDecode.__m92864arraySorter_splitMerge(Unknown Source)
E           	at __C92844etypeDecode.__m92864arraySorter_splitMerge(Unknown Source)
E           	at __C92844etypeDecode.__m92864arraySorter_splitMerge(Unknown Source)
E           	at __C92844etypeDecode.__m92864arraySorter_splitMerge(Unknown Source)
E           	at __C92844etypeDecode.__m92856arraySorter_outer(Unknown Source)
E           	at __C92844etypeDecode.__m92846DECODE_o_dict_of_o_struct_of_o_binaryANDo_int32END_TO_SIndexablePointer(Unknown Source)
E           	at __C92844etypeDecode.apply(Unknown Source)
E           	at is.hail.io.CompiledDecoder.readRegionValue(Decoder.scala:31)
E           	at is.hail.io.AbstractTypedCodecSpec.decodeArrays(CodecSpec.scala:57)
E           	at is.hail.io.AbstractTypedCodecSpec.decodeArrays$(CodecSpec.scala:54)
E           	at is.hail.io.TypedCodecSpec.decodeArrays(TypedCodecSpec.scala:19)
E           	at is.hail.expr.ir.Interpret$.$anonfun$run$1(Interpret.scala:80)
E           	at is.hail.utils.package$.using(package.scala:657)
E           	at is.hail.annotations.RegionPool.scopedRegion(RegionPool.scala:162)
E           	at is.hail.expr.ir.Interpret$.run(Interpret.scala:79)
E           	at is.hail.expr.ir.Interpret$.interpret$1(Interpret.scala:67)
E           	at is.hail.expr.ir.Interpret$.run(Interpret.scala:110)
E           	at is.hail.expr.ir.Interpret$.alreadyLowered(Interpret.scala:58)
E           	at is.hail.expr.ir.FoldConstants$.$anonfun$foldConstants$1(FoldConstants.scala:47)
E           	at is.hail.expr.ir.RewriteBottomUp$.$anonfun$apply$2(RewriteBottomUp.scala:11)
E           	at is.hail.utils.StackSafe$More.advance(StackSafe.scala:60)
E           	at is.hail.utils.StackSafe$.run(StackSafe.scala:16)
E           	at is.hail.utils.StackSafe$StackFrame.run(StackSafe.scala:32)
E           	at is.hail.expr.ir.RewriteBottomUp$.apply(RewriteBottomUp.scala:21)
E           	at is.hail.expr.ir.FoldConstants$.foldConstants(FoldConstants.scala:13)
E           	at is.hail.expr.ir.FoldConstants$.$anonfun$apply$1(FoldConstants.scala:10)
E           	at is.hail.backend.ExecuteContext$.$anonfun$scopedNewRegion$1(ExecuteContext.scala:86)
E           	at is.hail.utils.package$.using(package.scala:657)
E           	at is.hail.annotations.RegionPool.scopedRegion(RegionPool.scala:162)
E           	at is.hail.backend.ExecuteContext$.scopedNewRegion(ExecuteContext.scala:83)
E           	at is.hail.expr.ir.FoldConstants$.apply(FoldConstants.scala:9)
E           	at is.hail.expr.ir.Optimize$.$anonfun$apply$4(Optimize.scala:22)
E           	at is.hail.expr.ir.Optimize$.$anonfun$apply$1(Optimize.scala:15)
E           	at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
E           	at is.hail.expr.ir.Optimize$.runOpt$1(Optimize.scala:15)
E           	at is.hail.expr.ir.Optimize$.$anonfun$apply$2(Optimize.scala:22)
E           	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
E           	at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
E           	at is.hail.expr.ir.Optimize$.apply(Optimize.scala:18)
E           	at is.hail.expr.ir.lowering.OptimizePass.transform(LoweringPass.scala:40)
E           	at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$3(LoweringPass.scala:26)
E           	at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
E           	at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$1(LoweringPass.scala:26)
E           	at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
E           	at is.hail.expr.ir.lowering.LoweringPass.apply(LoweringPass.scala:24)
E           	at is.hail.expr.ir.lowering.LoweringPass.apply$(LoweringPass.scala:23)
E           	at is.hail.expr.ir.lowering.OptimizePass.apply(LoweringPass.scala:36)
E           	at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1(LoweringPipeline.scala:22)
E           	at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1$adapted(LoweringPipeline.scala:20)
E           	at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
E           	at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
E           	at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)
E           	at is.hail.expr.ir.lowering.LoweringPipeline.apply(LoweringPipeline.scala:20)
E           	at is.hail.expr.ir.Compile$.$anonfun$apply$4(Compile.scala:45)
E           	at is.hail.backend.BackendWithCodeCache.lookupOrCompileCachedFunction(Backend.scala:126)
E           	at is.hail.backend.BackendWithCodeCache.lookupOrCompileCachedFunction$(Backend.scala:122)
E           	at is.hail.backend.local.LocalBackend.lookupOrCompileCachedFunction(LocalBackend.scala:73)
E           	at is.hail.expr.ir.Compile$.apply(Compile.scala:39)
E           	at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$_apply$5(CompileAndEvaluate.scala:66)
E           	at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
E           	at is.hail.expr.ir.CompileAndEvaluate$._apply(CompileAndEvaluate.scala:66)
E           	at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$apply$1(CompileAndEvaluate.scala:19)
E           	at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
E           	at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:19)
E           	at is.hail.expr.ir.lowering.LowerDistributedSort$.distributedSort(LowerDistributedSort.scala:158)
E           	at is.hail.backend.local.LocalBackend.lowerDistributedSort(LocalBackend.scala:331)
E           	at is.hail.expr.ir.lowering.LowerAndExecuteShuffles$.$anonfun$apply$1(LowerAndExecuteShuffles.scala:67)
E           	at is.hail.expr.ir.RewriteBottomUp$.$anonfun$apply$2(RewriteBottomUp.scala:11)
E           	at is.hail.utils.StackSafe$More.advance(StackSafe.scala:60)
E           	at is.hail.utils.StackSafe$.run(StackSafe.scala:16)
E           	at is.hail.utils.StackSafe$StackFrame.run(StackSafe.scala:32)
E           	at is.hail.expr.ir.RewriteBottomUp$.apply(RewriteBottomUp.scala:21)
E           	at is.hail.expr.ir.lowering.LowerAndExecuteShuffles$.apply(LowerAndExecuteShuffles.scala:14)
E           	at is.hail.expr.ir.lowering.LowerAndExecuteShufflesPass.transform(LoweringPass.scala:167)
E           	at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$3(LoweringPass.scala:26)
E           	at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
E           	at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$1(LoweringPass.scala:26)
E           	at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
E           	at is.hail.expr.ir.lowering.LoweringPass.apply(LoweringPass.scala:24)
E           	at is.hail.expr.ir.lowering.LoweringPass.apply$(LoweringPass.scala:23)
E           	at is.hail.expr.ir.lowering.LowerAndExecuteShufflesPass.apply(LoweringPass.scala:161)
E           	at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1(LoweringPipeline.scala:22)
E           	at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1$adapted(LoweringPipeline.scala:20)
E           	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
E           	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
E           	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
E           	at is.hail.expr.ir.lowering.LoweringPipeline.apply(LoweringPipeline.scala:20)
E           	at is.hail.expr.ir.lowering.EvalRelationalLets$.execute$1(EvalRelationalLets.scala:10)
E           	at is.hail.expr.ir.lowering.EvalRelationalLets$.lower$1(EvalRelationalLets.scala:18)
E           	at is.hail.expr.ir.lowering.EvalRelationalLets$.apply(EvalRelationalLets.scala:32)
E           	at is.hail.expr.ir.lowering.EvalRelationalLetsPass.transform(LoweringPass.scala:157)
E           	at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$3(LoweringPass.scala:26)
E           	at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
E           	at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$1(LoweringPass.scala:26)
E           	at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
E           	at is.hail.expr.ir.lowering.LoweringPass.apply(LoweringPass.scala:24)
E           	at is.hail.expr.ir.lowering.LoweringPass.apply$(LoweringPass.scala:23)
E           	at is.hail.expr.ir.lowering.EvalRelationalLetsPass.apply(LoweringPass.scala:151)
E           	at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1(LoweringPipeline.scala:22)
E           	at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1$adapted(LoweringPipeline.scala:20)
E           	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
E           	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
E           	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
E           	at is.hail.expr.ir.lowering.LoweringPipeline.apply(LoweringPipeline.scala:20)
E           	at is.hail.backend.local.LocalBackend._jvmLowerAndExecute(LocalBackend.scala:150)
E           	at is.hail.backend.local.LocalBackend._execute(LocalBackend.scala:189)
E           	at is.hail.backend.local.LocalBackend.$anonfun$executeToEncoded$1(LocalBackend.scala:209)
E           	at is.hail.backend.ExecuteContext$.$anonfun$scoped$3(ExecuteContext.scala:76)
E           	at is.hail.utils.package$.using(package.scala:657)
E           	at is.hail.backend.ExecuteContext$.$anonfun$scoped$2(ExecuteContext.scala:76)
E           	at is.hail.utils.package$.using(package.scala:657)
E           	at is.hail.annotations.RegionPool$.scoped(RegionPool.scala:17)
E           	at is.hail.backend.ExecuteContext$.scoped(ExecuteContext.scala:62)
E           	at is.hail.backend.local.LocalBackend.$anonfun$withExecuteContext$1(LocalBackend.scala:109)
E           	at is.hail.backend.local.LocalBackend.executeToEncoded(LocalBackend.scala:208)
E           	at is.hail.backend.local.LocalBackend.$anonfun$executeEncode$2(LocalBackend.scala:237)
E           	at is.hail.backend.ExecuteContext$.$anonfun$scoped$3(ExecuteContext.scala:76)
E           	at is.hail.utils.package$.using(package.scala:657)
E           	at is.hail.backend.ExecuteContext$.$anonfun$scoped$2(ExecuteContext.scala:76)
E           	at is.hail.utils.package$.using(package.scala:657)
E           	at is.hail.annotations.RegionPool$.scoped(RegionPool.scala:17)
E           	at is.hail.backend.ExecuteContext$.scoped(ExecuteContext.scala:62)
E           	at is.hail.backend.local.LocalBackend.$anonfun$withExecuteContext$1(LocalBackend.scala:109)
E           	at is.hail.backend.local.LocalBackend.$anonfun$executeEncode$1(LocalBackend.scala:236)
E           	at is.hail.utils.ExecutionTimer$.time(ExecutionTimer.scala:52)
E           	at is.hail.backend.local.LocalBackend.executeEncode(LocalBackend.scala:234)
E           	at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
E           	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
E           	at java.lang.reflect.Method.invoke(Method.java:498)
E           	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
E           	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
E           	at py4j.Gateway.invoke(Gateway.java:282)
E           	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
E           	at py4j.commands.CallCommand.execute(CallCommand.java:79)
E           	at py4j.GatewayConnection.run(GatewayConnection.java:238)
E           	at java.lang.Thread.run(Thread.java:750)
E           
E           
E           
E           Hail version: 0.2.124-b31f1dcea5d2
E           Error summary: RuntimeException: invalid memory access: 120001305f726574/00000004: not in 7f15e9ffb1f8/00010000

Copy link
Collaborator

@danking danking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something still seems off here b/c of the test failures


val sorter = new ArraySorter(EmitRegion(cb.emb, region), ab)
def lessThan(cb: EmitCodeBuilder, region: Value[Region], l: Value[_], r: Value[_]): Value[Boolean] = {
val lk = cb.memoize(sct.loadToSValue(cb, l).asBaseStruct.loadField(cb, 0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patrick-schultz is a missing key value handled correctly by this code?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given how we're generating these from Python, how could a key value ever be missing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything obviously wrong. I can try checking out the branch and debugging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Patrick!

@danking
Copy link
Collaborator

danking commented Oct 25, 2023

Hmm. What's special about hl.concordance...

The other issue seems like it might be simpler, we're just setting a bad type on a field:

E           java.lang.ClassFormatError: Invalid signature for field in class __C8802Compiled referenced from constant pool index 1605 in method __C8802Compiled.addAndDecodeLiterals_region0_0(L__C9346addAndDecodeLiteralsSpills;)V

@daniel-goldstein
Copy link
Contributor Author

daniel-goldstein commented Oct 25, 2023

Ah we have the same ordering problem with sets as we have with dicts

@danking
Copy link
Collaborator

danking commented Oct 25, 2023

We sure do

@danking danking merged commit eabff3d into hail-is:main Oct 25, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants