Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MatrixKeyRowsBy with isSorted=true does not change RVD type after casting to a Table #5396

Closed
chrisvittal opened this issue Feb 20, 2019 · 2 comments · Fixed by #5693
Closed

MatrixKeyRowsBy with isSorted=true does not change RVD type after casting to a Table #5396

chrisvittal opened this issue Feb 20, 2019 · 2 comments · Fixed by #5693

Comments

@chrisvittal
Copy link
Collaborator

@chrisvittal chrisvittal commented Feb 20, 2019

Code:

import hail as hl
from hail import ir

hl.init()
mt = hl.import_vcf('path/to/vcf')
mt = hl.MatrixTable(ir.MatrixKeyRowsBy(mt._mir, ['locus'], is_sorted=True))
ht = mt._localize_entries('_e', '_c')
j = hl.Table._multi_way_zip_join([ht, ht], 'd', 'g')
j.write('tst.ht')

Java Stack Trace:

py4j.protocol.Py4JJavaError: An error occurred while calling z:is.hail.expr.ir.Interpret.interpretJSON.
: java.util.NoSuchElementException: key not found: alleles
	at scala.collection.MapLike$class.default(MapLike.scala:228)
	at scala.collection.AbstractMap.default(Map.scala:59)
	at scala.collection.MapLike$class.apply(MapLike.scala:141)
	at scala.collection.AbstractMap.apply(Map.scala:59)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
	at scala.collection.AbstractTraversable.map(Traversable.scala:104)
	at is.hail.rvd.RVDType.<init>(RVDType.scala:24)
	at is.hail.expr.ir.TableMultiWayZipJoin.execute(TableIR.scala:669)
	at is.hail.expr.ir.Interpret$.apply(Interpret.scala:775)
	at is.hail.expr.ir.Interpret$.apply(Interpret.scala:93)
	at is.hail.expr.ir.Interpret$.apply(Interpret.scala:63)
	at is.hail.expr.ir.Interpret$.interpretJSON(Interpret.scala:22)
	at is.hail.expr.ir.Interpret.interpretJSON(Interpret.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:280)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:214)
	at java.lang.Thread.run(Thread.java:748)

Some debugging showed that the RVD Key type for the children of MultiWayZipJoin was still [locus, alleles], I would expect the RVD key to be [locus].

@tpoterba
Copy link
Collaborator

@tpoterba tpoterba commented Feb 22, 2019

The RVD key type is the physical key. This is totally allowed. I believe the bug is in multi way zip join, which needs to truncate the key to the join key first.

@tpoterba
Copy link
Collaborator

@tpoterba tpoterba commented Feb 22, 2019

I also think this would replicate with is_sorted=False

tpoterba added a commit to tpoterba/hail that referenced this issue Mar 26, 2019
@danking danking closed this in #5693 Apr 1, 2019
danking added a commit that referenced this issue Apr 1, 2019
* Fix TableMultiWayZipJoin key behavior

fixes #5396

* fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

2 participants