-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(vague) sort out python and R integration story #21
Comments
We have a plan for both of these now. |
danking
pushed a commit
to danking/hail
that referenced
this issue
Apr 20, 2018
# This is the 1st commit message: apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering make TrivialContext Resettable a few more missing resettablecontexts address comments apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering remove rogue element type type make TrivialContext Resettable wip wip wip wip use safe row in join suite pull over hailcontext remove Region.clear(newEnd) add selectRegionValue # This is the commit message #2: convert relational.scala ; # This is the commit message #3: scope the extract aggregators constfb call # This is the commit message #4: scope interpret # This is the commit message #5: typeAfterSelect used by selectRegionValue # This is the commit message #6: load matrix # This is the commit message #7: imports # This is the commit message #8: loadbgen converted # This is the commit message #9: convert loadplink # This is the commit message #10: convert loadgdb # This is the commit message #11: convert loadvcf # This is the commit message #12: convert blockmatrix # This is the commit message #13: convert filterintervals # This is the commit message hail-is#14: convert ibd # This is the commit message hail-is#15: convert a few methods # This is the commit message hail-is#16: convert split multi # This is the commit message hail-is#17: convert VEP # This is the commit message hail-is#18: formatting fix # This is the commit message hail-is#19: add partitionBy and values # This is the commit message hail-is#20: fix bug in localkeysort # This is the commit message hail-is#21: fixup HailContext.readRowsPartition use # This is the commit message hail-is#22: port balding nichols model
danking
added a commit
that referenced
this issue
May 4, 2018
* # This is a combination of 22 commits. # This is the 1st commit message: apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering make TrivialContext Resettable a few more missing resettablecontexts address comments apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering remove rogue element type type make TrivialContext Resettable wip wip wip wip use safe row in join suite pull over hailcontext remove Region.clear(newEnd) add selectRegionValue # This is the commit message #2: convert relational.scala ; # This is the commit message #3: scope the extract aggregators constfb call # This is the commit message #4: scope interpret # This is the commit message #5: typeAfterSelect used by selectRegionValue # This is the commit message #6: load matrix # This is the commit message #7: imports # This is the commit message #8: loadbgen converted # This is the commit message #9: convert loadplink # This is the commit message #10: convert loadgdb # This is the commit message #11: convert loadvcf # This is the commit message #12: convert blockmatrix # This is the commit message #13: convert filterintervals # This is the commit message #14: convert ibd # This is the commit message #15: convert a few methods # This is the commit message #16: convert split multi # This is the commit message #17: convert VEP # This is the commit message #18: formatting fix # This is the commit message #19: add partitionBy and values # This is the commit message #20: fix bug in localkeysort # This is the commit message #21: fixup HailContext.readRowsPartition use # This is the commit message #22: port balding nichols model * apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering make TrivialContext Resettable a few more missing resettablecontexts address comments apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering remove rogue element type type make TrivialContext Resettable wip wip wip wip use safe row in join suite pull over hailcontext remove Region.clear(newEnd) add selectRegionValue convert relational.scala ; scope the extract aggregators constfb call scope interpret typeAfterSelect used by selectRegionValue load matrix imports loadbgen converted convert loadplink convert loadgdb convert loadvcf convert blockmatrix convert filterintervals convert ibd convert a few methods convert split multi convert VEP formatting fix add partitionBy and values fix bug in localkeysort fixup HailContext.readRowsPartition use port balding nichols model port over table.scala couple fixes convert matrix table remove necessary use of rdd variety of fixups wip add a clear * Remove direct Region allocation from FilterColsIR When regions are off-heap, we can allow the globals to live in a separate, longer-lived Region that is not cleared until the whole partition is finished. For now, we pay the memory cost. * Use RVDContext in MatrixRead zip This Region will get cleared by consumers. I introduced the zip primitive which is a safer way to zip two RVDs because it does not rely on the user correctly clearing the regions used by the left and right hand sides of the zip. * Control the Regions in LoadGDB I do not fully understand how LoadGDB is working, but a simple solution to the use-case is to serialize to arrays of bytes and parallelize those. I realize there is a proliferation of `coerce` methods. I plan to trim this down once we do not have RDD and ContextRDD coexisting * wip * unify RVD.run * reset in write * fixes * use context region when allocating * also read RVDs using RVDContext * formatting * address comments * remove unused val * abstract over boundary * little fixes * whoops forgot to clear before persisting This fixes the LDPrune if you dont clear the region things go wrong. Not sure what causes that bug. Maybe its something about encoders? * serialize for shuffles, region.scoped in matrixmapglobals, fix joins * clear more! * wip * wip * rework GeneralRDD to ease ContextRDD transition * formatting * final fixes * formatting * merge failures * more bad merge stuff * formatting * remove unnecessary stuff * remove fixme * boom! * variety of merge mistakes * fix destabilize bug * add missing newline * remember to clear the producer region in localkeysort * switch def to val * cleanup filteralleles and exporbidbimfam * fix clearing and serialization issue * fix BitPackedVectorView Previously it always assumed the variant struct started at offset zero, which is not true * address comments, remove a comment * remove direct use of Region * oops * werrrks, mebbe * needs cleanup * fix filter intervals * fixes * fixes * fix filterintervals * remove unnecessary copy in TableJoin * and finally fix the last test * re-use existing CodecSpec definition * remove unnecessary boundaries * use RVD abstraction when possible * formatting * bugfix: RegionValue must know its region * remove unnecessary val and comment * remove unused methods * eliminate unused constructors * undo debug change * formatting * remove unused imports * fix bug in tablejoin * fix RichRDDSuite test If you have no data, then you have no partitions, not 1 partition
konradjk
pushed a commit
to konradjk/hail
that referenced
this issue
Jun 12, 2018
* # This is a combination of 22 commits. # This is the 1st commit message: apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering make TrivialContext Resettable a few more missing resettablecontexts address comments apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering remove rogue element type type make TrivialContext Resettable wip wip wip wip use safe row in join suite pull over hailcontext remove Region.clear(newEnd) add selectRegionValue # This is the commit message #2: convert relational.scala ; # This is the commit message #3: scope the extract aggregators constfb call # This is the commit message hail-is#4: scope interpret # This is the commit message hail-is#5: typeAfterSelect used by selectRegionValue # This is the commit message hail-is#6: load matrix # This is the commit message hail-is#7: imports # This is the commit message hail-is#8: loadbgen converted # This is the commit message hail-is#9: convert loadplink # This is the commit message hail-is#10: convert loadgdb # This is the commit message hail-is#11: convert loadvcf # This is the commit message hail-is#12: convert blockmatrix # This is the commit message hail-is#13: convert filterintervals # This is the commit message hail-is#14: convert ibd # This is the commit message hail-is#15: convert a few methods # This is the commit message hail-is#16: convert split multi # This is the commit message hail-is#17: convert VEP # This is the commit message hail-is#18: formatting fix # This is the commit message hail-is#19: add partitionBy and values # This is the commit message hail-is#20: fix bug in localkeysort # This is the commit message hail-is#21: fixup HailContext.readRowsPartition use # This is the commit message hail-is#22: port balding nichols model * apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering make TrivialContext Resettable a few more missing resettablecontexts address comments apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering remove rogue element type type make TrivialContext Resettable wip wip wip wip use safe row in join suite pull over hailcontext remove Region.clear(newEnd) add selectRegionValue convert relational.scala ; scope the extract aggregators constfb call scope interpret typeAfterSelect used by selectRegionValue load matrix imports loadbgen converted convert loadplink convert loadgdb convert loadvcf convert blockmatrix convert filterintervals convert ibd convert a few methods convert split multi convert VEP formatting fix add partitionBy and values fix bug in localkeysort fixup HailContext.readRowsPartition use port balding nichols model port over table.scala couple fixes convert matrix table remove necessary use of rdd variety of fixups wip add a clear * Remove direct Region allocation from FilterColsIR When regions are off-heap, we can allow the globals to live in a separate, longer-lived Region that is not cleared until the whole partition is finished. For now, we pay the memory cost. * Use RVDContext in MatrixRead zip This Region will get cleared by consumers. I introduced the zip primitive which is a safer way to zip two RVDs because it does not rely on the user correctly clearing the regions used by the left and right hand sides of the zip. * Control the Regions in LoadGDB I do not fully understand how LoadGDB is working, but a simple solution to the use-case is to serialize to arrays of bytes and parallelize those. I realize there is a proliferation of `coerce` methods. I plan to trim this down once we do not have RDD and ContextRDD coexisting * wip * unify RVD.run * reset in write * fixes * use context region when allocating * also read RVDs using RVDContext * formatting * address comments * remove unused val * abstract over boundary * little fixes * whoops forgot to clear before persisting This fixes the LDPrune if you dont clear the region things go wrong. Not sure what causes that bug. Maybe its something about encoders? * serialize for shuffles, region.scoped in matrixmapglobals, fix joins * clear more! * wip * wip * rework GeneralRDD to ease ContextRDD transition * formatting * final fixes * formatting * merge failures * more bad merge stuff * formatting * remove unnecessary stuff * remove fixme * boom! * variety of merge mistakes * fix destabilize bug * add missing newline * remember to clear the producer region in localkeysort * switch def to val * cleanup filteralleles and exporbidbimfam * fix clearing and serialization issue * fix BitPackedVectorView Previously it always assumed the variant struct started at offset zero, which is not true * address comments, remove a comment * remove direct use of Region * oops * werrrks, mebbe * needs cleanup * fix filter intervals * fixes * fixes * fix filterintervals * remove unnecessary copy in TableJoin * and finally fix the last test * re-use existing CodecSpec definition * remove unnecessary boundaries * use RVD abstraction when possible * formatting * bugfix: RegionValue must know its region * remove unnecessary val and comment * remove unused methods * eliminate unused constructors * undo debug change * formatting * remove unused imports * fix bug in tablejoin * fix RichRDDSuite test If you have no data, then you have no partitions, not 1 partition
jackgoldsmith4
pushed a commit
to jackgoldsmith4/hail
that referenced
this issue
Jun 25, 2018
* # This is a combination of 22 commits. # This is the 1st commit message: apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering make TrivialContext Resettable a few more missing resettablecontexts address comments apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering remove rogue element type type make TrivialContext Resettable wip wip wip wip use safe row in join suite pull over hailcontext remove Region.clear(newEnd) add selectRegionValue # This is the commit message hail-is#2: convert relational.scala ; # This is the commit message hail-is#3: scope the extract aggregators constfb call # This is the commit message hail-is#4: scope interpret # This is the commit message hail-is#5: typeAfterSelect used by selectRegionValue # This is the commit message hail-is#6: load matrix # This is the commit message hail-is#7: imports # This is the commit message hail-is#8: loadbgen converted # This is the commit message hail-is#9: convert loadplink # This is the commit message hail-is#10: convert loadgdb # This is the commit message hail-is#11: convert loadvcf # This is the commit message hail-is#12: convert blockmatrix # This is the commit message hail-is#13: convert filterintervals # This is the commit message hail-is#14: convert ibd # This is the commit message hail-is#15: convert a few methods # This is the commit message hail-is#16: convert split multi # This is the commit message hail-is#17: convert VEP # This is the commit message hail-is#18: formatting fix # This is the commit message hail-is#19: add partitionBy and values # This is the commit message hail-is#20: fix bug in localkeysort # This is the commit message hail-is#21: fixup HailContext.readRowsPartition use # This is the commit message hail-is#22: port balding nichols model * apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering make TrivialContext Resettable a few more missing resettablecontexts address comments apply resettable context forgot to fix one use of AutoCloseable fix add setup iterator more sensible method ordering remove rogue element type type make TrivialContext Resettable wip wip wip wip use safe row in join suite pull over hailcontext remove Region.clear(newEnd) add selectRegionValue convert relational.scala ; scope the extract aggregators constfb call scope interpret typeAfterSelect used by selectRegionValue load matrix imports loadbgen converted convert loadplink convert loadgdb convert loadvcf convert blockmatrix convert filterintervals convert ibd convert a few methods convert split multi convert VEP formatting fix add partitionBy and values fix bug in localkeysort fixup HailContext.readRowsPartition use port balding nichols model port over table.scala couple fixes convert matrix table remove necessary use of rdd variety of fixups wip add a clear * Remove direct Region allocation from FilterColsIR When regions are off-heap, we can allow the globals to live in a separate, longer-lived Region that is not cleared until the whole partition is finished. For now, we pay the memory cost. * Use RVDContext in MatrixRead zip This Region will get cleared by consumers. I introduced the zip primitive which is a safer way to zip two RVDs because it does not rely on the user correctly clearing the regions used by the left and right hand sides of the zip. * Control the Regions in LoadGDB I do not fully understand how LoadGDB is working, but a simple solution to the use-case is to serialize to arrays of bytes and parallelize those. I realize there is a proliferation of `coerce` methods. I plan to trim this down once we do not have RDD and ContextRDD coexisting * wip * unify RVD.run * reset in write * fixes * use context region when allocating * also read RVDs using RVDContext * formatting * address comments * remove unused val * abstract over boundary * little fixes * whoops forgot to clear before persisting This fixes the LDPrune if you dont clear the region things go wrong. Not sure what causes that bug. Maybe its something about encoders? * serialize for shuffles, region.scoped in matrixmapglobals, fix joins * clear more! * wip * wip * rework GeneralRDD to ease ContextRDD transition * formatting * final fixes * formatting * merge failures * more bad merge stuff * formatting * remove unnecessary stuff * remove fixme * boom! * variety of merge mistakes * fix destabilize bug * add missing newline * remember to clear the producer region in localkeysort * switch def to val * cleanup filteralleles and exporbidbimfam * fix clearing and serialization issue * fix BitPackedVectorView Previously it always assumed the variant struct started at offset zero, which is not true * address comments, remove a comment * remove direct use of Region * oops * werrrks, mebbe * needs cleanup * fix filter intervals * fixes * fixes * fix filterintervals * remove unnecessary copy in TableJoin * and finally fix the last test * re-use existing CodecSpec definition * remove unnecessary boundaries * use RVD abstraction when possible * formatting * bugfix: RegionValue must know its region * remove unnecessary val and comment * remove unused methods * eliminate unused constructors * undo debug change * formatting * remove unused imports * fix bug in tablejoin * fix RichRDDSuite test If you have no data, then you have no partitions, not 1 partition
tpoterba
referenced
this issue
in tpoterba/hail
Feb 12, 2019
Refactor, package scripts for inclusion in Python Package Index
daniel-goldstein
referenced
this issue
in daniel-goldstein/hail
Feb 3, 2022
Run GH workflow (to build conda package) on tag pushes only
danking
pushed a commit
to danking/hail
that referenced
this issue
Oct 11, 2023
Consider this: ```scala class Foo { def bar(): (Long, Long) = (3, 4) def destructure(): Unit = { val (x, y) = bar() } def accessors(): Unit = { val zz = bar() val x = zz._1 val y = zz._2 } } ``` These should be exactly equivalent, right? There's no way Scala would compile the match into something horrible. Right? Right? ``` public void destructure(); Code: 0: aload_0 1: invokevirtual hail-is#27 // Method bar:()Lscala/Tuple2; 4: astore_3 5: aload_3 6: ifnull 35 9: aload_3 10: invokevirtual hail-is#33 // Method scala/Tuple2._1$mcJ$sp:()J 13: lstore 4 15: aload_3 16: invokevirtual hail-is#36 // Method scala/Tuple2._2$mcJ$sp:()J 19: lstore 6 21: new #13 // class scala/Tuple2$mcJJ$sp 24: dup 25: lload 4 27: lload 6 29: invokespecial hail-is#21 // Method scala/Tuple2$mcJJ$sp."<init>":(JJ)V 32: goto 47 35: goto 38 38: new hail-is#38 // class scala/MatchError 41: dup 42: aload_3 43: invokespecial hail-is#41 // Method scala/MatchError."<init>":(Ljava/lang/Object;)V 46: athrow 47: astore_2 48: aload_2 49: invokevirtual hail-is#33 // Method scala/Tuple2._1$mcJ$sp:()J 52: lstore 8 54: aload_2 55: invokevirtual hail-is#36 // Method scala/Tuple2._2$mcJ$sp:()J 58: lstore 10 60: return public void accessors(); Code: 0: aload_0 1: invokevirtual hail-is#27 // Method bar:()Lscala/Tuple2; 4: astore_1 5: aload_1 6: invokevirtual hail-is#33 // Method scala/Tuple2._1$mcJ$sp:()J 9: lstore_2 10: aload_1 11: invokevirtual hail-is#36 // Method scala/Tuple2._2$mcJ$sp:()J 14: lstore 4 16: return ``` Yeah, so, it extracts the first and second elements of the primitive-specialized tuple, constructs a `(java.lang.Long, java.lang.Long)` Tuple, then does the match on that. sigh.
danking
added a commit
that referenced
this issue
Oct 17, 2023
…13794) Consider this: ```scala class Foo { def bar(): (Long, Long) = (3, 4) def destructure(): Unit = { val (x, y) = bar() } def accessors(): Unit = { val zz = bar() val x = zz._1 val y = zz._2 } } ``` ![image](https://github.com/hail-is/hail/assets/106194/532dc7ea-8027-461d-8e12-3217f5451713) These should be exactly equivalent, right? There's no way Scala would compile the match into something horrible. Right? Right? ``` public void destructure(); Code: 0: aload_0 1: invokevirtual #27 // Method bar:()Lscala/Tuple2; 4: astore_3 5: aload_3 6: ifnull 35 9: aload_3 10: invokevirtual #33 // Method scala/Tuple2._1$mcJ$sp:()J 13: lstore 4 15: aload_3 16: invokevirtual #36 // Method scala/Tuple2._2$mcJ$sp:()J 19: lstore 6 21: new #13 // class scala/Tuple2$mcJJ$sp 24: dup 25: lload 4 27: lload 6 29: invokespecial #21 // Method scala/Tuple2$mcJJ$sp."<init>":(JJ)V 32: goto 47 35: goto 38 38: new #38 // class scala/MatchError 41: dup 42: aload_3 43: invokespecial #41 // Method scala/MatchError."<init>":(Ljava/lang/Object;)V 46: athrow 47: astore_2 48: aload_2 49: invokevirtual #33 // Method scala/Tuple2._1$mcJ$sp:()J 52: lstore 8 54: aload_2 55: invokevirtual #36 // Method scala/Tuple2._2$mcJ$sp:()J 58: lstore 10 60: return public void accessors(); Code: 0: aload_0 1: invokevirtual #27 // Method bar:()Lscala/Tuple2; 4: astore_1 5: aload_1 6: invokevirtual #33 // Method scala/Tuple2._1$mcJ$sp:()J 9: lstore_2 10: aload_1 11: invokevirtual #36 // Method scala/Tuple2._2$mcJ$sp:()J 14: lstore 4 16: return ``` Yeah, so, it extracts the first and second elements of the primitive-specialized tuple, ~~constructs a `(java.lang.Long, java.lang.Long)` Tuple~~ constructs another primitive-specialized tuple (for no reason???), then does the match on that. sigh.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
From @cseed on August 26, 2015 16:4
The goal is to let analysts use our methods and data representations coded in Spark within python and R.
Tachyon might might be part of the story:
http://tachyon-project.org/
Copied from original issue: cseed/hail#20
The text was updated successfully, but these errors were encountered: