Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import VCF header #17

Closed
cseed opened this issue Oct 29, 2015 · 2 comments
Closed

import VCF header #17

cseed opened this issue Oct 29, 2015 · 2 comments

Comments

@cseed
Copy link
Collaborator

cseed commented Oct 29, 2015

From @cseed on August 26, 2015 14:43

Store entire header. Parse contig at least. Expand sampleIds: Array[String] to store metadata.

Copied from original issue: cseed/hail#16

@cseed
Copy link
Collaborator Author

cseed commented Oct 29, 2015

@cseed cseed self-assigned this Oct 29, 2015
@cseed cseed added the feature label Oct 29, 2015
@cseed cseed removed their assignment Nov 19, 2015
@cseed
Copy link
Collaborator Author

cseed commented Feb 3, 2016

Now irrelevant.

@cseed cseed closed this as completed Feb 3, 2016
cseed pushed a commit to cseed/hail that referenced this issue Feb 6, 2018
danking added a commit to danking/hail that referenced this issue Apr 20, 2018
# This is the 1st commit message:

apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

make TrivialContext Resettable

a few more missing resettablecontexts

address comments

apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

remove rogue element type type

make TrivialContext Resettable

wip

wip

wip

wip

use safe row in join suite

pull over hailcontext

remove Region.clear(newEnd)

add selectRegionValue

# This is the commit message #2:

convert relational.scala
;

# This is the commit message #3:

scope the extract aggregators constfb call

# This is the commit message #4:

scope interpret

# This is the commit message #5:

typeAfterSelect used by selectRegionValue

# This is the commit message #6:

load matrix

# This is the commit message #7:

imports

# This is the commit message #8:

loadbgen converted

# This is the commit message #9:

convert loadplink

# This is the commit message #10:

convert loadgdb

# This is the commit message #11:

convert loadvcf

# This is the commit message #12:

convert blockmatrix

# This is the commit message #13:

convert filterintervals

# This is the commit message hail-is#14:

convert ibd

# This is the commit message hail-is#15:

convert a few methods

# This is the commit message hail-is#16:

convert split multi

# This is the commit message hail-is#17:

convert VEP

# This is the commit message hail-is#18:

formatting fix

# This is the commit message hail-is#19:

add partitionBy and values

# This is the commit message hail-is#20:

fix bug in localkeysort

# This is the commit message hail-is#21:

fixup HailContext.readRowsPartition use

# This is the commit message hail-is#22:

port balding nichols model
danking added a commit that referenced this issue May 4, 2018
* # This is a combination of 22 commits.
# This is the 1st commit message:

apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

make TrivialContext Resettable

a few more missing resettablecontexts

address comments

apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

remove rogue element type type

make TrivialContext Resettable

wip

wip

wip

wip

use safe row in join suite

pull over hailcontext

remove Region.clear(newEnd)

add selectRegionValue

# This is the commit message #2:

convert relational.scala
;

# This is the commit message #3:

scope the extract aggregators constfb call

# This is the commit message #4:

scope interpret

# This is the commit message #5:

typeAfterSelect used by selectRegionValue

# This is the commit message #6:

load matrix

# This is the commit message #7:

imports

# This is the commit message #8:

loadbgen converted

# This is the commit message #9:

convert loadplink

# This is the commit message #10:

convert loadgdb

# This is the commit message #11:

convert loadvcf

# This is the commit message #12:

convert blockmatrix

# This is the commit message #13:

convert filterintervals

# This is the commit message #14:

convert ibd

# This is the commit message #15:

convert a few methods

# This is the commit message #16:

convert split multi

# This is the commit message #17:

convert VEP

# This is the commit message #18:

formatting fix

# This is the commit message #19:

add partitionBy and values

# This is the commit message #20:

fix bug in localkeysort

# This is the commit message #21:

fixup HailContext.readRowsPartition use

# This is the commit message #22:

port balding nichols model

* apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

make TrivialContext Resettable

a few more missing resettablecontexts

address comments

apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

remove rogue element type type

make TrivialContext Resettable

wip

wip

wip

wip

use safe row in join suite

pull over hailcontext

remove Region.clear(newEnd)

add selectRegionValue

convert relational.scala
;

scope the extract aggregators constfb call

scope interpret

typeAfterSelect used by selectRegionValue

load matrix

imports

loadbgen converted

convert loadplink

convert loadgdb

convert loadvcf

convert blockmatrix

convert filterintervals

convert ibd

convert a few methods

convert split multi

convert VEP

formatting fix

add partitionBy and values

fix bug in localkeysort

fixup HailContext.readRowsPartition use

port balding nichols model

port over table.scala

couple fixes

convert matrix table

remove necessary use of rdd

variety of fixups

wip

add a clear

* Remove direct Region allocation from FilterColsIR

When regions are off-heap, we can allow the globals to live
in a separate, longer-lived Region that is not cleared until
the whole partition is finished. For now, we pay the
memory cost.

* Use RVDContext in MatrixRead zip

This Region will get cleared by consumers.

I introduced the zip primitive which is a safer way to
zip two RVDs because it does not rely on the user correctly
clearing the regions used by the left and right hand sides
of the zip.

* Control the Regions in LoadGDB

I do not fully understand how LoadGDB is working, but a simple
solution to the use-case is to serialize to arrays of bytes
and parallelize those.

I realize there is a proliferation of `coerce` methods. I plan
to trim this down once we do not have RDD and ContextRDD coexisting

* wip

* unify RVD.run

* reset in write

* fixes

* use context region when allocating

* also read RVDs using RVDContext

* formatting

* address comments

* remove unused val

* abstract over boundary

* little fixes

* whoops forgot to clear before persisting

This fixes the LDPrune if you dont clear the region things go wrong.
Not sure what causes that bug. Maybe its something about encoders?

* serialize for shuffles, region.scoped in matrixmapglobals, fix joins

* clear more!

* wip

* wip

* rework GeneralRDD to ease ContextRDD transition

* formatting

* final fixes

* formatting

* merge failures

* more bad merge stuff

* formatting

* remove unnecessary stuff

* remove fixme

* boom!

* variety of merge mistakes

* fix destabilize bug

* add missing newline

* remember to clear the producer region in localkeysort

* switch def to val

* cleanup filteralleles and exporbidbimfam

* fix clearing and serialization issue

* fix BitPackedVectorView

Previously it always assumed the variant struct started at offset
zero, which is not true

* address comments, remove a comment

* remove direct use of Region

* oops

* werrrks, mebbe

* needs cleanup

* fix filter intervals

* fixes

* fixes

* fix filterintervals

* remove unnecessary copy in TableJoin

* and finally fix the last test

* re-use existing CodecSpec definition

* remove unnecessary boundaries

* use RVD abstraction when possible

* formatting

* bugfix: RegionValue must know its region

* remove unnecessary val and comment

* remove unused methods

* eliminate unused constructors

* undo debug change

* formatting

* remove unused imports

* fix bug in tablejoin

* fix RichRDDSuite test

If you have no data, then you have no partitions, not 1 partition
konradjk pushed a commit to konradjk/hail that referenced this issue Jun 12, 2018
* # This is a combination of 22 commits.
# This is the 1st commit message:

apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

make TrivialContext Resettable

a few more missing resettablecontexts

address comments

apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

remove rogue element type type

make TrivialContext Resettable

wip

wip

wip

wip

use safe row in join suite

pull over hailcontext

remove Region.clear(newEnd)

add selectRegionValue

# This is the commit message #2:

convert relational.scala
;

# This is the commit message #3:

scope the extract aggregators constfb call

# This is the commit message hail-is#4:

scope interpret

# This is the commit message hail-is#5:

typeAfterSelect used by selectRegionValue

# This is the commit message hail-is#6:

load matrix

# This is the commit message hail-is#7:

imports

# This is the commit message hail-is#8:

loadbgen converted

# This is the commit message hail-is#9:

convert loadplink

# This is the commit message hail-is#10:

convert loadgdb

# This is the commit message hail-is#11:

convert loadvcf

# This is the commit message hail-is#12:

convert blockmatrix

# This is the commit message hail-is#13:

convert filterintervals

# This is the commit message hail-is#14:

convert ibd

# This is the commit message hail-is#15:

convert a few methods

# This is the commit message hail-is#16:

convert split multi

# This is the commit message hail-is#17:

convert VEP

# This is the commit message hail-is#18:

formatting fix

# This is the commit message hail-is#19:

add partitionBy and values

# This is the commit message hail-is#20:

fix bug in localkeysort

# This is the commit message hail-is#21:

fixup HailContext.readRowsPartition use

# This is the commit message hail-is#22:

port balding nichols model

* apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

make TrivialContext Resettable

a few more missing resettablecontexts

address comments

apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

remove rogue element type type

make TrivialContext Resettable

wip

wip

wip

wip

use safe row in join suite

pull over hailcontext

remove Region.clear(newEnd)

add selectRegionValue

convert relational.scala
;

scope the extract aggregators constfb call

scope interpret

typeAfterSelect used by selectRegionValue

load matrix

imports

loadbgen converted

convert loadplink

convert loadgdb

convert loadvcf

convert blockmatrix

convert filterintervals

convert ibd

convert a few methods

convert split multi

convert VEP

formatting fix

add partitionBy and values

fix bug in localkeysort

fixup HailContext.readRowsPartition use

port balding nichols model

port over table.scala

couple fixes

convert matrix table

remove necessary use of rdd

variety of fixups

wip

add a clear

* Remove direct Region allocation from FilterColsIR

When regions are off-heap, we can allow the globals to live
in a separate, longer-lived Region that is not cleared until
the whole partition is finished. For now, we pay the
memory cost.

* Use RVDContext in MatrixRead zip

This Region will get cleared by consumers.

I introduced the zip primitive which is a safer way to
zip two RVDs because it does not rely on the user correctly
clearing the regions used by the left and right hand sides
of the zip.

* Control the Regions in LoadGDB

I do not fully understand how LoadGDB is working, but a simple
solution to the use-case is to serialize to arrays of bytes
and parallelize those.

I realize there is a proliferation of `coerce` methods. I plan
to trim this down once we do not have RDD and ContextRDD coexisting

* wip

* unify RVD.run

* reset in write

* fixes

* use context region when allocating

* also read RVDs using RVDContext

* formatting

* address comments

* remove unused val

* abstract over boundary

* little fixes

* whoops forgot to clear before persisting

This fixes the LDPrune if you dont clear the region things go wrong.
Not sure what causes that bug. Maybe its something about encoders?

* serialize for shuffles, region.scoped in matrixmapglobals, fix joins

* clear more!

* wip

* wip

* rework GeneralRDD to ease ContextRDD transition

* formatting

* final fixes

* formatting

* merge failures

* more bad merge stuff

* formatting

* remove unnecessary stuff

* remove fixme

* boom!

* variety of merge mistakes

* fix destabilize bug

* add missing newline

* remember to clear the producer region in localkeysort

* switch def to val

* cleanup filteralleles and exporbidbimfam

* fix clearing and serialization issue

* fix BitPackedVectorView

Previously it always assumed the variant struct started at offset
zero, which is not true

* address comments, remove a comment

* remove direct use of Region

* oops

* werrrks, mebbe

* needs cleanup

* fix filter intervals

* fixes

* fixes

* fix filterintervals

* remove unnecessary copy in TableJoin

* and finally fix the last test

* re-use existing CodecSpec definition

* remove unnecessary boundaries

* use RVD abstraction when possible

* formatting

* bugfix: RegionValue must know its region

* remove unnecessary val and comment

* remove unused methods

* eliminate unused constructors

* undo debug change

* formatting

* remove unused imports

* fix bug in tablejoin

* fix RichRDDSuite test

If you have no data, then you have no partitions, not 1 partition
jackgoldsmith4 pushed a commit to jackgoldsmith4/hail that referenced this issue Jun 25, 2018
* # This is a combination of 22 commits.
# This is the 1st commit message:

apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

make TrivialContext Resettable

a few more missing resettablecontexts

address comments

apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

remove rogue element type type

make TrivialContext Resettable

wip

wip

wip

wip

use safe row in join suite

pull over hailcontext

remove Region.clear(newEnd)

add selectRegionValue

# This is the commit message hail-is#2:

convert relational.scala
;

# This is the commit message hail-is#3:

scope the extract aggregators constfb call

# This is the commit message hail-is#4:

scope interpret

# This is the commit message hail-is#5:

typeAfterSelect used by selectRegionValue

# This is the commit message hail-is#6:

load matrix

# This is the commit message hail-is#7:

imports

# This is the commit message hail-is#8:

loadbgen converted

# This is the commit message hail-is#9:

convert loadplink

# This is the commit message hail-is#10:

convert loadgdb

# This is the commit message hail-is#11:

convert loadvcf

# This is the commit message hail-is#12:

convert blockmatrix

# This is the commit message hail-is#13:

convert filterintervals

# This is the commit message hail-is#14:

convert ibd

# This is the commit message hail-is#15:

convert a few methods

# This is the commit message hail-is#16:

convert split multi

# This is the commit message hail-is#17:

convert VEP

# This is the commit message hail-is#18:

formatting fix

# This is the commit message hail-is#19:

add partitionBy and values

# This is the commit message hail-is#20:

fix bug in localkeysort

# This is the commit message hail-is#21:

fixup HailContext.readRowsPartition use

# This is the commit message hail-is#22:

port balding nichols model

* apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

make TrivialContext Resettable

a few more missing resettablecontexts

address comments

apply resettable context

forgot to fix one use of AutoCloseable

fix

add setup iterator

more sensible method ordering

remove rogue element type type

make TrivialContext Resettable

wip

wip

wip

wip

use safe row in join suite

pull over hailcontext

remove Region.clear(newEnd)

add selectRegionValue

convert relational.scala
;

scope the extract aggregators constfb call

scope interpret

typeAfterSelect used by selectRegionValue

load matrix

imports

loadbgen converted

convert loadplink

convert loadgdb

convert loadvcf

convert blockmatrix

convert filterintervals

convert ibd

convert a few methods

convert split multi

convert VEP

formatting fix

add partitionBy and values

fix bug in localkeysort

fixup HailContext.readRowsPartition use

port balding nichols model

port over table.scala

couple fixes

convert matrix table

remove necessary use of rdd

variety of fixups

wip

add a clear

* Remove direct Region allocation from FilterColsIR

When regions are off-heap, we can allow the globals to live
in a separate, longer-lived Region that is not cleared until
the whole partition is finished. For now, we pay the
memory cost.

* Use RVDContext in MatrixRead zip

This Region will get cleared by consumers.

I introduced the zip primitive which is a safer way to
zip two RVDs because it does not rely on the user correctly
clearing the regions used by the left and right hand sides
of the zip.

* Control the Regions in LoadGDB

I do not fully understand how LoadGDB is working, but a simple
solution to the use-case is to serialize to arrays of bytes
and parallelize those.

I realize there is a proliferation of `coerce` methods. I plan
to trim this down once we do not have RDD and ContextRDD coexisting

* wip

* unify RVD.run

* reset in write

* fixes

* use context region when allocating

* also read RVDs using RVDContext

* formatting

* address comments

* remove unused val

* abstract over boundary

* little fixes

* whoops forgot to clear before persisting

This fixes the LDPrune if you dont clear the region things go wrong.
Not sure what causes that bug. Maybe its something about encoders?

* serialize for shuffles, region.scoped in matrixmapglobals, fix joins

* clear more!

* wip

* wip

* rework GeneralRDD to ease ContextRDD transition

* formatting

* final fixes

* formatting

* merge failures

* more bad merge stuff

* formatting

* remove unnecessary stuff

* remove fixme

* boom!

* variety of merge mistakes

* fix destabilize bug

* add missing newline

* remember to clear the producer region in localkeysort

* switch def to val

* cleanup filteralleles and exporbidbimfam

* fix clearing and serialization issue

* fix BitPackedVectorView

Previously it always assumed the variant struct started at offset
zero, which is not true

* address comments, remove a comment

* remove direct use of Region

* oops

* werrrks, mebbe

* needs cleanup

* fix filter intervals

* fixes

* fixes

* fix filterintervals

* remove unnecessary copy in TableJoin

* and finally fix the last test

* re-use existing CodecSpec definition

* remove unnecessary boundaries

* use RVD abstraction when possible

* formatting

* bugfix: RegionValue must know its region

* remove unnecessary val and comment

* remove unused methods

* eliminate unused constructors

* undo debug change

* formatting

* remove unused imports

* fix bug in tablejoin

* fix RichRDDSuite test

If you have no data, then you have no partitions, not 1 partition
cseed pushed a commit to cseed/hail that referenced this issue Sep 22, 2018
Removed headers on tables in the user page
danking pushed a commit that referenced this issue Sep 24, 2018
* initial commit

* changes to run on cluster

* support multiple repos

put token in kubenetes secrets

* fixed deployment

* fix ci fq repo name

* updated Makefile

use delete/create for redeploy until we stop using latest image

* added custom user pages

* fixed indentation

* fix treating pulls as issues

* poll Github

* fix import

* fix changes requested reporting

* Small improvements

 - declare language="en" so I don't get translate notifications ?????
 - use defaultdict for great profit
 - use daemon thread so ctrl-C actually kills the server

* fix

* Small improvement 2

* add logging, restart on poll thread

* fix users head/header

* autoformat htmls (#12)

* BANISH THE SERIFS WHENCE THEY CAME (#13)

* add author column to "needs review" table (#14)

* Add a list of failing builds to the user page (#16)

* Slightly better formatting. (#17)

Removed headers on tables in the user page

* import sys so the retry stuff works (#18)

* assertion is going off, log instead (#19)

* reverse reviews (come in chrological order) (#20)

We want the most recent.  From https://developer.github.com/v3/pulls/reviews/:

 > The list of reviews returns in chronological order.

scorecard was categorizing #4328 incorrectly.

* move to project dir to merge into monorepo

* updated .gitignore

* backed off spacing

* make targets phony
cseed pushed a commit to cseed/hail that referenced this issue May 1, 2019
Exposed --metadata and fixed problem with creating directory
danking added a commit to danking/hail that referenced this issue Jul 8, 2019
Add annotation_db.json to hail-datasets
daniel-goldstein referenced this issue in daniel-goldstein/hail Feb 3, 2022
Conda pkg: prepare meta.yaml in a separate ubuntu job
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant