Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build info/version info inside ADAM-generated files #188

Closed
nealsid opened this issue Mar 24, 2014 · 8 comments
Closed

Build info/version info inside ADAM-generated files #188

nealsid opened this issue Mar 24, 2014 · 8 comments

Comments

@nealsid
Copy link
Contributor

@nealsid nealsid commented Mar 24, 2014

We should build off of Sebastian's work in #138 to output ADAM version info inside files generated by ADAM, so that we can version files containing ADAMRecords, ADAMNucleotideFragments, ADAMVariants, etc.

@nealsid
Copy link
Contributor Author

@nealsid nealsid commented Mar 25, 2014

Also, I may have missed some previous discussions on how we do this, but I recently converted hg19 to a Parquet file of ADAMNucleotideConfigFragments. It seems there's no way to recover the reference version information - or am I missing something? The AVRO record contig fields don't store this. Can we shove it in the Parquet metadata somewhere?

@tdanford
Copy link
Contributor

@tdanford tdanford commented Jul 24, 2014

Calling out @massie here (when you get back from vacation, Matt) -- he's had some thoughts on embedding information into the Parquet metadata.

@fnothaft
Copy link
Member

@fnothaft fnothaft commented Sep 20, 2014

Ping @massie

@massie
Copy link
Member

@massie massie commented Sep 22, 2014

Once we upgrade to Parquet 1.6.0, we'll be able to read/write arbitrary metadata much more easily. We can easily drop the version info (introduced in #138) into the metadata to help with debugging.

The upgrade to 1.6.0 is going well but three tests are failing because of issues with predicates (UnboundRecordFilter).

@heuermh
Copy link
Member

@heuermh heuermh commented Oct 7, 2015

Is this worth another look? Parquet dependency is now at version 1.8.x.

@fnothaft
Copy link
Member

@fnothaft fnothaft commented Jul 6, 2016

Perhaps we can write this with our various metadata?

@fnothaft
Copy link
Member

@fnothaft fnothaft commented Mar 3, 2017

We should resolve this as part of #1257.

@heuermh heuermh added this to Triage in Release 1.0.0 Mar 8, 2017
@fnothaft fnothaft added the duplicate label May 12, 2017
@fnothaft
Copy link
Member

@fnothaft fnothaft commented May 12, 2017

This will be resolved as part of #1257. Closing as dupe.

@fnothaft fnothaft closed this May 12, 2017
@heuermh heuermh modified the milestones: 1.0.0, 0.23.0 Dec 7, 2017
@heuermh heuermh added this to Completed in Release 0.23.0 Jan 4, 2018
@heuermh heuermh moved this from Triage to Completed in Release 1.0.0 Jan 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Release 1.0.0
Completed
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.