New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding printAttribute methods for alignment records, features, and samples. #1982

Merged
merged 1 commit into from Jul 4, 2018

Conversation

Projects
None yet
4 participants
@heuermh
Copy link
Member

heuermh commented Apr 12, 2018

Follow-on to #1958.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Apr 12, 2018

Coverage Status

Coverage decreased (-0.4%) to 78.649% when pulling 084c5e0 on heuermh:moar-print-methods into 75b51e7 on bigdatagenomics:master.

@AmplabJenkins

This comment has been minimized.

Copy link

AmplabJenkins commented Apr 13, 2018

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2750/
Test PASSed.

@heuermh

This comment has been minimized.

Copy link
Member

heuermh commented Apr 18, 2018

@fnothaft Milestone as 0.24.1?

@fnothaft
Copy link
Member

fnothaft left a comment

OOC, what's the benefit of this relative to using Dataset.show?

@heuermh

This comment has been minimized.

Copy link
Member

heuermh commented Apr 20, 2018

What's the benefit of this relative to using Dataset.show?

It is opinionated on the columns to show, and in what order, and creates separate columns for attributes.

scala> val features = sc.loadFeatures("adam-core/src/test/resources/dvl1.200.gff3")
features: org.bdgenomics.adam.rdd.feature.FeatureRDD = RDDBoundFeatureRDD with 0 reference sequences

scala> printFeatureAttributes(features, Seq("biotype"))

Feature Attributes
+-------------+---------+---------+---------+--------------------+--------------------+------+-------+----------------+
| Contig Name |  Start  |   End   | Strand  |        Name        |     Identifier     | Type | Score |    biotype     |
+-------------+---------+---------+---------+--------------------+--------------------+------+-------+----------------+
|           1 | 1331313 | 1335306 | FORWARD |    ENSG00000169962 |    ENSG00000169962 | gene |       | protein_coding |
|           1 | 1335275 | 1349350 | REVERSE |    ENSG00000107404 |    ENSG00000107404 | gene |       | protein_coding |
|           1 | 1339649 | 1339708 | REVERSE |    ENSG00000275884 |    ENSG00000275884 | gene |       |          miRNA |
|           1 | 1352688 | 1361777 | REVERSE |    ENSG00000162576 |    ENSG00000162576 | gene |       | protein_coding |
|           1 | 1331313 | 1335306 | FORWARD | OTTHUMG00000003071 | OTTHUMG00000003071 | gene |       | protein_coding |
|           1 | 1335275 | 1349350 | REVERSE | OTTHUMG00000003069 | OTTHUMG00000003069 | gene |       | protein_coding |
|           1 | 1352688 | 1361777 | REVERSE | OTTHUMG00000002973 | OTTHUMG00000002973 | gene |       | protein_coding |
|           1 | 1328998 | 1335320 | FORWARD |              83756 |              83756 | gene |       | protein_coding |
|           1 | 1328998 | 1335320 | FORWARD |              83756 |              83756 | gene |       | protein_coding |
|           1 | 1331345 | 1334464 | FORWARD |        CCDS30556.1 |        CCDS30556.1 | gene |       | protein_coding |
+-------------+---------+---------+---------+--------------------+--------------------+------+-------+----------------+


scala> features.dataset.show
2018-04-20 11:11:11 WARN  ObjectStore:568 - Failed to get database global_temp, returning NoSuchObjectException
+------------------+------------------+-------+-----------+----------+-------+-------+-------+-----+-----+-----+------+------------+------+-------+---------+------+----+-----------+-----+-------+-------------+--------+--------------------+
|         featureId|              name| source|featureType|contigName|  start|    end| strand|phase|frame|score|geneId|transcriptId|exonId|aliases|parentIds|target| gap|derivesFrom|notes|dbxrefs|ontologyTerms|circular|          attributes|
+------------------+------------------+-------+-----------+----------+-------+-------+-------+-----+-----+-----+------+------------+------+-------+---------+------+----+-----------+-----+-------+-------------+--------+--------------------+
|   ENSG00000169962|   ENSG00000169962|Ensembl|       gene|         1|1331313|1335306|FORWARD| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|   ENSG00000107404|   ENSG00000107404|Ensembl|       gene|         1|1335275|1349350|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|   ENSG00000275884|   ENSG00000275884|Ensembl|       gene|         1|1339649|1339708|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|  [biotype -> miRNA]|
|   ENSG00000162576|   ENSG00000162576|Ensembl|       gene|         1|1352688|1361777|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|OTTHUMG00000003071|OTTHUMG00000003071|   Vega|       gene|         1|1331313|1335306|FORWARD| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|OTTHUMG00000003069|OTTHUMG00000003069|   Vega|       gene|         1|1335275|1349350|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|OTTHUMG00000002973|OTTHUMG00000002973|   Vega|       gene|         1|1352688|1361777|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|             83756|             83756|Ensembl|       gene|         1|1328998|1335320|FORWARD| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|             83756|             83756|Ensembl|       gene|         1|1328998|1335320|FORWARD| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|       CCDS30556.1|       CCDS30556.1|Ensembl|       gene|         1|1331345|1334464|FORWARD| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|              1855|              1855|Ensembl|       gene|         1|1335277|1349158|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|              1855|              1855|Ensembl|       gene|         1|1335277|1349158|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|          CCDS22.1|          CCDS22.1|Ensembl|       gene|         1|1336141|1349065|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|         102466740|         102466740|Ensembl|       gene|         1|1339649|1339708|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> misc_...|
|         102466740|         102466740|Ensembl|       gene|         1|1339649|1339708|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|  [biotype -> miRNA]|
|ENSESTG00000024621|ENSESTG00000024621|Ensembl|       gene|         1|1340042|1349108|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|             54587|             54587|Ensembl|       gene|         1|1352688|1363541|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|             54587|             54587|Ensembl|       gene|         1|1352688|1363541|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|       CCDS59951.1|       CCDS59951.1|Ensembl|       gene|         1|1353300|1358504|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
|       CCDS59950.1|       CCDS59950.1|Ensembl|       gene|         1|1353603|1358504|REVERSE| null| null| null|  null|        null|  null|     []|       []|  null|null|       null|   []|     []|           []|    null|[biotype -> prote...|
+------------------+------------------+-------+-----------+----------+-------+-------+-------+-----+-----+-----+------+------------+------+-------+---------+------+----+-----------+-----+-------+-------------+--------+--------------------+
only showing top 20 rows

That said, I don't know much about the Dataset.show API, is it extensible or customizable?

@AmplabJenkins

This comment has been minimized.

Copy link

AmplabJenkins commented Apr 20, 2018

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2758/

Build result: FAILURE

[...truncated 7 lines...] > /home/jenkins/git2/bin/git init /home/jenkins/workspace/ADAM-prb # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/heads/:refs/remotes/origin/ # timeout=15 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10 > /home/jenkins/git2/bin/git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1982/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains cceb31c # timeout=10Checking out Revision cceb31c (origin/pr/1982/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f cceb31c87b46c61c71b8fcb131de46a428f945feFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.2,2.10,2.2.1,centosTriggering ADAM-prb ? 2.6.2,2.11,2.2.1,centosTriggering ADAM-prb ? 2.7.3,2.10,2.2.1,centosTriggering ADAM-prb ? 2.7.3,2.11,2.2.1,centosADAM-prb ? 2.6.2,2.10,2.2.1,centos completed with result FAILUREADAM-prb ? 2.6.2,2.11,2.2.1,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.10,2.2.1,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.11,2.2.1,centos completed with result SUCCESSNotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@heuermh heuermh force-pushed the heuermh:moar-print-methods branch from 37a57b7 to 084c5e0 Apr 21, 2018

@AmplabJenkins

This comment has been minimized.

Copy link

AmplabJenkins commented Apr 21, 2018

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2759/
Test PASSed.

@heuermh heuermh added this to the 0.24.1 milestone Jun 6, 2018

@fnothaft fnothaft merged commit bf5d033 into bigdatagenomics:master Jul 4, 2018

2 checks passed

Codacy/PR Quality Review Up to standards. A positive pull request.
Details
default Merged build finished.
Details
@fnothaft

This comment has been minimized.

Copy link
Member

fnothaft commented Jul 4, 2018

Merged! Thanks @heuermh!

@heuermh heuermh deleted the heuermh:moar-print-methods branch Jul 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment