Skip to content

Add as_dataframe to EffectCollection#150

Merged
iskandr merged 3 commits intomasterfrom
effects-df
May 25, 2016
Merged

Add as_dataframe to EffectCollection#150
iskandr merged 3 commits intomasterfrom
effects-df

Conversation

@arahuja
Copy link
Copy Markdown
Contributor

@arahuja arahuja commented May 12, 2016

This adds an as_dataframe to EffectCollection - happy to add or rename any columns.

Also, a few helper properties to Variant that I have been using:

  • is_snv
  • is_transition
  • is_transversion

This change is Reviewable

@iskandr
Copy link
Copy Markdown
Contributor

iskandr commented May 12, 2016

Very slight nitpick: what do you think of to_dataframe instead of as_dataframe? It feels slightly more "pythonic" and seems to be the preferred method name for other projects: BigQuery, django-pandas, xarray.

@arahuja
Copy link
Copy Markdown
Contributor Author

arahuja commented May 12, 2016

Ah yea, I like to_dataframe as well, but wanted to match the issue #128

@tavinathanson
Copy link
Copy Markdown
Contributor

Issue #128?

@arahuja
Copy link
Copy Markdown
Contributor Author

arahuja commented May 12, 2016

Yes, updated, I meant #128

Comment thread varcode/effect_collection.py Outdated
row['is_indel'] = effect.variant.is_indel
row['is_transversion'] = effect.variant.is_transversion
row['is_transition'] = effect.variant.is_transition
row['is_transversion'] = effect.variant.is_transversion
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Twice?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops, fixed

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage increased (+0.3%) to 87.065% when pulling 922901d on effects-df into d46b2c5 on master.

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage increased (+0.3%) to 87.065% when pulling 922901d on effects-df into d46b2c5 on master.

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage increased (+0.2%) to 86.989% when pulling 922901d on effects-df into d46b2c5 on master.

@coveralls
Copy link
Copy Markdown

coveralls commented May 12, 2016

Coverage Status

Coverage increased (+0.3%) to 87.065% when pulling 922901d on effects-df into d46b2c5 on master.

@arahuja
Copy link
Copy Markdown
Contributor Author

arahuja commented May 13, 2016

@iskandr seem to be running into the problem you have on pyensembl, where the py 2.7 tests on Travis on failing with The command "nosetests test --with-coverage --cover-package=varcode && ./lint.sh" exited with 137.

@iskandr
Copy link
Copy Markdown
Contributor

iskandr commented May 13, 2016

@arahuja Should be fixed now but have to wait for Travis to run. The problem was that I was decoding byte strings into ASCII, which does the sane thing under Python 3 but under Python 2 causes us to leave the byte-per-character str type to a fatter unicode representation which uses 2 (or 4!) times as much space per character. The dictionaries of transcript sequences are about ~200-300mb per release, so with 4x larger characters it's easy to imagine how we blow through Travis's 3gb limit.

@arahuja
Copy link
Copy Markdown
Contributor Author

arahuja commented May 17, 2016

@iskandr Looks to be passing now

@arahuja
Copy link
Copy Markdown
Contributor Author

arahuja commented May 25, 2016

@iskandr Good to merge?

@iskandr
Copy link
Copy Markdown
Contributor

iskandr commented May 25, 2016

@arahuja Definitely

Comment thread varcode/effect_collection.py Outdated
def row_from_effect(effect):
row = {}
row['chr'] = effect.variant.contig
row['position'] = effect.variant.start
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might make sense to be clear here about the coordinate system, e.g. call it inclusive_start. Just a suggestion, feel free to ignore

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I'll make it match the varcode field names

@timodonnell
Copy link
Copy Markdown
Contributor

these are the column names i've been using. Not necessary to match them but just fwiw https://github.com/hammerlab/varlens/blob/master/varlens/variants_util.py#L26

@coveralls
Copy link
Copy Markdown

coveralls commented May 25, 2016

Coverage Status

Coverage increased (+0.2%) to 86.989% when pulling 1421b59 on effects-df into d46b2c5 on master.

@iskandr iskandr merged commit 6803355 into master May 25, 2016
@iskandr iskandr deleted the effects-df branch May 25, 2016 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants