Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add Alignment.heatmap method #816

Closed
wants to merge 25 commits into from

Conversation

jairideout
Copy link
Contributor

This pull request includes @Kleptobismol's commits in #779. I added a few commits with documentation updates, a few bug fixes, some refactoring, and more unit tests.

Fixes #765.

@gregcaporaso @ebolyen can you please review? @gregcaporaso I'll follow up with a comment on a specific part of the code that I'd like your feedback on.

kestrelgorlick and others added 18 commits November 24, 2014 17:10
Also fig_size defaults to None, and so does cmap.
The docstring entries about the X and Y axis labels are likely in the wrong place.
…additions

Took a pass through the code and unit tests, cleaning up the implementation a
bit and improving error messages. Also refactored some of the code for easier
testing of individual pieces, which uncovered a few bugs:

- every value in `value_map` was being used to compute min/median/max for the
  legend, and default values (if `value_map` was a defaultdict) were ignored
- plotting an empty alignment resulted in a cryptic error
- plotting an alignment with a single sequence produced the wrong y-axis labels

Fixes scikit-bio#765.
``KeyErrors`` are not caught, so all possible values should be in
`value_map`, or it should be a ``collections.defaultdict`` which
can, for example, default to ``nan``.
legend_labels : iterable, optional
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregcaporaso does the behavior described for legend_labels make sense?

In your original cookbook implementation, all of the values in value_map were used to compute the minimum, median, and maximum. This could produce strange behavior if a value in value_map wasn't used in the heatmap. For example, if value_map had an unused mapping to a maximum value, the "maximum" label wouldn't show up in the legend because a different (smaller) maximum value was used in the heatmap/colorbar.

Also, if a defaultdict was supplied, mappings using the default value weren't being considered when computing min/median/max. For example, suppose the defaultdict had a default value that happened to be the true maximum, the "maximum" label would be placed in the wrong spot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jairideout, that makes sense and I see that it has to be that way now that you bring it up. But, I wonder if we should include the values in the legend label as well in this case. So, next to the label for the minimum, you'd always include the min value in parenthesis, and same for mean and max. Otherwise it could be misleading if your labels were *hydrophilic", "medium", and "hydrophobic", and then there were no "hydrophobic" residues in the alignment (I know, very unlikely, but just working from the cookbook example).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, I'll add that in. Since these labels are pretty specialized, I think a better default is to not include these labels on the legend at all (i.e., legend_labels=None). If someone wants to mark the min/median/max, then they have the option to, but then we're not forcing users to label the legend this way. Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good idea.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.15%) to 98.72% when pulling 89f45fc on jairideout:aln-heatmap into 1f3bb5a on biocore:master.

@jairideout
Copy link
Contributor Author

@gregcaporaso @ebolyen this is ready for review -- tests are passing now and I manually verified that coverage is at 100% for the code I added (coveralls isn't rerunning for some reason).

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 98.88% when pulling a7b9094 on jairideout:aln-heatmap into 1f3bb5a on biocore:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 98.88% when pulling a97665a on jairideout:aln-heatmap into d9371eb on biocore:master.

@ebolyen
Copy link
Contributor

ebolyen commented Feb 20, 2015

@jairideout Merge conflict.

Conflicts:
	CHANGELOG.md
	skbio/alignment/_alignment.py
	skbio/alignment/tests/test_alignment.py
@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 98.99% when pulling 246b15f on jairideout:aln-heatmap into bf0d8c6 on biocore:master.

@jairideout
Copy link
Contributor Author

Fixed, thanks!

@gregcaporaso
Copy link
Contributor

Looks good, thanks @jairideout! I just added one suggestion based on your question - if that doesn't make sense we can discuss tomorrow.

See here for choices:
http://matplotlib.org/examples/color/colormaps_reference.html
If ``None``, defaults to the colormap specified in the user's
matplotlibrc file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the default if it's not specified in the user's matplotlibrc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With any luck, we'll have a better default in the future as mpl is discussing a change to the default color map (matplotlib/matplotlib#875). Anyone want to weigh in on that discussion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fall back to whatever matplotlib's default is (there is always a "base" config file included in a matplotlib install, similar to how QIIME config files work). I don't think listing this here is appropriate because matplotlib's defaults may change and we won't want to update these docs when that happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, agree.

@jairideout
Copy link
Contributor Author

Thanks for reviewing @gregcaporaso! I had another question (see inline). In the meantime I'll work on the change you suggested.

@jairideout
Copy link
Contributor Author

@gregcaporaso I made the changes we discussed. Can you and @ebolyen review?

@jairideout jairideout modified the milestones: 0.3.0: Sequences, Collections, and Alignments, oh my!, 0.3.4: Easy as ABC Mar 9, 2015
def test_heatmap_invalid_sequence_order(self):
# duplicate ids
with self.assertRaises(ValueError):
self.a1.heatmap({}, sequence_order=['d1', 'd2', 'd1'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you assert some basic verbs in the error message exist to ensure that the correct error is being raised?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent idea!

@jairideout
Copy link
Contributor Author

Thanks for reviewing @ebolyen! I'm going to hold off on addressing your comments until after the alignment object is refactored (#823) , then I'll update the changes here to work with the new API. I'm marking this as "do not merge" for now.

@jairideout jairideout removed this from the 0.3.0: Sequences, Collections, and Alignments, oh my! milestone Jun 15, 2015
@jairideout jairideout closed this Jun 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add TabularMSA.heatmap method
5 participants