Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make Alignment plottable as a sequence logo #805

Open
jairideout opened this issue Jan 13, 2015 · 29 comments
Open

make Alignment plottable as a sequence logo #805

jairideout opened this issue Jan 13, 2015 · 29 comments

Comments

@jairideout
Copy link
Contributor

@gregcaporaso, @Kleptobismol, and I chatted awhile ago about adding a plotting method to Alignment to create a sequence logo. This could be a really cool visualization and might be a good default plot type for Alignment.

@jairideout jairideout added this to the 0.2.3: Easy as ABC milestone Jan 13, 2015
@kestrelgorlick
Copy link
Contributor

Originally we found an existing tool called WebLogo which does exactly this. However, when looking through the source code, I found that it doesn't use matplotlib.figure.Figure as an output, but rather directly generates an image file. Plotting methods in scikit-bio need to return a matplotlib.figure.Figure object so this prevented us from using this program.

The second possibility that I found was a kwarg within matplotlib.text.Text called stretch. This allowed the user to stretch the text bbox horizontally, so used in combination with changing the font size, it should have been effective. I learned however that this feature was not yet implemented into matplotlib, so I threw away that idea.

The final option, given to me by @gregcaporaso as a possible fix was to use PIL to create an image that only displays a single letter, and then stretch the image using matplotlib.image. I was able to make the image which only displays text, using PIL, but while looking through the documentation of the matplotlib's image module, I could not find anything that would allow me to stretch the image in any direction; it seems to only allow plotting of the image on a fixed point at a best fit size.

I have exhausted every possibility I could think of. If anyone has an idea of how to accomplish this with matplotlib, that would be much appreciated.

cc @gregcaporaso

@rob-knight
Copy link

Jeremy Widmann implemented this at one point using pycogent so it might be in that codebase somewhere.

On Jan 13, 2015, at 2:14 PM, Kestrel Gorlick <notifications@github.commailto:notifications@github.com> wrote:

Originally we found an existing tool called WebLogohttp://weblogo.berkeley.edu/ which does exactly this. However, when looking through the source code, I found that it doesn't use matplotlib.figure.Figure as an output, but rather directly generates an image file. Plotting methods in scikit-bio need to return a matplotlib.figure.Figure object so this prevented us from using this program.

The second possibility that I found was a kwarg within matplotlib.text.Texthttp://matplotlib.org/api/text_api.html#matplotlib.text.Text called stretch. This allowed the user to stretch the text bbox horizontally, so used in combination with changing the font size, it should have been effective. I learned however that this feature was not yet implementedhttps://github.com/matplotlib/matplotlib/blob/478b181748c14610fc76fe7dbf6a8fbfcbe48b88/matplotlibrc.template#L133 into matplotlib, so I threw away that idea.

The final option, given to me by @gregcaporasohttps://github.com/gregcaporaso as a possible fix was to use PIL to create an image that only displays a single letter, and then stretch the image using matplotlib.imagehttp://matplotlib.org/api/image_api.html#matplotlib.image.AxesImage. I was able to make the image which only displays text, using PIL, but while looking through the documentation of the matplotlib's image module, I could not find anything that would allow me to stretch the image in any direction; it seems to only allow plotting of the image on a fixed point at a best fit size.

I have exhausted every possibility I could think of. If anyone has an idea of how to accomplish this with matplotlib, that would be much appreciated.

cc @gregcaporasohttps://github.com/gregcaporaso


Reply to this email directly or view it on GitHubhttps://github.com//issues/805#issuecomment-69831344.

@jairideout
Copy link
Contributor Author

Thanks @rob-knight! @Kleptobismol the PyCogent repo is here:

https://github.com/pycogent/pycogent

@anderspitman
Copy link
Contributor

Did the WebLogo source not yield any obvious hints as to how they accomplish this?

@kestrelgorlick
Copy link
Contributor

@rob-knight , thank you for your suggestion, I really needed some help on this. However I combed through the PyCogent code with these different searches.

sequence_logo
0 Results

sequence logo
0 Results

sequence
355 Results
So instead of having to comb through this multitude of code, I tried logo instead.

logo
1 Result
It was a mention in the config file, unrelated to sequence logos.

plot
38 Results
None related to sequence logos.

So based on this, I am fairly certain that a sequence logo function was not implemented into PyCogent.

@rob-knight
Copy link

If anyone reading this has access to our old CVS repository or a backup of it (I can’t look into it now), it’s apparently in Projects/WebLogo . Anyone available to assist?

On Feb 5, 2015, at 9:43 AM, Kestrel Gorlick <notifications@github.commailto:notifications@github.com> wrote:

@rob-knighthttps://github.com/rob-knight , thank you for your suggestion, I really needed some help on this. However I combed through the PyCogent code with these different searches.

sequence_logo
0 Results

sequence logo
0 Results

sequence
355 Results
So instead of having to comb through this multitude of code, I tried logo instead.

logo
1 Result
It was a mention in the config file, unrelated to sequence logos.

plot
38 Results
None related to sequence logos.

So based on this, I am fairly certain that a sequence logo function was not implemented into PyCogent.


Reply to this email directly or view it on GitHubhttps://github.com//issues/805#issuecomment-73091094.

@kestrelgorlick
Copy link
Contributor

Again thanks @rob-knight for your help with this.

@gregcaporaso
Copy link
Contributor

Here it is, last edit: 8/22/2007!

Note that this code was written by Jeremy Widmann. If we're going to port any of this code to sckit-bio, we need to get his permission first. If it ends up being useful as a reference, but we don't use any code, we should acknowledge Jeremy in the commit message.

@wasade
Copy link
Collaborator

wasade commented Feb 5, 2015

@gregcaporaso, I can inquire easily if you'd like

On Thu, Feb 5, 2015 at 11:25 AM, Greg Caporaso notifications@github.com
wrote:

Here https://www.dropbox.com/s/to33862c8z0ux8k/Weblogo.zip?dl=0 it is,
last edit: 8/22/2007!

Note that this code was written by Jeremy Widmann. If we're going to port
any of this code to sckit-bio, we need to get his permission first. If it
ends up being useful as a reference, but we don't use any code, we should
acknowledge Jeremy in the commit message.


Reply to this email directly or view it on GitHub
#805 (comment).

@gregcaporaso
Copy link
Contributor

gregcaporaso commented Feb 5, 2015 via email

@widmannj
Copy link

widmannj commented Feb 5, 2015

Hey, Greg/Daniel.

Feel free to port this code to scikit-bio. I haven't touched the code in a while, so I'm glad to see it will live on. Last I recall, there will need to be some scaling tweaks for larger alignments, as I used it primarily for shorter non-coding RNAs. Also note, the code only supports the "ACTGU" character set, as these are not fonts, but rather matplotlib.patches.Polygon objects which I had to generate the vertices for manually. You'll need to do the same for the other IUPAC characters you want to support.

Cheers,
Jeremy

@gregcaporaso
Copy link
Contributor

Thanks @widmannj, we really appreciate it! Note that this would imply that the code would then be under the BSD license. Is that ok? We just need to explicitly confirm that for our records.

@widmannj
Copy link

widmannj commented Feb 5, 2015

BSD is fine...glad that I could help.

@gregcaporaso
Copy link
Contributor

Thanks again!

@rob-knight
Copy link

Thanks!

Rob

On Feb 5, 2015, at 11:39 AM, widmannj <notifications@github.commailto:notifications@github.com> wrote:

BSD is fine...glad that I could help.


Reply to this email directly or view it on GitHubhttps://github.com//issues/805#issuecomment-73111806.

@gregcaporaso
Copy link
Contributor

This is interesting, about alternatives to sequence logos:
http://www.isa-tools.org/towards-a-new-sequence-logo-visualization/

@ebolyen
Copy link
Contributor

ebolyen commented Sep 28, 2015

👍 That looks really cool!

@nvictus
Copy link

nvictus commented Apr 24, 2016

Hello!

In case this is useful at all, I recently wrote a module to parse SVG path shapes into matplotlib paths and use it to make sequence logo plots from font characters by plotting them as patches as in @widmannj's code.

On a less serious note, one could just as easily throw in arbitrary SVG glyphs (sequence emoji, anyone?).

@jairideout
Copy link
Contributor Author

This looks really useful, thanks for pointing us to your module @nvictus!

@Kleptobismol, pinging you so you're aware of this when wrapping up sequence logo.

@saketkc
Copy link

saketkc commented Jun 8, 2016

Hi everyone,

I have another version(more of a proof of concept) that follows tricks from the seqLogo package.

https://github.com/saketkc/motif-logos-matplotlib/blob/master/Motif%20Logos%20using%20matplotlib.ipynb

My license understanding says that BSD-3 clause and LGPL are compatible. I am unaware of the scikit-core license policy though.

@jairideout
Copy link
Contributor Author

Thanks for sharing your code @saketkc! I'm unsure about licensing but it seems BSD-3 clause and LGPL are compatible, and scikits just need to have an OSI-approved license (which includes BSD-3 clause and LGPL).

@stephenshank
Copy link

In the event that this will be of use to someone searching for matplotlib sequence logos, I've built upon the excellent work of @nvictus to do amino acid sequence logos.

I'm not sure what the status of this issue is. But if any of this code would be of use to scikit-bio, I would be glad to contribute.

@jairideout
Copy link
Contributor Author

@stephenshank that looks awesome! It'd be great to support plotting nucleotide and amino acid sequence logos from an skbio.TabularMSA object. @Kleptobismol is working on other tasks/projects now so it'd be great to have you contribute that code as a TabularMSA method if you're still interested. Let us know if you are and if you have any questions about contributing to scikit-bio (I recommend checking out CONTRIBUTING.md).

@wasade
Copy link
Collaborator

wasade commented Jun 22, 2016

Just a small comment, possible to use the same color palette as MEME uses? I believe the colors correspond to Sanger sequencing, so there is some historical precedent.

@stephenshank
Copy link

stephenshank commented Jun 23, 2016

@jairideout it would be my pleasure! At this point my only concern is whether there are any licensing issues, since I am dependent on the code of @nvictus and am quite certain that it will have to go into scikit-bio along with the TabularMSA method. I suspect this is not a problem. I will submit a PR in the near future and we can go from there.

@jairideout
Copy link
Contributor Author

@stephenshank, that sounds great, thanks for working on this! There shouldn't be any issues with including the code linked by @nvictus in scikit-bio because that code is BSD-3 clause (same as scikit-bio). If you end up including that code, just add its license as a new file under the top-level licenses directory. I usually also add a comment in the ported code stating where it came from (e.g., see these lines).

@nvictus
Copy link

nvictus commented Jun 24, 2016

Let me know if you need any help with this @stephenshank! I'd be happy to help review the PR.

@saketkc
Copy link

saketkc commented Mar 14, 2017

Any updates on this?
I created a newer version, that scales pretty well, supports multiple fonts:

http://nbviewer.jupyter.org/github/saketkc/notebooks/blob/master/python/Sequence%20Logo%20Python%20%20--%20Any%20font.ipynb?flush=true

@wasade
Copy link
Collaborator

wasade commented Mar 15, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.