Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

visualization of the environmental parameters gradients #13

Closed
gregcaporaso opened this issue Jul 19, 2012 · 35 comments
Closed

visualization of the environmental parameters gradients #13

gregcaporaso opened this issue Jul 19, 2012 · 35 comments
Assignees
Labels

Comments

@gregcaporaso
Copy link
Contributor

From Jack:

I would also like to have a visual representation of the environmental gradients we have for each ecosystem. i.e. I can imagine a figure like the attached (sorry in my hotel room) - where we represent from a gradient of 0-100 the coverage of the gradients we have already surveyed. 0 would be the lowest possible (sensible) limit for that variable and 100 the highest. So for temp we would go for -56C to +120C, and for pH from 1 to 14 - or something like that. I could have some one start creating this if everyone agrees it is a good idea.

@dansmith01
Copy link

How's something like this? The different colors represent different environments.
Red = Animal-Associated
Green = Sea Water
Blue = Fresh Water
Orange = Soil
Black = other

Env. Gradients V1

@gregcaporaso
Copy link
Contributor Author

This is cool Dan.

Would it be possible to add something like a mouseover to see what
projects/samples contribute to what bars in the bar chart?

Also, it's a bit of a problem that some of the bars are obscuring
other bars. Would having the different environments side-by-side work
better?

@dansmith01
Copy link

Mouseovers would be cool - I'd need to rework my script quite a bit though to track projects/samples.

The bars are stacked, so there's no need to worry about visual obstruction :)

@dansmith01
Copy link

I should also mention that although the combined height of each bar is plotted on a log scale, the color-by-color breakdown is simple percentage.

@gregcaporaso
Copy link
Contributor Author

OK, that makes sense, thanks! What do others think? Is mouseover to
show project/sample ids in Dan's plot worth the development effort on
this part?

Also, Dan, could you modify so the figure titles are not abbreviated
(e.g., 'tot org carb' becomes 'Total Organic Carbon')?

@dansmith01
Copy link

I've updated the figure to display full title descriptions as you suggested - you may need to click on the above image and hit refresh to see the changes. Do you happen to know what the units of measurement are for the values in EMP data, e.g. um vs nm?

I've also put together this graphic showing the geographical distribution of EMP samples:
Geographic Distribution

@gregcaporaso
Copy link
Contributor Author

Cool, this looks better. If @gilbertjack agrees that this is what he's looking for I think we can close the issue.

As for the geographic distribution, this overlaps with issue #2 so @dansmith01 and @douginator2000 should connect about this.

@gilbertjack
Copy link

I am very happy with this.

@gilbertjack
Copy link

Hey Dan,

Can you remove the gradient bars between the major divisions - i.e. just keep the gradient bars for 10, 100, 1000, 10,000, etc.

Cheers

Jack

@dansmith01
Copy link

Sure thing! I've updated the figure. You can also view it at this link:
http://img.dnasmith.com/histograms_static1.png

I left the tick marks though to help key in the viewer that it's on a log scale. If you'd like them removed as well, just let me know.

@dansmith01
Copy link

I'm thinking it'd be cool to add a fourth color for human-associated samples. What do you think?

@gilbertjack
Copy link

sure sounds good, and yes i am fine with the marks

@dansmith01
Copy link

Ok, updated. And some of them now have a log scale on the x-axis.

@gregcaporaso
Copy link
Contributor Author

@dansmith01, this looks much better! Thanks!

Two last issues:

  • Can this be generated as a PDF for better scaling?
  • We need a legend for the colors. Can you generate this as a
    separate PDF? That will be most convenient for use when building
    slides, etc.

Once this is done I think we're ready to commit these files to the
data repository and close this issue.

@dansmith01
Copy link

Here's a PDF of the histograms: http://img.dnasmith.com/histograms.pdf
I'll put together a legend shortly.

@gregcaporaso
Copy link
Contributor Author

@dansmith01 - what was the source of the data in this plot? I'm realizing that we still don't have the full mapping file together, so I'm just wondering if this is comprehensive.

@gregcaporaso
Copy link
Contributor Author

@dansmith01 - just wanted to check on the source of this data. We'll need the plot generated from the latest mapping file (see issue #24) which we're hoping will be ready tonight. Sorry, I hope that's not too much extra effort!

@dansmith01
Copy link

@gregcaporaso - I'm using the metadata files downloaded from the EMP GESD.

@gregcaporaso
Copy link
Contributor Author

OK, there is going to be a new "official" metadata file coming soon (issue
#24). Will you be able to easily regenerate the plot with that one? It will
be the same format as the ones you're downloading.

Greg

On Fri, Aug 10, 2012 at 1:06 PM, dansmith01 notifications@github.comwrote:

@gregcaporaso https://github.com/gregcaporaso - I'm using the metadata
files downloaded from the EMP GESD.


Reply to this email directly or view it on GitHubhttps://github.com//issues/13#issuecomment-7655210.

@dansmith01
Copy link

Yep - I think that'll be simple enough.

@dansmith01
Copy link

I've got the legend for this figure ready to go:
http://img.dnasmith.com/histograms-legend.pdf

Legend

@gregcaporaso
Copy link
Contributor Author

Perfect, thanks!

@dansmith01
Copy link

@gregcaporaso - The official metadata file has 6,541 samples, whereas the one I compiled from GESD has 14,176 samples. For example, the GESD dataset named "sample_template_2012-06-14 13_54_16.486850" is missing. Do you know why so many samples were excluded from the official compilation, and would you still like me to regenerate the above histograms using the reduced dataset?

@jistombaugh
Copy link
Contributor

The official metadata file contains only the samples that were sequenced and then subsequently processed and loaded into the QIIME-DB. The EMP portal contains all samples including those samples which haven't been sequenced yet, which is why there is a large discrepancy in the numbers.

@gregcaporaso
Copy link
Contributor Author

For this analysis I think we want to go with what has been sequenced already as that's what we're including for the other analyses. @gilbertjack and @rob-knight, do you agree?

@gilbertjack
Copy link

Yes

@rob-knight
Copy link

yes, definitely

On Aug 13, 2012, at 6:05 PM, gilbertjack <notifications@github.commailto:notifications@github.com> wrote:

Yes


Reply to this email directly or view it on GitHubhttps://github.com//issues/13#issuecomment-7712456.

@gregcaporaso
Copy link
Contributor Author

@dansmith01, let me know if you need anything else to get this done.

@ghost ghost assigned dansmith01 Aug 15, 2012
@dansmith01
Copy link

@gregcaporaso, could you take a quick look at the master mapping file (issue #24)? The last 104 lines don't seems to mesh with the columns in the lines above.

@gregcaporaso
Copy link
Contributor Author

@dansmith01 are you sure your dropbox in sync'ing? That was an issue with an older version, but I fixed that a couple of days ago. The version here:

https://github.com/EarthMicrobiomeProject/isme14/blob/master/master_mapping_file.txt.gz?raw=true

also has the fix.

@dansmith01
Copy link

Not sure about dropbox, but that link works great! Thanks

@jairideout
Copy link
Member

It might be an issue with dropbox- Greg and I ran into the same issue where
my shared dropbox folder wasn't updating a couple of days ago.

On Wed, Aug 15, 2012 at 12:46 PM, dansmith01 notifications@github.comwrote:

Not sure about dropbox, but that link works great! Thanks


Reply to this email directly or view it on GitHubhttps://github.com//issues/13#issuecomment-7766996.

@dansmith01
Copy link

@alexdthomas
Copy link

Hello all,

I made a mock-up demonstration along a similar line as this thread for my first meeting with Dr. Jansson, you can see it here.
https://www.dropbox.com/s/wd68fpdlakw83c2/AThomas_EMP_DemoAnalysis.pptx

Jack Gilbert liked the maps and wanted to use them. However, I am unsure as to the quality of the data I used. I through this together by copying the Lat/Long coordinates out of the map here http://www.microbio.me/emp/

I'd like to make myself useful, so let me know what you think. I have some questions and concerns about the metadata I added to #17 (comment)

Thanks,
Alex

@cuttlefishh
Copy link
Collaborator

@rob-knight suggested redoing graphs of environmental parameters for EMP 20k analysis.

@cuttlefishh cuttlefishh self-assigned this Dec 23, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants