Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metaphlan and Humann tutorials #63

Merged
merged 31 commits into from
Oct 8, 2021
Merged

Metaphlan and Humann tutorials #63

merged 31 commits into from
Oct 8, 2021

Conversation

kescobo
Copy link
Member

@kescobo kescobo commented Oct 5, 2021

No description provided.

@annelle-abatoni
Copy link
Collaborator

I am wondering why conda.md was deleted? Is it not needed to run functions anymore?

@annelle-abatoni
Copy link
Collaborator

annelle-abatoni commented Oct 6, 2021

I am wondering why conda.md was deleted? Is it not needed to run functions anymore?

Oh, I can see that it's mentioned in docs/src/gettingstarted.md which I think answers my question...

@annelle-abatoni
Copy link
Collaborator

It seems that tutorial.md contains metaphlan functions, and I was wondering if we're going to be keeping it like that or shifting them to metaphlan.md?

docs/src/humann.md Outdated Show resolved Hide resolved
@annelle-abatoni
Copy link
Collaborator

annelle-abatoni commented Oct 6, 2021

I am not sure what the rules on this are, but do you think it would be helpful to add a link to every method mentioned that points to its respective functionality in a src file. Like CommunityProfile instead of CommunityProfile

I think this would be even more helpful when pointing to files.

@anikaluo
Copy link
Collaborator

anikaluo commented Oct 6, 2021

Not sure why but I get this error when I try to run metaphlan (Line 91):

julia> metaphlan("SRS014476-Supragingival_plaque.fasta.gz", "SRS014476-Supragingival_plaque_profile.tsv"; input_type="fasta")
[ Info: Running command: `metaphlan SRS014476-Supragingival_plaque.fasta.gz SRS014476-Supragingival_plaque_profile.tsv --input_type fasta`
ERROR: IOError: could not spawn `metaphlan SRS014476-Supragingival_plaque.fasta.gz SRS014476-Supragingival_plaque_profile.tsv --input_type fasta`: no such file or directory (ENOENT)

@anikaluo
Copy link
Collaborator

anikaluo commented Oct 6, 2021

It seems that tutorial.md contains metaphlan functions, and I was wondering if we're going to be keeping it like that or shifting them to metaphlan.md?

Yes, I wrote that tutorial and named it as such because I thought the tutorials were all going to be in one file. We can absolutely delete it if there's no new info in it.

@annelle-abatoni
Copy link
Collaborator

It seems that tutorial.md contains metaphlan functions, and I was wondering if we're going to be keeping it like that or shifting them to metaphlan.md?

Yes, I wrote that tutorial and named it as such because I thought the tutorials were all going to be in one file. We can absolutely delete it if there's no new info in it.

Oh I see. I think because right now tutorial.md has all the functions and examples of metaphlan.jl and that what's missing in metaphlan.md, we can just shift those to metaphlan.md and be done with that tutorial, and then delete the tutorial.md file.

@kescobo
Copy link
Member Author

kescobo commented Oct 8, 2021

Alright, finally back to at least some tests passing. I'm gonna merge this now. @anikaluo @annelle-abatoni can you checkout main and run tests on your mac to see if they pass for you? You should make sure to ]add Microbiome#main or ]dev Microbiome if you haven't already

@kescobo kescobo merged commit e789299 into main Oct 8, 2021
@kescobo kescobo deleted the tutorials branch October 8, 2021 01:31
@anikaluo
Copy link
Collaborator

anikaluo commented Oct 8, 2021

all tests pass locally for me!
should we now branch and comment on main? how should we review the tutorials now?

@kescobo
Copy link
Member Author

kescobo commented Oct 8, 2021

yay!

The humann tutorial should be good to review now. Metaphlan needs work

@anikaluo
Copy link
Collaborator

anikaluo commented Oct 8, 2021

oh can we still review this PR even after it's merged?

@kescobo
Copy link
Member Author

kescobo commented Oct 8, 2021

Yep! But you can also make a review PR

If you just want a single value, you can use a `GeneFunction` directly:

```julia-repl
julia> gfs_strat[GeneFunction("UniRef90_D0TRR5", "Bacteroides_dorei"), 1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this actually be something like this:

julia> gfs_strat[GeneFunction("UniRef90_D0TRR5", Taxon("Bacteroides_dorei", :species)), 1]
0.8271298594

When I run the line as is, I get a huge error that starts with ERROR: BoundsError: attempt to access Axis(GeneFunction[GeneFunction("UNMAPPED", missing), ...


In the following example, `gf ->` indicates a function that takes a single argument
(in this case, our `GeneFunction`),
then askes if it's [`name`](@ref) is "UNMAPPED" with `name(gf) == "UNMAPPED"`,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • "askes" --> "asks" typo
  • broken link for name?


Note - to get other feature types, you may have to download the requisite databases
using `humann_databases` at the command line.
See [Using Conda.jl](@ref)
Copy link
Collaborator

@anikaluo anikaluo Oct 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broken link
I think it should be [Using Conda.jl](@ref using-conda)?

(because Getting Started line 136: ### [Using Conda.jl](@id using-conda))

Process(`humann_join_tables -i hmp_subset -o hmp_subset_genefamilies.tsv --file_name genefamilies`, ProcessExited(0))
```

This will write a new file that you can then load with [`humann_profiles`](@ref)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should humann_profiles have a docstring?

This will write a new file that you can then load with [`humann_profiles`](@ref)

```julia-repl
julia> humann_profiles("hmp_subset_genefamilies.tsv"; stratified=true)
Copy link
Collaborator

@anikaluo anikaluo Oct 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add output?

julia> humann_profiles("hmp_subset_genefamilies.tsv"; stratified=true)
CommunityProfile{Float64, GeneFunction, MicrobiomeSample} with 1 features in 6 samples

Feature names:
UNMAPPED

Sample names:
SRS014459-Stool_Abundance-RPKs, SRS014464-Anterior_nares_Abundance-RPKs, SRS014470-Tongue_dorsum_Abundance-RPKs...SRS014476-Supragingival_plaque_Abundance-RPKs, SRS014494-Posterior_fornix_Abundance-RPKs

I guess it's the same output as below, but it could be helpful to have it here too

sort="braycurtis",
scaling="logstack",
as_genera=true,
remove_zeros=true)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should humann_barplot calls all have full outputs (1 and 2 have partial, 3 and 4 have full, 5 does not have an output at all)

Here's my outputs:

julia> humann_barplot(gfs, "plot1.png"; focal_metadata="STSite", focal_feature="METSYN-PWY")
Process(`humann_barplot --i /var/folders/n6/jnm06kv91t7gl5jy43j3smcr0000gn/T/jl_QMapSq -o plot1.png --last-metadata STSite --focal-metadata STSite --focal-feature METSYN-PWY`, ProcessExited(0))

julia> humann_barplot(gfs, "plot2.png"; focal_metadata="STSite", focal_feature="METSYN-PWY", 
                             sort="sum")
Process(`humann_barplot --i /var/folders/n6/jnm06kv91t7gl5jy43j3smcr0000gn/T/jl_cLL6XK -o plot2.png --last-metadata STSite --focal-metadata STSite --focal-feature METSYN-PWY --sort sum`, ProcessExited(0))

julia> humann_barplot(gfs, "plot3.png"; focal_metadata="STSite", focal_feature="METSYN-PWY",
                             sort=["sum", "metadata"],
                             scaling="logstack")
Process(`humann_barplot --i /var/folders/n6/jnm06kv91t7gl5jy43j3smcr0000gn/T/jl_dQCVFF -o plot3.png --last-metadata STSite --focal-metadata STSite --focal-feature METSYN-PWY --sort sum metadata --scaling logstack`, ProcessExited(0))

julia> humann_barplot(gfs, "plot4.png"; focal_metadata="STSite", focal_feature="COA-PWY",
                             sort="sum")
Process(`humann_barplot --i /var/folders/n6/jnm06kv91t7gl5jy43j3smcr0000gn/T/jl_nXUvHy -o plot4.png --last-metadata STSite --focal-metadata STSite --focal-feature COA-PWY --sort sum`, ProcessExited(0))

julia> humann_barplot(gfs, "plot5.png"; focal_metadata="STSite", focal_feature="COA-PWY",
                             sort="braycurtis",
                             scaling="logstack",
                             as_genera=true,
                             remove_zeros=true)
Process(`humann_barplot --i /var/folders/n6/jnm06kv91t7gl5jy43j3smcr0000gn/T/jl_qYK7Ou -o plot5.png --last-metadata STSite --focal-metadata STSite --focal-feature COA-PWY --sort braycurtis --scaling logstack --as-genera --remove-zeros`, ProcessExited(0))

@kescobo
Copy link
Member Author

kescobo commented Oct 12, 2021

Pretty sure a bunch of these things have been fixed in recent PRs. Let's put a pause on reviewing here and make sure everyone is on the most up-to-date version tomorrow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants