Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract information from PHI-Base matches #26

Closed
darcyabjones opened this issue May 28, 2020 · 1 comment
Closed

Extract information from PHI-Base matches #26

darcyabjones opened this issue May 28, 2020 · 1 comment
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@darcyabjones
Copy link
Member

What kind of feature are you proposing?

We need a way of associating phi-base MMSeqs2 matches with the phenotypic effect.
This may not be trivial, since the "reduced-virulence" phenotype encompasses a few different meanings.

Describe the solution you'd like

I think a simple frequency of match phenotype would be sufficient to reduce the data when we take it in tandem with Pfam matches and signal peptide prediction etc.
So if we have a match to a protein involved in the secretion pathway, which might have the phenotype "reduced virulence" we could count that as +1 for reduced virulence.
Essentially, we reduce the phibase matches down to just the counts of matches to different phenotypic effects.

The other suggestion is to try to classify those phenotypic effects by the experimental details, e.g. "knock out"+"reduced virulence". This gives us the ability to distinguish NEs from Avrs in some cases.
The experimental details column is less consistent than the phenotype column though.

It's a bit tricky to be honest.
I'd love some suggestions.

@darcyabjones darcyabjones added enhancement New feature or request help wanted Extra attention is needed labels May 28, 2020
@darcyabjones darcyabjones added this to the Version 1 release milestone May 28, 2020
@darcyabjones
Copy link
Member Author

James and I came up with a solution for this.

We use the phibase phenotypes: effector, loss of pathogenicity, and hypervirulence as a high positive score.
Use the phenotype reduced virulence as a weak positive score if we didn't already give the high score.
Use the phenotype "lethal" as a strong negative score.

We can extract these phenotypes easily from the fasta headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant