Extract information from PHI-Base matches #26

darcyabjones · 2020-05-28T11:05:45Z

What kind of feature are you proposing?

We need a way of associating phi-base MMSeqs2 matches with the phenotypic effect.
This may not be trivial, since the "reduced-virulence" phenotype encompasses a few different meanings.

Describe the solution you'd like

I think a simple frequency of match phenotype would be sufficient to reduce the data when we take it in tandem with Pfam matches and signal peptide prediction etc.
So if we have a match to a protein involved in the secretion pathway, which might have the phenotype "reduced virulence" we could count that as +1 for reduced virulence.
Essentially, we reduce the phibase matches down to just the counts of matches to different phenotypic effects.

The other suggestion is to try to classify those phenotypic effects by the experimental details, e.g. "knock out"+"reduced virulence". This gives us the ability to distinguish NEs from Avrs in some cases.
The experimental details column is less consistent than the phenotype column though.

It's a bit tricky to be honest.
I'd love some suggestions.

darcyabjones · 2020-06-17T02:59:24Z

James and I came up with a solution for this.

We use the phibase phenotypes: effector, loss of pathogenicity, and hypervirulence as a high positive score.
Use the phenotype reduced virulence as a weak positive score if we didn't already give the high score.
Use the phenotype "lethal" as a strong negative score.

We can extract these phenotypes easily from the fasta headers.

darcyabjones added enhancement New feature or request help wanted Extra attention is needed labels May 28, 2020

darcyabjones added this to the Version 1 release milestone May 28, 2020

darcyabjones closed this as completed Jun 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract information from PHI-Base matches #26

Extract information from PHI-Base matches #26

darcyabjones commented May 28, 2020

darcyabjones commented Jun 17, 2020

Extract information from PHI-Base matches #26

Extract information from PHI-Base matches #26

Comments

darcyabjones commented May 28, 2020

darcyabjones commented Jun 17, 2020