Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Name elements that indicate taxon is a virus #7

Closed
KatjaSchulz opened this issue May 13, 2015 · 4 comments
Closed

Name elements that indicate taxon is a virus #7

KatjaSchulz opened this issue May 13, 2015 · 4 comments

Comments

@KatjaSchulz
Copy link

@KatjaSchulz KatjaSchulz commented May 13, 2015

It looks like names with these elements are not yet recognized as viruses, so capitalized words are stripped from the canonical form:

NPV, e.g., Papilio polyxenes NPV: http://eol.org/pages/41592578
RNA, e.g., Alternaria zinniae dsRNA element: http://eol.org/pages/11611917
virophage, e.g., Organic Lake virophage: http://eol.org/pages/20868817
satellites, e.g., Double-stranded RNA satellites: http://eol.org/pages/11603787
satellite, e.g., Whitefly VEM satellite: http://eol.org/pages/20858522
betasatellite, e.g., Tomato leaf curl China betasatellite: http://eol.org/pages/11603870
alphasatellite, e.g., Ageratum yellow vein Singapore alphasatellite: http://eol.org/pages/39738381
particle, e.g., Mouse Intracisternal A-particle: http://eol.org/pages/11609198
subgroup, e.g., Subgroup B: http://eol.org/pages/11623168 -- This is probably not limited to viruses, but it's very unlikely that any name that has this string in it will have author information associated with it.

@dimus
Copy link
Member

@dimus dimus commented May 13, 2015

Thanks Katja, I will look at these words through GN names to see if I get some unexpected consequences (unlikely) and everything that is safe will go to the next version of parser

@dimus
Copy link
Member

@dimus dimus commented May 13, 2015

RNA happens in surrogate names, so it is a bit dangerous to say everything that has RNA word are viruses, but probably is ok (if there are no other indications it is a virus) to just refuse to parse:

|Candida albicans RNA_12C-1 |
| Candida albicans RNA_12C-2 |
| Candida albicans RNA_12C-3 |
| Candida albicans RNA_CTR0-1 |
| Candida albicans RNA_CTR0-2 |
| Candida albicans RNA_CTR0-3 |
| Candida albicans RNA_GC75-1 |
| Candida albicans RNA_GC75-2 |
| Candida albicans RNA_GC75-3 |
| Candida albicans RNA_SC5314-1 |
| Candida albicans RNA_SC5314-2 |
| Candida albicans RNA_SC5314-3 |
| Candida tropicalis RNA_Ct1-1 |
| Candida tropicalis RNA_Ct1-2 |
| Candida tropicalis RNA_Ct2-1 |
| Candida tropicalis RNA_Ct2-2 |
| Candida tropicalis RNA_Ct3-1 |
| Candida tropicalis RNA_Ct3-2

@dimus
Copy link
Member

@dimus dimus commented May 14, 2015

I am not sure what to do with subgroup. Here are some examples:

xLevivirus subgroup I
plant rhabdoviruses subgroup B
Zelotes mayanus subgroup
Teuchophorus notabilis subgroup Meuffels & Grootaert 2004
Subgroup I Geminivirus
Sericania mimica subgroup Kobayashi & Fujioka 2008
Pipistrellus (Hypsugo) imbricatus subgroup

I will leave subgroup as is until I understand how to deal with it

@dimus
Copy link
Member

@dimus dimus commented May 14, 2015

Everything except subgroups is covered in v3.1.10

@dimus dimus closed this May 14, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.