Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author strings with y in their name are not parsed #93

Closed
dimus opened this issue Dec 18, 2020 · 2 comments
Closed

Author strings with y in their name are not parsed #93

dimus opened this issue Dec 18, 2020 · 2 comments
Assignees

Comments

@dimus
Copy link
Member

dimus commented Dec 18, 2020

created by @gdower at https://gitlab.com/gogna/gnparser/-/issues/93

In the scientific name, Struthiopteris fallax (Lange) S.Molino, Gabriel y Galán & Wasowicz, the y Galán & Wasowicz component becomes an unparsed tail. I realize that might not be easy to fix because of it might get a lot of false matches in BHL, and we were able to resolve the issue on our end, so no worries if it can't be fixed.

https://parser.globalnames.org/?q=Struthiopteris+fallax+%28Lange%29+S.Molino%2C+Gabriel+y+Gal%C3%A1n+%26+Wasowicz

{
  "parsed": true,
  "quality": 3,
  "qualityWarnings": [
    [3,"Unparsed tail"]
  ],
  "verbatim": "Struthiopteris fallax (Lange) S.Molino, Gabriel y Galán \u0026 Wasowicz",
  "normalized": "Struthiopteris fallax (Lange) S. Molino \u0026 Gabriel",
  "cardinality": 2,
  "canonicalName": {
    "full": "Struthiopteris fallax",
    "simple": "Struthiopteris fallax",
    "stem": "Struthiopteris fallax"
  },
  "authorship": "(Lange) S. Molino \u0026 Gabriel",
  "details": [
    {
      "genus": {
        "value": "Struthiopteris"
      },
      "specificEpithet": {
        "value": "fallax",
        "authorship": {
          "value": "(Lange) S. Molino \u0026 Gabriel",
          "basionymAuthorship": {
            "authors": [
              "Lange"
            ]
          },
          "combinationAuthorship": {
            "authors": [
              "S. Molino",
              "Gabriel"
            ]
          }
        }
      }
    }
  ],
  "positions": [
    ["genus",0,14],
    ["specificEpithet",15,21],
    ["authorWord",23,28],
    ["authorWord",30,32],
    ["authorWord",32,38],
    ["authorWord",40,47]
  ],
  "surrogate": false,
  "virus": false,
  "hybrid": false,
  "bacteria": false,
  "unparsedTail": " y Galán \u0026 Wasowicz",
  "nameStringId": "ac36333c-ad8f-5389-abe4-fe1bea5c7a92",
  "parserVersion": "v0.14.1"
}

Re: CatalogueOfLife/testing#2

@dimus dimus self-assigned this Dec 18, 2020
@dimus
Copy link
Member Author

dimus commented Dec 18, 2020

created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/43

Looks like if not common, it is not unique:

Caloptenopsis crassiusculus (Martínez y Fernández-Castillo, 1896)
Caluromytrema martindelcampoi Lamothe y Pineda, 1989
Capillaria xochimilcensis Caballero y Zerecero, 1943
Carabus (Tanaocarabus) hendrichsi Bolvar y Pieltain, Rotger & Coronado-G 1967
Didymosella acutirostris Faura y Sans & Canu 1917
Dufourea fuenti Dusmet y Alonso, 1935

So it has to be solved in a general way.

@dimus
Copy link
Member Author

dimus commented Dec 18, 2020

created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/44

I think it is fixable. I suspect that y is a rare prefix, so may be I need to start a list of verbatim authors.

@dimus dimus closed this as completed Dec 18, 2020
dimus added a commit that referenced this issue Dec 18, 2020
Sometimes 'y' is used instead of '&' as an author separator.
We parse with with a warning of level 2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant