Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Systema Dipterorum (id 1101): test report #127

Open
2 tasks done
yroskov opened this issue May 14, 2021 · 62 comments
Open
2 tasks done

Systema Dipterorum (id 1101): test report #127

yroskov opened this issue May 14, 2021 · 62 comments

Comments

@yroskov
Copy link

yroskov commented May 14, 2021

Version 3.1 received 2021-05-12.
Imported to prod: https://data.catalogueoflife.org/dataset/1101/about

(previous reports are in #6)

  • Metadata: updated

  • Sector: order Diptera minus 4 families

  1. delete CIPA sectors
  2. re-establish and sync CCW (suprefamily Tipuloidea)
  3. establish SD sector as order Diptera
  4. block 4 families:
    Cylindrotomidae
    Limoniidae
    Pediciidae
    Tipulidae
  5. sync

As result, assembly tree looks like that:

image

@yroskov
Copy link
Author

yroskov commented May 14, 2021

TASKS

image

ACC-ACC species (different authors)

image

image

image

image

image

image

image

image

  • ACC-ACC species (same authors)

image

Resolved 2021-05-17

image

  • Identical genus (437 names); Identical subgenus (743 names) - cases of "split" taxa in the classification remain unresolved. Need attention from the authors.

@yroskov
Copy link
Author

yroskov commented May 17, 2021

Synced with 4 blocked families in the assembly tree, 2021-05-17

4 blocked families need to be confirmed after the sync: confirmed, OK.

@yroskov
Copy link
Author

yroskov commented Jun 14, 2021

See CatalogueOfLife/data#273

Both, species and its parent genus, are blocked in CoL.
Systema Dipterorum re-synced 2021-06-14.

@yroskov
Copy link
Author

yroskov commented Jun 17, 2021

Reported by @olafbanki 2021-06-16:

On Diptera I have a question. I see quite some family names that closely resemble each other (Archisargidaae & Archisargidae) and where duplicate genera exist (e.g. Archirhagio). I attach a screen shot from catalogueoflife.org as example. Looks like there are some data quality issues. What is your take on this?

  • Misspelled family name? Archisargidaae (1 sp Archirhagio obscurus Rohdendorf, 1938) vs Archisargidae (59 spp) - for attention of Neal.

In CoL: Archisargidaae "taxon blocked" in assembly tree 2021-06-17. Synced.

@yroskov
Copy link
Author

yroskov commented Jun 17, 2021

Reported by @olafbanki 2021-06-16:

In addition the family Archisargidae is extinct, but it does not have a flag.

  • CoL have no "extinct" flag with SD species. - NOW FIXED ON SPECIES LEVEL IN SD. EXTINCT FLAG FOR PARENT TAXA SHOULD BE CALCULATED BY THE SOFTWARE
    @gdower, could we use data from SD field "Epoch" (Amber, Baltic Amber, Creataceous, etc.). Would it be correct to apply "extinct" flag to every species which have empty Epoch field?

We agreed our action plan as (1) modify script and take available values from Epoch field of Systema Dipterorum V3.1_2021-05-12, fill start & end periods in CoL (where it necessary) for 4K accepted species, (2) apply flag “extinct” to the species with not-empty values in Epoch field.

@yroskov
Copy link
Author

yroskov commented Jun 18, 2021

  • In the Tree "Chironomus mixtus Holmgren, 1869: 45. TL: Norway." as a family. BLOCKED.
    image

  • ISSUES

  • TASKS

V3.1_2021-05-12 synced 2021-06-18

Preview 2021-06-18 https://preview.catalogueoflife.org/ looks good.
New spp stats: 145933 extant & 3759 extinct spp.

  • Parent taxa with all extinct children species are not marked as extinct (i.e. no dagger with genus, family, etc.)

@yroskov
Copy link
Author

yroskov commented Jul 1, 2021

ITIS offers global checklist for Culicidae family:
#8 (comment)

Response from SD: keep Culicidae from SD.

@yroskov
Copy link
Author

yroskov commented Sep 7, 2021

Set of correctly assigned species binomials are placed under incorrect genus in the classification as a false parent: gbif/checklistbank#187
see also CatalogueOfLife/backend#1052

Example:
image

Simple re-sync did not fix a problem.

Two hypotheses (@gdower):

(1) Option "Union", which was used for the sector Diptera, may cause a problem. = No
(2) Name Trigonometopus (Culex) canus (as in a dataset) may cause a problem. = Yes
Generalised problem: the same subgenus name may appear in different genera in SD:

  • single accepted species name Trigonometopus (Culex) canus and rest of species names are in Culex (Culex)
  • single accepted species name Bactrocera (Callantra) axanthina and rest of species names are in Dacus (Callantra)
    etc.

Experiment 1: do not use "Union" in assembly.
Steps to repair via re-assembly of sectors:
(1) delete all sectors in Diptera
(2) delete entire Diptera subtree
(3) establish order Diptera from SD
(4) block 4 families (which we'll take from CCW):
Cylindrotomidae
Limoniidae
Pediciidae
Tipulidae
(5) add 4 CCW families as sectors in Diptera (skip suprefamily Tipuloidea as a rank between family & order)
(6) sync SD and CCW

Bad news: above steps did not fix a problem:

image

Experiment 2: fix "incorrect" name.
Steps to repair via complex decision over species binomial:
(1) complex decision Trigonometopus (Culex) canus --> Trigonometopus canus
(2) re-sync SD

Result successful: subgenus Culex is correctly placed in genus Culex
image
image

Well, sync process is misinterpreting original placement of "homonymic" subgenera.
Source dataset may have a mistake (i.e. incorrect subgenus in a set of species from another genus), but it may use homonymic subgenera for a purpose as well. I would expect that software generates report on detected problems for GSD authors, but also translate placement of subgenus in genus as it occur in the original dataset.

@yroskov
Copy link
Author

yroskov commented Sep 7, 2021

Report for Identical Subgenus contains 743 names:
image

CoL cannot resolve all cases on our end.

@yroskov
Copy link
Author

yroskov commented Sep 8, 2021

Another strange case of subgenus interpretation in the hierarchy: four species with assigned subspecies in the name have been placed in NotAssignedSubgenus node:
image

Can it be addressed in the ChecklistBank code, @mdoering? (@gdower)

@yroskov
Copy link
Author

yroskov commented Sep 14, 2021

Subgenus issues fixed in the code.

SD synced 2021-09-14.

2021-09-16: looks like technical problem is resolved and species placed in correct genera.
Waiting for a new version of the checklist without "homonymic" subgenera in different genera from SD team.

@yroskov
Copy link
Author

yroskov commented Feb 18, 2022

Version 3.6 received 2022-02-14

Imported to DEV https://data.dev.catalogueoflife.org/dataset/1101/classification

@yroskov
Copy link
Author

yroskov commented Feb 18, 2022

Checks of the view 2022-02-18
(few notes)

image

  • A single case where species is placed as a family: = BLOCKED
    image
    image

In the source (SPECIES table):

Family Full Name Line Full Name Line Range Full Species Line
Chironomidae Chironomus mixtus Holmgren, 1869: 45. TL: Norway. Bear I. (HT F NRS). Chironomus mixtus Holmgren, 1869: 45. TL: Norway. Bear I. (HT F NRS). Orthocladius (Orthocladius) mixtus. (PA: PA)

*)
image

  • 18 subfamilies, 69 genera (without square brackets) & 25 genera in brackets are outside families:
    image

CONCLUSION: set of species in these fam & gen have no parent families in the source file (blank values). CLB INTERPRETATION IS CORRECT

Checked few against the source (SPECIES):

Ceratopogoninae have 3 parent families, plus blank family with Serromyia errata:
image

Chironominae have 2 parent families, plus blank family with 3 spp Nandeva pudens, Parachironomus inageheus, Polypedilum (Polypedilum) xianjuensis :
image

Chironomiinae (is it different with above? - check with Neal) have 1 parent family Chironomidae, plus blank family with species Yaeprimus balteatus:
image

Clitellariinae have 2 parent families, plus blank family with species Adoxomyia hasbenlii
image

@mdoering
Copy link
Member

mdoering commented Feb 19, 2022

Species with uncertain placement in the genus: the original genus goes in square brackets, as agreed with Neal. (Previously, such names had question mark in genus field ( ? stercoreus)
How CLD deals with such names?

I don't think it is a good idea to feed names with square bracket embraced genera into CLB to indicate uncertain placement. This is a very specific convention for SD only and not known to anyone nor the system itself.

We should try to change those names and rather follow the guidelines of ColDP, where we have discussed this problem and how to deal with it in a consistent way so both CLB and other users understand the data correctly.

Looking at the verbatim Taxon record of that example I see various problems:

  • col:genus is given as the classification (this is a Taxon, not Name record). If it is uncertain don't do that and better remove the field.
  • col:species is given as an epithet only, the classification expects a binomial. Remove the field.
  • col:provisional should be used to indicate the uncertain placement

In the linked Name record I would suggest to simply remove the square brackets.

@yroskov
Copy link
Author

yroskov commented Feb 22, 2022

@yroskov
Copy link
Author

yroskov commented Feb 22, 2022

In the linked Name record I would suggest to simply remove the square brackets.

I am comfortable with presentation of an original genus in square brackets where a new placement in a genus is not resolved yet. If CLB allows search for names with square bracket, I'll be happy to mark these accepted names as Provisionally Accepted in the CoL. See: CatalogueOfLife/backend#1112

@mdoering
Copy link
Member

mdoering commented Feb 23, 2022

Curly brackets around genera is nothing we support at this stage. It will be considered bad data and likely has impacts down the line when we assemble COL, e.g. when we make sure to have a genus record for every accepted species. Don't be surprised if you find new genera with brackets in COL.

@yroskov
Copy link
Author

yroskov commented May 17, 2022

Version 3.6 (2022-02-14) imported to the PROD 2022-05-17

  • Imported 177,088 spp
    image

  • Metadata: OK (ver. 3.6 Feb 2022, 2022-02-14)

  • Sectors: OK
    Blocked families Cylindrotomidae, Limoniidae, Pediciidae, Tipulidae in Systema Dipterorum (taken from CCW)
    pre-synced 2022-05-17

@yroskov
Copy link
Author

yroskov commented May 17, 2022

ISSUES assessed 2022-05-17

image

@yroskov
Copy link
Author

yroskov commented May 17, 2022

TASKS as 2022-05-17
image

  • Broken decisions, 3506; rematch all = 3501remain broken; deleted all.

!Remember! ACC=ACC sp (diff auth):
all names with genus in square brackets = Prov Acc
? what to do with names with authorstrings without year (they may have different synonyms - keep?)
image

@yroskov
Copy link
Author

yroskov commented May 19, 2022

Version 3.6 (2022-02-14), new crawl iteration imported to the PROD 2022-05-19

  • Imported 169,487 spp (vs 177,088 in previously crawled version)
    image

  • Metadata: OK

  • Sectors: families Cylindrotomidae, Limoniidae, Pediciidae, Tipulidae in Systema Dipterorum (taken from CCW)
    should be blocked again = FIXED

@yroskov
Copy link
Author

yroskov commented May 19, 2022

ISSUES assessed 2022-05-19 (many previous decisions remain in place)
image

@yroskov
Copy link
Author

yroskov commented May 19, 2022

Investigating bare names, 8,455
https://www.checklistbank.org/catalogue/3/dataset/1101/workbench?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&limit=100&offset=0&status=bare%20name

? ambigua Pankratova, 1950 = ok
? arcudae Botnariuc, 1956 = ok
? delicatula Botnariuc & Cure, 1956 = ok

However, many names become "bare" for unclear (yet) reasons:
Mesembrinella dorsimacula Aldrich, 1922, it is (Available, Valid) Current Status
Komisca nanensis (Chaiwong, Sukontason & Sukontason, 2009), it is (Available, Valid) Changed Combination / Rank
Abago rohdendorfi Grunin, 1966 (Available, Invalid) Junior SECONDARY Homonym

@yroskov
Copy link
Author

yroskov commented May 19, 2022

TASKS as 2022-05-19
image

  • Broken decisions, 1082; deleted all.
  • Split genera, tribes, subfamilies only partially marked as "prov Acc"

Resolved 2022-05-19:
image

@yroskov
Copy link
Author

yroskov commented May 19, 2022

  • All accepted names with square bracket [ need to be flagged as provisionally accepted. Examples: [Aricia] coronata (Holmgren, 1883); [Cordylura] marginipennis Gimmerthal, 1847
    I cannot do it in CLB: there is no such names in ISSUE reports, neither Workbench search delivers them.

NEW SEARCH OPTION IN Workbench @clb: RegEx Search (Regular Expression Search)
image

#197

@yroskov
Copy link
Author

yroskov commented May 20, 2022

Crawl iteration with pre-flaged "prov acc" names imported 2022-05-19 & 20.
(The main problem: page Tasks failed to be displayed in CLB (spinning progress forever), page Classification whether also failed or too slow - multiple imports. = now resolved)

  • Now crawler automatically flagged all species with genera in square brackets as "Prov Acc" (3,275 prov acc spp in this version).

  • ~700 genera with square brackets were flagged as "Prov Acc" via decisions. (Steps: workbench - filter for acc genera: all are at the end of the list, resorting - set up 700 lines per page - applied balk decision)

@yroskov
Copy link
Author

yroskov commented May 20, 2022

  • Imported: 169,492 spp
  • Sectors: families Cylindrotomidae (blocked), Limoniidae (blocked), Pediciidae (blocked), Tipulidae (not blocked) in Systema Dipterorum (taken from CCW) = FIXED

TASKS as 2022-05-20
(all previous decisions re-applied successfully, new decisions added)

image

Synced 2022-05-20

@yroskov
Copy link
Author

yroskov commented Oct 12, 2022

ISSUES assessed 2022-10-12
image

  • Unparsable authorship, 12: fixed in CLB & reported to Neal.

@yroskov
Copy link
Author

yroskov commented Oct 12, 2022

TASKS
image

  • Broken decisions, 3699; rematch all, Request failed with status code 503 = broken 4054; rematch all, Request failed with status code 503 = broken 5178; ... ; deleted all
    @mdoering, as for me, something wrong with a tool "rematch all" broken decisions.

image

  • Identical genus, 109 pairs: remain unresolved

Synced 2022-10-25

@mdoering
Copy link
Member

ISSUES assessed

Quite a lot of serious invalid ids and duplicate ids. Please investigate into the cause, that has potential for lots of problems.

@yroskov
Copy link
Author

yroskov commented Oct 12, 2022

@gdower, could you pls have a look on "technical" issues among those highlighted by @mdoering:
Id Not Unique, 135
Accepted Id Invalid, 725
Name Id Invalid, 8934

@yroskov
Copy link
Author

yroskov commented Oct 25, 2022

  • Failed to give ProvAcc status to genera in square brackets (e.g. [Archalia]) - RegEx Search mode.
    Formula which works (see it in a source):
    image
    ^[[A-Z][a-z]+]$
    Workbench cannot handle request to apply ProvAcc status even with 50 names per page setting. = See below: it works, but not shown in the interface.
    Requested to add this in the conversion code for SD = GO: it cannot be done via conversion code; CLD generates genera.

GO & YR 2022-11-14: decisions in RegEx Search might be set up, but not displayed in the interface.

Experiment of 2022-11-14:
Seems, there are 715 genera in square brackets in total:
1st page of 500 per page: [Ablabesmyia] - [Phoraea]
2nd page [Phorbia] - [Zygoneura]
Prov Acc status applied to all 715 genera in square brackets. No progress shown. No decisions appear in the report. However, decisions shown in Project-CoL-Decisions (see mode = update):
image

Synced 2 hours later, 2022-11-14

Bracketed genera checked in PREVIEW 2022-11-15:

Archalia - 1 in the RegEx report, 0 in the PREVIEW
Actia - 1 in the report, 1 from SD in the PREVIEW, accepted
Bibio - 4 in the report, 2 in the PREVIEW, both marked as prov acc
Ceroxys - 2 in the report, 1 in the PREVIEW, accepted
Dinera - 1 in the report, 1 in the PREVIEW, accepted
Lydella - 1 in the report, 1 in the PREVIEW, accepted
Voriella - 1 in the report, 1 in the PREVIEW, accepted

There are only 19 ProvAcc genera in Diptera in the PREVIEW.

No bracketed genera in the PREVIEW: all brackets removed, but more probable, bracketed genera did not pass to the final product. @gdower, could this be related to the issue of broken parent-child relationships, invalid and duplicated ids?

Anyway, experiment of 2022-11-14 did not change number of accepted species in SD@CoL.

@yroskov
Copy link
Author

yroskov commented Nov 15, 2022

Experiment of 2022-11-16:

Plan: remove brackets from species names, give them ProvAcc status before the import in CLB. Imported 2022-11-16

  • Imported: 171,235 spp (vs 171,235 spp)
  • Metadata: OK
  • Classification: the same problem with subfamilies and many genera outside in the Tree root
  • Sectors: CCW families need to be blocked in SD again: Cylindrotomidae, Limoniidae, Pediciidae, Tipoulidae, Tipulidae, plus subfamily Tipulinae & genus Tipula in the root of Diptera = FIXED

TASKS remain unchaged (i.e. resolved)

Synced 2022-11-16

Results in the PREVIEW 2022-11-18:

Previously bracketed genera checked:

Genus RegEx report 2022-11-14 (as bracketed genus) PREVIEW 2022-11-15 PREVIEW 2022-11-18
Archalia 1 0 1, accepted
Actia 1 1 from SD, accepted 1 from SD, accepted
Bibio 4 2, prov acc 5, of them 4 accepted, 1 prov acc
Ceroxys 2 1, accepted 3, of them 3 accepted
Dinera 1 1, accepted 2, of them 1 accepted, 1 prov acc
Lydella 1 1, accepted 2, of them 2 accepted
Voriella 1 1, accepted 2, of them 1 accepted, 1 prov acc

@yroskov
Copy link
Author

yroskov commented Dec 2, 2022

Misspellings reported to Neal, 2022-12-02:

Diptera> Ceratopogonidaae vs Ceratopogonidae
Diptera> Cedratopogonidae
Diptera> ceratopogonidae
Diptera> Phoridase vs Phoridae
Diptera> Sphaerroceridae vs Sphaeroceridae
Diptera> Mycxetophilidae vs Mycetophilidae
Diptera> Dolichoposidae vs Dolichopodidae
Diptera> Limoniidae> Limoninae vs Limoniinae
Diptera> Liomniidae vs Limoniidae
Diptera> Tephritidae> Tryptetinae vs Trypetinae
Diptera> Cecidomyiidae> Porrocondylinae vs Porricondylinae
Diptera> Cecidomyiidae> Porriconylinae vs Porricondylinae
Diptera> Mycetophilidae> Leeinae vs Leiinae
Diptera> Cecidomyiidae> Cercidomyiinae vs Cecidomyiinae
Diptera> Chironomidsae vs Chironomidae
Diptera> Stratiomyidae> Stratiomyiinae vs Stratiomyinae

@yroskov
Copy link
Author

yroskov commented Dec 2, 2022

Systema Dipterorum 3.10, Sep 2022

TASKS 2022-12-02

image

Re-synced 2022-12-02

@yroskov
Copy link
Author

yroskov commented May 16, 2023

2023-05-16: SD is in update4.2/revert3.10 process. (checklist reverted back to 3.10, metadata 4.2). Must not be synced until repair!

@yroskov
Copy link
Author

yroskov commented May 17, 2023

Systema Dipterorum 4.2, May 2023, received 2023-05-13; imported to prod 2023-05-17

  • Imported: 172,050 spp (vs 171,235 spp in 3.10, Sep 2022)
  • Metadata: OK
  • Classification: 7 subfamilies and many genera outside families in the Tree root
  • Sectors: CCW families Cylindrotomidae, Limoniidae, Pediciidae, Tipulidae should be blocked in Systema Dipterorum; plus subfamily Tipulinae & genus Tipula in the root of Diptera.

image

ISSUES

image

@yroskov
Copy link
Author

yroskov commented May 17, 2023

SD 4.2, May 2023

TASKS

image

Not synced

@yroskov
Copy link
Author

yroskov commented Jun 12, 2023

Systema Dipterorum 4.2.2, May 2023, received 2023-05-27; imported to prod 2023-05-30

  • Imported: 171,937 spp (vs 172,050 spp in 4.2, May 2022)
  • Metadata: OK, ver. 4.2.2 -> 4.2.2, May 2023
  • Classification: 7 subfamilies and many genera outside families in the Tree root
  • Sectors: CCW families Cylindrotomidae, Limoniidae, Pediciidae, Tipulidae should be blocked in Systema Dipterorum; plus subfamily Tipulinae & genus Tipula in the root of Diptera. = all families have no block = FIXED 2023-06-12

image

TASKS

image

  • Broken decisions, 369: deleted all

  • Genera with square brackets blocked

  • Split subgenera - failed to resolve = the sector synced without rank subgenus

Resolved 2023-06-12:

image

Synced 2023-06-12 (without rank subgenus)

@yroskov
Copy link
Author

yroskov commented Jun 15, 2023

2023-06-15: temporary names such as *FChironominae (start as *F) deleted as a node (“taxon”) in Assembly - Draft. All children attached to the next parent. Sync is not involved (i.e. such names will be back with next sync).

@yroskov
Copy link
Author

yroskov commented Nov 13, 2023

Both names blocked in CoL. Reported to Neal.

Systema Dipterorum re-synced 2023-11-13.

@yroskov yroskov mentioned this issue Nov 13, 2023
17 tasks
@yroskov
Copy link
Author

yroskov commented Nov 20, 2023

https://www.checklistbank.org/catalogue/3/dataset/1101/workbench?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&limit=50&offset=0&q=%5Cn

image

Re-synced 2023-11-20

@yroskov
Copy link
Author

yroskov commented Dec 5, 2023

  • Uninomials with prefixes *F, *T (e.g. *FChironominae, *FChloropinae, *FTephritini, *TLestremiinae) = blocked 2023-12-05

image

Systema Dipterorum 4.2.2, May 2023 re-synced 2023-12-05

After the check of PREVIEW 2023-12-06:
*F & *T names were blocked as taxa, not names.

Test names:
*FChironominae (next parent subfamily Chironominae)
Kribiopelma albidum Kieffer, 1923
Kribiobius modestus Kieffer, 1923

Decision "Ignore" applied instead of "Block". Synced 2023-12-07. That's work: only names was blocked and children taxa synced in the CoL.

@yroskov
Copy link
Author

yroskov commented Dec 5, 2023

Remains unresolved. Attempt to block subgenus as a rank: (1) vanished all subgenera from the tree & species names, (2) created "self-synonymy" (identical ACC-SYN). Blocking subgenus decision was reversed 2023-12-05. Duplicated subgenera are back (sic! PREVIEW 2023-12-07).

The list was sent to Neal 2023-12-05.

@yroskov
Copy link
Author

yroskov commented Feb 8, 2024

Tests of Systema Dipterorum ver. 4.5, 2023-11-16 processed via TW by DD vs data by GO: #244

@yroskov
Copy link
Author

yroskov commented Feb 8, 2024

Systema Dipterorum ver. 5.0, 2024-01-08 processed via TW by DD; imported 2024-02-07

  • Imported: 177193 spp (vs 171,937 spp in 4.2.2, May 2023)
  • Metadata: Corrected Version / Issued 0.38.1 / 2024-02-07 --> 5.0, Jan 2024 / 2024-01-08
    Added paragraph in Description:
    This version of the Systema Dipterorum data has been imported in TaxonWorks (the author of the import script is D. Dmitriev) and past soft validation there before exporting to the CoLDP format (the author of the export script is G. Ower).
  • Classification: 17 subfamilies and many genera & species are outside families in the Tree root (inside order Diptera). Species are flagged as Prov. Acc. 7 species with the portion [GENUS NOT SPECIFIED] in the root.
  • Sectors: sector "Diptera" broken = FIXED 2024-02-12
    CCW families Cylindrotomidae, Limoniidae, Pediciidae, Tipulidae should be blocked in Systema Dipterorum; plus subfamily Tipulinae (also, check genus Tipula in the root of Diptera) = all families re-blocked 2024-02-12
  • Extinct taxa: missing extinct flag with species (0 spp now vs 3,792 in May 2023). (Field Epoch in the source spreadsheet should be used). There are extinct flags in all other ranks.
  • Genera with square brackets: 37 names with the portion [GENUS NOT SPECIFIED] (workbench, reverse ordering, 40 n/p) = blocked
  • Names start with *F, *T, \n, ? = none
  • Split subgenera, 7 = resolved

image

ISSUES assessed 2024-02-12

image

TASKS

image

Hoplacephala nigriventris (Villeneuve, 1913)
vs
Hoplacephala nigriventris Villeneuve, 1913

Hoplacephala retroseta (Villeneuve, 1913)
vs
Hoplacephala retroseta Villeneuve, 1913

Huttonobesseria verecunda (Hutton, 1901)
vs
Huttonobesseria verecunda Hutton, 1901

Hystricia cuestae (Engel, 1920)
vs
Hystricia cuestae Engel, 1920

Isomyia pseudolucilia (Malloch, 1928)
vs
Isomyia pseudolucilia Malloch, 1928

plus, few cases of two identical accepted species (full list):
Empis (Polyblepharis) fedtschenkoi Shasmshev, 2023
Empis (Polyblepharis) hirsutitarsis Shamshev, 2023
Empis (Polyblepharis) sogdiensis Shamshev, 2023
Empis (Polyblepharis) sogdiensis Shasmshev, 2023
Holops anarayae Barahona-Segovia, 2021
Holops grezi Barahona-Segovia, 2021
Holops pullomen Baharona-Segovia, 2021
Paraclius brooksi Soares, Capellari & Ale-Rocha, 2023
Physoconops tentenvilu Baharona-Segovia, 2020
Polleniopsis bomdilaensis Bharti & Verves, 2016

ACC-ACC species (same authors) 0 of 342: https://www.checklistbank.org/dataset/1101/duplicates?authorshipDifferent=false&category=binomial&limit=50&minSize=2&mode=STRICT&offset=0&status=accepted
Two identical accepted species:
Agadasys hexablepharis Whittington, 2000
Amblypsilopus qinlingensis Yang & Saigusa, 2005
Amplisegmentum venezuelensis Winterton, 2021
etc.

Resolved 2024-02-12:

image

Synced 2024-02-12

@yroskov
Copy link
Author

yroskov commented Mar 4, 2024

TASKS does not detect such cases.

@yroskov
Copy link
Author

yroskov commented Mar 8, 2024

Systema Dipterorum ver. 5.0, 2024-01-08 processed via TW by DD; second iteration (bring back extinct spp); imported 2024-03-07

  • Imported: 177,193 spp (vs 177,193 spp)
  • Metadata: Corrected Version / Issued 0.39.0 / 2024-03-07 --> 5.0, Jan 2024 / 2024-01-08
  • Classification: as above.
  • Sectors: sector "Diptera" broken (subject is missing) = REPAIRED 2024-03-08.
    No blocked CCW families Cylindrotomidae, Limoniidae, Pediciidae, Tipulidae in Systema Dipterorum; plus subfamily Tipulinae (also, check genus Tipula in the root of Diptera) = all families re-blocked 2024-03-08. Species with the portion [GENUS NOT SPECIFIED] in the root = BLOCKED.
  • Extinct taxa: 4742 spp = FIXED!
  • Genera with square brackets: 28 names with the portion [GENUS NOT SPECIFIED] (workbench, reverse ordering, 40 n/p) = blocked.
  • Names start with *F, *T, \n, ? = none
  • Split taxa (see in TASKS)

METRICS

image

TASKS

image

  • Broken decisions, 6765 = no actions (because I have no assurance that decisions/CLB operates correctly)

  • Identical subfamilies approx. 45. Split subfamilies. = RESOLVED: (1) alternatives without authorstring ignored; (2) if both tribes without authorstrings, item with less species ignored.

  • Identical tribe 0 of 37. Split tribes. = RESOLVED: (1) alternatives without authorstring ignored; (2) if both tribes without authorstrings, item with less species ignored. However, decisions are not shown in the interface (see screenshot below) - for attention of @thomasstjerne (https://www.checklistbank.org/catalogue/3/dataset/1101/duplicates?catalogueKey=3&category=uninomial&limit=50&minSize=2&mode=STRICT&offset=0&rank=tribe&status=accepted&withDecision=false)

  • Identical genus 0 of 62. Split genera. = RESOLVED: alternatives without authorstring ignored. However, decisions are not shown in the interface (see screenshot below) - for attention of @thomasstjerne

  • Identical subgenus 0 of 7. Spit subgenera, 7 = RESOLVED: blocked alternative without authorstring - they are empty.

image

image

Seems, a bug resolved. On 2024-03-11, ACC-ACC species (different authors) 512 of 512:
image

The same problem: interface does not show results ofdecision application. Neither in the report nor in the panel.
https://www.checklistbank.org/catalogue/3/dataset/1101/duplicates?authorshipDifferent=false&catalogueKey=3&category=binomial&limit=500&minSize=2&mode=STRICT&offset=0&status=accepted&withDecision=false

image

image

See comments on bugs - stopper:
CatalogueOfLife/backend#1300 (comment)

CatalogueOfLife/backend#1300 (comment)

CatalogueOfLife/backend#1300 (comment)

Synced 2024-03-18, probably with sets of unresolved duplicates:

image

@yroskov
Copy link
Author

yroskov commented Mar 22, 2024

Re-do TASKS after software fixes, 2024-03-22

image

  • Broken decisions: 6765 = deleted
  • ACC-ACC species (same authors), 98 of 342: names with higher IDs blocked

Resolved 2024-03-22:

image

Re-synced 2024-03-22

@yroskov
Copy link
Author

yroskov commented May 23, 2024

Systema Dipterorum ver. 5.2 of 2024-05-15 (as 0.41.1 / 2024-05-23) processed via TW by GO; 1st iteration ; imported 2024-05-23

  • Imported: 216,630 spp (vs 177,193 spp); Synonym Count: 1 = something wrong
  • Metadata: ver. as "0.41.1 / 2024-05-23" need to corrected
  • Classification:
    Species with the portion [GENUS NOT SPECIFIED] in the root
  • Extinct taxa: no flag
  • Genera with square brackets: species with the portion [GENUS NOT SPECIFIED] (workbench, reverse ordering, 40 n/p)
  • Names start with *F, *T, \n, ? = none
  • Split taxa (see in TASKS)

METRICS

image

  • Incorrect authorstring with family Acartophthalmidae Enderlein, Lopes, Hardy, Czerny, Hennig, McAlpine, Munroe, Brauer, Bergenstamm, Fleming, Townsend, Zimin, Leach, Samouelle, Zetterstedt, Agassiz, Marschall, Griffini, Acloque, Arias, Brues, Melander, Lehrer, Hull, Alexander, Lahille, Rohdendorf, Verves, Aldrich, Griffiths, Guimarães, Dugdale, Rübsaamen, Hedicke, McAlpine, Mamaev, Robineau-Desvoidy, Macquart, Swainson, Griffith, Pidgeon, Harris, Rondani, Desmarest, Lioy, Rye, Barrett, Hendel, Coe, Latreille, Burmeister, Carpenter, Bowden, Becker, Brundin, Bigot, Mesnil, Ussatchov, Sack, Bezzi, Stein, Williston, Pleske, Newman, Grimshaw, Verrall, Young, Nowicki, Stuckenberg, Schnabl, Bernardi, Duda, Kloet, Hincks, Shannon, Crozy, Nartshuk, Bickel, Byers, Philip, Westwood, Baranov, Venturi, Schiner, Walker, Roback, Shewell, Winnertz, Malloch, Billberg, Hutton, Brodie, White, Stenhammar, Saether, Zimina, Herting, Glumac, Goffe, Egger, Pascoe, Townes, Frey, Beschovski, Jones, Albuquerque, Cogan, Wesché, Handlirsch, Aczél, Séguy, Wahlgren, Peck, Crampton, Cook, Hong, Cockerell, Wirth, Stone, Vujić, Curran, Kalugina, Zumpt, Blanchard, Drensky, Sturtevant, Bromley, Mani, Shiraki, Bertrand, Loew, Pandellé, Okada, Harris, Alcock, Yabar, Hall, Evenhuis, Edwards, Bhatia, Keilin, Morrison, Kertész, Shatalkin, Anonymous, Sharp, White, Wiedemann, Vimmer, Mathis, Adisoemarto, Wood, Wilcox, Papavero, Costa, Bode, Sabrosky, Arnaud, Smirnov, Nagatomi, Iwata, Daniels, Springer, Dziedzicki, Mallo, Roonwal, Tonnoir, Belkin, Fallén, Haliday, Coquillett, Slosson, Knowlton, Cutler, Hesse, Zaitzev, Theobald, Krivosheina, Ashe, Maa, Gressitt, Steyskal, Wheeler, Erichson, Presl, Bellardi, Berthold, Martin, Dahl, Bibby, Zilahi-Sebess, Grünberg, Brèthes, Curtis, Wesenberg-Lund, Miyatake, Anduze, Sedman, Weems, Vossbrinck, Friedman, Thompson, Nagler, Stanescu, Verb, Eysell, Austen, Rossi, Bau, Wing, Fluke, Holloway, Theodor, Papp, Weidner, Kieffer, Wu, Wenzel, Tokunaga, Grunin, Thomson, Kessel, Maggioncalda, Oken, Bequaert, Wingate, Carrera, Andretta, Woodley, Osten Sacken, Wulp, Stackelberg, Oldenberg, Colless, Riedel, Verbeke, Knutson, Lyneborg, Lameere, Lundström, Camras, Korneyev, Barnes, Heer, Jeannel, Elouard, Andersson, Blanchard, Boyes, Illingworth, Vaillant, Wandolleck, Kröber, Prado, Rafinesque, Perty, Doleschall, Philippi, Jaennicke, Hippa, Ass, Needham, Wasmann, Breddin, Börner, Wheeler, Seebold, Hardy, Meijere, Surcouf, Comstock, Hinton, Stephens, Meigen, Thon, Telford, Vockeroth, Roháček, Yang, Ren, Yang, Grichanov, Mazzarolo, Amorim, Liu, Chandler, Jaschhof, Ansorge, Norris, Richter, Lehr, Gaimari, Irwin, Labandeira, Schmitz, Gerstaecker, Hackman, Ovchinnikova, Barraclough, Freidberg, Tozoni, Andersen, Gagné, Kirby, Spence, Nitzsch, Neuhaus, Dyar, Borkent, Kano, Shinonaga, Kurahashi, Wang, Sasakawa, Marshall, Mostovski, Munari, Yeates, Lukashevich, Shcherbakov, Mason, Naglis, Brake, Kuznetzov, Szadziewski, Grootaert, Meuffels, Meunier, Shaw, Shaw, Greathead, Jaschhof, Didham, Pinto, Han, Schinz, Artigas, Wiegmann, Hancock, Krzemiński, Krzemińska, Saigusa, Nagatomi, Peris, González-Mora, Zloty, Sinclair, Pritchard, Séguy, Vilkamaa, Ševčík, Huang, Lin, Rindal, Guo, Wang, Michelsen, Zhang, Shih, Zhang, Krzeminska, Papier, Ebejer, Cherian, Shinimol, Zhu, Wang, Zhang, Fedotova, Sidorenko, Grimaldi, Cumming, Pape, Pimentel, Azar, Jaschhof, Li, Riccardi, Barták, Skartveit, Brown, Kung, Skibińska, Kaddumi, Pepinelli, Currie, Perkovsky, Lessard, Sidorenko, Lukashevich, Przhiboro, Plakidas, Tanasijtshuk, Oliveira, Ježek, Tang, Cranston, Rasnitsyn, Astakhov, Winterton, Ware, Hoffeins, Oldenburg, Mik, Wang, Blagoderov & Šifner, 2024 [1914]

Should be: Acartophthalmidae Czerny, 1928 = FIXED in 2nd iteration

  • Acartophthalmus coxatus Zetterstedt, 1848 = missing original combination as Agromyza coxata Zetterstedt, 1848 and missing brackets in the authorstring = FIXED in 2nd iteration

Both Acartophthalmidae & Acartophthalmus coxatus were correct in previous version.

@yroskov
Copy link
Author

yroskov commented Jun 3, 2024

Systema Dipterorum ver. 5.2 of 2024-05-15 (as 0.41.1 / 2024-06-01) processed via TW by GO; 2nd iteration ; imported 2024-06-01

  • Imported: 176,894 (vs 216,630 spp in 1st iteration, vs 177,193 spp); "Synonym Count: 133,377" = OK (vs 135,855)

  • Metadata: ver. "0.41.1 / 2024-06-01" corrected as "5.2 of 2024-05-15" 2024-06-03

  • Classification: OK.
    Species with the portion [GENUS NOT SPECIFIED] in the root

  • Sector: order Diptera is broken (missing subject); RESTORED 2024-06-03. 4 CCW families Cylindrotomidae, Limoniidae, Pediciidae & Tipulidae = BLOCKED

  • Extinct taxa: OK (4737 spp vs 4742 spp)

  • Genera with square brackets: species with the portion [GENUS NOT SPECIFIED] (workbench, reverse ordering, 40 n/p) = 19 blocked

  • Names start with *F, *T, \n, ? = none

  • Split taxa (see in TASKS)

  • Family Acartophthalmidae Czerny, 1928 = OK

METRICS

image

ISSUES

image

TASKS

image

  • Broken decisions, 5885 = all deleted
  • Outdated decisions, 57 = no actions (ref: the problem of decisions between GSDs) = 0 after deletion of broken decisions

Resolved 2024-06-03:

image

Synced 2024-06-03

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants