Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CoL of May 2024 #262

Open
32 of 34 tasks
yroskov opened this issue Apr 24, 2024 · 43 comments
Open
32 of 34 tasks

CoL of May 2024 #262

yroskov opened this issue Apr 24, 2024 · 43 comments

Comments

@yroskov
Copy link

yroskov commented Apr 24, 2024

  • WoRMS of 2024-05-01; imported 2024-05-02; synced 2024-05-03; 18 re-checked & re-synced 2024-05-07 All 65 WoRMS checklists in 2024 #254 (comment). Check editors in WoRMS Nemys & MolluscaBase.

  • ITIS of 2024-04-26; imported 2024-04-30; synced 2024-05-01

  • Scarabs of 2024-05-06; imported 2024-05-06; synced 2024-05-07

FROM TaxonWorks:

  • 3i Auchenorrhyncha 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-09
  • Entiminae 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-09
  • WCO (Opiliones) 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-09
  • WOL (Odonata) 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-09
  • SF Isoptera 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-09
  • SF Orthoptera 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-09

monthly started from January 2024:

  • SF Dermaptera 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-10
  • SF Embioptera (Heidi, 2024-05-01: I am going to move Embioptera over to the annual update list) 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-10
  • SF Plecoptera 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-10
  • SF Psocodea 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-10

monthly started from March 2024:

  • SF Aphid 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-13
  • SF Chrysididae 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-13
  • SF Coreoidea 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-13
  • SF Lygaeoidea 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-13
  • SF Mantodea 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-13
  • SF Phasmida 0.41.0 / 2024-05-06; imported 2024-05-06; synced 2024-05-13

_single update in a year (January):~

SF Coleorrhyncha
SF Embioptera
SF Grylloblattodea
SF Mantophasmatodea
SF Zoraptera

LEPIDOPTERA:
See #260 (comment)

  • GLI imported 2024-05-14; ; synced 2024-05-14
  • Alucitoidea (Lepidoptera) imported 2024-05-06; synced 2024-05-14
  • Gelechiidae (Lepidoptera) 1.1.24.115 (24 Apr 2024); imported 2024-04-24; 2024-05-09; synced 2024-05-14, 2024-05-15
  • Hepialidae (Lepidoptera) 1.0.1 / 2024-05-19; imported 2024-05-19; synced 2024-05-20
  • Pterophoroidea (Lepidoptera) imported 2024-04-27; synced 2024-05-14
  • Sesiidae (Lepidoptera) = no new imports
  • Geometridae (Lepidoptera) imported 2024-04-23; synced 2024-05-15
  • Global Gracillariidae of 2024-03-15; synced 2024-05-14

OTHER:

  • UCD proof; completed & synced 2024-04-30, 2024-05-06, 2024-05-09, 2024-05-10; submitted to authors 2024-05-01, 2024-05-07, 2024-05-09, 2024-05-10 permitted for publishing (logo deadline 20 May)

  • CunaxidBase (as an update for one family in BdelloideaBase) received 2024-02-28; imported 2024-05-17; synced 2024-05-17

  • The Scorpion Files, imported 2024-04-22; synced 2024-05-01

  • TITAN, imported 2024-04-23; synced 2024-05-01

  • Species Fungorum Plus; received 2024-04-28; imported 2024-04-28; synced 2024-05-06

  • Bryonames of 7 May 2024; synced 2024-05-10

  • Systema Dipterorum ver. 5.2 of 2024-05-15; received 2024-05-16

=============================

  • SF Permopsocida is not yet in CoL; I have submitted its adoption to the CoL Taxon Group. (see email of 2023-11-10, Heidi)

  • ScaleNet

  • UCD in Nov 2023?

  • WCVP ver 11.0 / 2023-04-20

  • LPSN see Genome Taxonomy Database (GTDB) for prokaryotes data#202 (comment) . @mdoering via Slack 2024-05-06: heard back from LPSN - they have good progress, but the API with classification will be public and usable by us at the end of May - so we might be able to get LPSN into the June release YR: LPSN - good! But we have no June release. It will be AC24 in June. It's risky to publish AC24 with new dataset. Especially, without Taxonomy Group assessment of LPSN and approval of ITIS replacement. Let's do it in July-August

  • Rotifers

  • FishBase - ?

  • Paris/DBTNT - ?

  • LDL Neuropterida

=============================

Filling gaps:

Suborder Symphyta (Hymenoptera) CatalogueOfLife/data#579
Class/order Diplura CatalogueOfLife/data#577
Order †Permopsocida (Insecta) CatalogueOfLife/data#578
Family Promecheilidae (Tenebrionoidea, Coleoptera) CatalogueOfLife/data#580

@dhobern
Copy link

dhobern commented Apr 27, 2024

@yroskov More updates to Pterophoridae and GLI - also everything I've listed under "Towards annual checklist"

@yroskov
Copy link
Author

yroskov commented Apr 30, 2024

PREVIEW release started 2024-04-30, 7:07 pm (server time)
Finished as COL24.4, id 295185, 2024-04-30, 8:43 pm
Deployed to the preview website 2024-04-30

@yroskov
Copy link
Author

yroskov commented Apr 30, 2024

PREVIEW release started 2024-04-30, 9:17 pm (server time)
Finished as COL24.4, id 295188, 2024-04-30, 10:47 pm
Deployed to the preview website 2024-05-01

  • Check the proof of UCD Draft ver. 0.40.3 / 2024-04-18 after additional cleanings

@yroskov
Copy link
Author

yroskov commented May 7, 2024

PREVIEW release started 2024-05-07, 12:43 pm (server time)
Finished as COL24.5, id 295753, 2024-05-07, 2:12 pm
Deployed to the preview website 2024-05-07

  • Species Fungorum Plus; proof submitted
  • UCD; second proof submitted. Set of families flagged in CLB as "bare names". This caused not assigned subfamilies and tribes.

@mdoering
Copy link
Member

mdoering commented May 7, 2024

can we sync all sources in need of an update as given here?
https://www.checklistbank.org/catalogue/3/sources

.... with the exception of IRMNG maybe

@yroskov
Copy link
Author

yroskov commented May 7, 2024

No.

All necessary syncs are already completed. Those sectors which are not synced may have unresolved issues and broken decisions.

@mdoering
Copy link
Member

mdoering commented May 7, 2024

All SpeciesFiles sources have issues?

@mdoering
Copy link
Member

mdoering commented May 7, 2024

Pterophoroidea has added 6 species. Is this problematic?
https://www.checklistbank.org/dataset/1199/diff?attempts=162..163

@yroskov
Copy link
Author

yroskov commented May 7, 2024

Checks & syncs of SFs are scheduled for next week

@yroskov
Copy link
Author

yroskov commented May 7, 2024

Please check problems with GSDs in Github reports

@mdoering
Copy link
Member

mdoering commented May 7, 2024

I did look at 2 just now, Bryonames and Gracillariidae.
I don't see any recent comments on those. They just seem to not have been synced for quite some time.

Both do update automatically whenever sources change. Hence it is crucial to look at the attempts in sources to see what is in need of a sync. At least as long as we do manual syncs.

@mdoering
Copy link
Member

mdoering commented May 7, 2024

Pterophoroidea also seems fine? There is just a software issue mentioned which I think is fixed long time ago: #179

@dhobern
Copy link

dhobern commented May 7, 2024

Five of the six extra Pterophoridae are because the new North American checklist has split some species. What's less clear to me is why Oidaematophorus poulini appears in this list. It is clearly marked as a synonym for Hellinsia poulini (along with the 2005 versions of the same names). I thought the diff view only showed the accepted names. Otherwise, all good.

@mdoering
Copy link
Member

mdoering commented May 7, 2024

The diff tool simply shows alphabetically sorted names. You can decide whether you want synonyms included, authorships and direct parents being shown. The default in that view does include synonyms.

@dhobern
Copy link

dhobern commented May 8, 2024

Gelechiidae updated again today.

@yroskov
Copy link
Author

yroskov commented May 8, 2024

PREVIEW release started 2024-05-08, 3:41 pm (server time)
Finished as COL24.5, id 295802, 2024-05-08, 5:10 pm
Deployed to the preview website 2024-05-08

  • UCD, 3rd proof with applied OTUs: families FIXED; unassigned subfamilies & tribe as expected

@yroskov
Copy link
Author

yroskov commented May 9, 2024

PREVIEW release started 2024-05-09, 7:26 pm (server time)
Finished as COL24.5, id 295965, 2024-05-09, 8:59 pm
Deployed to the preview website 2024-05-09

@yroskov
Copy link
Author

yroskov commented May 14, 2024

@dhobern, I'm going to sync all the updated Lepidoptera checklists for the May release today (tomorrow). It does not require any action on your part. I'll check the import dates in CLB. Further updates will be added to the Annual Checklist in June (June 7 is the deadline for the latest updates).

@yroskov
Copy link
Author

yroskov commented May 15, 2024

PREVIEW release started 2024-05-15, 5:11 pm (server time)
Finished as COL24.5, id 296093, 2024-05-15, 6:53 pm
Deployed to the preview website 2024-05-15

CHECKS
for @mdoering attention (how we can fix these?):

  • ITIS is "overloaded" from estimated approx. 171,278 spp to 219,033 spp due to sync of merge-sectors. = FIXED in the preview 2024-05-16
  • Unexpected entry "Catalogue of the type specimens of Miridae" in the list of Source datasets at https://preview.catalogueoflife.org/data/source-datasets
    image

@mdoering
Copy link
Member

@yroskov there is no point in doing (preview) releases while we still have the ITIS issue

@yroskov
Copy link
Author

yroskov commented May 15, 2024

there is no point...

You said in Slack, "You can continue, please just dont use any sync all methods or manually sync merge sectors". I continued with the updates and need to see the results in the preview. What's the point of continuing if we still have an ITIS problem?

I have now suspended work on the May edition until I get the green light from you.

@mdoering
Copy link
Member

It doesn't hurt to do a release if you need it. But obviously ITIS is wrong now, so no release at this stage can be a real one.
Otherwise please continue your work as long as you do not sync all.

@yroskov
Copy link
Author

yroskov commented May 16, 2024

I have deleted the 5 ITIS merge sectors which should have removed all linked data in the project. @yroskov could you do a brief check if you spot sth unusual? #8 (comment)

COL project 3: checks of 2024-05-16:

Plantae, Multiple providers: 6 (7) unexpected contributors (e.g. GREEN, IUCN, etc.):
image

Animalia, Multiple providers: many unexpected contributors (e.g. Afromoths, ArthropodsPT, etc.)
image

Archaea & Bacteria: "2214"
image

@mdoering
Copy link
Member

That only shows in the project, right? Not in previews or releases in CLB?
@thomasstjerne sth we might have to look into

@yroskov
Copy link
Author

yroskov commented May 16, 2024

I checked it in the project only for now.

I launched a new preview, to see what is there.

@yroskov
Copy link
Author

yroskov commented May 16, 2024

PREVIEW release started 2024-05-16, 2:10 pm (server time)
Finished as COL24.5, id 296266, 2024-05-16, 3:40 pm
Deployed to the preview website 2024-05-16

CHECKS 2024-05-16

@mdoering
Copy link
Member

mdoering commented May 17, 2024

Weird. For some reason a single merge sector Miridae made it into the release: https://www.checklistbank.org/dataset/296266/sector?limit=100&mode=merge&offset=0

The sector was changed 12h after it was created. I suspect it changed its mode and it was originally accidently created as an attach sector. When creating attaches the root name is immediately copied to the project so it shows up in the assembly tree. This cause the sector to stay in the release.
The family Miridae is still there, I will delete the sector now. @camiplata @DianRHR please recreate the sector as a merge one as needed!

I made sector modes immutable from now on. Trying to change them will raise an error.

@mdoering
Copy link
Member

Works for me:

image image

@yroskov
Copy link
Author

yroskov commented May 17, 2024

Hmm. Strange. Neither of 3 browsers displays logos in the PREVIEW on my machine (but www.catalogueoflife.org is OK).

FireFox:

image

image

Chrome:

image

MS Edge:

image

(I'll ask @gdower to have a look what might be wrong with my machine)

@mdoering
Copy link
Member

The images are on a private URL that needs authentication. Are you logged into CLB right now? It's nothing to worry about for the public release, but still worth looking at

@yroskov
Copy link
Author

yroskov commented May 17, 2024

PREVIEW release started 2024-05-17, 7:54 pm (server time)
Finished as COL24.5, id 296445, 2024-05-17
Deployed to the preview website 2024-05-20

CHECKS 2024-05-20:

@yroskov
Copy link
Author

yroskov commented May 20, 2024

PREVIEW release started 2024-05-20, 2:02 pm (server time)
Finished as COL24.5, id 296489, 2024-05-20, 3:32 pm
Deployed to the preview website 2024-05-20

Checked 2024-05-20

@yroskov
Copy link
Author

yroskov commented May 20, 2024

PREVIEW release started 2024-05-20, 4:32 pm (server time)
Finished as COL24.5, id 296511, 2024-05-20, 5:59 pm
Deployed to the preview website 2024-05-20

Checked 2024-05-20

  • Unexpected taxa in the Tree, which were absent in the CoL of April. Taxa have no credits to the Source Dataset and have no children species.

for example:

kingdom Animalia > family Katharellidae Czaker, 1994

image

kingdom Chromista > family Chloromonadaceae

kingdom Chromista > phylum Bigyra > class Labyrinthulea > families Acanthoniidae, Aulacanthidae, Stethopiliidae

image

kingdom Fungi > families Amoebidiaceae, Lagenidiaceae

image

etc.

@yroskov
Copy link
Author

yroskov commented May 20, 2024

For attention of @olafbanki & @mdoering: COL24.5, id 296511, 2024-05-20 is completed now as CoL of May.

It looks like some unexpected taxa have appear in the Tree (they were absent in the CoL of April). These taxa have no credits to the Source Dataset and have no children species. Perhaps, this is a result of sync of ITIS "merged sectors" or other Extended Catalogue activities. Good thing, I don't see unexpected changes in species statistics.

@mdoering
Copy link
Member

That may likely me the cause yes. I wasn't able to spot all of them. So please feel free to manage these unsourced higher names and delete them if desired

@mdoering
Copy link
Member

mdoering commented May 22, 2024

I have looked into the dates when a taxon from the management classification, i.e. all taxa without a sector, were created. Aggregated by day this looks like this:

   created  | count 
------------+-------
 2019-11-20 |  1707
 2020-01-06 |     1
 2020-01-07 |     1
 2020-04-24 |   308
 2020-07-18 |    16
 2020-07-19 |     8
 2020-07-27 |     2
 2020-08-10 |   224
 2020-08-12 |   474
 2020-08-14 |     6
 2020-09-04 |     1
 2020-09-15 |     1
 2020-10-02 |     1
 2021-01-19 |    12
 2021-03-10 |     1
 2021-03-19 |     2
 2021-07-01 |     1
 2021-10-11 |    57
 2022-02-10 |     1
 2022-03-09 |     9
 2022-03-30 |     1
 2022-08-16 |     2
 2022-10-27 |   567
 2023-01-09 |     4
 2023-03-17 |     1
 2023-04-12 |    18
 2023-06-03 |     1
 2024-02-06 |    24
 2024-02-15 |   288
 2024-03-25 |     1
 2024-03-28 |     1
 2024-04-01 |     2
 2024-05-14 |  4446

You can nicely see that more than half of the taxa were created on May 14th when all of ITIS was synced.
From these 579 were families, 3867 genera.

Comparing these numbers with the last April release it is exactly the same, but the 4446 from May are missing and instead of 288 from Feb 15th there were still 293. So some 5 names have been deleted:

 C5SLY | FAMILY      | Naibiidae        | ACCEPTED
 C8BYS | FAMILY      | Sinojuraphididae | ACCEPTED
 C72ZM | FAMILY      | Dracaphididae    | ACCEPTED
 C7RWG | SUPERFAMILY | Naibioidea       | ACCEPTED
 C2KG6 | INFRAORDER  | Naibiomorpha     | ACCEPTED

I am going to remove all May 14th taxa again as they are all empty with no species.

@mdoering
Copy link
Member

mdoering commented May 22, 2024

All removed. Reindexing the project to be safe

@yroskov
Copy link
Author

yroskov commented May 22, 2024

@mdoering, thank you!

All of it will go in June release. May should go as id 296511 of 2024-05-20, because I already started assembly of June yesterday, and the process is far from the end.

@olafbanki
Copy link

@yroskov if I understand may 2024-05-20 could be released as May edition?

@yroskov
Copy link
Author

yroskov commented May 28, 2024

@olafbanki, yes

@olafbanki
Copy link

@yroskov thanks; @mdoering I have run the checks and committed a blog post. Can you publish the May edition?

@mdoering
Copy link
Member

published

@yroskov
Copy link
Author

yroskov commented May 30, 2024

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants