# 1. Gathering taxonomic journals

We gathered taxonomic journals through three sources:
 - We used Wikidata to find all academic or scientific journals with a main subject or field of work related to taxonomy, phylogeny, nomenclature,...
 - We used Wikidata to find all journals that had an IPNI or ZooBank publication ID.
 - We used the OpenAlex API to retrieve all journals that were related to "taxonomy", i.e. with the concept "taxonomy" linked to it.

Here, we take a quick look at the results.

In [1]:
import pandas as pd

In [6]:
journals = pd.read_csv("../data/processed/journals.csv")
journals

Unnamed: 0,title,wikidataURL,ISSN-L,IPNIpubID,ZooBankPubID,openAlexID,dissolvedYear,dissolved,source
0,Ornithology,http://www.wikidata.org/entity/Q2300649,0004-8038,,3F3F951F-B494-44B0-B286-AF9BCB097966,S152904045,,,Wikidata taxonomic subject
1,Ornithology,http://www.wikidata.org/entity/Q2300649,0004-8038,,3F3F951F-B494-44B0-B286-AF9BCB097966,S152904045,,,Wikidata taxonomic subject
2,ZooKeys,http://www.wikidata.org/entity/Q219980,1313-2970,,91BD42D4-90F1-4B45-9350-EEF175B1727A,S199213172,,,Wikidata taxonomic subject
3,ZooKeys,http://www.wikidata.org/entity/Q219980,1313-2970,,91BD42D4-90F1-4B45-9350-EEF175B1727A,S199213172,,,Wikidata taxonomic subject
4,Zootaxa,http://www.wikidata.org/entity/Q220370,1175-5326,,78F99150-21C2-4639-B359-F3E2302DF0B7,S171471881,,,Wikidata taxonomic subject
...,...,...,...,...,...,...,...,...,...
3061,Progress in molecular and subcellular biology,https://www.wikidata.org/entity/Q27710179,2363-7684,,,S4210207345,before 2012,True,OpenAlex taxonomy concept
3062,Advances in Evolutionary Biology,http://www.wikidata.org/entity/Q27726196,2314-7660,,,S4210228732,2015,False,OpenAlex taxonomy concept
3063,Sternbergiana,,2695-1118,,,S4210236945,2021,False,OpenAlex taxonomy concept
3064,Agricultural Gazette of New South Wales,https://www.wikidata.org/entity/Q31845337,0002-1474,,,S4210172420,,,OpenAlex taxonomy concept


In [8]:
# number of journals per source
journals["source"].value_counts()

IPNI or ZooBank ID            2830
OpenAlex taxonomy concept      155
Wikidata taxonomic subject      81
Name: source, dtype: int64

In [9]:
ipnizoo = set(journals[journals["source"]=="IPNI or ZooBank ID"]["title"])
openalex = set(journals[journals["source"]=="OpenAlex taxonomy concept"]["title"])
wikisubjects = set(journals[journals["source"]=="Wikidata taxonomic subject"]["title"])

In [23]:
print("Number of journals found via IPNI or ZooBank ID, not found via OpenAlex: " +
      str(len(ipnizoo - openalex)))
print("Number of journals found via IPNI or ZooBank ID, not found via Wikidata subjects: " +
      str(len(ipnizoo - wikisubjects)))

Number of journals found via IPNI or ZooBank ID, not found via OpenAlex: 1992
Number of journals found via IPNI or ZooBank ID, not found via Wikidata subjects: 2024


In [25]:
print("Number of journals found via Wikidata subjects, not found via OpenAlex: " +
      str(len(wikisubjects - openalex)))
print("Number of journals found via Wikidata subjects, not found via IPNI or ZooBank ID: " +
      str(len(wikisubjects - ipnizoo)))

Number of journals found via Wikidata subjects, not found via OpenAlex: 24
Number of journals found via Wikidata subjects, not found via IPNI or ZooBank ID: 8


In [26]:
print("Number of journals found via OpenAlex, not found via Wikidata subjects: " +
      str(len(openalex - wikisubjects)))
print("Number of journals found via OpenAlex subjects, not found via IPNI or ZooBank ID: " +
      str(len(openalex - ipnizoo)))

Number of journals found via OpenAlex, not found via Wikidata subjects: 141
Number of journals found via OpenAlex subjects, not found via IPNI or ZooBank ID: 93


In [27]:
# number of journals with an OpenAlex ID per source
journals[journals["openAlexID"]==journals["openAlexID"]]["source"].value_counts()

IPNI or ZooBank ID            1207
OpenAlex taxonomy concept      155
Wikidata taxonomic subject      68
Name: source, dtype: int64

In [33]:
# number of journals that were not recently dissolved (or not confirmed dissolved) per source
journals[(journals["dissolved"]==False) | (journals["dissolved"]!=journals["dissolved"])]["source"].value_counts()

IPNI or ZooBank ID            2299
OpenAlex taxonomy concept      122
Wikidata taxonomic subject      78
Name: source, dtype: int64