Skip to content

Check consistency between SKBL and Wikidata for ISNI

Notifications You must be signed in to change notification settings

salgo60/SKBLWikidata

Repository files navigation

KARP - Wikidata

KARP is a database that contains profiles from SKBL ( The Biographical Dictionary of Swedish Women ). Wikidata is already connected to all records in SKBL using the property 4963 ( see blog )

This task is for checking consistency between data in SKBL and Wikidata (WD) regarding ISNI (compare how simple the same solution is with SPARQL Federated search see WD <-> Nobelprize using Federated search )

KARP

KARP <-> Wikidata

KARP <-> Wikidata sources e.g. Riksarkivet
  • next step is to also define sources with Unique identifiers so they can be machinereadable and compared
    • a small test with +300 SKBL profiles whith sources in Riksarkivet Wikidata as Wikidata reference for facts

Using what Wikidata knows about

Wikidata is connected to +3000 external sources see list of SKBL profiles below some examples how that can be used to find candidates fr new profiles

KARP <-> Wikidata <-> Digital Museums

As we move in the direction of more digital also some museums starts add collections avalaible on the net as sites like Digitalmuseum.org. In Wikidata we store an identifier called Kulturnav and if the museums that upload data to Digitalmuseum is doing it in the "correct" way we could easy find collections for a person

  • Example
    • SKBL LeaAhlborn = kulturnav Ahlborn, Lea (1826 - 1897) = Digital museum ahlborn-lea-1826-1897
    • List ordered after how many articles Wikipedia have about the person in different languages My feeling is that its a small chaos if museums links or not links a person. I guess its some education needed, better userinterfaces and also better understanding of the possibilities of the Digital landscape

KARP <-> Wikidata <-> Signaturer.se P5316

KARP <-> Wikidata <-> Nationalmuseum P2538

KARP <-> Wikidata <-> Spotify P1902

KARP <-> Wikidata <-> Svenskt Översättarlexikon P5147

KARP <-> Wikidata <-> SOK P2323

KARP <-> Wikidata <-> Litteraturbanken P5101

Status in Wikidata is that Litteraturbanken P5101 needs some cleaning....

  • Example
    • SKBL SelmaLagerlof = Litteraturbanken LagerlofS
    • List
      • List female, past away part in Litteraturbanken but NOT in SKBL ordered by number of articles in Wikipedia

KARP <-> Wikidata <-> Nobel Prize Nomination P3360

  • Example
    • SKBL SelmaLagerlof = Nobel Prize Nomination 5152
    • List
      • List female, past away with Nobel Prize Nomination in Wikidata but NOT in SKBL ordered by number of articles in Wikipedia

The program

To use add Wikidata login to user-config.py

The program

  1. Gets all records from Karp SKBL
    1. try to decode ISNI (get_isni)
  2. Gets Wikidata record and ISNI in Wikidata (get_wikidata)
  3. Logs potential mismatches to the /log file
  4. Writes a csv file skbl.csv with the following columns
    1. Karp.url = Wikidata P4963 e.g. MargitAbenius
    2. Karp.id
    3. SKBL ISNI (format 9999 9999 9999 9999 )
    4. Wikidata Q number e.g. Q4933592
    5. Wikidata ISNI

Wikidata actions

  1. Updated wrong ISNI added missing
  2. Oddities found in SKBL
    1. AVmr017w7hONWjeN9oCA ISNI number containes also text ISNI
    2. See section below Action SKBL see also task T219706

Lesson learned

  1. Writing software that should parse JSON and take care of odd implementations like KARP ISNI were 00000000 is no ISNI found. Take much more time than having a SPARQL endpoint and use Federated search see Task T200668 and Nobelprize.org
  2. Start comparing and sharing data will make the quality better for everyone and make life easier for the next person that should use this data
  3. Starting using linked data for people, places instead strings will be a game changer Europeana Linked Open Data - What is it? se Task T218782

Next step

  1. Compare more data between SKBL and Wikidata
  2. Start a dialogue what is good implementation of LOD or not
  3. Find better patterns I feel SPARQL endpoint is easy and in Wikidata we have out of the box support for Listera lists see video in Swedish and how we every night compare Wikidata and the Nobelprize.org - Task T200668
  4. Also get tighter collaborations between different projects
    1. SKBL writes about Lilli Zickerman and Nordiska has new material - help each other to be more visible and dont be scare linking is my feeling see tweet and also start thinking what can attract Wikipedia as they have a lot of visitors (2019 profiles related to SKBL had 13 481 viewers per day)
    2. In Wikidata we document signs of litterature in Stockholm see link - maybe this could be an input to SKBL what people should be documented next time and when planning signs also look how they relates to SKBL/Litteraturbanken
    3. See also blogpost how Wikidata can be used to get some interesting statistics
      1. example search in Wikidata for female Swedish persons who has died and have most articles and are not part of SKB - in Swedish a Video

Action SKBL see also task T219706

  1. Mismatches that maybe needs action at SKBL Task T219706
    1. http://www.wikidata.org/wiki/Q4954113
      1. SKBL: https://skbl.se/sv/artikel/AV44Ec_tDqWJ2eBq92nx.json
      2. WD ISNI: 0000 0000 5186 9858 SKBL ISNI:
    2. http://www.wikidata.org/wiki/Q16595323
      1. SKBL: https://skbl.se/sv/artikel/AWBPP4z-WxymAE2pjnvP.json
      2. WD ISNI: 0000 0000 1701 7221 SKBL ISNI:
    3. http://www.wikidata.org/wiki/Q4956130
      1. SKBL: https://skbl.se/sv/artikel/AVvxOGQx6HF5HgpJZ2w9.json
      2. WD ISNI: 0000 0003 6825 4210 SKBL ISNI:
    4. http://www.wikidata.org/wiki/Q50804907
      1. SKBL: https://skbl.se/sv/artikel/AV_9nDubjP4rR07fm2xo.json
      2. WD ISNI: 0000 0003 8182 3780 SKBL ISNI:
    5. http://www.wikidata.org/wiki/Q50803502
      1. SKBL: https://skbl.se/sv/artikel/AVuf7Lxgi-B3PHNCX9VD.json
      2. WD ISNI: 0000 0000 3262 2657 SKBL ISNI:
    6. http://www.wikidata.org/wiki/Q4970779
      1. SKBL: https://skbl.se/sv/artikel/AWEDRVoDWxymAE2pjnyo.json
      2. WD ISNI: 0000 0001 0463 2783 SKBL ISNI:
    7. http://www.wikidata.org/wiki/Q50846445
      1. SKBL: https://skbl.se/sv/artikel/AVyMLsea6HF5HgpJZ2y9.json
      2. WD ISNI: 0000 0003 7794 8716 SKBL ISNI:
    8. http://www.wikidata.org/wiki/Q16596123
      1. SKBL: https://skbl.se/sv/artikel/AWFv7UtfWxymAE2pjn2w.json
      2. WD ISNI: 0000 0004 3982 9828 SKBL ISNI:
    9. http://www.wikidata.org/wiki/Q4974163
      1. SKBL: https://skbl.se/sv/artikel/AVmr017w7hONWjeN9oCA.json
      2. WD ISNI: 0000 0000 5303 1792 SKBL ISNI:
    10. http://www.wikidata.org/wiki/Q13563726
      1. SKBL: https://skbl.se/sv/artikel/AWFrNzKKWxymAE2pjn2Q.json
      2. WD ISNI: 0000 0000 5184 7851 SKBL ISNI:
    11. http://www.wikidata.org/wiki/Q4976815
      1. SKBL: https://skbl.se/sv/artikel/AWEDkKK1jP4rR07fm24c.json
      2. WD ISNI: 0000 0000 1218 5656 SKBL ISNI:
    12. http://www.wikidata.org/wiki/Q4976841
      1. SKBL: https://skbl.se/sv/artikel/AV9sGgqZWxymAE2pjnoD.json
      2. WD ISNI: 0000 0000 5196 8258 SKBL ISNI: 0000000137251661
      3. Looks like she has 2 ISNI
    13. http://www.wikidata.org/wiki/Q4979129
      1. SKBL: https://skbl.se/sv/artikel/AWED1FSwWxymAE2pjnyw.json
      2. WD ISNI: 0000 0001 0915 0425 SKBL ISNI:
    14. http://www.wikidata.org/wiki/Q4990100
      1. SKBL: https://skbl.se/sv/artikel/AV3p525ZDqWJ2eBq92mK.json
      2. WD ISNI: 0000 0001 0966 715X SKBL ISNI:

Binder

About

Check consistency between SKBL and Wikidata for ISNI

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published