Skip to content

KBNLwikimedia/OpenRefine-WikimediaCommons-Workshop

Repository files navigation

Workshop OpenRefine & Wikimedia Commons

OpenRefine is a well-known tool for editing, enriching and manipulating data. It is widely used within the Wikimedia community to add data to Wikidata. As from version 3.7, you can also upload images to Wikimedia Commons, enriched with structured data. In this workshop you will learn step by step how to do that.

Learning objectives

In this workshop you will learn

  • how to use OpenRefine to upload new images with regular file descriptions (Wikitext) and structured data to Wikimedia Commons, and
  • how to add structured data to existing Commons files, using OpenRefine.

Target audience

This workshop is suitable for people who

  • have used OpenRefine before to add data to Wikidata,
  • know their way around OpenRefine v 3.4, 3.5 or 3.6, and
  • who know what "reconciling against Wikidata" means,

but who do not yet know how to use OpenRefine to add images and structured data to Wikimedia Commons.

This workshop is therefore not suitable for people who have never worked with OpenRefine and/or Wikidata.

Required preparation

  • Please bring your own laptop to this workshop.
  • Make sure you know your way around OpenRefine v 3.4, 3.5 or 3.6, and are comfortable with reconciling and uploading data to Wikidata.
  • Install OpenRefine 3.7 SNAPSHOT on your machine. It can be downloaded at https://github.com/OpenRefine/OpenRefine#snapshot-releases. Please note:
    • this is an unstable release, functionalities can change from release to release and may or may not work, depending on the exact snapshot release you have installed. Don't be suprised if some things that worked yesterday and/or today will stop working tomorrow. Therefore, it might be possible that you will not be able to upload your images to Wikimedia Commons at all, depending the matureness of the snapshot releases. This workshop is more about learning the logic and required steps then it is about actually uploading the images.
    • Version 3.6.2 (or older) is not suitable, because you cannot upload files to Wikimedia Commons with it.
  • Download the zipped OpenRefine-WikimediaCommons-Workshop repo and unzip it to some folder on your machine. This folder contains both your pre-built OpenRefine project archive (as tar.gz file) and your raw working materials: 18 local images, an Excel with data about the images, an OpenRefine schema and this explanation (README.md)
  • If you have time: check out OpenRefine 3.7+ – How to upload new files to Wikimedia Commons

Working materials

1) OpenRefine

2) Raw materials

If you want to build up the OpenRefine project from scratch, you can use these raw source materials

  • Online images: We are going to upload the 18 images from Nederlandsche havengezichten enz. to Commons. These images can be directly requested via http://resolver.kb.nl/resolve?urn=urn:gvn:KONB16:533939704&role=page&count=4&size=large (count=1, count=2, count=3... count=18)

    Please note that the the domain *.kb.nl has been whitelisted, so Wikimedia Commons accepts uploads from resolver.kb.nl.

  • Local images: This page holds 18 individual images, which have been downloaded into the images folder in this repo. These are only relevant if you want to upload these local images, rather then from the URLs.

  • Excel file: All necessary data for our uploads to Commons is contained in this Excel-file. It will be used as input for creating our OpenRefine project during the workshop.

    This Excel lets you choose if you want to upload the files (to Commons) from the local images folder, or from the URLs above.

3) Example outputs

4) Workshop guidance & outline

  • PDF slides: The outline, explanations, tips & tricks etc. that will be demonstrated during the workshop can be seen in this PDF-presentation in Dutch. You can also use it as guidance if you want to do this workshop by yourself.

    The PDF is also available on Wikimedia Commons and Zenodo.

Workshop leader

This workshop is given by Olaf Janssen, the Wikimedia coordinator of the Koninklijke Bibliotheek, the national library of the Netherlands. In this role he stimulates and facilitates collaboration between the collections, knowledge, open data and staff of the KB on the one hand, and the projects of the Wikimedia movement, such as Wikipedia, Wikimedia Commons, Wikidata and Wikibase, on the other. He is also active as a volunteer within the community.

Feel free to contact Olaf via olaf.janssen(at)kb.nl

Workshop instances

This workshop was given during

Licensing

All workshop materials are released into the public domain under the Creative Commons Zero v1.0 Universal and can therefore be reused freely and openly. Attribution is not required, but still appreciated.

See also

Latest updates

This page was last updated on 14 December 2022.

About

Materials for a workshop on how to upload media files to Wikimedia Commons using OpenRefine

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published