Dataset Classification: Orgs Types Groups Tags

Ian Ward edited this page Dec 9, 2013 · 8 revisions
Clone this wiki locally

Classification Methods

CKAN has an abundance of methods to classify datasets. This wiki page attempts to highlight the differences between each method. Maybe one day this page will grow into a proper part of the CKAN docs.


Dataset organizations for people who are publishing data. They are for controlling who can view, edit, create and publish datasets.

Organizations are one of the default facets in the dataset search pages. Also the organization dataset search at /organization/{name} shows all dataset types for the given organization.

dataset type

Dataset types are for when you want to have different types of datasets that have different schemas.

URLs are automatically added for searching different datasets types. /dataset is the default type and search page, showing only "dataset"-type datasets. A new dataset type e.g. "application" would automatically be given the search page /application, showing only that type of dataset.


Groups are for people who are consuming datasets. Use groups when you want to group datasets together under a theme e.g. climate data etc. You don't want just anyone to be able to add datasets to your carefully curated climate group, so only users who are members of the group are allowed to add datasets to or remove them from the group.

Unlike with organizations, being a member of a group doesn't give you permission to create or edit the datasets in the groups, groups are about collecting existing datasets together into groups, they're not about publishing datasets.

Groups are one of the default facets in the dataset search pages. Also the group dataset search at /group/{name} shows all dataset types for the given group.

Different group types are supported by writing a CKAN extension. This extension will need to provide a controller or other way of actually create non-default group types.

tag vocabulary

Tag vocabularies are just for when you want to add a custom field to the dataset schema, e.g. "Genre", and you want that field to be a drop-down list with a fixed number of possible values. There's an API for adding and removing from the list of possible values.

Tag vocabularies must be added to facets by a CKAN extension, there is no default interface for showing only datasets with a tag from a tag vocabulary.


Completely free-form tagging

Tags are one of the default facets in the dataset search pages.

Method Comparison

Dataset classification method comparison
method effect number/dataset types/dataset can change extra info custom plugin req'd
organization dataset editing permissions 0 or 1 1 sysadmin only? yes no
dataset type dataset schema 1 1 sysadmin only? no yes
group none 0+ 1+* group editors yes *yes for >1 type
tag vocabulary none 0+ 1+ dataset editors no yes
tag none 0+ 1 dataset editors no no
  • method: the dataset classification method chosen
  • effect: effects of the options selected
  • number/dataset: the number of options allowed per dataset
  • types/dataset: the number of groups of options available
  • can change: who may modify options for a dataset
  • extra info: extra information may be stored with each option
  • custom plugin req'd: custom CKAN extension code required