Skip to content
/ YALC Public

🕸 YALC: Yet Another LOD Cloud (registry of Linked Open Datasets).

Notifications You must be signed in to change notification settings

TriplyDB/YALC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YALC: Yet Another LOD Cloud

This repository contains configuration files for Linked Open Datasets that are published on the web. These datasets can be freely used at https://triplydb.com.

Get started

Go to https://triplydb.com and use the search bar to search for datasets.

Contribute

If your favorite Linked Dataset is not yet available at https://triplydb.com, you can add its configuration in a pull request or you can open a ‘Dataset request’ issue.

Repository structure

This repository contains the following directories:

/datasets
Contains one configuration file for each dataset.
/datasets/errors
Contains one configuration file for each dataset that cannot yet be uploaded because it contains errors.
/datasets/todo
Contains one configuration file for each dataset that cannot yet be uploaded because some functionality is still missing.
/datasets/too-little-info
Contains partial configuration files for datasets for which too little information is available at the moment.
/img
Contains images that are used in dataset and organization configurations.
/organizations
Contains one configuration file for each organization.
/rdf
Contains the RDF definitions that are used by the configuration files in this repository, and includes small RDF datasets that are part of the LOD Cloud but for which we could not find an online publication elsewhere.

Configuration format

The configuration files in YALC all follow a JSON configuration format. The following subsections document the format for dataset configuration files and the format for organization configuration files.

Dataset configuration format

The dataset configuration format is used for the files in the /datasets subdirectory. Each file contains a JSON object with the following keys:

"about"

Zero or more topics that characterize what the dataset is about.

Topic values must also appear in the topic hierarchy. Topic values are specified with their IRI local name. For example, "eGov" is used to denote the topic with IRI https://triplydb.com/Triply/yalc/id/topic/eGov.

For vocabularies, this key must include the value "vocabulary".

This key is optional, in which case the dataset has zero topics.

"asset"

Links to binary files that are part of the dataset.

One example is documentation files (DOCX, PDF, ODT) that either occur in the dataset or that describe it. Another example is media files (images, sounds, videos) that occur in the dataset.

This key is optional.

"description"

The description of the dataset.

This must be at least 50 characters and at most 1,000 characters long.

This key is required.

"diagram"

An image showing a diagrammatic overview of the dataset. This value must be the local name for a file in the /img subdirectory.

This key is optional.

"graph"

Specifies the named graph that will be used to store the content from the default graph in.

This key is optional.

If this key is not specified and exactly one prefix is specified, then the prefix IRI is used as the default graph name.

"homepage"

The URL of the web page that is the authoritative location on the Internet for human-readable information about the dataset.

Sometimes a dataset does not have its own web page. In such cases it is possible to specify a web page that describes or mentions the dataset.

This key is optional, in which case the dataset will have no homepage.

"id"

Identifies the dataset and (optionally) its organization and version. The format for this value is "ORGANIZATION/DATASET@VERSION".

The values for ORGANIZATION and DATASET must be at least 2 and at most 40 characters long. They must consist of digits ([0-9]), letters ([A-Za-z]), and hyphens (-).

The value for ORGANIZATION must correspond with a file named ORGANIZATION.json in the /organizations subdirectory. See the organization configuration format section for more information.

The value of VERSION must follow one of the following formats:

VERSION formatVERSION_OBJECT format
MAJOR[.MINOR[.PATCH]]{"@type": "SemanticVersion", "major": "MAJOR", "minor": "MINOR", "patch": "PATCH"}
YYYY{"@type": "TemporalVersion", "year": "YYYY"}
YYYY-MM{"@type": "TemporalVersion", "yearMonth": "YYYY-MM"}
YYYY-MM-DD{"@type": "TemporalVersion", "date": "YYYY-MM-DD"}

If the organization that issues the dataset is not known, the ORGANIZATION/ prefix can be omitted. If omitted, the ‘none’ organization will be used.

If the version of the dataset is not known, the @VERSION suffix can be omitted. If omitted, "1.0.0" is used as the dataset version.

The DATASET part of this value is required.

"image"

The logo or image for this dataset. This value must be the local name for a file in the /img subdirectory.

This key is optional. If it is omitted, the image of the dataset's organization (if any) is used.

"license"

The license of this dataset. The value must be one of the following:

  • https://creativecommons.org/licenses/by-nc/4.0/
  • https://creativecommons.org/licenses/by-sa/3.0/
  • https://creativecommons.org/licenses/by/1.0/
  • https://creativecommons.org/licenses/by/2.0/
  • https://creativecommons.org/licenses/by/2.5/
  • https://creativecommons.org/licenses/by/3.0/
  • https://creativecommons.org/licenses/by/4.0/
  • https://creativecommons.org/publicdomain/zero/1.0/
  • https://opendatacommons.org/licenses/by/1-0/
  • https://opendatacommons.org/licenses/odbl/1.0/
  • https://opendatacommons.org/licenses/pddl/1-0/
  • https://opensource.org/licenses/BSD-3-Clause

This key is optional. Since datasets in YALC must have a license, the value https://creativecommons.org/licenses/by/4.0/ is used in case this key is omitted.

"name"

The display name of the dataset.

This key is optional.

"namespace"

Zero or more IRIs that denote namespaces for this dataset.

IRIs within these specified namespaces will have this dataset as their authority. Ideally, every IRI has an authoritative dataset to which it belongs. An IRI can have at most one authoritative dataset to which it belongs.

This key is optional. If it is omitted and the "prefix" key is present, then the IRIs that appear in the value of the "prefix" key are used as the namespaces.

"prefix"

A JSON object containing RDF prefix declarations. The keys of this JSON object are aliases that can be used to denote their corresponding IRI values.

This key is optional.

"successor"

Allows an outdated version of a dataset to point to its successor version.

The value notation is identical to the notation that is used for the "id" key.

"url"

Zero or more URLs from which an RDF serialization of the dataset can be downloaded.

This key may be omitted if one or more namespaces are specified (see the documentation of the "namespace" key for more details).

The following example shows the full dataset configuration for file datasets/owl@2.0.json. Notice the following details:

  • The value for key "id" is "w3c/owl@2.0", whose prefix corresponds to the organization configuration file organizations/w3c.json.
  • The value for key "image" is "owl.png", which corresponds to file img/owl.png.
  • While the "namespace" key is not specified, its value is implicitly set to "http://www.w3.org/2002/07/owl#", which is specified in the "prefix" key.
  • While the "url" key is not specified, its value is implicitly set to "http://www.w3.org/2002/07/owl#", because this is the (implicit) value of the "namespace" key.
{
  "about": "vocabulary",
  "description": "This ontology partially describes the built-in classes and properties that together form the basis of the RDF/XML syntax of OWL 2.  The content of this ontology is based on Tables 6.1 and 6.2 in Section 6.4 of the OWL 2 RDF-Based Semantics specification, available at <http://www.w3.org/TR/owl2-rdf-based-semantics/>.\n\nPlease note that those tables do not include the different annotations (labels, comments and `rdfs:isDefinedBy` links) used in this file.  Also note that the descriptions provided in this ontology do not provide a complete and correct formal description of either the syntax or the semantics of the introduced terms (please see the OWL 2 recommendations for the complete and normative specifications).\n\nFurthermore, the information provided by this ontology may be misleading if not used with care. This ontology SHOULD NOT be imported into OWL ontologies. Importing this file into an OWL 2 DL ontology will cause it to become an OWL 2 Full ontology and may have other, unexpected, consequences.",
  "homepage": "https://www.w3.org/TR/owl2-overview",
  "id": "w3c/owl@2.0",
  "image": "owl.png",
  "name": "Web Ontology Language (OWL)",
  "prefix": {
    "owl": "http://www.w3.org/2002/07/owl#"
  }
}

Organization configuration format

The organization configuration format is used for files in the /organizations subdirectory. Each file contains a JSON object with the following keys:

"description"

The description of the dataset.

This must be at least 50 characters and at most 1,000 characters long.

This key is required.

"homepage"

The URL of the main web page for human-readable information about the organization.

This key is optional, in which case the organization will have no homepage.

"id"

Identifies the organization.

The value must be at least 2 and at most 40 characters long. It must consist of digits ([0-9]), letters ([A-Za-z]), and hyphens (-).

"image"

The logo or image for the organization. This value must be the local name for a file in the /img subdirectory.

This key is optional. If it is omitted, the image img/rdf.png is used.

"name"

The display name of the organization.

This key is optional.

The following example shows the full organization configuration file organizations/w3c.json:

{
  "description": "The World Wide Web Consortium (W3C) is an international community where Member organizations, a full-time staff, and the public work together to develop Web standards.  Led by Web inventor and Director Tim Berners-Lee and CEO Jeffrey Jaffe, W3C's mission is to lead the Web to its full potential.  Contact W3C for more information.",
  "homepage": "https://www.w3.org",
  "id": "w3c",
  "image": "w3c.png",
  "name": "World Wide Web Consortium (W3C)"
}

Semantic definitions

The configuration files in this repository can themselves be processed as Linked Data. This is achieved by the following definition files:

/rdf/yalc.jsonld
Configuration files can be processed as RDF by including the context stored in this file.
/rdf/yalc.trig
Definitions for the classes and properties that are used in the configuration files.
/rdf/topics.jsonld
Topic hierarchy that is used to tag datasets.

The Linked Data version of the configuration files is itself published over here.

Pull request details

Pull requests can be created to add new datasets, or to improve existing datasets.

Pull request for a new dataset

In order to add a new Linked Open Data to this repository, create a pull request that includes at least the following:

If the organization that is specified in the dataset file does not yet have an organization file, it must be included as well:

If the dataset and/or organization file specifies an image that does not yet belong to the /img subdirectory, then it must be added as well:

  • A dataset image file and/or an organization image file in the /img directory. The following image formats can be used: JPEG, PNG, SVG. SVG images are preferred, since they are smaller in size and do not suffer from resolution issues.

Pull request for an existing dataset

Feel free to improve the configurations for existing datasets.

Releases

No releases published

Packages

No packages published

Languages