Skip to content
Toby Steiner edited this page Dec 22, 2023 · 389 revisions
Thoth

The Thoth User Manual, for publishers and other creators of metadata records in Thoth, can be found here.

The wiki below provides an overview of Thoth's approach to Data and Metadata and its interactions with the Open Access Book Supply Chain. It also provides an overview of the Thoth Archiving Network.

Data and Metadata

In the digital realm, a Work usually consists of two constituent parts: data and metadata. The data comprise the contents of the publications, the information contained in it targeted at human readers, machine readers, or both. The metadata comprise all the data about the publication, such as its author, title, and subject classification.

Metadata are frequently also part of the Work. For example, the title and author are often mentioned on the opening pages, and the ISBN numbers are usually listed in the colophon. Despite this partial overlap, it is useful to distinguish between data and metadata, as they are handled in distinct manners in the Open Access book supply chain. There are several international Metadata Standards setting baseline quality criteria for metadata.

There are specific digital Data Formats and Metadata Formats that are supported by Thoth. An important subset of metadata is formed by Persistent Identifiers.

Open Access Book Supply Chain

Thoth operates at several level in the Open Access book supply chain. We employ here the categorization of key stakeholders and intermediaries proposed in Michael Clarke and Laura Ricci's 2021 report OA Books Supply Chain Mapping (Clarke & Ricci 2021).

Content Funders

Funders

Work records in Thoth allow Content Creators to add information about Funding by referencing an Institution by means of Persistent Identifiers as well as further grant program and project information. Content Funders are able to harvest these data through one of the Thoth Metadata Formats or our Open API.

Libraries

Libraries, both University Libraries and National Libraries, have become increasingly important Content Funders in the OA Book Supply Chain. Thoth is partially funded by library subscriptions through the Open Book Collective and in return Thoth provides high-quality metadata in a range of Metadata Formats including MARC 21 that libraries can ingest into their Library Management Systems.

Additionally, Thoth is working with University Libraries in the context of the Thoth Archiving Network.

Content Creators

Publishers

Thoth is primarily designed as a platform for Content Creators, in particular Open Access Publishers. Thoth provides integrated services for the maintenance, management, and dissemination of metadata records in a wide variety of Metadata Formats to a large selection of Content Platforms, Ebook Distributors, and Catalogs and Indices.

Publishers may use one of the available commercial Title Management Platforms or Publishing Platforms, which allow authors, editors, and publishers to collaborate in a digital, in-browser environment. Thoth is currently collaborating collaborating with Open Monograph Press and PubPub to improve integration with their in-platform metadata management functionalities.

Whereas this wiki focuses on mainly on the digital OA book supply chain, many OA publishers also publish print books via one of the commercial Print Book Distributors.

Authors

Individuals authors are not a targeted user group of Thoth. They may manage their private bibliographic metadata on one of the available commercial or open source Bibliographic Reference Management Platforms and upload their research directly to one of the Green OA Repositories. Thoth currently supports the export of metadata to all available Bibliographic Reference Management Platforms via BibTeX.

Authors are also end-users of the metadata provided by Thoth by accessing Knowledge Graphs and Web-Scale Search Engines and using any of the Content Platforms to access publications during their research phase.

Content Platforms

Ebook Aggregators

Ebook Aggregators "license and consolidate titles from many publishers into one combined database, [… and] often combine OA and paid-access titles for greater discoverability and convenience" (Clarke & Ricci 2021). Thoth currently supports the export of metadata to Baobab ebooks, EBSCO eBooks, JSTOR, Project MUSE, and ProQuest Ebook Central.

OA Platforms and Repositories

OA Platforms and Repositories "have no underlying infrastructure for the buying and selling of books, and are intended to host exclusively free or OA content" (Clarke & Ricci 2021). Thoth currently supports the export of metadata to OAPEN.

A special category of OA Platforms and Repositories are Digital OER Libraries, which focus mainly on textbooks rather than scholarly publications.

Consumer Ebook Platforms

Consumer Ebook Platforms "offer titles for an individual’s use and access, and do not actively support institutional or library integration" (Clarke & Ricci 2021). Thoth currently supports the export of metadata to Google Play Books.

Shadow Libraries

Shadow Libraries are online databases of readily available content that is normally obscured or otherwise not readily accessible. Such content may be inaccessible for a number of reasons, including the use of paywalls, copyright controls, or other barriers to accessibility placed upon the content by its original owners" (Wikipedia). Thoth currently does not support export of metadata to any of the shadow libraries.

Ebook Distributors

Ebook Distributors differ from Digital Libraries in the sense that they do not claim to offer a scholarly function, be that to research institutions or to the general public. Distributors repackage and normalize ebook metadata. Most ebook distributors operate some form of monetization scheme, which may not be hospitable to OA books. Thoth currently supports the export of metadata to OverDrive and RNIB Bookshare.

Catalogs and Indices

Third-party Content Indices

Third-party Content Indices are more specialized types of products that promote metadata curation and discovery. Thoth currently supports the export of metadata to DOAB.

OER Discovery Platforms provide similar services for open textbooks.

Knowledge Bases

Knowledge Bases are library-agnostic global content indices. Thoth currently supports the export of metadata to BDSLive and EBSCO Knowledge Base.

Topic-specific bibliographies

Topic-specific Bibliographies are managed by scholarly organizations related to a specific field of inquiry.

Citation Indices

Citation Indices, such as OpenCitations, provide specific indexing for citations and references.

Thoth Archiving Network

The Thoth Archiving Network comprises an expanding number of institutional repositories that archive the metadata stored in Thoth and its linked data for long-term preservation purposes. These data and metadata are often preserved in specific Data Formats and Metadata Formats. Institutional repositories often operate through one of the available Repository Systems such as DSpace or Figshare. See also our blog posts here and here.

The following Preservation Repositories are currently connected to Thoth as part of the Thoth Archiving Network:

Clone this wiki locally