Skip to content

Faidare

Peter Selby edited this page Jun 6, 2024 · 9 revisions

Table of Contents

Documentation

  • Technology Description

    • (From https://urgi.versailles.inrae.fr/faidare/about) The purpose of this portal is to facilitate the discoverability of public data on plant biology from a federation of established data repositories. It indexes any type of data using a generic approach, hence providing full text search capabilities and general filters (species, datatype, source). In addition, it provides BrAPI enabled data repositories more specialized filters for germplasm and phenomic or genetic studies. Its data model relies on the Breeding API (BrAPI) specifications and facilitates the access to crop and forest plants dataset through an easy to use web interface. It also provides a standard interface that can be accessed programmatically through web services.
  • Learn from an expert

  • More Information

Pros and Cons

  • Cost to setup

    • The cost of joining the existing federation (FAIDARE) greatly depends on the level of technical expertise and resources available. This could be anywhere from one week to 6 months worth of work. See https://urgi.versailles.inrae.fr/faidare/join
    • Independently setting up the data-discovery software: software is free. You must set up an Elasticsearch cluster and then deploy the application using the self executing jar and embedded tomcat. Maintenance cost depends on the volume of metadata but should be reasonable.
    • Both options will incur costs related to metadata curation.
  • Pros

    • Increases findability of data
    • Facilitates accessibility as data and metadata need to be accessible to be used in Faidare
    • Use of Brapi schema can help with interoperability and reusability
    • Represents an ag/breeding domain specific tool, unlike other technologies with generic data structures
  • Cons

    • Curating/structuring data and metadata to follow the BrAPI schema may take time
    • Using the generic indexing is an efficient workaround if the BrAPI endpoint development is too costly

Example use cases

  • Joining the existing federation

    • A database curating plant genotype, phenotype, germplasm data and metadata wants to make their data more findable by the scientific community. By sending a request to the Faidare team and organizing their metadata in the BrAPI format, the database can join the public Faidare instance. Their data will be searchable along side all the other public Faidare resources.
  • Establishing a new federation

    • A group of data repositories with similar data types want to have a system for searching across all resources simultaneously. They use the open source Faidare code to setup their own search server. Only data from selected resources will be displayed here.

FAIR Principles

  • Findability - Metadata and data should be easy to find for both humans and computers.

    • F1 - (Meta)data are assigned a globally unique and persistent identifier

      Persistent identifiers can be stored; generating them is up to the providing database.

    • F2 - Data are described with rich metadata (defined by R1 below)

      Faidare supports the BrAPI format, including metadata.

    • F3 - Metadata clearly and explicitly include the identifier of the data they describe

      URIs are associated with data resources, mainly using germplasmPUI and studyPUI.

    • F4 - (Meta)data are registered or indexed in a searchable resource

      Faidare uses a joint elasticsearch engine and is per nature a searchable source.

  • Accessibility - Once the user finds the required data, it should be clear how the data can be fully accessed.

    • A1.1 - The protocol is open, free, and universally implementable

      FAIDARE uses a standardized, open, freely available communication protocol.

    • A1.2 - The protocol allows for an authentication and authorization procedure, where necessary

      The primary use case for FAIDARE is for published/public data - therefore, authentication and authorization should not be necessary.

    • A2 - Metadata are accessible, even when the data are no longer available

      Metadata and data persistence is the responsibility of the original data provider.

  • Interoperability - The data should easily interoperate with other data, as well as applications for analysis, storage, and processing.

    • I1 - (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.

      The language for knowledge representation is JSON.

    • I2 - (Meta)data use vocabularies that follow FAIR principles

      Data need to be structured in FAIDARE - typically the BrAPI format is recommended.

    • I3 - (Meta)data include qualified references to other (meta)data

      FAIDARE is designed to display data from multiple compatible systems in one location.

  • Reusability - Metadata and data should be well-described so that they can be replicated and/or combined in different settings.

    • R1 - (Meta)data are richly described with a plurality of accurate and relevant attributes

      FAIDARE supports rich (meta)data; (meta)data description itself is up to the providing database.

    • R1.1 - (Meta)data are released with a clear and accessible data usage license

      FAIDARE gives access to datasets that have their own license. Generating the license is up to the providing database.

    • R1.2 - (Meta)data are associated with detailed provenance

      Provenance description is up to the providing database.

    • R1.3 - (Meta)data meet domain-relevant community standards

      The BrAPI requirement helps meet domain-specific metadata requirements.