Bioschemas Specification Process

Ric Arcila edited this page Apr 12, 2018 · 6 revisions

This repository will help people interested in defining a Bioschemas Specification. Here you will find all the templates and documentation needed to be familiar with the Specification Process. This process starts with the Use Case Study and finish with the RDFa generation.

Bioschemas Specification Process

If you want to modify the flow chart, click here, change the image file called specification_process.png in the img folder.

In the following README you have an explanation of the process needed to create a Bioschemas Specification. There are 3 steps:

  1. Use Case Study
  2. Mapping
  3. Specification

For this explanation, the Bioschemas Tool Specification will be used as an example.

MAPPING

Once you have defined Use Cases you should start to consider what properties are needed to describe the data from your use case. Please try to reuse existing Schema.org properties (and definitions) where possible. It may be beneficial to try and extend (or specialise) an existing Schema.org Type. You may find reading other Bioschemas.org specifications useful.

To record the decisions your team has reached follow these steps:

  1. Create a new folder for your type/profile on thebioschemas specification folder with the name of the type or profile.
  2. On the template folder you will find the Simplified Bioschemas mapping template spreadsheet and the example spreadsheets. Make a copy of the "Simplified Bioschemas mapping template" spreadsheet on the folder you have just created on the previous step, naming the file <SPECIFICATION_NAME> mapping.
  3. Start filling the copy you created, the structure of the file goes as follow:

Warning: Bioschemas field sheet is formatted to work with bioschemas-goweb. Changing it could ensue unexpected behaviour and results.

Bioschemas Tools Mapping empty file

On the "Specification Info" sheet you will find:

  • Title: The name of your profile/type, e.g., Gene, ProteinStructure, DataRecord
  • Subtitle: Short description of the specification.
  • Description: Extended specification description. If you want to have links in your text, add them in markdown format.
  • Version: Version you want to be display on the website. Format #.#, e.g., 0.3
  • Official Type: URL to an existing ontology if applied, e.g, http://purl.obolibrary.org/obo/SO_0000704
  • Full Example: URL to the full example folder on the bioschemas specification github,

On the "Schema.org mapping" sheet you will find:

  • schema.org: These columns are copy-pasted from a schema.org type definition page, or filled with types of external ontologies.
    • Property: Name of the property from the selected schema.org type or external ontology type, i.e., SIO:is transcribe into. When an external ontology is used the Type you must select "external" from the dropdown on the Type column.
    • Expected Type: Expected type for the property. This could be a schema.org property, a bioschemas property or an external ontology one, e.g., URL, BioChemEntity, Thing. These values can be separated by " or " or ",". See Beacon Example
    • Description: Description of the property. This field accepts Markdown formatting.
    • Type: Type for the property, possible types are: schema.org, pending, external or bioschemas. Leaving this field blank has the same effect as selecting schema.org. For external and bioschemas type please fill the Type URL column.
    • Type URL: URL for the property type; it should be filled when using bioschemas or external as type in order to link the property on the website.
  • bioschemas: These columns are defined for the properties that will be in the Bioschemas Specification file.
    • BSC Description: If is considered an additional description for the property, here can be added an additional text that complements the schema.org description. This field accepts Markdown formatting.
    • Marginality: The template gives three options for the property specification of the new Bioschemas Type: Minimum, Recommended or Optional.
    • Cardinality: The template gives you the two possible cardinalities in a Bioschemas Specification (ONE or Many).
    • Controlled Vocabulary: This field contains a list of terms or ontologies that provide values for this property. This field accepts Markdown formatting.
    • Example: An small JSON-LD markup example for the property. Indentation will be kept when it is embedded on the website. See Gene Example.
  1. Go to the schema.org and find the definition of the type you want to reuse in the Bioschemas Structure, return to the mapping file and replace the text <schema.org Type> in merged cell (A6 to C6) with the Type name you want to reuse. Schema.org extented Type In their Use Case Study, the Tools Specification identified that the Schema.org Type SoftwareApplication was the best fit for their use cases.

  2. Copy the Type definition table form schema.org, starting from the first Property, but do not copy the table headers. D For the Tool example, SoftwareApplication should look like this:

Copy Schema.org Type Definition

  1. Then paste into your mapping Spreadsheet starting in A7 Cell. For the Tools Specification you would have something like this:

Pasting Schema.org Type Definition to the mapping Template

  1. Fill all the Use Cases for this Specification. For the Tools Specification you would have something like this: Filling the Use Cases in the Mapping Template

The template gives you different colours for the Use Case Matching (Dark blue for Match, light blue for Partial Match and light orange for No Match).

  1. Fill the Bioschemas Fields bioschemas: These columns are defined for the properties that will be in the Bioschemas Specification file.

    1. BSC Description: A short description of what this property describes.
    2. Marginality: The template gives three options for the property specification of the new Bioschemas Type: Minimum, Recommended or Optional.
    3. Cardinality: The template gives you the two possible cardinalities in a Bioschemas Specification (ONE or Many).
    4. Controlled Vocabulary (CV): This field contains a list of terms and/or ontologies. This field accepts Markdown formatting. Please add links to the terms and/or ontologies included on this field, e.g,. [uberon](http://purl.obolibrary.org/obo/uberon.owl), [emap](http://purl.obolibrary.org/obo/emap.owl). See this for more info about links in Markdown.
  2. Go to the Bioschemas fields sheet to view the summary of your mapping For the Tools Specification you would have something like this:

Mapping Summary

SPECIFICATION

There are several ways you can move your mapping to the web:

  1. Parsing your mapping using bioschemas goweb tool and sending a Pull Request with it in your new profile on _devSpec folder bioschemas website. Use the Gene profile as a example of the folder and file structure. In case of any question please feel free to reach arcila@ebi.ac.uk
  2. Emailing arcila@ebi.ac.uk, kcm1@hw.ac.uk and cc'ing the mailing attaching parsed mapping using bioschemas-goweb tool. Parsing your mapping file into a YAML ready-for-bioschemas-web file using the bioschemas goweb tool.
  3. Emailing arcila@ebi.ac.uk, kcm1@hw.ac.uk and cc'ing the mailing attaching the mapping spreadsheet link.

Extra information

Versioning

The assignment of version numbers is determined by the process. When you first ask for your gSheet document to be converted into a specification, it will be published on http://bioschemas.org/specifications/drafts. It will automatically receive the version number v0.1-draft.1. The next time you ask for the specification to be updated the version will become v0.1-draft.2. This will continue indefinitely.

When it is determined your specification is sufficiently mature to be published as a release version (on http://bioschemas.org/specifications) the version number will become v0.1.

Subsequent minor changes will be published on the drafts page, where the version number will be v0.2-draft.1. Again, once mature the specification will be published on the http://bioschemas.org/specifications page with version set to v0.2.

This iterative process may continue indefinitely.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.