SONG is a robust metadata and validation system used to quickly and reliably track genome metadata scattered across multiple cloud storage systems. In the field of genomics and bioinformatics, metadata managed by simple solutions such as spreadsheets and text files require significant time and effort to maintain and ensure the data is reliable. With several users and thousands of genomic files, tracking the state of metadata and their associations can become a nightmare. The purpose of SONG is to minimize human intervention by imposing rules and structure to user uploads, which as a result produces high quality and reliable metadata with a minimal amount of effort. SONG is one of many products provided by Overture and is completely open-source and free for everyone to use.
For additional information on other products in the Overture stack, please visit https://overture.bio
Did you know?
Did you know that SONG is actually a recurrsive acronym for SONG's Our New GNOS?
GNOS (Genomic Network Operating System) was a genomic data repository built and maintained by Annai Systems, which ceased operation in 2017. In response to the vacant position for a genomic metadata system, SONG was born.
- Synchronous and asynchronous metadata validation using JsonSchema
- Strictly enforced data relationships and fields
- Optional schema-less JSON info fields for user specific metadata
- Standard REST API that is easy to understand and work with
- Simple and fast metadata searching
- Export payloads for SONG mirroring
- Clear and concise error handling
- ACL security using OAuth2 and scopes based on study codes
- Unifies metadata with object data stored in SCORE
- Built-in Swagger UI for API interaction
The data submission workflow can be separated into 4 main stages:
- Metadata Upload (SONG)
- Metadata Saving (SONG)
- Object data Upload (SCORE)
- Publishing Metadata (SONG)
The following diagram summarized the steps involved in successful data submission using SONG and SCORE:
Legend:
- Cancer Collaboratory - Toronto : song.cancercollaboratory.org
- AWS - Virginia : virginia.song.icgc.org
The easiest way to understand SONG, is to simply use it! Below is a short list of different ways to get started on interacting with SONG.
The Docker for SONG tutorial <docker_tutorial_ref>
is a great way to spin-up SONG and all its dependent services using Docker on your host machine. Use this if you want to play with SONG locally. Refer to the Docker for SONG <docker_for_song_ref>
documentation.
The SONG Python SDK Tutorial <sdk_python_tutorial_ref>
is a Python client module that is used to interact with a running SONG server. Use it with one of the Projects Using SONG <_intro_projects_using_song_ref>
, or in combination with Docker for SONG <docker_for_song_ref>
. For more information to about the Python SDK, refer to the SONG Python SDK <song_python_sdk_ref>
documentation.
If you want to play with SONG from your browser, simply visit the Swagger UI for each server:
- Cancer Collaboratory - Toronto: https://song.cancercollaboratory.org/swagger-ui.html
- AWS - Virginia: https://virginia.song.icgc.org/swagger-ui.html
For more information about user access, refer to the User Access <user_access_ref>
documentation.
If you want to deploy SONG onto a server, refer to the Deploying a SONG Server in Production <server_ref>
documentation.
- join our gitter channel!
Copyright (c) 2018. Ontario Institute for Cancer Research
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.