Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Score doc revamp #154

Merged
merged 10 commits into from May 13, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion markdown/documentation/dms/index.md
Expand Up @@ -25,7 +25,7 @@ Illustrated above are the five core Overture components:

| Component | Purpose |
| --------------------| ------------|
| [Score](/products/score) | Manages cloud-based data object storage and transfer. |
| [Score](/documentation/score) | Manages cloud-based data object storage and transfer. |
| [Song](/documentation/song) | Manages the metadata associated with the data objects. |
| [Maestro](/documentation/maestro) | Indexes the metadata in Song into [Elasticsearch](https://www.elastic.co/). |
| [Arranger](/documentation/arranger) | Generates an easily-configurable web portal interface with faceted search against the Elasticsearch index. |
Expand Down
Expand Up @@ -2,7 +2,7 @@
title: Setup Data Storage Buckets
---

The [Score service]((../../../../../score)) manages data transfer to (upload) and from (download) cloud object storage. As such, Score requires two specific buckets to get setup in advance in your storage service. These buckets are supplied as inputs to the DMS interactive configuration questionnaire later on.
The [Score service](/documentation/score) manages data transfer to (upload) and from (download) cloud object storage. As such, Score requires two specific buckets to get setup in advance in your storage service. These buckets are supplied as inputs to the DMS interactive configuration questionnaire later on.

Score requires two buckets to be setup in your storage:

Expand Down
4 changes: 2 additions & 2 deletions markdown/documentation/dms/installation/test-upload/index.md
Expand Up @@ -186,7 +186,7 @@ client:

# Download and Configure Score Client

Next, you must download and configure the Score client. This command-line client is to upload and download data files to and from your configured object storage service. To understand how to use Score in more detail, see [here](../../../score).
Next, you must download and configure the Score client. This command-line client is used to upload and download data files to and from your configured object storage service. To understand how to use Score in more detail, see [here](../../../score).

1. Download and unzip the latest Score client from [here](https://artifacts.oicr.on.ca/artifactory/dcc-release/bio/overture/score-client/%5BRELEASE%5D/score-client-%5BRELEASE%5D-dist.tar.gz) or do so from your terminal command line, then switch to the unzipped directory:

Expand Down Expand Up @@ -227,7 +227,7 @@ For example:
accessToken=36099917-45b1-49f4-b91e-68a655eb6708

# The location of the metadata service (SONG)
metadata.url=http://locatlhost:80/song-api
metadata.url=http://localhost:80/song-api

# The location of the object storage service (SCORE)
storage.url=http://localhost:80/score-api
Expand Down
Expand Up @@ -10,7 +10,7 @@ Before installing Maestro, the following software services needs to be installed
|---------|---------|-------------|-------------|
| [Elasticsearch](https://www.elastic.co/downloads/elasticsearch) | 7 or up | Required | For Maestro to build the index in |
| [Song](https://github.com/overture-stack/SONG/releases) | Latest | Required | See [here](/documentation/song/installation) for installation instructions |
| [Apache Kafka](https://kafka.apache.org/downloads/) | Latest | Optional | Optionaly, only needed if you want to setup event-based indexing |
| [Apache Kafka](https://kafka.apache.org/downloads/) | Latest | Optional | Optional, only needed if you want to setup event-based indexing |

# Installation

Expand Down
37 changes: 37 additions & 0 deletions markdown/documentation/score/_contents.yaml
@@ -0,0 +1,37 @@
sectionSlug: score
sectionTitle: Score
pages:
- title: Introduction
url: score
- title: Installation Guide
url: score/installation
isHeading: true
pages:
- title: Installation
url: score/installation/installation
- title: Configuration
url: score/installation/configuration
isHeading: true
pages:
- title: Run Profiles
url: score/installation/configuration/profiles
- title: Song Server Integration
url: score/installation/configuration/song
- title: Object Storage Integration
url: score/installation/configuration/object-storage
- title: Other Bootstrap Properties
url: score/installation/configuration/bootstrap
- title: Authentication
url: score/installation/authentication
- title: User Guide
url: score/user-guide
isHeading: true
pages:
- title: Setting Up the Score Client
url: score/user-guide/client-setup
- title: Uploading Data
url: score/user-guide/upload
- title: Downloading Data
url: score/user-guide/download
- title: Command Reference
url: score/user-guide/commands
49 changes: 49 additions & 0 deletions markdown/documentation/score/index.md
@@ -0,0 +1,49 @@
---
title: Introduction
---

Score facilitates the transfer and storage of your data seamlessly and flexibly for cloud-based projects. This storage and transfer system helps you manage data upload and download with powerful features such as file bundling and resumable downloads.

Score uses the concept of pre-signed URLs (see Amazon S3 definition [here](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ShareObjectPreSignedURL.html)) to manage data transfer to and from your cloud storage provider. As such, Score can be thought of as a broker between an object storage system (such as Amazon S3) and the user authorization system, with the responsibility of validating user access and generating the pre-signed URLs required for object access.

Working together, Song and Score enable secure and distributed data management.
Score works with object-based storage including Amazon Web Services S3, Azure Storage,
and Openstack Ceph to enable file upload and download that can be parallelized into multiple
parts and easily resumed with high integrity for a fault-tolerant data transfer. Specific features to
support genomic data have been built into Song and Score: file bundling to match genomic files
with their index files, and slicing of a sequencing read file for a targeted region instead of
downloading the whole file.

# Features

## Support for Multiple Storage Providers

Score currently supports data transfer with several popular cloud-based storage providers:

* [Amazon S3](https://aws.amazon.com/s3/)
* [Microsoft Azure Storage](https://azure.microsoft.com/en-ca/services/storage/)
* [Openstack](https://www.openstack.org/) with [Ceph](https://ceph.io/)
* [Minio](https://min.io/)

## Multipart Uploads and Downloads

To enable high performance transfers, Scores supports multipart file uploads and downloads. By implementing a multipart transfer solution, Score provides several key benefits:

* File downloads can be done in parts, being paused and resumed as required by the user
* File transfers will automatically resumed if paused or interrupted mid-transfer (e.g. due to connection issues)
* Parallelization of these transfer operations makes upload and download of files efficietn and fast

## Data Integrity

Score performs standard [MD5 validation](https://www.ietf.org/rfc/rfc1321.txt) against all file uploads and downloads to check for corrupted files and ensure data integrity.

## Applications to Genomics

Similar to other products in the [Overture](https://www.overture.bio/products/) software suite, Score has particularly useful applications in the field of Genomics, including the following features:

* Ability to slice BAM and CRAM files by genomic regions using integrated command line tools
* Integration of other samtools functionality in the Score client, such as ability to view reads from a BAM file

# Integrations

As a data transfer management system, Score is focused on managing data upload and download, and does not handle the complexities of file metadata validation. To handle this, Score is built to interact with a required companion application, [Song](/documentation/song). Song is responsibe for validating file metadata, assigning unique global identifiers for data management, assigning permisssions for open (public) versus controlled (authentication required) file access, and so on.
64 changes: 64 additions & 0 deletions markdown/documentation/score/installation/authentication.md
@@ -0,0 +1,64 @@
---
title: Authentication
---

# Application Authentication & Authorization

For an application to securely interact with Score, authentication and authorization must be provided. This ensures unauthorized users cannot access Score's API endpoints. To authorize properly with Score, either an authorized user's valid API key with appropriate scopes (permissions) must be supplied, or application-to-application authorization must be enabled following the [OAuth 2.0](https://oauth.net/2/) protocol.

Although configuring authentication and authorization is technically optional, it is **highly recommended**, especially for production environments. Settings are configured in the `auth` section of the `score-server-[version]/conf/application.properties` file, using these profiles:

| Profile | Requirement | Description |
|---------|-------------|-------------|
| secure | Required if using Ego | If the [Overture](https://overture.bio) product [Ego](/documentation/ego) is used as the authentication service for Score, this profile is required. It enables authentication for requests to the Score API using API keys issued by Ego. |
| jwt | Optional | Optionally, you can use this profile to support both JWT ([JSON Web Tokens](https://jwt.io/)) and API Key authentication for requests to Score. |

# Secure Profile Example

The `secure` profile is required if the [Overture](https://overture.bio) product [Ego](/documentation/ego) is used as the authentication service for Score. It enables authentication for requests to the Score API using API keys issued by Ego.

To configure authentication and authorization via Ego, in the `score-server-[version]/conf/application.properties` file, make sure the `secure` profile exists and configure these settings in the `auth -> server` section:

| Section | Setting | Requirement | Description |
|---------|---------|-------------|-------------|
| `auth.server.url` | Required | URL to the Ego API endpoint that is used to authenticate a user's API key (token). Specify the host and port where the endpoint is hosted. The endpoint to use is `/oauth/check_token`. See the example below for guidance. |
| `auth.server.tokenName` | Required | Name used to identify a token. Typically you should leave this set to the default value, `token`. |
| `auth.server.clientId` | Required | This is the client ID for the Score application as configured in Ego. |
| `auth.server.clientSecret` | Required | This is the client secret for the Score application as configured in Ego. |
| `auth.server.scope.download.system` | Required | Scope (permission) that a user's API key must have to enable system-level downloads from Score. Typically you should leave this set to the default value, `score.READ`. |
| `auth.server.scope.download.study.prefixprefix` | Required | Prefix that must come before the Song study name when assigning study-level download scopes (permissions) for Score. Typically you should leave this set to the default value, `score.`. |
| `auth.server.scope.download.study.suffix` | Required | Suffix that must come after the Song study name when assigning study-level download scopes (permissions) for Score. Typically you should leave this set to the default value, `.READ`. |
| `auth.server.scope.upload.system` | Required | Scope (permission) that a user's API key must have to enable system-level uploads to Score. Typically you should leave this set to the default value, `score.READ`. |
| `auth.server.scope.upload.study.prefix` | Required | Prefix that must come before the Song study name when assigning study-level upload scopes (permissions) for Score. Typically you should leave this set to the default value, `score.`. |
| `auth.server.scope.upload.study.suffix` | Required | Suffix that must come after the Song study name when assigning study-level upload scopes (permissions) for Score. Typically you should leave this set to the default value, `.READ`. |

For example:

```shell
auth.server.url="https://localhost:8081/oauth/check_token"
auth.server.tokenName="token"
auth.server.clientId="<client ID from Ego>"
auth.server.clientSecret="<client secret from Ego>"
auth.server.scope.download.system="score.READ:"
auth.server.scope.download.study.prefix="score."
auth.server.scope.download.study.suffix=".READ"
auth.server.scope.upload.system="score.WRITE"
auth.server.scope.upload.study.prefix="score."
auth.server.scope.upload.study.suffix=".WRITE"
```

# JWT Profile Example

The `jwt` profile can be optionally used if you want to support both JWT and API Key authentication for requests to Score. Note that JWT authentication cannot be configured standalone, it still requires the aforementioned API key authentication to be setup first.

To make use of JWT authentication, in the `score-server-[version]/conf/application.properties` file, make sure the `jwt` profile exists and configure these settings in the `auth -> jwt` section:

| Setting | Requirement | Description |
|---------|-------------|-------------|
| `auth.jwt.publicKeyUrl` | Required | URL to the Ego API endpoint that is used to retrieve a user's public key . Specify the host and port where the endpoint is hosted. The endpoint to use is `/oauth/token/public_key`. See the example below for guidance. |

For example:

```shell
auth.jwt.publicKeyUrl="https://<host>:<port>/oauth/token/public_key"
```
@@ -0,0 +1,27 @@
---
title: Other Bootstrap Properties
---

In addition to the `score-server-[version]/conf/application.properties` file that is created by default when you unzip the distribution, you must also create another file in the same `conf` folder. This file, `bootstrap.properties`, will contain some additional configurations required by the Score server.

Assuming the directory path of the distribution is `$SCORE_SERVER_HOME`, do the following:

1. Switch to the Score server configuration folder:

```bash
$ cd $SCORE_SERVER_HOME/conf
```

2. Using the text editor of your choice, create a new file in the `/conf` directory named `bootstrap.properties`, and add the following settings:

Setting | Requirement | Description |
|---------|-------------|-------------|
| `spring.cloud.vault.enabled` | Required | If [HashiCorp's Vault](https://www.vaultproject.io/) solution is being used to manage your authentication secrets, set this to `true`. Else, set this to `false`. Typically, most deployments will no be using Vault and hence this value should be defaulted to `false`. |

For example:

```shell
spring.cloud.vault.enabled="false"
```

3. Save the file.
12 changes: 12 additions & 0 deletions markdown/documentation/score/installation/configuration/index.md
@@ -0,0 +1,12 @@
---
title: Configuring Score
---

There are several <span style="color:red"> required</span> components to configure for the Score server. These include:

- [Run Profiles](/documentation/score/installation/configuration/profiles)
- [Song Server Integration](/documentation/score/installation/configuration/song)
- [Object Storage Integration](/documentation/score/installation/configuration/object-storage)
- [Other Bootstrap Properties](/documentation/score/installation/configuration/bootstrap)

All of these configurations are managed within the same file, `score-server-[version]/conf/application.properties`.