-
Notifications
You must be signed in to change notification settings - Fork 45
Workspaces
A workspace in the Scale system is a location where files are stored (source files, product files, etc). A workspace contains configuration specifying how files are stored into and retrieved from the workspace. Workspaces are configured to use various brokers, which know how to store/retrieve files in different storage systems (e.g. NFS).
Example workspace configuration:
{
"version": "1.0",
"broker": {
"type": "nfs",
"nfs_path": "host:/my/path"
}
}
The broker value is a JSON object providing the configuration for this workspace’s broker. The type value indicates that the NFS (Network File System) broker should be used for this workspace. The nfs_path field specifies the NFS host and path that should be mounted in order to access the files. To see all of the options for a workspace’s configuration, please refer to the Workspace Configuration Specification below.
A valid workspace configuration is a JSON document with the following structure:
{
"version": STRING,
"broker": {
"type": STRING
}
}
Type: String
Required: No
Defines the version of the configuration used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an olderversion
and convert it to the current version. The default value, if not included, is the latestversion
(currently1.0
). It is recommended, though not required, that you include theversion
so that future changes to the specification will still accept your workspace configuration.
Type: JSON Object
Required: Yes
Defines the broker that the workspace should use for retrieving and storing files.
-
Type: String
Required: Yes
Specifies the type of the broker to use. The other fields that configure the broker are based upon thetype
specified. The validbroker types
are:-
host - A
host
broker mounts a local directory from the host into the job’s container. Usually this local directory is a shared file system that has been mounted onto the host. -
nfs - An
nfs
broker utilizes an NFS (Network File System) for file storage. -
s3 - An
s3
broker utilizes the Amazon Web Services (AWS) Simple Storage Service (S3) for file storage.
Additional
broker
fields may be required depending on thetype
of broker selected. See below for more information on each broker type. -
host - A
The host broker mounts a local directory from the host into the job’s container. This local directory should be a shared file system that has been mounted onto all hosts in the cluster. All hosts must have the same shared file system mounted at the same location for this broker to work properly.
??? danger "Permissions" The Scale Docker containers run with a UID and GID of 7498. To ensure that permissions are appropriately handled within Docker, make sure that your host’s local directory is owned by a user/group with UID/GID of 7498/7498.
??? danger "Security" There are potential security risks involved with mounting a host directory into a Docker container. It is recommended that you use another broker type if possible.
Example host broker configuration:
{
"version": "1.0",
"broker": {
"type": "host",
"host_path": "/the/absolute/host/path"
}
}
The host broker requires one additional field in its configuration:
Type: String
Required: Yes
Specifies the absolute path of the host’s local directory that should be mounted into a job’s container in order to access the workspaces files.
The NFS broker mounts a remote network file system volume into the job’s container.
??? danger "Plugin Required" In order to use Scale’s NFS broker, you must install and run the Netshare Docker plugin. Please see http://netshare.containx.io/ for more information.
??? danger "Permissions" The Scale Docker containers run with a UID and GID of 7498. To ensure that permissions are appropriately handled within Docker, make sure that the directories in your NFS file volume are owned by a user/group with UID/GID of 7498/7498.
Example NFS broker configuration:
{
"version": "1.0",
"broker": {
"type": "nfs",
"nfs_path": "host:/my/path"
}
}
The NFS broker requires one additional field in its configuration:
Type: String
Required: Yes
Specifies the remote NFS path to use for storing and retrieving the workspace files. It should be in the formathost:/path
.
The S3 broker references a storage location that exists as an S3 bucket in your AWS account. Please take note of the bucket name, which is typically of the form my_name.domain.com
since bucket names must be globally unique (See Bucket Restrictions). The bucket must be configured for read and/or write access through an appropriate IAM account (Identity and Access Management). Once the IAM account is created and granted permissions to the bucket, then there are two ways to handle authentication. IAM roles can be used to automatically grant permissions to the EC2 executing the broker operations (See AWS Roles). This method is preferred because no secret keys are required. Alternatively, an ACCESS KEY ID
and SECRET ACCESS KEY
can be generated and used with this broker (See AWS Credentials). These tokens allow 3rd party software to access resources on behalf of the associated account.
??? danger "Security" A dedicated IAM account should be used rather than the root AWS account to limit the risk of damage if a leak were to occur and similarly the IAM account should be given the minimum possible permissions needed to work with the bucket. The access tokens should also be changed periodically to further protect against leaks.
??? danger "Security" While this broker is in the experimental phase, the access keys are currently stored in plain text within the Scale database and exposed via the REST interface. A future version will maintain these values using a more appropriate encrypted store service.
Example S3 broker configuration:
{
"version": "1.0",
"broker": {
"type": "s3",
"bucket_name": "my_bucket.domain.com",
"credentials": {
"access_key_id": "AKIAIOSFODNN7EXAMPLE",
"secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
},
"host_path": "/my_bucket",
"region_name": "us-east-1"
}
}
The S3 broker requires the following additional fields in its configuration:
Type: String
Required: Yes
Specifies the globally unique name of a storage bucket within S3. The bucket should be created before attempting to use it here.
Type: String
Required: No
Provides the necessary information to access the bucket. This attribute should be omitted when using IAM role-based security. If it is included for key-based security, then both sub-attributes must be included. An IAM account should be created and granted the appropriate permissions to the bucket before attempting to use it here.
-
Type: String
Required: No
A Unique identifier for the user account in IAM that will be used as a proxy for read and write operations within Scale. -
Type: String
Required: No
A generated token that the system can use to prove it should be able to make requests on behalf of the associated IAM account without requiring the actual password used by that account.
Type: String
Required: No
Adds S3 workspace support for locally mounted buckets and partial file read-only access. If a FUSE file system (such as s3fs or goofys) mounts the S3 bucket at thehost_path
location on all nodes, an alternative to downloading large files is available to jobs that use only portions of a file. The job interface must indicate partial equal totrue
for any input files to take advantage ofhost_path
. Only read operations are performed using the mount, all write operations will use the S3 REST API.
Type: String
Required: No
Specifies the AWS region where the S3 bucket is located. This is not always required, as environment variables or configuration files could set the default region, but it is a highly recommended setting for explicitly indicating the bucket region.
- Home
- What's New
-
In-depth Topics
- Enable Scale to run CUDA GPU optimized algorithms
- Enable Scale to store secrets securely
- Test Scale's scan capability on the fly
- Test Scale's workspace broker capability on the fly
- Scale Performance Metrics
- Private docker repository configuration
- Setting up Automated Snapshots for Elasticsearch
- Setting up Cluster Monitoring
- Developer Notes