# Working with data collections

<table align="left">

  <td>
    <a href="https://github.com/DataBiosphere/terra-axon-examples/blob/main/first_hour_on_terra/working_with_data_collections.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://github.com/DataBiosphere/terra-axon-examples/main/first_hour_on_terra/working_with_data_collections.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in a Terra notebook instance
    </a>
  </td>                                                                                               
</table>

## Overview

This notebook provides examples of working with data collections in Terra. Build upon the best practices described in this notebook to create and share your own data collections. 

### Objective

Perform common workspace resource operations including:

1. Create a new data collection from cloud data.
1. Share the data collection with collaborators.
1. Add the data collection as a resource to a new workspace.

#### How to run this notebook

Run this notebook cell by cell to set up your workspace. All setup steps are optional, but highly recommended so that your workspace is compatible with the Enterprise Terra tutorials.

#### Costs

This notebook takes less than a minute to run, which will typically cost less than $0.01 of compute time on your cloud environment.

## Create a data collection

In order to create a data collection, you must first create a new workspace. Run the cell below to create a new workspace. 
<div class="alert alert-block alert-success">
<b>Note:</b> 
    If you'd like to restrict access to your data collection to members of a specific group, you'll need to provide the <a href="https://et-docs-tests.googleplex.com/docs/reference/glossary/#policy">group policy constraint</a> at the time of workspace creation.</div>
    See <a href="../creating_a_group.ipynb">../creating_a_group.ipynb</a> for details on how to create a Terra group that can be used for group policy constraints on workspaces and data collections. 

In [None]:
!terra workspace create \
--id=${GOOGLE_CLOUD_PROJECT}-dc-ws \
--name='${TERRA_OWNER_EMAIL} - My First Data Collection' \
--description='A new workspace which I will transform into a data collection.'
--properties="'terra-type':'data-collection','terra-workspace-version':'1.0','terra-workspace-short-description':'An example data collection.'"

Before converting the workspace into a data collection, we must set any relevant data policies.

## Provenance

Generate information about this notebook environment and the packages installed.

In [None]:
!date

Conda and pip installed packages:

In [None]:
!conda env export

JupyterLab extensions:

In [None]:
!jupyter labextension list

Number of cores:

In [None]:
!grep ^processor /proc/cpuinfo | wc -l

Memory:

In [None]:
!grep "^MemTotal:" /proc/meminfo

---
Copyright 2022 Verily Life Sciences LLC

Use of this source code is governed by a BSD-style   
license that can be found in the LICENSE file or at   
https://developers.google.com/open-source/licenses/bsd