# Quickstart Guide: Accessing Cube Data

In the Cube web application (https://cube.jax.org) you can search through various elements such as Studies, Assays, and Data Sets. Using the controls along the top of the table you can filter based on properties to narrow your search. From the "Data Sets" page, for example, you can select data to add to your collection. Once a data set is in your collection you can use the `cube_r` package to access and inspect those data.

The commands in this notebook show how to access data sets in your collection. Other elements, such as studies and assays, can be inspected here as well, but they won't have any associated data - only metadata.

In [None]:
# load required libraries
# load all library, it is already installed
library(devtools)
library(httr)
library(jsonlite)
library(stringr)
library(rapportools)
library(htmlwidgets)
library(rlist)
library(data.table)
library(cloudml)
library(readr)
library(logger)
library(cube)
setwd("~/")
log_threshold(INFO)

## Logging in

After running the cell below to log in to the Cube service, you will see a link appear in the response. **You will need to click on the `verification_uri_complete` link to complete the login process.** That will associate this notebook session with your user ID and allow you to pull in data from your collection in https://cube.jax.org/ 

In [None]:
# create a CubeAPI object
cube_api = CubeAPI$new()
# only need to login once per month
# after run, click on "verification_uri_complete" to finish verification
cube_api$login()

## Get the data from your collection

In [162]:
response = cube_api$get_metadata_collection()
json = response_json_to_data(response)
str(json$results)

INFO [2021-01-12 19:25:40] GET: http://10.105.16.22/metadata-service/metadata_repository/collection/
INFO [2021-01-12 19:25:40] status_code: 200


No encoding supplied: defaulting to UTF-8.



'data.frame':	1 obs. of  5 variables:
 $ id              : int 10
 $ collection_items:List of 1
  ..$ :'data.frame':	1 obs. of  7 variables:
  .. ..$ id           : int 352
  .. ..$ collection   :'data.frame':	1 obs. of  4 variables:
  .. .. ..$ id             : int 10
  .. .. ..$ collection_name: logi NA
  .. .. ..$ user_name      : chr "waad.939JXKnVNUKLJTw6JEVLxgIlZoYOeIfrJ58NLiCbVQc"
  .. .. ..$ date_created   : chr "2020-12-22T14:24:16.893718Z"
  .. ..$ collection_id: int 10
  .. ..$ accession_id : chr "JAXDS0000G"
  .. ..$ item_type    : chr "Data Set"
  .. ..$ item_label   : logi NA
  .. ..$ date_created : chr "2021-01-12T19:24:47.916557Z"
 $ user_name       : chr "waad.939JXKnVNUKLJTw6JEVLxgIlZoYOeIfrJ58NLiCbVQc"
 $ collection_name : logi NA
 $ date_created    : chr "2020-12-22T14:24:16.893718Z"


## With the accession ID, get the pointer to the data

In [None]:
accession_ids = cube_api$parse_accession_ids(response)
response = cube_api$get_element_instance(accession_ids = c(accession_ids))

In [None]:
uri = cube_api$parse_storage_uri(response)[1,]
uri

## Get the data file and read into a data frame

In [None]:
bucket_name = uri[[3]][[1]]
file_name = uri[[4]][[1]]
data_dir = gs_data_dir( bucket_name )
df <- read.table(file.path(data_dir, file_name), sep = '\t',header = TRUE)
head(df)

In [None]:
df[c('CLIMB.ID','sex','line','strain','diet','treatment',"JAX_ASSAY_BODYWEIGHT")]