# Crucible Tutorial


## Part 1: Setup
- Install the crucible python client
- Import packages
- Retrieve your personal Crucible API key
- Initialize your client

#### Install the client from GitHub

In [None]:
!pip install git+https://github.com/MolecularFoundryCrucible/pycrucible.git

#### Import packages

In [None]:
import os
import json
import pprint
import uuid
from typing import List, Dict
from datetime import datetime
import h5py
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
from pathlib import Path
import ipywidgets as widgets
from IPython.display import display, clear_output
import matplotlib.pyplot as plt

from pycrucible import CrucibleClient, SecureInput

#### Retrieve your API key

In your web browser navigate to https://crucible.lbl.gov/testapi/user_apikey.

You will be prompted to login to your ORCID.  Login.

Run the cell below and copy your resulting API key into the box!

** note: If you do not have an ORCID you can easily create one here: https://orcid.org/register

In [None]:
SecureInput(description = "Enter your API key:", var_name = 'CRUCIBLE_API_KEY')

#### Initialize the client

In [None]:
API_URL = "https://crucible.lbl.gov/testapi"
API_KEY = os.environ.get("CRUCIBLE_API_KEY")

# Initialize the client
client = CrucibleClient(API_URL, API_KEY)
print("Crucible client initialized successfully!")

### Download a batch of perovskite data
For this demo we will be using data generated for a batch of perovskite wafers generated by Yi-Ru.  The batch is named `S-pMeMBAI-pre-2` and has the unique id: `0t3h7ymbm5s27000z6tt82zvx4`



##### Query the Data

In [None]:
# set the batch_id as a variable
batch_id = '0t3h7ymbm5s27000z6tt82zvx4'

In [None]:
# list all of the samples associated with this batch
client.list_samples(parent_id = batch_id)

In [None]:
# list all of the datasets associated with this batch
client.list_datasets(sample_id = batch_id)

##### Download data files

After running the following cell, you can to navigate to the file system on the right by clicking the folder icon.  You should see a folder titled "crucible_downloads" that will contain all of the files you just downloaded.

In [None]:
batch_datasets = client.list_datasets(sample_id = batch_id)
for ds in batch_datasets[0:2]:
    print(ds)
    try:
      client.download_dataset(dsid = ds['unique_id'])
      print('downloaded')
    except Exception as err:
      print(err)

### Adding data with the API

#### Add a project you are working on

In [None]:
help(client.add_project)

In [None]:
client.add_project(project_info = {"project_id":"AUM_DEMO",
                                   "organization":"Summer School",
                                   "project_lead_email":"mkwall@lbl.gov"})

#### Add a sample

In [None]:
sample = client.add_sample()

#### Add a dataset from your google drive

In [None]:
# mount your google drive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# choose a file
your_file_path = "sample_data/california_housing_train.csv"

# define some metadata you want to add to this dataset
metadata_to_add = {'comments': 'this is a fake dataset',
                   'weather': 'sunny',
                   'iphone_version': 11
                  }

In [None]:
# fill out the fields and send the data to Crucible
results = client.build_new_dataset_from_file(files_to_upload = [your_file_path],
                                        dataset_name = None, # this will default to the file name
                                        project_id = None, # this will default to unknown
                                        instrument_name = None, # default is null
                                        measurement = None, # default is null
                                        session_name = None, # default is null
                                        source_folder = None, # this will default to the base directory
                                        scientific_metadata = metadata_to_add, # this is the dictionary you defined above
                                        keywords = [], # list any keywords you want to be able to search on
                                        ingestor = 'CrucibleDatasetIngestor', # use a generic ingestor
                                        verbose = False,
                                        wait_for_ingestion_response = True)

ds = results['created_record']
pprint.pprint(ds)

#### Associate this dataset with the sample you created

In [None]:
# define the dataset and sample
dataset_id = ds['unique_id']
sample_id = sample['unique_id']

# link them!
client.add_dataset_to_sample(dataset_id = dataset_id, sample_id = sample_id)

In [None]:
# see all the datasets associated with your sample
client.list_datasets(sample_id = sample_id)

#### Send your dataset from Crucible to SciCat

In [None]:
client.send_to_scicat(dsid = ds['unique_id'], wait_for_scicat_response= True)

Go to https://mf-scicat.lbl.gov to get a quick look at your data