# Jupyter notebook in Scispot

- Scispot --> LabSheets --> Jupyter --> +New --> write script/notebook
  - ~~need jupyterhub server credentials?~~

- ? can I look up a UUID by sample name?
  - update-by-id to fetch UUID from Sample ID
- ? can I look up UUIDs by manifest name?
  - not yet! 
- ? can I specify other "variable" inputs/params? or can scientist edit the ipynb with those values?
  - A: yes, can code input parameters 
  - ? can I read in other sheets into pandas?
    - ~~nm don't worry about this for now...~~ done, see below


### tips
- update-by-id to fetch UUID from Sample ID 
- https://www.loom.com/share/32c01c2cb1294720b15596b0a43af5e9

### notes 
- Sample IDs from 96-well manifest export → Protein Manager registrations
- Protein Manager registrations completed for following Sample manifests:
  - ChESS_dbgen1_plate1
  - ChESS_dbgen1_202212_plate1rep2 
  - ChESS_dbgen1_202212_plate1rep3
  - ... how to stage these in the future? LC-MS queue?

### todo
- Parent-Sample and Child-Protein UUID links need to be updated for historical Benchling protein manager entries
- Related, Update programmatic entries with child linkages (i.e. update Sample Manager entries with Child links?)
- Feature “Request”: Protein Manager > Peptide Manager 
- uuids that generated from buggy add_entry to protein manager:
  - 0344a99d-888a-443e-9a23-13eceb279ae1 - Sample Manager and Protein Manager and Peptide Manager and ...??
  - 90b19820-726d-4086-b74a-199d37899ea5 - Sample Manager and Protein Manager and Peptide Manager and ...??
  - 98f8e123-84ab-4171-b26e-6c3c0ad6ecf3 - Sample Manager and Protein Manager and Peptide Manager and ...??
  - f83d6043-0901-4362-bef9-324987e06451 - doesn't exist
  - 485546de-f451-46cd-afbb-7bad60e998d5
  - e4695003-2074-458c-8080-9a659bc208c5
- Pull a plate of samples with some manifest name/location from API? (not yet, per satya)


In [1]:
## script to generate Scispot registrations for ChESS 96-well plate
## lkp 2022/11/15

import pandas as pd
import requests
import json
import sys
import datetime

sys.stdout.write("Imported required packages")

api_key_file = 'G:/My Drive/Lindsay Pino/proj/2023_scispot_utils/data/scispot_api_key.txt'
API_KEY = open(api_key_file, 'r').readlines()[0]
#print(API_KEY)

Imported required packages

In [3]:
## functions

# lookup an entry by UUID
def find_entry_from_uuid(manager, uuid_list):
    session = requests.Session()
    url = "https://api.scispot.io/tryingtofixcors/labsheets/find-row"
    payload = {
        "apiKey": API_KEY,
        "manager": manager,
        "uuid": uuid_list
    }
    ret = session.post(url, json=payload)
    return json.loads(ret.text)

# turn json Scispot return into a pandas df
def json_to_pd_df(json_in):
    df_out = pd.DataFrame(data=json_in['row']).T
    df_out.columns = list(json_in['headers'])
    return df_out

# fetch any row from the Protein Manager to get the columns
def fetch_random_entry():
    session = requests.Session()
    url = "https://api.scispot.io/tryingtofixcors/labsheets/list-rows"
    payload = {
        "apiKey": API_KEY,
        "manager": "Sample Manager",
        "pageSize": "1",
        "page": "1"
    }
    ret = session.post(url, json=payload)
    return json.loads(ret.text)

# fetch a Scispot entry based on the Sample ID
def fetch_entry_from_id(manager, sample_id):
    session = requests.Session()
    url = "https://api.scispot.io/tryingtofixcors/labsheets/find-row-by-id"
    payload = {
        "apiKey": API_KEY,
        "labsheet": manager,
        "id": sample_id
    }
    ret = session.post(url, json=payload)
    return json.loads(ret.text)

sys.stdout.write("Imported required functions")

Imported required functions

In [None]:
## examples

# look up an entry based on UUID
test_uuid = "0344a99d-888a-443e-9a23-13eceb279ae1"
find_entry_from_uuid('Sample Manager', test_uuid)

# make the scispot return a pretty dataframe
json_to_pd_df(find_entry_from_uuid('Sample Manager', test_uuid))

# initialize an empty Protein Manager
prot_df = pd.DataFrame(data=fetch_random_entry()['rows'], columns=fetch_random_entry()['headers'])
prot_df= prot_df.drop(prot_df.index[0])
print(prot_df)

# test_sampleid = "THP1_dbgen1_VTP50469_3"
fetch_entry_from_id("Sample Manager",test_sampleid)

In [4]:
# add a new row to the Protein Manager
def add_protein_manager_entry(new_row):
    url = "https://api.scispot.io/tryingtofixcors/labsheets/add-rows"
    payload = {
        "apiKey": API_KEY,
        "manager": "Protein Manager",
        "rows": [new_row]
    }
    headers = {"Content-Type": "application/json"}
    #print(new_row)
    
    response = requests.request("POST", url, json=payload, headers=headers)
    print(response.text)

# generate a new row for the protein manager
def new_protein_row(sample_uuid, fraction):
    
    # generate protein ID from sample ID
    sample_id = json_to_pd_df(find_entry_from_uuid("Sample Manager", sample_uuid))['Sample ID'][0]
    #print(sample_id)
    protein_id = sample_id + "_" + fraction
          
    new_protein_row = [
        protein_id, # Protein ID
        sample_uuid, # Sample ID required, must be UUID
        "e45d66e9-abed-4d24-9dd8-8f074233c2d8", # Protocol ID required, must be UUID
        "01/01/2023", # "Extraction Date" column required, must be MM/DD/YYYY
        "Lindsay", # "Extraction Scientist" column required, must be name
        fraction, # Extraction fraction
        "extractionnote_here", # Extraction Note
        "0", # Protein Concentration (ug/mL)
        "bca_here", # BCA Results
        "", # Location
        "storageform_here", # Storage Form
        "storageamt_here", # Storage Amount (uL)
        "True", # "Pass" column required
        sample_uuid, # Parent UUID
        "", # Children UUID
        "", # Created By
        "" # Creation Date
    ]

    return(new_protein_row)

sys.stdout.write("Imported functions for manipulating Protein Manager")

Imported functions for manipulating Protein Manager

In [None]:
##
## GENERATE PROTEIN SAMPLES
##

# read in a Scispot plate TSV export of Sample Manager
# ChESS_dbgen1_plate1 was done manually
#sample_fi = "G:/My Drive/Lindsay Pino/proj/2023_scispot_utils/data/Sample-2023-04-10T19_24_09.tsv" # ChESS_dbgen1_202212_plate1rep2 
#sample_fi = "G:/My Drive/Lindsay Pino/proj/2023_scispot_utils/data/Sample-2023-04-10T19_24_09.tsv" # ChESS_dbgen1_202212_plate1rep3
input_df = pd.read_table(sample_fi, sep='\t', engine="python")
input_df.tail()

# generate the new protein samples and add them to SciSpot Protein Manager
# note: uuid header string parses weird > "ï»¿uuid"
for uuid in input_df["ï»¿uuid"]:

    # define Protein fractions to be generated for each Sample
    extraction_frx = ['cytosol',
                     'chromatin',
                     'insoluble']
    
    # generate the Protein Manager row for each fraction
    for frx in extraction_frx:
        new_row = new_protein_row(uuid, frx)
    
        # add the row to the Protein Manager
        #add_protein_manager_entry(new_row)

In [None]:
##
## GENERATE PEPTIDE SAMPLES
##

