# Update and organize CAVE tables
After creating a table on [CAVE](https://github.com/seung-lab/CAVEclient), we often need to do a maintainance on it. `upload.py` has several functions useful for such maintainance. Here, we demonstrate how you can organize your CAVE tables, by showing one of our examples when we uploaded some neuronal nuclei to the soma table on CAVE. 

## 0. Prepare CAVE client
First, you need to initialize CAVEclient. If this is your first time to use CAVE, access [here](https://globalv1.daf-apis.com/info/) and check which dataset you have an access. You can also check the [CAVE's official document](https://caveclient.readthedocs.io/en/latest/index.html).

In [1]:
import numpy as np
import pandas as pd
from caveclient import CAVEclient

import fanc.auth as auth
from fanc.upload import CAVEorganizer, xyz_StringSeries2List

In [2]:
datastack_name = 'fanc_production_mar2021'
client = CAVEclient(datastack_name)

## 1. Prepare a DataFrame with new somata

The FANC community members reported several neuronal somata that were missing in the original soma table. We assgined 17-digit nucleus ids, which start from `10000000000000001`, to those manually identified somata (compared to automatically detected nuclei that have randomly-generated 17-digit nucleus ids with `7` in the most significant figure. See [here](https://github.com/seung-lab/cloud-volume/wiki/Graphene) for more detailed information.) These new somata were reported in Google Sheet, so we first downloaded it as a `tsv` file and formatted it like below. Make sure each soma is stored in separate row (annotation).

In [3]:
path_to_tsv = "../Output/FANC missing soma - missing soma.tsv"
missing_soma = pd.read_table(path_to_tsv,
                             usecols = ['Voxel coordinate (x, y, z)', 'Added to CAVE table? (YYYYMMDD)', 'Cell body ID'],
                             dtype = {'Added to CAVE table? (YYYYMMDD)': str, 'Cell body ID': object})
missing_soma_notna = missing_soma[missing_soma['Cell body ID'].notna()]

to_be_added = missing_soma_notna.reindex(columns=['Voxel coordinate (x, y, z)'])
to_be_added = to_be_added.rename(columns={'Voxel coordinate (x, y, z)': 'pt_position'})
to_be_added['pt_position'] = xyz_StringSeries2List(to_be_added['pt_position'])

## 2. Instantiate `organizer.update_soma`
Then, you can instantiate a CAVE organizer in `fanc.upload` using the client you set up. Since we want to edit the soma table and its subset table(s) for this time, we call `organizer.update_soma` and initialize it by specifing soma table and which cell types we will add. This function often uses `caveclient.materializationengige.live_live_query()` [(code)](https://github.com/seung-lab/CAVEclient/blob/master/caveclient/materializationengine.py#982), and it gives us an WARNING message every time we run it. But you can ignore these messages for now.

In [4]:
organizer = CAVEorganizer(client)
organizer.update_soma.initialize("somas_dec2022", "neuron")



Ready to update soma table: somas_dec2022 and subset soma table: neuron_somas_dec2022
Please make sure you have separate soma in each annotation and have all information required: ['pt'].


You can inspect up-to-date version of the soma table by `organizer.update_soma.soma_table`, and neuronal soma table by `organizer.update_soma.subset_table`.

In [5]:
organizer.update_soma.soma_table

Unnamed: 0,id,valid,volume,pt_supervoxel_id,pt_root_id,pt_position,bb_start_position,bb_end_position
0,72129612586418230,t,9.933692,72272137318248598,648518346487080213,"[7440, 102568, 2126]","[7168.0, 102048.0, 2099.0]","[7712.0, 103088.0, 2154.0]"
1,72129612653527663,t,25.513715,72272068800215905,648518346474233837,"[7056, 102224, 2500]","[6624.0, 101712.0, 2457.0]","[7488.0, 102736.0, 2543.0]"
2,72129612720635938,t,70.696932,72272068867345230,648518346500365939,"[7152, 101136, 2607]","[6512.0, 100544.0, 2551.0]","[7792.0, 101728.0, 2663.0]"
3,72129612787745097,t,30.472467,72272069001526660,648518346515615306,"[7528, 101640, 2884]","[7008.0, 101088.0, 2850.0]","[8048.0, 102192.0, 2918.0]"
4,72129612787745291,t,47.379510,72272069068774152,648518346507454262,"[6648, 100800, 3046]","[6192.0, 100208.0, 2983.0]","[7104.0, 101392.0, 3110.0]"
...,...,...,...,...,...,...,...,...
16684,73115256242570721,t,44.748474,74172987233942977,648518346511845488,"[61640, 130136, 2964]","[61072.0, 129632.0, 2921.0]","[62208.0, 130640.0, 3007.0]"
16685,72764580417241202,t,40.386136,73471635449375881,648518346498317138,"[42192, 199680, 1501]","[41568.0, 199280.0, 1454.0]","[42816.0, 200080.0, 1549.0]"
16686,10000000000000041,t,,73752147614832606,648518346517540644,"[49453, 171782, 50]","[nan, nan, nan]","[nan, nan, nan]"
16687,73115256175461639,t,189.924239,74172987099839888,648518346470483710,"[63008, 130032, 2711]","[62096.0, 129328.0, 2628.0]","[63920.0, 130736.0, 2794.0]"


The code below will upload new somas in your dataframe to both soma and subset table on CAVE. Remove `%%script false --no-raise-error` when running. If they are successfully uploaded, you will see a message `Successfully uploaded!`. If your dataframe is not formatted correctly, you will receive error messages. e.g., your dataframe does not have `pt_position` column, each row does not contain an annotation, some of new somas are already reported.

In [None]:
%%script false

organizer.update_soma.add_dataframe(to_be_added)

## 3. Preview somas
The following code generates a Neuroglancer url that has soma points as an annotation layer. If you have just uploaded your somata to CAVE, you need to wait ~1 hour, as the server needs to ingest them and look up their supervoxels.

In [None]:
organizer.update_soma.update_tables()
organizer.update_soma.preview()