# Proofread in eCREST

The files generated by this script will also be able to be opened in CREST original (though some information may be lost if using original CREST.py or .exe).

## Setup

Do the following two setup steps regardless of how you will be using this script. 

### 1. Imports

Run the following code cell to import the necessary packages and modules. 

In [37]:
############################################################################################################################ 
# Get the latest CREST files for each ID within the target folder (dirname)

from pathlib import Path
import json
from sqlite3 import connect as sqlite3_connect
from sqlite3 import DatabaseError
from igraph import Graph as ig_Graph
from igraph import plot as ig_plot
from scipy.spatial.distance import cdist
from random import choice as random_choice
from itertools import combinations
from numpy import array, unravel_index, argmin, mean,unique,nan
import pandas as pd
from copy import deepcopy
from datetime import datetime
from time import time
import neuroglancer
from webbrowser import open as wb_open
from webbrowser import open_new as wb_open_new
import neuroglancer

# from eCREST_cli_beta import ecrest, import_settings
from eCREST_cli import ecrest, import_settings

The 'ecrest' class has been imported from eCREST_cli.py

An instance of this object will be able to:
- open an neuroglancer viewer for proofrieading (see "Proofread using CREST")
    - add-remove segments (using graph feature for efficiency)
    - format itself and save itself as a CREST-style .json
- convert from neuroglancer json (see "Convert From Neuroglancer to eCREST")
    - format itself and save itself as a CREST-style .json
    


# USING THE CREST_JSON class

## Settings definitions

Whether you are converting from neuroglancer or creating a new reconstruction, the settings_dict parameters is needed to create CREST json files with correct formatting. 
- 'save_dir' : the directory where JSON files are saved 
- 'cred' and 'db_path' : specify the path to the agglomeration database file on your local computer. 

In [None]:
settings_dict = {
    'save_dir' : '/Users/kperks/Documents/eCREST-local-files/in-progress',
    'db_path' : '/Users/kperks/Documents/eCREST-local-files/Mariela_bigquery_exports_agglo_v230111c_16_crest_proofreading_database.db',
    'max_num_base_added' : 1000,
    'cell_structures' : ['unknown','axon', 'basal dendrite', 'apical dendrite', 'dendrite', 'multiple'],
    'annotation_points' : ['exit volume', 'natural end', 'uncertain', 'pre-synaptic', 'post-synaptic']
}

### Import settings

If you save a copy of settings_dict.json (found in the "under construction" directory of eCREST repo) locally somewhere outside the repo (like in your save_dir), then you can use the following code cell to import. This avoids needing to re-type the save_dir and db_path each time you "git pull" updates from the repo to this notebook.

In [38]:
path_to_settings_json = '/Users/kperks/Documents/ell-connectome/eCREST-local-files/settings_dict.json'
settings_dict = import_settings(path_to_settings_json)

## Proofread using eCREST



### 1. Create a crest_json object that launches a proofreading instance of neuroglancer


Initialize with either:
- (segment_id, segment_list): the main_base_id from the neuroglancer file you are converting and a list of base_segments.
- (segment_id): a "main_base_id"
- (filepath): an existing CREST json file

#### NEW reconstruction from segment ID

If you wanted to start reconstructing a new cell from a main base segment, 
you would use the following code block to launch

In [None]:
segment_id = 558129604
crest = ecrest(settings_dict,segment_id = segment_id)
viewer_object = crest.neuroglancer_viewer()
crest.load_to_viewer()


#### EDIT reconstruction from file

If you wanted to edit a reconstruction from an existing file, 
you would use the following code block to launch

In [None]:
# json_path = Path('/Users/kperks/Documents/gdrive/.shortcut-targets-by-id/16q1BuOMfD2ta0Cwq8CjMlRe4rDvbuWC5/ELL_connectome/CREST_reconstructions/mg-network')
# filename = 'cell_graph_213605530__2023-03-29 22.49.21.json'

json_path = Path(settings_dict['save_dir']) # / 'todo_post-synaptic'
filename = 'cell_graph_50844566__2023-04-21 14.47.44.json'

crest = ecrest(settings_dict,filepath= json_path / filename, launch_viewer=True)
# viewer_object = crest.neuroglancer_viewer()
# crest.load_to_viewer()

################
## You can also open using a full copy/pasted filepath 
## (instead of directory path and file name... which is better for looping over cells)

# filepath = r"C:\Users\PerksLab\Downloads\cell_graph_133378529__2023-03-27 14.03.02.json"
# crest = ecrest(settings_dict,filepath = filepath)
########################

#### If you want to change key bindings for functions..

In [None]:
with crest.viewer.config_state.txn() as s:
    s.input_event_bindings.data_view["alt+mousedown0"]="add-or-remove-seg"
    s.input_event_bindings.data_view["alt+mousedown2"]="mark-branch-in-colour"
    print(s.input_event_bindings.data_view)

#### Use the following to open a new cell in the same neuroglancer tab as is already opened

**DOES NOT WORK YET**

In [None]:
# json_path = Path(settings_dict['save_dir']) / 'todo_post-synaptic'
# filename = 'cell_graph_302637877__2023-04-09 19.21.28.json'

# crest = ecrest(settings_dict,filepath= json_path / filename)
# crest.neuroglancer_viewer(viewer_object)
# crest.load_to_viewer()


In [208]:
# Assign the cell type and which method you are using (manual or auto)
cell_type = 'uk'
method = 'manual'

## Do not edit
crest.define_ctype(cell_type,method)

### 2. SAVE YOUR WORK BEFORE CLOSING NEUROGLANCER! 

In [209]:
crest.save_cell_graph()

Saved cell 50844566 reconstruction locally at 2023-04-30 08.42.42


If you want to re-write the file you opened instead of saving with a new timestamp in the filename, run the following code cell instead of the previous one.

In [187]:
filepath = json_path / filename
crest.save_cell_graph(directory_path = filepath.parent, file_name=filepath.name, save_to_cloud=False); 

Saved cell 49376222 reconstruction locally at 2023-04-30 08.26.23


### 3. CELL TYPING

If part of your job as a reconstructor is to identify cell types, then you can use the following blocks of code.  
First, check if it is already defined (and what the cell type was defined as).  


After you are finished defining the cell type:  
**DONT FORGET TO SAVE YOUR WORK!**. 
(step 2)

In [16]:
# Assign which method you are using (manual or auto)
method = 'manual'

## Do not edit
crest.get_ctype(method)

'mg1'

1 other base segments in the agglo segment; max number can add is 1000
Added 1 base segments from agglomerated segment 391051932, linked base segments 392181168 and 391051932, 3753nm apart, 


If not defined (or defined incorrectly), then define it.
> OPTIONS: mg1, mg2, mgx, lg, lf, lx, mli, gc, gran, sg

In [210]:
# Assign the cell type and which method you are using (manual or auto)
cell_type = 'mgx'
method = 'manual'

## Do not edit
crest.define_ctype(cell_type,method)

## Check for DUPLICATES

Specify a folder of cells that you want to check for duplicates with the cell you are reconstructing.

The following code cell uses the function ```get_base_segments_dict``` in the crest instance to create a dictionary of all base segments for each cell within a specified directory. 

In [181]:
dirpath = Path(settings_dict['save_dir']) #/ 'todo_post-synaptic'

base_segments = crest.get_base_segments_dict(dirpath)


Load a cell that needs checking

In [205]:
json_path = Path(settings_dict['save_dir']) / 'todo_post-synaptic' / 'check-duplicates'
filename = 'cell_graph_50844566__2023-04-21 14.47.44.json'

crest = ecrest(settings_dict,filepath= json_path / filename, launch_viewer=False)

And then uses the function ```get_duplicates``` in the crest instance to check if it overlaps with any of the cells in that directory.

In [206]:
df = crest.check_duplicates(base_segments)

display(df)

Unnamed: 0,index,0


## Add found missing segments to reconstructions

manually go through each cell with missing segments, search and add them...

keep a running "todo" list of any segments that should be a new reconstruction rather than missing from current

In [None]:
missing_path = Path('/Users/kperks/Documents/gdrive/.shortcut-targets-by-id/16q1BuOMfD2ta0Cwq8CjMlRe4rDvbuWC5/ELL_connectome/CREST_reconstructions/mg-network/todo_post-synaptic/reconstructed_missing_segs.json')
with open(missing_path,'r') as fp:
    reconstructed_segs=fp.read()
    reconstructed_segs = json.loads(reconstructed_segs)


In [None]:
keys = list(reconstructed_segs.keys())
# keys
len(keys)

for each key in the dict, open the crest file for that cell (from nodefiles) and visualize the missing segments

In [None]:
path_to_settings_json = '/Users/kperks/Documents/ell-connectome/eCREST-local-files/settings_dict.json'
settings_dict = import_settings(path_to_settings_json)

dirpath = Path(settings_dict['save_dir'])
# dirpath = "/Users/kperks/Documents/gdrive/.shortcut-targets-by-id/16q1BuOMfD2ta0Cwq8CjMlRe4rDvbuWC5/ELL_connectome/CREST_reconstructions/mg-network"

nodes = [child.name.split('_')[2] for child in sorted(dirpath.iterdir()) 
         if (child.name[0]!='.') & (child.is_file())] # ignore hidden files]

nodefiles = dict()
for child in sorted(dirpath.iterdir()):
    if (child.name[0]!='.') & (child.is_file()):
        nodefiles[child.name.split('_')[2]] = child
                    

In [None]:
reconstructed_segs

In [None]:
k = 15

crest = ecrest(settings_dict,filepath= nodefiles[keys[k]], launch_viewer=True)

print(reconstructed_segs[keys[k]])

In [None]:
with crest.viewer.config_state.txn() as s:
    s.input_event_bindings.data_view["alt+mousedown0"]="add-or-remove-seg"
    s.input_event_bindings.data_view["alt+mousedown2"]="mark-branch-in-colour"
    print(s.input_event_bindings.data_view)

In [None]:
crest.save_cell_graph()

In [None]:
crest.get_ctype('manual')

In [None]:
todo = [302453434,387230926,558315206,43622129,130671514,132032043,561440096,302453434]

## Convert From Neuroglancer to eCREST

Run the following code cell to convert neuroglancer json files to eCREST json files. 

Uses "conversion_specs.json" to batch process conversion.

Conversion using "conversion_specs.json" expects:
- a folder of neuroglancer json files (with filenames standardized like in Google Drive)
- "dirname" is the folder containing neuroglancer json files to be converted
- that the "conversion_specs.json" is in the ```settings_dict['save_dir']``` key

### Batch

In [None]:
conversion_specs_filename = "conversion_specs.json"

with open(Path(settings_dict['save_dir']) / conversion_specs_filename) as f:
    conversion_specs = json.load(f)

p = Path(conversion_specs['dirname'])

for cell_id, info in conversion_specs['cell_info'].items():
   
    f = info['filename']
    neuroglancer_layer_name = info['neuroglancer_layer_name']
    crest_layer_name = info['crest_layer_name']
  
    ## Get main_base_seg_ID from filename or from list of segment IDs
    main_base_id = f.split('_')[1] # gets the base segment ID from the name
    
    try:
        assert cell_id == main_base_id, f'cell id and filename do not match in conversion json; moving on to next cell without completing this one'
    except AssertionError as msg:
        print(msg)
        #add error message to json
        with open(settings_dict['save_dir'] / conversion_specs_filename, "r") as f:
            loaded = json.load(f)
        loaded['cell_info'][cell_id]['errors'].append(str(msg))
        with open(settings_dict['save_dir'] / conversion_specs_filename, "w") as f:
            json.dump(loaded, f, indent=4)
        continue
    
    ## Load the neuroglancer json
    print(f'you have selected cell {cell_id} to convert')
    
    with open(p / f, 'r') as myfile: # 'p' is the dirpath and 'f' is the filename from the created 'd' dictionary
        neuroglancer_data = json.load(myfile)

    print(f'Obtaining base_seg IDs from segmentation layer of neuroglancer json.')

    ## Obtain the list of base_segments from the neuroglancer json
    segmentation_layer = next((item for item in neuroglancer_data['layers'] if item["source"] == 'brainmaps://10393113184:ell:roi450um_seg32fb16fb_220930'), None)
    try:
        # add annotation layer
        
        base_segment_list_ng = segmentation_layer['segments']
    except TypeError as msg:
        print(msg, f': segmentation layer source is different; moving on to next cell without completing this one')
        #add error message to json
        with open(settings_dict['save_dir'] / conversion_specs_filename, "r") as f:
            loaded = json.load(f)
        loaded['cell_info'][cell_id]['errors'].append(str(msg) + f': segmentation layer source is different; moving on to next cell without completing this one')
        with open(settings_dict['save_dir'] / conversion_specs_filename, "w") as f:
            json.dump(loaded, f, indent=4)
        continue

    
    

    print(f'creating a crest_json object with no viewer for this cell')
    ## Create CREST instance with no viewer, segment_list, and segment_id
    crest = ecrest(settings_dict, segment_id = main_base_id, segment_list = base_segment_list_ng, launch_viewer=False)

    print(f'importing annotation layers from neuroglancer')
    ## Get annotations from neuroglancer -- iterate through one layer at a time to check for errors in layer names
    for nl_, cl_ in zip(neuroglancer_layer_name, crest_layer_name):

        # get the 'layers' dictionary that has that name
        neuroglancer_layer = next((item for item in neuroglancer_data['layers'] if item["name"] == nl_), None)

        if neuroglancer_layer != None:
            if cl_ in crest.point_types:
                # add annotation layer
                crest.import_annotations(neuroglancer_data, [nl_], [cl_])
                print(f"Imported - {nl_} - layer from neuroglancer annotations tabs for cell {crest.cell_data['metadata']['main_seg']['base']} as - {cl_} -.")
            else: 
                msg = f"CREST layer name - {cl_} - incorrect for cell {crest.cell_data['metadata']['main_seg']['base']} in conversion_json"
                print(msg)
                #add error message to json
                with open(crest.save_dir / conversion_specs_filename, "r") as f:
                    loaded = json.load(f)
                loaded['cell_info'][cell_id]['errors'].append(str(msg))
                with open(crest.save_dir / conversion_specs_filename, "w") as f:
                    json.dump(loaded, f, indent=4)
        else:
            msg = f"no layer by the name - {nl_} - in neuroglancer json for cell {crest.cell_data['metadata']['main_seg']['base']}"
            print(msg)
            #add error message to json
            with open(crest.save_dir / conversion_specs_filename, "r") as f:
                loaded = json.load(f)
            loaded['cell_info'][cell_id]['errors'].append(str(msg))
            with open(crest.save_dir / conversion_specs_filename, "w") as f:
                json.dump(loaded, f, indent=4)


    ## Save the cell_data as json
    print(f'saving cell {cell_id} with completed graph and annotations layers imported')
    crest.save_cell_graph() # If do not give file_path, then it will auto-generate one like CREST produces

### Single file

Just make the "conversion_specs" file have one cell in it. The "batch" loop will still run on one cell. 

In [213]:
json_path = Path(settings_dict['save_dir']) #/ 'todo_post-synaptic'
filename = 'cell_graph_304356725__2023-04-14 20.58.51.json'

crest = ecrest(settings_dict,filepath= json_path / filename, launch_viewer=True)

updating viewer status message: Current Base Segment Counts: unknown: 1926, axon: 105, basal dendrite: 151, apical dendrite: 909, dendrite: 0, multiple: 0


### Just annotations layer

If starting from scratch on a reconstruction is faster than converting the base_segs into a graph... but you want the annotations preserved.

In [214]:
neuroglancer_layer_name = ['post-synaptic']
crest_layer_name = ['post-synaptic']
neuroglancer_path = '/Users/kperks/Documents/gdrive/.shortcut-targets-by-id/16q1BuOMfD2ta0Cwq8CjMlRe4rDvbuWC5/ELL_connectome/CREST_reconstructions/mg-network/Nate_neuroglancer_synapses/finished'
neuroglancer_path = Path(neuroglancer_path) / '304356725_nbs.json'

with open(Path(neuroglancer_path), 'r') as myfile: # 'p' is the dirpath and 'f' is the filename from the created 'd' dictionary
    neuroglancer_data = json.load(myfile)
# for nl_, cl_ in zip(neuroglancer_layer_name, crest_layer_name):

for nl_, cl_ in zip(neuroglancer_layer_name, crest_layer_name):
    # get the 'layers' dictionary that has that name
    neuroglancer_layer = next((item for item in neuroglancer_data['layers'] if item["name"] == nl_), None)

    if neuroglancer_layer != None:
        if cl_ in crest.point_types:
            # add annotation layer
            crest.import_annotations(neuroglancer_data, [nl_], [cl_])
            print(f"Imported - {nl_} - layer from neuroglancer annotations tabs for cell {crest.cell_data['metadata']['main_seg']['base']} as - {cl_} -.")
        else: 
            msg = f"CREST layer name - {cl_} - incorrect for cell {crest.cell_data['metadata']['main_seg']['base']} in conversion_json"
            print(msg)

    else:
        msg = f"no layer by the name - {nl_} - in neuroglancer json for cell {crest.cell_data['metadata']['main_seg']['base']}"
        print(msg)

crest.load_annotation_layer_points()

Imported - post-synaptic - layer from neuroglancer annotations tabs for cell 304356725 as - post-synaptic -.


In [215]:
## Save the cell_data as json
print(f'saving cell {neuroglancer_path} with annotations layers imported')
crest.save_cell_graph() # If do not give file_path, then it will auto-generate one like CREST produces

saving cell /Users/kperks/Documents/gdrive/.shortcut-targets-by-id/16q1BuOMfD2ta0Cwq8CjMlRe4rDvbuWC5/ELL_connectome/CREST_reconstructions/mg-network/Nate_neuroglancer_synapses/finished/304356725_nbs.json with annotations layers imported
Saved cell 304356725 reconstruction locally at 2023-04-30 19.25.05


Another version for a json of just one annotation layer (the crest file would be a list of dicts.... each dict is an annotation point)

In [None]:
annotations_path = json_path / 'tmp' /'annotations.json'

with open(annotations_path, 'r') as myfile: # 'p' is the dirpath and 'f' is the filename from the created 'd' dictionary
    annotate_data = json.load(myfile)
# for nl_, cl_ in zip(neuroglancer_layer_name, crest_layer_name):

# annotate_data
annotation_list = []
for v in annotate_data:


    # for v in neuroglancer_layer['annotations']:
    corrected_location = crest.get_corrected_xyz(v['point'], 'seg')

    if 'segments' not in v.keys():
        annotation_list.extend([corrected_location])
    if 'segments' in v.keys():
        annotation_list.extend([corrected_location + v['segments'][0]])

# self.cell_data['end_points'][c].extend(annotation_list)

crest.cell_data['end_points']['post-synaptic'] = annotation_list

crest.load_annotation_layer_points()

### segments from an NG json into an existing CREST

In [216]:
neuroglancer_path = '/Users/kperks/Documents/gdrive/.shortcut-targets-by-id/16q1BuOMfD2ta0Cwq8CjMlRe4rDvbuWC5/ELL_connectome/CREST_reconstructions/mg-network/Nate_neuroglancer_synapses/finished'
neuroglancer_path = Path(neuroglancer_path) / '304356725_nbs.json'

with open(Path(neuroglancer_path), 'r') as myfile: # 'p' is the dirpath and 'f' is the filename from the created 'd' dictionary
    neuroglancer_data = json.load(myfile)

In [218]:
segmentation_layer = next((item for item in neuroglancer_data['layers'] if item["source"] == 'brainmaps://10393113184:ell:roi450um_seg32fb16fb_220930'), None)
base_segment_list_ng = segmentation_layer['segments']

In [220]:
base_ids_added = set()
anchor_cell = crest

for base_seg in base_segment_list_ng:
    
    if base_ids_added&set(base_seg)==set(): # if this segment has not already been added

        # print(i,base_seg)
        agglo_seg = anchor_cell.get_agglo_seg_of_base_seg(base_seg)

        constituent_base_ids = anchor_cell.get_base_segs_of_agglo_seg(agglo_seg)
        # print(f'{len(constituent_base_ids)} other base segments in the agglo segment; max number can add is {crest.max_num_base_added}')


        if len(constituent_base_ids) > anchor_cell.max_num_base_added:
            base_ids = [base_seg]
            # anchor_cell.large_agglo_segs.add(agglo_seg)
            # print(f'{base_seg} part of an agglo seg {agglo_seg} that is too large to add, so just adding the one segment')
        else:
            base_ids = constituent_base_ids
        
        current_segs = anchor_cell.assert_segs_in_sync(return_segs=True)

        num_base_segs_this_agglo_seg = len(base_ids)
        base_ids = [x for x in base_ids if x not in current_segs]
        num_base_segs_not_already_included = len(base_ids)

        if num_base_segs_this_agglo_seg > num_base_segs_not_already_included:

            base_ids = [x for x in base_ids if x not in anchor_cell.cell_data['removed_base_segs']]

            if not base_seg in base_ids:
                base_ids.append(base_seg)
        
        anchor_cell.update_base_locations(base_ids)
        anchor_cell.pr_graph.add_vertices(base_ids)

        if len(base_ids) > 1:
            edges = anchor_cell.get_edges_from_agglo_seg(agglo_seg)
            edges = [x for x in edges if (x[0] in base_ids and x[1] in base_ids)]
            anchor_cell.pr_graph.add_edges(edges)

        join_msg = anchor_cell.add_closest_edge_to_graph(base_ids, base_seg) 
        

        # Update lists of base segments and displayed segs:
        anchor_cell.cell_data['base_segments']['unknown'].update(set(base_ids))

        with anchor_cell.viewer.txn(overwrite=True) as s:

            for bs in base_ids:
                s.layers['base_segs'].segment_colors[int(bs)] = '#ff0000' #'#d2b48c'
                s.layers['base_segs'].segments.add(int(bs))
                
        base_ids_added.update(base_ids)


        anchor_cell.update_displayed_segs() 
        anchor_cell.assert_segs_in_sync()


AssertionError: 

In [227]:
new_segs = base_ids

In [233]:
base_ids

['127776824']

In [224]:
seg_to_link = base_seg

In [225]:
assert len(crest.pr_graph.clusters(mode='weak')) == 2

In [228]:
# Some segments do not have locations recorded:
current_cell_node_list = [x['name'] for x in crest.pr_graph.vs if x['name'] not in new_segs]
current_cell_node_list = [x for x in current_cell_node_list if x in crest.cell_data['base_locations']]

In [230]:
# Then determine new segments that are acceptable as partners
if seg_to_link in crest.cell_data['base_locations'].keys():
    new_segs = [seg_to_link]
else:
    new_segs = [x for x in new_segs if x in crest.cell_data['base_locations']]

In [232]:
new_segs

['127776824']

In [235]:
sel_curr, sel_new, dist = crest.get_closest_dist_between_ccs(current_cell_node_list, new_segs)

In [238]:
dist

1147

In [240]:
crest.pr_graph.add_edges([(sel_curr, sel_new)])

In [241]:
crest.cell_data['added_graph_edges'].append([sel_curr, sel_new, dist])

In [244]:
crest.add_cc_bridging_edges_pairwise()

2 clusters of connected components. Connecting these clusters with nearest base segments.


KeyboardInterrupt: 

In [243]:
len(crest.pr_graph.clusters(mode='weak')) 

2

In [None]:
assert len(crest.pr_graph.clusters(mode='weak')) == 1     

return f', linked base segments {sel_curr} and {sel_new}, {round(dist)}nm apart, '

## Combine annotations and/or base segments across different CREST files

In [191]:
json_path = Path(settings_dict['save_dir']) #/ 'todo_post-synaptic'

Cell 1

In [192]:
filename = 'cell_graph_42769344__2023-04-16 11.29.29.json'

crest_1 = ecrest(settings_dict,filepath= json_path / filename, launch_viewer=True)

updating viewer status message: Current Base Segment Counts: unknown: 1692, axon: 3, basal dendrite: 1688, apical dendrite: 0, dendrite: 0, multiple: 0


In [198]:
crest_1.save_cell_graph()

Saved cell 42769344 reconstruction locally at 2023-04-30 08.31.57


Cell 2

In [193]:
filename = 'cell_graph_49686361__2023-04-21 15.00.25.json'
# Path("/Users/kperks/Documents/gdrive/.shortcut-targets-by-id/16q1BuOMfD2ta0Cwq8CjMlRe4rDvbuWC5/ELL_connectome/CREST_reconstructions/mg-network/cell_graph_214581797__2023-04-09 10.58.57.json")

crest_2 = ecrest(settings_dict,filepath= json_path / filename, launch_viewer=True)

# with open(Path(ann_cell_path), 'r') as myfile: # 'p' is the dirpath and 'f' is the filename from the created 'd' dictionary
#     cell_data = json.load(myfile)

updating viewer status message: Current Base Segment Counts: unknown: 71, axon: 0, basal dendrite: 0, apical dendrite: 0, dendrite: 0, multiple: 0


In [178]:
crest_2.save_cell_graph()

Saved cell 474357461 reconstruction locally at 2023-04-30 07.52.38


### Get missing segments from one into the other...
and adjust graph too (find missing edges and vertices... instead of making new graph, use old?)

In [194]:
segs_1 = set([a for b in crest_1.cell_data['base_segments'].values() for a in b])
segs_2 = set([a for b in crest_2.cell_data['base_segments'].values() for a in b])

print(f'{len(segs_1.difference(segs_2))} segments in cell 1 that are not in cell 2')
print(f'{len(segs_2.difference(segs_1))} segments in cell 2 that are not in cell 1')

3313 segments in cell 1 that are not in cell 2
1 segments in cell 2 that are not in cell 1


### add segments missing from one reconstruction to another

as loop... keeps track of all added (and exclude them from next iterations) because some can be in same agglo. 

In [195]:
# assign which cell you want to add to (and then keep)
anchor_cell = crest_1

# assign which segments need to be added
base_ids_all = sorted(list(segs_2.difference(segs_1)))

In [196]:
base_ids_added = set()

for base_seg in base_ids_all:
    
    if base_ids_added&set(base_seg)==set(): # if this segment has not already been added

        # print(i,base_seg)
        agglo_seg = anchor_cell.get_agglo_seg_of_base_seg(base_seg)

        constituent_base_ids = anchor_cell.get_base_segs_of_agglo_seg(agglo_seg)
        # print(f'{len(constituent_base_ids)} other base segments in the agglo segment; max number can add is {crest.max_num_base_added}')


        if len(constituent_base_ids) > anchor_cell.max_num_base_added:
            base_ids = [base_seg]
            # anchor_cell.large_agglo_segs.add(agglo_seg)
            # print(f'{base_seg} part of an agglo seg {agglo_seg} that is too large to add, so just adding the one segment')
        else:
            base_ids = constituent_base_ids
        
        current_segs = anchor_cell.assert_segs_in_sync(return_segs=True)

        num_base_segs_this_agglo_seg = len(base_ids)
        base_ids = [x for x in base_ids if x not in current_segs]
        num_base_segs_not_already_included = len(base_ids)

        if num_base_segs_this_agglo_seg > num_base_segs_not_already_included:

            base_ids = [x for x in base_ids if x not in anchor_cell.cell_data['removed_base_segs']]

            if not base_seg in base_ids:
                base_ids.append(base_seg)
        
        anchor_cell.update_base_locations(base_ids)
        anchor_cell.pr_graph.add_vertices(base_ids)

        if len(base_ids) > 1:
            edges = anchor_cell.get_edges_from_agglo_seg(agglo_seg)
            edges = [x for x in edges if (x[0] in base_ids and x[1] in base_ids)]
            anchor_cell.pr_graph.add_edges(edges)

        join_msg = anchor_cell.add_closest_edge_to_graph(base_ids, base_seg) 
        

        # Update lists of base segments and displayed segs:
        anchor_cell.cell_data['base_segments']['unknown'].update(set(base_ids))

        with anchor_cell.viewer.txn(overwrite=True) as s:

            for bs in base_ids:
                s.layers['base_segs'].segment_colors[int(bs)] = '#ff0000' #'#d2b48c'
                s.layers['base_segs'].segments.add(int(bs))
                
        base_ids_added.update(base_ids)


        anchor_cell.update_displayed_segs() 
        anchor_cell.assert_segs_in_sync()


### Create new crest file from the union segment list...

In [62]:
new_seg_list = segs_1.union(segs_2)
segment_id = crest_1.cell_data['metadata']['main_seg']['base']

In [None]:
combo_crest = ecrest(settings_dict, segment_id = , segment_list = new_seg_list, launch_viewer=True)

Add annotations from one of the cells...

In [67]:
combo_crest.cell_data['end_points'] = crest_1.cell_data['end_points']

combo_crest.load_annotation_layer_points()

In [72]:
combo_crest.define_ctype('mg1','manual')

In [73]:
combo_crest.save_cell_graph()

Saved cell 387230926 reconstruction locally at 2023-04-29 07.45.26


#### DONT FORGET TO SAVE YOUR WORK! 



## Other...

### Add vertex if missing (if can't remove a segment, sometimes this is the reason)

In [None]:
# ('479295220')
crest.cell_data['base_segments']['unknown'].add('565168297')

In [None]:
crest.pr_graph.vs.find("459940426")

In [None]:
crest.pr_graph.add_vertex(name='459940426')

In [None]:
crest.pr_graph.add_edges([(4966,323)])

### Try to add more than 1000 segments...


In [None]:
base_seg = 215728691
agglo_seg = crest.get_agglo_seg_of_base_seg(base_seg)

In [None]:
# First, load the original cell that is missing segments

segs_to_add_from_manual = set([a for b in crest.cell_data['base_segments'].values() for a in b])

In [None]:
len(segs_to_add_from_manual)

In [None]:
# then, without re-starting kernel, load the agglo cell from the missing segment

all_segs_current = set([a for b in crest.cell_data['base_segments'].values() for a in b])

In [None]:
len(all_segs_current)

In [None]:
# make sure you have fixed any merge errors in the agglo cell...
# then find which segments are missing from the agglo cell (because the agglo generally has more?... could do opposite)

base_ids_all = segs_to_add_from_manual - all_segs_current

In [None]:
len(base_ids_all)

all of these segments to add are not in the agglo... so loop through as if double clicked on each independently

> FORCE IT NOT TO ADD EXTRA (because the extra could be in list and will cause graph clustering errors when tried to be added... )

In [None]:
base_ids_all = sorted(list(base_ids_all))

In [None]:
len(base_ids_all)

In [None]:


for i,base_seg in enumerate(base_ids_all):

    print(i,base_seg)
    agglo_seg = crest.get_agglo_seg_of_base_seg(base_seg)

    constituent_base_ids = crest.get_base_segs_of_agglo_seg(agglo_seg)
    # print(f'{len(constituent_base_ids)} other base segments in the agglo segment; max number can add is {crest.max_num_base_added}')
    
    
    if len(constituent_base_ids) > crest.max_num_base_added:
        base_ids = [base_seg]
        self.large_agglo_segs.add(agglo_seg)
    else:
        base_ids = constituent_base_ids

    current_segs = crest.assert_segs_in_sync(return_segs=True)

    num_base_segs_this_agglo_seg = len(base_ids)
    base_ids = [x for x in base_ids if x not in current_segs]
    num_base_segs_not_already_included = len(base_ids)

    if num_base_segs_this_agglo_seg > num_base_segs_not_already_included:

        base_ids = [x for x in base_ids if x not in crest.cell_data['removed_base_segs']]

        if not base_seg in base_ids:
            base_ids.append(base_seg)

    crest.update_base_locations(base_ids)
    crest.pr_graph.add_vertices(base_ids)

    if len(base_ids) > 1:
        edges = crest.get_edges_from_agglo_seg(agglo_seg)
        edges = [x for x in edges if (x[0] in base_ids and x[1] in base_ids)]
        crest.pr_graph.add_edges(edges)

    join_msg = crest.add_closest_edge_to_graph(base_ids, base_seg) 

    # Update lists of base segments and displayed segs:
    crest.cell_data['base_segments']['unknown'].update(set(base_ids))


    with crest.viewer.txn(overwrite=True) as s:

        for bs in base_ids:
            s.layers['base_segs'].segment_colors[int(bs)] = '#d2b48c'
            s.layers['base_segs'].segments.add(int(bs))


    crest.update_displayed_segs() 
    crest.assert_segs_in_sync()

### define cell type for a crest file

resaves as original file name (not with an updated timestamp)

In [None]:
dirpath = Path('/Users/kperks/Documents/gdrive/.shortcut-targets-by-id/16q1BuOMfD2ta0Cwq8CjMlRe4rDvbuWC5/ELL_connectome/CREST_reconstructions/mg-network')
filepath = dirpath / 'cell_graph_307591597__2023-04-07 12.54.44.json'
cell_type = 'lf'

### 
crest = ecrest(settings_dict, filepath = filepath, launch_viewer=False);
crest.define_ctype(cell_type,'manual')
crest.get_ctype('manual') == cell_type
crest.save_cell_graph(directory_path = filepath.parent, file_name=filepath.name, save_to_cloud=False); #rewrites the original, not with a new time stamp

check cell type in neuroglancer

In [None]:
dirpath = Path('/Users/kperks/Documents/gdrive/.shortcut-targets-by-id/16q1BuOMfD2ta0Cwq8CjMlRe4rDvbuWC5/ELL_connectome/CREST_reconstructions/mg-network')
filepath = dirpath / 'cell_graph_213605530__2023-03-29 22.49.21.json'

crest = ecrest(settings_dict, filepath = filepath, launch_viewer=True)

In [None]:
crest.save_cell_graph(directory_path = filepath.parent, file_name=filepath.name, save_to_cloud=False); 

### get cell types of neuroglancer reconstructions into crest json files


In [None]:
path_to_settings_json = '/Users/kperks/Documents/ell-connectome/eCREST-local-files/settings_dict.json'
settings_dict = import_settings(path_to_settings_json)

In [None]:
crestpath = "/Volumes/GoogleDrive/.shortcut-targets-by-id/16q1BuOMfD2ta0Cwq8CjMlRe4rDvbuWC5/ELL_connectome/CREST_reconstructions/mg-network"
ngpath = "/Volumes/GoogleDrive/.shortcut-targets-by-id/16q1BuOMfD2ta0Cwq8CjMlRe4rDvbuWC5/ELL_connectome/CREST_reconstructions/mg-network/files_for_names"
ngfiles = [x.name for x in Path(ngpath).iterdir()]

In [None]:
ctype_list = []
has_ctype = set()
all_cells = set()

for fname in sorted(list(Path(crestpath).iterdir())):
    if (fname.name[0]!='.') & (fname.is_file()):
        # display(fname.name)
        crest = ecrest(settings_dict, filepath = fname, launch_viewer=False);
        ngfile = list(filter(lambda x: cell.cell_data['metadata']['main_seg']['base'] in x, ngfiles))
        
        all_cells = all_cells | set({cell.cell_data['metadata']['main_seg']['base']})
        
        if len(ngfile)==1:
            ctype = ngfile[0].split('_')[3].lower()
            has_ctype = has_ctype | set({cell.cell_data['metadata']['main_seg']['base']})
        ctype_list.append(ctype)
        crest.define_ctype(ctype,'manual');
        crest.save_cell_graph(directory_path = fname.parent, file_name=fname.name, save_to_cloud=False);

In [None]:
# make sure all crest cells have cell type definition from neuroglancer file name
all_cells-has_ctype

In [None]:
# check cell type labels
list(unique(ctype_list))

### resave a json file with formatting for readability

In [None]:
filepath = Path("D:\electric-fish\eCREST\CREST_settings.json")
with open(filepath, "r") as f:
    loaded = json.load(f)

with open(filepath, "w") as f:
    json.dump(loaded, f, indent=4)