# prune_raw_precomputed_table
The purpose of this notebook is to go through the production spock table containing the precomputed raw imaging pipeline job ids and get rid of the ones that do not correspond to entries in the ImagingChannel() table in the light sheet schema. This happened because I re-submitted the imaging entry form when there was an error. Somehow one of the raw data folders got overwritten so when it came time to run the precomputed raw pipeline on them, that failed. I moved the launching of the celery job to start the raw precomputed pipeline when an individual sample form is submitted so this should no longer be a problem in the future.

In [1]:
import pandas as pd
import numpy as np
import datajoint as dj
import matplotlib
import matplotlib.pyplot as plt
from datetime import datetime
matplotlib.style.use('ggplot')
%matplotlib inline

<a id='connect'></a>
# Connect to the datajoint00 database
Username and password are same as Princeton login.

**Note**: You need to be on campus or using a VPN to connect to the database.

In [3]:
netid = 'ahoag' # change this to your netid 
dj.config['database.host'] = 'datajoint00.pni.princeton.edu'
dj.config['database.user'] = netid
dj.conn()

Please enter DataJoint password: ········
Connecting ahoag@datajoint00.pni.princeton.edu:3306


DataJoint connection (connected) ahoag@datajoint00.pni.princeton.edu:3306

In [4]:
# db_lightsheet = dj.create_virtual_module('ahoag_lightsheet_demo','ahoag_lightsheet_demo')
# db_spockadmin = dj.create_virtual_module('ahoag_spockadmin_demo','ahoag_spockadmin_demo')

db_lightsheet = dj.create_virtual_module('a','u19lightserv_lightsheet')
db_spockadmin = dj.create_virtual_module('a','u19lightserv_appcore')

In [5]:
# Go through step2 jobids in the raw precomputed table 
lightsheets,spock_step2_jobids = db_spockadmin.RawPrecomputedSpockJob().fetch('lightsheet','jobid_step2')

In [9]:
raw_precomputed_contents = db_spockadmin.RawPrecomputedSpockJob()
raw_precomputed_contents

jobid_step2  the jobid on spock of step2 (downsampling) in the precomputed pipeline. Used as primary key so that the progress of the precomputed pipeline can be probed.,timestamp,lightsheet  left or right,jobid_step0,jobid_step1,username,status_step0,status_step1,status_step2
20002337,2020-08-24 13:51:00,left,20002335,20002336,soline,SUBMITTED,SUBMITTED,SUBMITTED
20002337,2020-08-24 14:00:45,left,20002335,20002336,soline,COMPLETED,COMPLETED,COMPLETED
20002381,2020-08-24 14:01:45,right,20002379,20002380,soline,SUBMITTED,SUBMITTED,SUBMITTED
20002381,2020-08-24 14:06:34,right,20002379,20002380,soline,COMPLETED,COMPLETED,COMPLETED
20002385,2020-08-24 14:02:45,left,20002383,20002384,soline,SUBMITTED,SUBMITTED,SUBMITTED
20002385,2020-08-24 14:06:34,left,20002383,20002384,soline,COMPLETED,COMPLETED,COMPLETED
20002388,2020-08-24 14:03:45,right,20002386,20002387,soline,SUBMITTED,SUBMITTED,SUBMITTED
20002388,2020-08-24 14:06:34,right,20002386,20002387,soline,COMPLETED,COMPLETED,COMPLETED
20010217,2020-08-26 16:23:27,left,20010211,20010214,oostland,SUBMITTED,SUBMITTED,SUBMITTED
20010217,2020-08-27 21:20:54,left,20010211,20010214,oostland,COMPLETED,COMPLETED,COMPLETED


In [10]:
# for each one if there is not an entry in the ImagingChannel() table then remove this entry
left_precomputed_jobids,right_precomputed_jobids = db_lightsheet.Request.ImagingChannel().fetch(
    'left_lightsheet_precomputed_spock_jobid','right_lightsheet_precomputed_spock_jobid')

In [11]:
bad_spock_step2_jobids = []
for ii in range(len(spock_step2_jobids)):
    lightsheet = lightsheets[ii]
    spock_step2_jobid = spock_step2_jobids[ii]
    if lightsheet=='left':
        if spock_step2_jobid not in left_precomputed_jobids:
            bad_spock_step2_jobids.append(spock_step2_jobid)
    if lightsheet == 'right':
        if spock_step2_jobid not in right_precomputed_jobids:
            bad_spock_step2_jobids.append(spock_step2_jobid)
bad_spock_step2_jobids

['20383040',
 '20383047',
 '20383048',
 '20383049',
 '20383056',
 '20383057',
 '20383058',
 '20383064',
 '20383066',
 '20383067',
 '20383073',
 '20383074',
 '20383076',
 '20383083',
 '20383084',
 '20383085',
 '20383090',
 '20383093',
 '20383094',
 '20383101',
 '20383102',
 '20383103',
 '20383110',
 '20383111',
 '20383112',
 '20383119',
 '20383120',
 '20383121',
 '20383127',
 '20383129',
 '20383130',
 '20383137',
 '20383138',
 '20383139',
 '20383145',
 '20383147',
 '20383148',
 '20383155',
 '20383156',
 '20383157',
 '20383162',
 '20383165',
 '20383166',
 '20383171',
 '20383172',
 '20383175',
 '20383181',
 '20383183',
 '20383184',
 '20383191',
 '20383192',
 '20383193',
 '20383200',
 '20383201',
 '20383202',
 '20383391',
 '20383392',
 '20383395',
 '20383398',
 '20383404',
 '20383406',
 '20383407',
 '20383410',
 '20383413',
 '20383418',
 '20383419',
 '20383422',
 '20383425',
 '20383430',
 '20383431',
 '20383434',
 '20383437',
 '20383442',
 '20383443',
 '20383446',
 '20383450',
 '20383454',

In [12]:
dj.config['safemode']=False
for jobid in bad_spock_step2_jobids:
    print(jobid)
    spock_table_contents = db_spockadmin.RawPrecomputedSpockJob() & f'jobid_step2={jobid}'
    if spock_table_contents:
        spock_table_contents.delete()

20383040
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383047
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383048
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383049
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383056
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383057
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383058
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383064
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383066
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383067
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383073
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383074
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383076
`u19lightserv_appcore`.`raw_pre

`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383538
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383539
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383542
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383546
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383550
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383551
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383554
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383561
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383562
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383563
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383566
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20383571
`u19lightserv_appcore`.`raw_precomputed_

`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20415913
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20415916
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20415917
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20415918
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20415927
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20415928
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20415929
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20415930
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20415939
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20415940
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20415941
`u19lightserv_appcore`.`raw_precomputed_spock_job`: 1 items
Committed.
20415942
`u19lightserv_appcore`.`raw_precomputed_

In [None]:
dj.config['safemode']=False
spock_table_contents.delete()

In [9]:
contents = db_spockadmin.RawPrecomputedSpockJob() & 'jobid_step2=18986528'
contents

jobid_step2  the jobid on spock of step2 (downsampling) in the precomputed pipeline. Used as primary key so that the progress of the precomputed pipeline can be probed.,timestamp,lightsheet  left or right,jobid_step0,jobid_step1,username,status_step0,status_step1,status_step2
18986528,2020-05-15 18:12:23,left,18986521,18986524,lightserv-test,SUBMITTED,SUBMITTED,SUBMITTED
18986528,2020-06-05 09:23:57,left,18986521,18986524,lightserv-test,COMPLETED,CANCELLED,CANCELLED


In [10]:

contents.delete()

`u19lightserv_appcore`.`raw_precomputed_spock_job`: 2 items
Committed.


In [None]:
spock_table_contents

In [None]:
spock_table_contents = db_spockadmin.RawPrecomputedSpockJob() & f'jobid_step2=18986528'
spock_table_contents

In [None]:
bool("False")

In [None]:
spock_table_contents = db_spockadmin.RawPrecomputedSpockJob()
spock_table_contents

In [None]:
job_contents = db_spockadmin.RawPrecomputedSpockJob()
unique_contents = dj.U('jobid_step2','username',).aggr(
    job_contents,timestamp='max(timestamp)')*job_contents

In [None]:
ongoing_codes = ('SUBMITTED','RUNNING','PENDING','REQUEUED','RESIZING','SUSPENDED')
incomplete_contents = unique_contents & f'status_step2 in {ongoing_codes}'

In [None]:
incomplete_contents

In [None]:
jobids = list(incomplete_contents.fetch('jobid_step2'))

In [None]:
jobids

In [None]:
'20382672' in jobids

In [None]:
list(incomplete_contents.fetch('username'))