# Pipeline

The pipeline automatically runs the 3 basic processing steps:

- find objects on the first plate, that have no corresponding object on the second plate;
- statistically compare between the objects with no match, with the ones that have a match;
- display the non-matched objects that have a higher likelihood of being a "real event" (and 
  not a plate artifacts).
  
The sequence is controlled by an entry in the *sequences* dictionary in file *settings.py*.
These are sequences of plate ID numbers. 

Existing sequences currently in the file were produced
by notebook *footoprints.ipynb*, and include all sequenecs for the **Grosser Schmidt-Spiegel**
telescope that contain plates that overlap on the sky by more than 50%, and were taken in the 
same nigth, or maybe one night before or after. One can place in there any sequence that one 
wants to study though.

Each step of the pipelin is run by a separate notebook, with results dumped in subdirectory 
**./html/**. Results have the same format as the corresponding input notebook run for a 
particular pair of plates, but are read-only, and formatted as an HTML web page. 

These notebooks can be run separately in interactive form. The input for a notebook is controlled 
by the content of file *dataset.json*, which defines the pair of plates one wants to study. This file
is rewritten by the pipeline, so make sure it points to the correct pair of plates before running 
any notebook (pay attention to the spelling, it's a JSON file which has stringent formatting 
requirements).

The pipelne requires that all data be previously installed in the data directory (defined in file
*settings.py*). The data can be automatically downloaded by script *download.ipynb* (still under 
works, the database seems to be broken at the time of this writing). This script is
also driven by the sequences in *settings.py*.

In [1]:
import os
import json
from importlib import reload

import settings
from settings import DATAPATH, sequences, current_dataset

In [2]:
# output path
output_path = os.path.join(DATAPATH, "html")

In [3]:
def update_dataset(key):
    # update the 'dataset.json' file with the name of the dataset to
    # be used next as a dict key by the pipeline code
    dataset_dict = {"current_dataset": key}
    try:
        json_file = open('dataset.json', 'w')
        json.dump(dataset_dict, json_file, indent=4)
    except IOError as e:
        print(f"Error writing to file: {e}")

In [4]:
sequence = sequences['seq 03']

for i in range(len((sequence)) - 1):
    
    plate_id_str = str(sequence[i])
    next_plate_id_str = str(sequence[i+1])

    key = plate_id_str + ',' + next_plate_id_str
    print("Start processing dataset: ", key)
    
    update_dataset(key)
    reload(settings)
    from settings import current_dataset
    
    suffix = plate_id_str + "_" + next_plate_id_str + ".html"
    
    filename = os.path.join(output_path, "find_mismatches_" + suffix)
    !jupyter nbconvert --to html --execute find_mismatches.ipynb --output $filename

    filename = os.path.join(output_path, "psf_analysis_" + suffix)
    !jupyter nbconvert --to html --execute psf_analysis.ipynb --output $filename
    
    filename = os.path.join(output_path, "display_nomatches_" + suffix)
    !jupyter nbconvert --to html --execute display_nonmatches.ipynb --output $filename


Start processing dataset:  9313,9315
[NbConvertApp] Converting notebook find_mismatches.ipynb to html
w0  -  0% .  0 11186 40349260002079     40349270025985    0.32900025523758814 0.1312381450105704
w1  -  3% .  2500 11635 40349260021686     40349270027234    0.04417058621584147 0.19276507877918903
w2  -  26% .  5500 10353 40349260037924     40349270023615    0.0549325991414662 0.31982726045782783
w1  -  23% .  3000 18161 40349260024487     40349270045856    0.14034093587724783 0.13063160434967358
w0  -  20% .  500 10957 40349260008600     40349270025358    0.3540247140563224 0.04050070722314558
w2  -  47% .  6000 14749 40349260041341     40349270035837    0.21285496102336765 0.07010937300151454
w1  -  44% .  3500 13779 40349260027396     40349270033199    0.06713128275350755 0.005269962846909948
w2  -  68% .  6500 18691 40349260046514     40349270047957    0.16831629077387333 0.11366486872361747
w0  -  41% .  1000 18352 40349260012484     40349270046625    0.0728260838513961 0.0105966

w4  -  74% .  9000 21797 40349300028717     40349340021798    0.37527955885252595 0.09795805858630047
w3  -  43% .  6500 37148 40349300047760     40349340037149    0.06708535847792518 0.1532494828381914
w2  -  63% .  5000 32567 40349300042053     40349340032568    0.04854868793700007 0.1773515981625451
w0  -  52% .  1000 43138 40349300055472     40349340043139    0.2690158388531927 0.1082573320843494
w3  -  69% .  7000 38539 40349300049592     40349340038540    0.2803708837063823 0.25306965148104155
w1  -  58% .  3000 51073 40349300065981     40349340051074    0.09085568854061421 0.5258919617347146
w2  -  90% .  5500 34119 40349300044059     40349340034120    0.03118035831448651 0.03896638552305376
w0  -  79% .  1500 44937 40349300057746     40349340044938    0.6314871317954385 1.0841674267595636
w3  -  95% .  7500 17226 40349300022944     40349340017227    0.4581126583616424 0.035563152499662465
w1  -  84% .  3500 54605 40349300070232     40349340054606    0.0750501140146298 0.0783728

w1  -  26% .  4000 20834 40349360022535     40349370036180    0.10581290621303197 0.06101611391500228
w2  -  37% .  7500 15253 40349360034256     40349370020913    0.2507331856122619 0.013694689005205873
w3  -  64% .  11500 17496 40349360049189     40349370027109    0.15239537667639524 0.1313496186952534
w0  -  31% .  1000 18920 40349360011175     40349370030956    0.17156868652818957 0.06133277831850137
w2  -  53% .  8000 24936 40349360035983     40349370049296    0.14072862529701524 0.12538343133030594
w1  -  42% .  4500 20081 40349360024175     40349370034125    0.3229731047213136 0.0673885088787074
w3  -  80% .  12000 18730 40349360052263     40349370030440    0.13721547950353852 0.010212852464519528
w2  -  69% .  8500 18783 40349360037601     40349370030627    0.26448938031080615 0.2622492982197855
w1  -  58% .  5000 15293 40349360025925     40349370021031    0.2949469523400694 0.16936624567449599
w0  -  47% .  1500 14295 40349360013367     40349370018028    0.17535695499191206 0.

w6  -  36% .  13000 8131 40349380008402     40349400008132    0.24744924803599133 0.08785127894270772
w4  -  16% .  8500 22916 40349380023299     40349400022917    0.1924483809943922 0.14192033164022178
w5  -  14% .  10500 29499 40349380030080     40349400029500    0.03540554453138611 0.06326913664338463
w6  -  61% .  13500 10481 40349380010746     40349400010482    0.13753057112353417 0.20102096739833542
w2  -  20% .  4500 36962 40349380037594     40349400036963    0.1251655930900597 0.18115175190303034
w3  -  18% .  6500 43949 40349380044440     40349400043950    0.17198384161929425 0.13568710972720055
w6  -  85% .  14000 12593 40349380012877     40349400012594    0.06971981274546124 0.28177371112860783
w4  -  40% .  9000 24553 40349380024991     40349400024554    0.10197632270774193 0.06422321355046279
w7  -  59% .  15500 18116 40349380018558     40349400018117    0.9510938634548438 0.008353681369044352
w0  -  24% .  500 50822 40349380051449     40349400050823    0.3201265895313554 

w4  -  31% .  8500 22141 40349400024355     40349320022142    0.7262306719894696 0.03155211403509384
w6  -  84% .  13500 11989 40349400013210     40349320011990    0.7700688885279305 0.10269467577757041
w5  -  32% .  10500 28260 40349400031237     40349320028261    0.0738602228466334 0.06992369391980446
w2  -  28% .  4500 33869 40349400037422     40349320033870    0.42040041428208497 0.14758858283983045
w7  -  61% .  15000 17273 40349400018981     40349320017274    0.8050344408502497 0.285539315098049
w0  -  25% .  500 46134 40349400051067     40349320046135    0.717713660355912 0.27885464735390997
w3  -  29% .  6500 40211 40349400044648     40349320040212    0.7016920735850363 0.00024429812413018226
w4  -  56% .  9000 23624 40349400026089     40349320023625    0.43483061938331957 0.12375405711395615
w1  -  26% .  2500 53909 40349400059771     40349320053910    0.591390486738419 0.033975565324340096
w7  -  86% .  15500 19100 40349400020866     40349320019101    1.1581922631194175 0.501