# Accessing SCoPe online
This notebook will go over how to use the `scope_client` class to access the SCoPe catalogs.

1. [initalize](#initalize)
2. [`cone_search` and `cone_searches`](#cone)
3. [`ids_search`](#id)
4. [`search_by_classification` and `search_by_features`](#serach)
5. [Plot classifications](#plot)
6. [Retrive light curves](#lc)
7. [Examples](#ex)
    1. [CMD](#cmd)
    2. [Single source](#singlesource)


(links are broken, will fix another time)

## <a id="initalize"></a> 1. Initalize `scope_client`

`scope_client` is built on top of `Kowalski` so it needs the same credetials. The tables for SCoPe all live on `gloria` so it is the only nessecary host. Adding other hosts will break the code. Specify this information in the `config.yaml` file and ONLY use hosts that you have access to.

In [None]:
from SCoPe_db import scope_client
import yaml
import numpy as np
import pandas as pd
timeout = 120
C=scope_client(time_out=timeout)

`ZTF_source_features_DR16` and `ZTF_source_classifications_DR16` are the two tables that make up the SCoPe catalog. `ZTF_source_classifications_DR16` contains all of the classifications for all of the lightcurves (the same as the Zenodo repo). `ZTF_source_features_DR16` has the feautes computed that were used to compute the classifications. Not all columns are as useful as each other. Built into `scope_client` is a list of preselected columns to be returned in all queries.

In [None]:
#columns used for ZTF_source_classifications_DR16
print(C.classification_keys)

In [None]:
#columns used for ZTF_source_features_DR16
print(C.features_keys)

One can manualy change the columns returned like in the cell below, however the prefered method is the change these varibles in `config.yaml` before initalizing a `scope_client`

In [None]:
#To change these columns just
C.features_keys=C.features_keys[1::] #removing the first element
#then run
C._setup_projections_()
# Now undo this
C.features_keys=['_id']+C.features_keys
C._setup_projections_()

## <a id="cone"></a>2. `cone_search` and `cone_searches`
use these functions to preform a cone search around one (`cone_search`) or many (`cone_searchs`) coordinates.

In [None]:
out_data=C.cone_search(10,10,radius=1,unit='arcmin')
out_data

In [None]:
#multiple ra dec positions
pos=[(10,10),(0,0),(5,5)]
out_data=C.cone_searches(pos,radius=1,unit='arcmin')
out_data

## <a id="id"></a>3. `ids_search`
You may know the ids of the objects you are looking for. You can retrive the objects like

In [None]:
ids=[10447433026230,10447432005323,10447433005770]
out_data=C.ids_search(ids,id_type='_id')


You can also search on `AllWISE___id`, `Gaia_EDR3___id` or `PS1_DR1___id` by changing the `id_type` kwarg to the desired id.

## <a id="4"></a> 4.  `search_by_classification` and `search_by_features`
There are indices over field so these implement searches in parallel over fields while filtering on columns from the classifications table or features table. These are equivalant to a 3 stage agregation pipeline where
1. Select only rows in a specifed `field`
2. Preform a `match` on columns of `ZTF_source_classifications_DR16` xor `ZTF_source_features_DR16`
3. `project` desired columns

In [None]:
fields=[447,500,396]
# period is less than 10 days AND dnn periodic score is greater than .7 AND XGB periodic score is greater than .7
short_period_and_periodic={'$and':[
                                    {'period':{'$lt':10}},
                                    {'pnp_dnn':{'$gt':.7}},
                                    {'pnp_xgb':{'$gt':.7}}
                                ]
                            }

out_data=C.search_by_classification(fields,filter_stage=short_period_and_periodic)
out_data

In [None]:
fields=[447,500,396]
# period is less than 10 days AND significance greater than 10 AND an amplitude greater than 1
short_period_high_amplitude={'$and':[
                                    {'amplitude':{'$gt':1}},
                                    {'period_ELS_ECE_EAOV':{'$lt':10}},
                                    {'significance_ELS_ECE_EAOV':{'$gt':10}}
                                ]
                            }

out_data=C.search_by_feature(fields,filter_stage=short_period_high_amplitude)
out_data

## 5. <a id="plot"></a> View one sources classifications
`all_tax.yaml` has the full names for the columns in the classifcation database

In [None]:
#get one row
out_data=C.ids_search([10500512001977],id_type='_id')
out_data

You can view the classifcation using the scheme below. I have left this level of settings exposed to let you customize the plots how you see fit. If you want more setting exposed or features for the plot open an issue on the github and I (Daniel Warshofsky) will look into it.

The top part of the circle shows the DNN scores and the lower part show the XGB scores 

In [None]:
from class_plot import *
import yaml
with open('./all_tax.yaml') as config_yaml:
    full_tree = yaml.load(config_yaml, Loader=yaml.FullLoader)
copy_tree=copy.deepcopy(full_tree)
# get just the Phenomenological tree
ph_tree=copy_tree['children'][0]
# get just the Ontological tree
on_tree=copy_tree['children'][1]
fig,axs=plt.subplots(2,figsize=(8,16))

s_ph={'skip_text':False,'cm':"Greens"}
s_on={'skip_text':False,'cm':"Greens"}
axs[0].set_title('Phenomenological')
plot_classifications(axs[0],out_data,ph_tree,sep=.3,settings=s_ph)
axs[1].set_title('Ontological')
plot_classifications(axs[1],out_data,on_tree,sep=.3,settings=s_on)
axs[0].set_xticks([])
axs[0].set_yticks([])
axs[1].set_xticks([])
axs[1].set_yticks([])
plt.tight_layout()

## 6. <a id="lc"></a> Retriving Lightcurves
You can get light curves by the `_id` or by a single coordinate. SCoPe sources are by lightcurve not object! Each filter is seperate! All lightcurves are returned in the same `DataFrame` with the column `_id` to denote which lightcurve is which. You need access to `melman` for this functionality

In [None]:
ids=[10447433026230,10447432005323,10447433005770]
out_data=C.get_light_curves_by_id(ids)
out_data


In [None]:
out_data=C.get_light_curves_by_coord(5.008773,4.999681)
out_data

In [None]:
import matplotlib.pyplot as plt
fig,ax=plt.subplots()
for id in np.unique(out_data['_id']):
    mask=out_data['_id']==id
    ax.plot(out_data[mask]['hjd']-min(out_data['hjd']),out_data[mask]['mag'],label=str(id))
ax.legend()
ax.invert_yaxis()
ax.set_xlabel('Days since first observation')
ax.set_ylabel('mag')

## 7. <a id="ex"></a> Examples

### 7.1. <a id="cmd"></a> CMD

In [None]:
# Retrive some data
fields=[500]
periodic={'$and':[
                    {'pnp_dnn':{'$gt':.7}},
                    {'pnp_xgb':{'$gt':.7}}
                ]
            }
non_periodic={'$and':[
                    {'pnp_dnn':{'$lt':.4}},
                    {'pnp_xgb':{'$lt':.4}}
                ]
            }

periodic_data=C.search_by_classification(fields,filter_stage=periodic)
non_periodic_data=C.search_by_classification(fields,filter_stage=non_periodic)

In [None]:
periodic_data=periodic_data[periodic_data['Gaia_EDR3___id']!=np.nan]
non_periodic_data=non_periodic_data[non_periodic_data['Gaia_EDR3___id']!=np.nan]


periodic_data['Abs_g_mag']=periodic_data["Gaia_EDR3__phot_g_mean_mag"] + 5.0 * np.log10(periodic_data["Gaia_EDR3__parallax"] / 1000)
non_periodic_data['Abs_g_mag']=non_periodic_data["Gaia_EDR3__phot_g_mean_mag"] + 5.0 * np.log10(non_periodic_data["Gaia_EDR3__parallax"] / 1000)

#keep only significant parallax
final_periodic=periodic_data[periodic_data["Gaia_EDR3__parallax"]/periodic_data['Gaia_EDR3__parallax_error']>3]
final_non_periodic=non_periodic_data[non_periodic_data["Gaia_EDR3__parallax"]/non_periodic_data['Gaia_EDR3__parallax_error']>3]


In [None]:
fig,ax= plt.subplots(figsize=(8,8))
ax.scatter(final_non_periodic['Gaia_EDR3__phot_bp_mean_mag']-final_non_periodic['Gaia_EDR3__phot_rp_mean_mag'],final_non_periodic['Abs_g_mag'],label='Non-Periodic',s=5,alpha=.1)
ax.scatter(final_periodic['Gaia_EDR3__phot_bp_mean_mag']-final_periodic['Gaia_EDR3__phot_rp_mean_mag'],final_periodic['Abs_g_mag'],label='Periodic',s=5,alpha=.2)
ax.set_xlabel('BP-RP (mag)')
ax.set_ylabel('Abs G (mag)')
l=ax.legend()
for lh in l.legend_handles: 
    lh.set_alpha(1)
ax.invert_yaxis()


### 7.2. <a id="singlesource"></a> Inspecting a single object

In [None]:
#First lets find some sources that are interesting
fields=[740]
#look for some very short period sources
periodic={'$and':[
                    {'period':{'$lt':.2}},
                    {'pnp_dnn':{'$gt':.7}},
                    {'pnp_xgb':{'$gt':.7}}
                ]
            }
periodic_data=C.search_by_classification(fields,filter_stage=periodic)


In [None]:
periodic_data

In [None]:
# now lets filter for only the most significant periods
q='significance_ELS > 100 & significance_EAOV >100 & significance_ECE >30 & significance_ELS_ECE_EAOV > 100' #this is arbitrary, just to find the source that I want
periodic_data_high_sig=periodic_data.query(q).reset_index(drop=True)
periodic_data_high_sig

In [None]:
#Look at the rows one by one untill we find something cool

row=periodic_data_high_sig[0:1]
print(f"This source has id: {row['_id'].values}")

with open('./all_tax.yaml') as config_yaml:
    full_tree = yaml.load(config_yaml, Loader=yaml.FullLoader)
copy_tree=copy.deepcopy(full_tree)
# get just the Phenomenological tree
ph_tree=copy_tree['children'][0]
# get just the Ontological tree
on_tree=copy_tree['children'][1]
fig,axs=plt.subplots(2,figsize=(8,16))

s_ph={'skip_text':False,'cm':"Greens"}
s_on={'skip_text':False,'cm':"Greens"}
axs[0].set_title('Phenomenological')
plot_classifications(axs[0],row,ph_tree,sep=.3,settings=s_ph)
axs[1].set_title('Ontological')
plot_classifications(axs[1],row,on_tree,sep=.3,settings=s_on)
axs[0].set_xticks([])
axs[0].set_yticks([])
axs[1].set_xticks([])
axs[1].set_yticks([])
plt.tight_layout()

While both DNN and XGB agree that the source is periodic, DNN seems to think this source is a  W Urse Maj. (contact binary) and XGB thinks it is a Delta Scuti varible. This merits looking at the lightcurve!

In [None]:
light_curve=C.get_light_curves_by_id([int(row['_id'][0])]) #This is a little awkward becuse numpy ints are not json serializable, if you encounter issues like this post an issue
light_curve['hjd']-=min(light_curve['hjd']) #set relitive to 

period=row['period'].loc[0]
fig,axs=plt.subplots(2,figsize=(12,8))
axs[0].scatter(light_curve['hjd'],light_curve['mag'],s=2)
axs[0].invert_yaxis()
axs[0].set_ylim(15.65,15)
axs[0].set_xlabel('Days Since First Obs')
axs[0].set_ylabel('Mag')
axs[0].set_title('Full Lightcurve')


axs[1].scatter((light_curve['hjd']%(2*period))/period,light_curve['mag'],s=2)
axs[1].invert_yaxis()
axs[1].set_ylim(15.65,15)
axs[1].set_xlabel('Phase')
axs[1].set_ylabel('Mag')
axs[1].set_title(f'Phase folded Lightcurve as period {period:.4f}')
plt.tight_layout()

Seeing the light curve reviles that it is a Delta Scuti (This is a known sources see [here](https://simbad.u-strasbg.fr/simbad/sim-coo?protocol=html&NbIdent=us=30&Radius.unit=arcsec&CooFrame=FK5&CooEpoch=2000&CooEqui=2000&Coord=43.3365825d+48.2069786d)<-- link to internet)