## Demo: Importing Catalogue from Vizier
**LINCC FrameWorks** Lunch Talk Demo

Andy Tzanidais, April 25, 2024


### Getting Started
- First visit the VizieR Catalogue Collection: https://cdsarc.u-strasbg.fr/viz-bin/cat/
- Search catalogue ID number or reference (i.e., StarHorse)

In [46]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import astropy.units as u
%matplotlib inline
%config InlineBackend.figure_format = "retina"
from matplotlib import rcParams
rcParams['savefig.dpi'] = 550
rcParams['font.size'] = 20
plt.rc('font', family='serif')

import lsdb
from tqdm import tqdm
import dask
dask.config.set({"temporary-directory" :'/epyc/ssd/users/atzanida/tmp'})
dask.config.set({"dataframe.shuffle-compression": 'Snappy'})
dask.config.set({"dataframe.convert-string": False})
from dask.distributed import Client

## ViZier and Aladin querying 
from pyvo import registry  # version >=1.4.1 
from mocpy import MOC
from ipyaladin import Aladin



In [None]:
# initialize dask client 
Client = Client(n_workers=8, threads_per_worker=1, memory_limit='auto')

## VizieR Querying

In [47]:
# the catalogue name in VizieR
CATALOGUE = "I/354"

# each resource in the VO has an identifier, called ivoid. For vizier catalogs,
# the VO ids can be constructed like this:
catalogue_ivoid = f"ivo://CDS.VizieR/{CATALOGUE}"

# the actual query to the registry
voresource = registry.search(ivoid=catalogue_ivoid)[0]

tables = voresource.get_tables()

# We can also extract the tables names for later use
tables_names = list(tables.keys())



In [48]:
print (f"Available table names: {tables_names}")

Available table names: ['I/354/starhorse2021']


In [49]:
# Let's read quickly the table description...
voresource.describe(verbose=True)

StarHorse2, Gaia EDR3 photo-astrometric distances
Short Name: I/354
IVOA Identifier: ivo://cds.vizier/i/354
Access modes: conesearch, hips#hips-1.0, tap#aux, web
Multi-capabilty service -- use get_service()

We present a catalogue of 362 million stellar parameters, distances, and
extinctions derived from Gaia's Early Data Release (EDR3) cross-matched with
the photometric catalogues of Pan-STARRS1, SkyMapper, 2MASS, and AllWISE. The
higher precision of the Gaia EDR3 data, combined with the broad wavelength
coverage of the additional photometric surveys and the new stellar-density
priors of the StarHorse code, allows us to substantially improve the accuracy
and precision over previous photo-astrometric stellarparameter estimates. At
magnitude G=14 (17), our typical precisions amount to 3% (15%) in distance,
0.13mag (0.15mag) in V-band extinction, and 140K (180K) in effective
temperature. Our results are validated by comparisons with open clusters, as
well as with asteroseismic and spectr

In [None]:
# Select the first table name
first_table_name = tables_names[0]

In [None]:
# Initialize TAP service
tap_service = voresource.get_service("tap")


# ADQL query
tap_records = voresource.get_service("tap").run_sync(
f'SELECT TOP 6000000 Source, RA_ICRS, DE_ICRS, teff50,\
logg50, met50, dist50, fidelity, GMAG0, "BP-RP0" from "{first_table_name}" WHERE (logg50>4.3) AND \
(logg50<4.72) AND (teff50<7220) AND \
(teff50>4000) AND (DE_ICRS BETWEEN -50 AND 50) AND (fidelity>=0.9) AND (Source BETWEEN 161269789918889472 AND 9361269790018889472)',
) 

In [None]:
table0 = tap_records.to_table()

In [None]:
# Convert to pandas dataframe
table_df = table0.to_pandas()

## Passing to LSDB

In [None]:
%%time
# Hipscat catalogue with minimum threshold 1_000_000 per pixel
star_horse_cat = lsdb.from_dataframe(
    table_df,
    catalog_name="StarHorse",
    catalog_type="object",
    lowest_order=5, 
    highest_order=8, 
    ra_column="RA_ICRS", 
    dec_column="DE_ICRS", 
    threshold=1_000_000)

In [50]:
# Already completed this step...
fgk_object = lsdb.read_hipscat("/nvme/users/atzanida/tmp/sample_final_starhorse_hips")

In [51]:
type(fgk_object)

lsdb.catalog.catalog.Catalog

In [52]:
fgk_object

Unnamed: 0_level_0,Source_StarHorse,RA_ICRS_StarHorse,DE_ICRS_StarHorse,teff50_StarHorse,logg50_StarHorse,met50_StarHorse,dist50_StarHorse,fidelity_StarHorse,GMAG0_StarHorse,BP-RP0_StarHorse,Norder_StarHorse,Dir_StarHorse,Npix_StarHorse,ps1_objid_ztf_dr14,ra_ztf_dr14,dec_ztf_dr14,ps1_gMeanPSFMag_ztf_dr14,ps1_rMeanPSFMag_ztf_dr14,ps1_iMeanPSFMag_ztf_dr14,nobs_g_ztf_dr14,nobs_r_ztf_dr14,nobs_i_ztf_dr14,mean_mag_g_ztf_dr14,mean_mag_r_ztf_dr14,mean_mag_i_ztf_dr14,Norder_ztf_dr14,Dir_ztf_dr14,Npix_ztf_dr14,_DIST,Norder,Dir,Npix
npartitions=99,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1
0,int64,float64,float64,float64,float64,float64,float64,float64,float64,float64,int64,int64,int64,int64,float64,float64,float64,float64,float64,int32,int32,int32,float64,float64,float64,int32,int32,int32,float64,uint8,uint64,uint64
288230376151711744,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12682136550675316736,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
18446744073709551615,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...


In [58]:
%%time
query_1 = fgk_object.cone_search(ra=131.5, dec=11.5, radius_arcsec=350)

CPU times: user 19 ms, sys: 6.92 ms, total: 25.9 ms
Wall time: 24.2 ms


In [60]:
%%time
query_1.head(10)

CPU times: user 1.77 s, sys: 615 ms, total: 2.38 s
Wall time: 4.21 s


Unnamed: 0_level_0,Source_StarHorse,RA_ICRS_StarHorse,DE_ICRS_StarHorse,teff50_StarHorse,logg50_StarHorse,met50_StarHorse,dist50_StarHorse,fidelity_StarHorse,GMAG0_StarHorse,BP-RP0_StarHorse,...,mean_mag_g_ztf_dr14,mean_mag_r_ztf_dr14,mean_mag_i_ztf_dr14,Norder_ztf_dr14,Dir_ztf_dr14,Npix_ztf_dr14,_DIST,Norder,Dir,Npix
_hipscat_index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1197863561548791808,598931784973616640,131.516913,11.406418,5466.05,4.597045,-0.446562,1.330821,1.0,5.618283,0.899752,...,16.74791,16.216818,16.053377,4,0,265,2.6e-05,1,0,4
1197675982639595520,598837978592281344,131.49218,11.407178,4971.93,4.597973,0.007115,1.182294,1.0,6.117981,1.137367,...,17.290233,16.475666,16.239628,4,0,265,3.2e-05,1,0,4
1197863489494843392,598931716254139264,131.552034,11.431177,5832.03,4.387356,-0.173844,1.033428,1.0,4.596917,0.776481,...,15.277122,14.796724,14.651127,4,0,265,3.8e-05,1,0,4
1197864225905573888,598932089915685760,131.551169,11.434792,5760.52,4.464173,-0.380127,2.618237,0.986816,4.999017,0.816458,...,17.676188,17.170518,17.040968,4,0,265,3.5e-05,1,0,4
1197864815444361216,598932399153342592,131.55043,11.462403,4075.75,4.380496,0.20447,1.229194,0.994629,7.27018,1.791975,...,19.139228,17.749946,17.143517,4,0,265,3.1e-05,1,0,4
1197868078726119424,598934018356606848,131.577509,11.494147,5813.61,4.4266,-0.203977,2.550507,0.987305,4.705297,0.82076,...,17.30064,16.805834,16.663889,4,0,265,3.3e-05,1,0,4
1203681371613036544,601840680424055168,131.45577,11.431003,4831.35,4.37069,-0.39797,0.365645,1.0,5.547179,1.116304,...,14.608643,13.756027,13.335592,4,0,267,2.7e-05,1,0,4
1203869057137246208,601934512573684480,131.469083,11.446394,5708.72,4.48501,-0.281889,2.325652,0.998535,5.032155,0.805952,...,17.405196,16.895617,16.765872,4,0,267,3.6e-05,1,0,4
1203869713306746880,601934856171068672,131.517089,11.464559,5415.8,4.556509,-0.352593,1.471636,1.0,5.486508,0.954173,...,16.8358,16.267802,16.10044,4,0,267,3.2e-05,1,0,4
1203869653294645248,601934826107096448,131.537837,11.466398,6124.96,4.31493,-0.177052,0.665078,1.0,4.058485,0.689887,...,13.843696,13.395432,13.265409,4,0,267,2.8e-05,1,0,4
