# Data Selection (Query Gaia DR3)

This notebook retrieves Gaia DR3 data for stars in the vicinity of the **Pleiades open cluster** (Messier 45). We aim to:
- Extract astrometric (parallax, proper motion) and photometric (magnitudes) data.
- Prepare a clean dataset for downstream analysis (e.g., membership determination).

In [2]:
# Necessary imports
import numpy as np
import pandas as pd

# Astronomy tools
import astropy.units as u
from astropy.coordinates import SkyCoord

# Access astronomical databases
from astroquery.vizier import Vizier

---

### Step 1: Query Gaia DR3 data

#### Why Gaia DR3?
Gaia DR3 provides high-precision astrometry (parallaxes, proper motions) and photometry for over 1.8 billion stars. This makes it ideal for studying stellar clusters like the Pleiades.

### Target Coordinates
- **Pleiades Center**: RA = 56.87°, Dec = 24.11° ([ICRS](https://en.wikipedia.org/wiki/International_Celestial_Reference_System)).
- **Radius**: 1 degree (to capture cluster members and background/foreground stars).

In [None]:
# Configuration

Vizier.ROW_LIMIT = -1 # Disable row limit
catalog = "I/355/gaiadr3" # Gaia DR3 catalog ID
pleiades_ra, pleiades_dec = 56.87, 24.11 # Pleiades' ICRS coordinates

### Step 2: Fetch data

In [None]:
# Fet data within 1-degree radius

vizier = Vizier()
pleiades_ra_dec = f"{pleiades_ra} {pleiades_dec}"

result = vizier.query_region(
    pleiades_ra_dec,
    radius="1d",  # Radius of 1 degree
    catalog=catalog
)



A `User Warning` about coordinate interpretation is expected because we pass a string in degrees. This does not affect results.

In [5]:
result

TableList with 1 tables:
	'0:I/355/gaiadr3' with 57 column(s) and 50 row(s) 

In [6]:
result[0]

RA_ICRS,DE_ICRS,Source,e_RA_ICRS,e_DE_ICRS,Plx,e_Plx,PM,pmRA,e_pmRA,pmDE,e_pmDE,RUWE,FG,e_FG,Gmag,FBP,e_FBP,BPmag,FRP,e_FRP,RPmag,BP-RP,RV,e_RV,Vbroad,GRVSmag,QSO,Gal,NSS,XPcont,XPsamp,RVS,EpochPh,EpochRV,MCMCGSP,MCMCMSC,And,Teff,logg,[Fe/H],Dist,A0,HIP,PS1,SDSS13,SKYM2,TYC2,URAT1,AllWISE,APASS9,GSC23,RAVE5,2MASS,RAVE6,RAJ2000,DEJ2000
deg,deg,Unnamed: 2_level_1,mas,mas,mas,mas,mas / yr,mas / yr,mas / yr,mas / yr,mas / yr,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,mag,Unnamed: 16_level_1,Unnamed: 17_level_1,mag,Unnamed: 19_level_1,Unnamed: 20_level_1,mag,mag,km / s,km / s,km / s,mag,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,K,log(cm.s**-2),Unnamed: 40_level_1,pc,mag,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,deg,deg
float64,float64,int64,float64,float64,float64,float32,float64,float64,float32,float64,float32,float64,float64,float32,float64,float32,float32,float64,float32,float32,float64,float64,float64,float32,float64,float64,uint8,uint8,uint8,uint8,uint8,uint8,uint8,uint8,uint8,uint8,uint8,float64,float64,float64,float64,float64,int32,int64,int64,int32,str12,str15,str19,int32,str10,str16,str17,str21,float64,float64
56.29970655907,23.26100088624,64878779542177920,0.4013,0.2921,0.2190,0.4557,2.859,2.721,0.595,-0.875,0.363,1.098,275.57098,1.196,19.586782,112.9,6.887,20.207155,225.8,10.91,18.863811,1.343344,--,--,--,--,0,0,0,0,0,0,0,0,0,0,0,--,--,--,--,--,--,135910562997343759,--,--,,,,--,NC3R001125,,,,56.29969339355,23.26100477401
56.31187534304,23.25615636617,64878779542180096,0.5357,0.3904,0.8742,0.6493,5.453,-1.735,0.871,-5.170,0.487,0.984,188.11929,0.9717,20.001284,61.05,8.187,20.874315,200.2,12.21,18.994102,1.880213,--,--,--,--,0,0,0,0,0,0,0,0,0,0,0,--,--,--,--,--,--,135900563119097937,--,--,,,,--,NC3R002264,,,,56.31188373645,23.25617934188
56.30543515160,23.26484284000,64878779542260864,1.0335,0.7541,2.6486,1.1043,19.280,0.152,1.829,-19.280,0.918,1.064,122.81044,1.642,20.464280,49.71,8.23,21.097471,186.7,14.43,19.070179,2.027292,--,--,--,--,0,0,0,0,0,0,0,0,0,0,0,--,--,--,--,--,--,135910563054748477,--,--,,,J034513.30+231553.6,--,,,03451333+2315539,,56.30543441548,23.26492852824
56.28885509191,23.27596376545,64879604175898624,0.1305,0.0950,0.1802,0.1463,11.835,6.824,0.185,-9.670,0.121,1.008,1191.05275,1.683,17.997540,560,10.16,18.467978,880,10.07,17.386631,1.081346,--,--,--,--,0,0,0,0,0,0,0,0,1,1,0,4939.8,4.7814,-2.3683,1544.3237,0.0292,--,135930562888691737,--,--,,URAT1-567028005,J034509.31+231633.2,--,NC3R001123,,03450931+2316336,,56.28882207756,23.27600674439
56.27559048728,23.27885146249,64879608470745984,0.7507,0.5688,-0.3159,0.9182,3.107,-2.950,1.274,-0.975,0.699,0.998,133.42777,1.072,20.374250,60.69,12.46,20.880724,109.2,12.17,19.652622,1.228102,--,--,--,--,0,0,0,0,0,0,0,0,0,0,0,--,--,--,--,--,--,135930562756285187,--,--,,,,--,NC3R002484,,,,56.27560476142,23.27885579510
56.26915181062,23.27549627972,64879642830595712,2.6329,2.0902,--,--,--,--,--,--,--,--,81.25938,1.223,20.912683,47.88,15.12,21.138159,110.1,17.28,19.643484,1.494675,--,--,--,--,0,0,0,0,0,0,0,0,0,0,0,--,--,--,--,--,--,135930562691941172,--,--,,,,--,,,,,56.26915181062,23.27549627972
56.27480840954,23.28176848536,64879672895376256,0.2407,0.1686,0.3975,0.2760,3.140,-0.691,0.344,-3.063,0.205,1.125,690.30101,1.513,18.589771,303.5,9.319,19.133230,522.5,10.54,17.952698,1.180532,--,--,--,--,0,0,0,0,0,0,0,0,1,0,0,4556.2,4.8610,-1.4192,1513.5316,0.0053,--,135930562748398687,--,--,,URAT1-567027998,,--,NC3R001151,,,,56.27481175082,23.28178210047
56.27001240205,23.28184658425,64879672895376640,0.2812,0.2063,1.6016,0.3162,29.786,25.935,0.433,-14.649,0.251,1.105,512.07012,1.392,18.914043,72.48,5.137,20.688047,655.2,10.19,17.706924,2.981123,--,--,--,--,0,0,0,0,0,0,0,0,1,0,0,3291.4,4.9633,-0.2035,371.0366,0.5355,--,135930562699838803,--,--,,URAT1-567027994,J034504.79+231654.7,--,NC3R001158,,03450478+2316549,,56.26988691887,23.28191169119
56.27350772077,23.29409853091,64879672895379200,0.0809,0.0580,2.2309,0.0893,90.705,24.736,0.111,-87.267,0.071,0.995,2685.48168,2.458,17.114810,748.9,9.855,18.152431,2811,13.44,16.125580,2.026852,--,--,--,--,0,0,0,1,0,0,0,0,1,1,0,4277.5,4.6363,-0.1009,573.3726,1.3041,--,135950562734883658,--,--,,URAT1-567027997,J034505.62+231739.2,--,NC3R002605,,03450562+2317402,,56.27338802683,23.29448638534
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...


### Step 3: Convert to DataFrame and select key columns

Column Rationale:

- `RA_ICRS`, `DE_ICRS`: Celestial coordinates.

- `Plx`, `e_Plx`: Parallax and its error (critical for distance estimation).

- `pmRA`, `pmDE`: Proper motions (to identify cluster members).

- `Gmag`, `BPmag`, `RPmag`: Photometric magnitudes (for color-magnitude diagrams).

In [7]:
df = result[0].to_pandas()
df.head(3)

Unnamed: 0,RA_ICRS,DE_ICRS,Source,e_RA_ICRS,e_DE_ICRS,Plx,e_Plx,PM,pmRA,e_pmRA,...,TYC2,URAT1,AllWISE,APASS9,GSC23,RAVE5,2MASS,RAVE6,RAJ2000,DEJ2000
0,56.299707,23.261001,64878779542177920,0.4013,0.2921,0.219,0.4557,2.859,2.721,0.595,...,,,,,NC3R001125,,,,56.299693,23.261005
1,56.311875,23.256156,64878779542180096,0.5357,0.3904,0.8742,0.6493,5.453,-1.735,0.871,...,,,,,NC3R002264,,,,56.311884,23.256179
2,56.305435,23.264843,64878779542260864,1.0335,0.7541,2.6486,1.1043,19.28,0.152,1.829,...,,,J034513.30+231553.6,,,,03451333+2315539,,56.305434,23.264929


In [None]:
columns_need = ['RA_ICRS', 'DE_ICRS', 'Plx', 'e_Plx', 'pmRA', 'pmDE', 'Gmag', 'BPmag', 'RPmag']
df = df[columns_need]
df.head()

Unnamed: 0,RA_ICRS,DE_ICRS,Plx,e_Plx,pmRA,pmDE,Gmag,BPmag,RPmag
0,56.299707,23.261001,0.219,0.4557,2.721,-0.875,19.586782,20.207155,18.863811
1,56.311875,23.256156,0.8742,0.6493,-1.735,-5.17,20.001284,20.874315,18.994102
2,56.305435,23.264843,2.6486,1.1043,0.152,-19.28,20.46428,21.097471,19.070179
3,56.288855,23.275964,0.1802,0.1463,6.824,-9.67,17.99754,18.467978,17.386631
4,56.27559,23.278851,-0.3159,0.9182,-2.95,-0.975,20.37425,20.880724,19.652622


### Step 4: Save the data

The cleaned dataset is saved to `../data/raw/` for reproducibility in subsequent analyses.

In [10]:
df.to_csv("../data/raw/pleiades_gaia_cleaned.csv", index=False)