# NGC 7000 Data Extraction and 3D Velocity calculations
*Eric G. Suchanek, Ph.D. 9/26/2019*

This code performs a Gaia rectangular search from within the North American Nebula (NGC 7000) and extracts
* non-null fields pmra, pmdec, with parallax > 0. The query strings are stored in the mirapy module, filename *egs.py*, and can be easily modified.

3d velocities were calculated per the formulae discussed in: http://www.astronexus.com/a-a/motions-long-term. These values were then written out to a csv file containing the query results. The approximate center of NGC 7000 is used for the query center. The approximate distance to the nebula is 675pc.

The code is sufficiently general to be re-used to perform stellar extractions against the Gaia DR2 database, perform pm angle and 3d velocity calculations, and write the output to a well-defined file in .csv format. Simply set the center point and extents as well as the project_name variables. 


In [None]:
# Author: Eric G. Suchanek, Ph.D., MIRA#
# Setup the libraries and initialize any global variables.
# This code must be run first prior to any other cells.
#
# Suppress warnings. Comment this out if you wish to see the warning messages
import warnings
warnings.filterwarnings('ignore')

# We go offline with plotly first, since this is all local
#
import numpy as np
from pandas import DataFrame

import plotly.offline
import plotly_express as px

from time import time
# make sure the mirapy module is in the python module include path
from mirapy import *
import mirapy.egs as egs
import mirapy.utils as utils
from mirapy.utils import pprint_elapsed

# global variable used for pprinting the elapsed time. one of linux, imac, pc
_sys = imac_ubuntu

# Program to extract stars based on systematic variation of extraction size, 
# centered on NGC 7000 (the North American Nebula)

# specify the absolute path prefix (relative to the User's HOME directory), for 
# the location of the extracted stars will need to use this when reading as well!

"""
import sys
if sys.platform == 'win32':
    project_dir = "\\MIRA\\data\\PM_subset\\"
    file_prefix = "NGC7000PM_"
else:
    project_dir = "/MIRA/data/PM_subset/"
    file_prefix = "NGC7000PM_"
"""

project_dir = "/MIRA/data/PM_subset/"
file_prefix = "NGC7000PM_"

# Center of NGC 7000
ra_center = "20h58m47s" 
dec_center = "44d19m48s" 

star_count_list = []
time_list = []

start = time()

# list of extract sizes. 5m to 120m in 5m steps:
size_list = [str(size)+"m" for size in range(5,125,5)]

# loop over various extraction sizes. note we are doing square extractions so width = height
print("Starting batch run for sizes: ", size_list)
for sz in size_list:
    start2 = time()
    print("Starting query, size:", sz, "x", sz)
    star_count, secs, output_filename = egs.extract_Gaia_stars(ra_center, dec_center,
                                                               sz, sz,
                                                               project_dir, file_prefix)
    print('-- Returned:', star_count, 'stars')
    pprint_elapsed(start2, arch=_sys)
    star_count_list.append(star_count)
    time_list.append(secs) 

print('Batch extract completed.')
pprint_elapsed(start, _sys)

In [7]:

df = DataFrame()
df['starcount'] = star_count_list
df['Size'] = size_list
df['Seconds'] = time_list

title_str = "Star Count vs Extent"
fig = px.scatter(df, x="Size", y="starcount", template='plotly_dark', title=title_str,
                log_y=True)
fig.show()


title_str = "Query Time vs Extent"
fig2 = px.scatter(df, x="Size", y="Seconds", template='plotly_dark', title=title_str,
                 log_y=True)
fig2.show()

In [None]:
# just trying out the Gaia class extract routine...
from astroquery.gaia import Gaia
import astropy.units as u

start = time()
result = Gaia.query_object_async("20h58m47s 44d19m48s", width=.5*u.deg, height=.5*u.deg,verbose=True)
print('Stars returned:', len(result))
pprint_elapsed(start, _sys)