# Galaxias en VVV

## Clasificacion con Machine Learning usando estrellas

Usamos un catálogo de galaxias identificadas en VVV en los tiles **d010** y **d0115** de Baravalle L.

Para saber donde estan ubicados los tiles usamos el mapa de VVV

<img src='./imgs/survey-area-tile-nbrs-copy2.jpg'>

En estos tiles encontraron 574 objetos con propiedades morfologicas, fotometricas y fotocromaticas propias de galaxias. 90 de los mismos han sido visualmente inspeccionados, y constituyen una muestra *bona fide* de galaxias en el VVV.

## Analisis de los datos

Primero cargamos las librerias necesarias

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from IPython.display import display

from astropy.io import ascii
from astropy.table import Table, Column

%matplotlib inline

In [2]:
from astroquery.irsa_dust import IrsaDust

Leo el catalogo de los restos del tile 010:

In [3]:
colnames = "ALPHA  DELTA  MAG_PSF_Ks  MAGERR_PSF_Ks  MAG_AUTO_Ks MAGERR_AUTO_Ks MAG_APER_Ks MAGERR_APER_Ks MAG_MODEL_Ks MAGERR_MODEL_Ks SPREAD_MODEL AMODEL_IMAGE BMODEL_IMAGE ELONGATION ELLIPTICITY A_IMAGE B_IMAGE KRON_RADIUS FLUX_RADIUS_02 FLUX_RADIUS_051 FLUX_RADIUS_08 SPHEROID_SERSICN CLASS_STAR MAG_PSF_H MAGERR_PSF_H MAG_AUTO_H MAGERR_AUTO_H MAG_APER_H MAGERR_APER_H MAG_PSF_J MAGERR_PSF_J MAG_AUTO_J MAGERR_AUTO_J MAG_APER_J MAGERR_APER_J C".split()
print colnames

['ALPHA', 'DELTA', 'MAG_PSF_Ks', 'MAGERR_PSF_Ks', 'MAG_AUTO_Ks', 'MAGERR_AUTO_Ks', 'MAG_APER_Ks', 'MAGERR_APER_Ks', 'MAG_MODEL_Ks', 'MAGERR_MODEL_Ks', 'SPREAD_MODEL', 'AMODEL_IMAGE', 'BMODEL_IMAGE', 'ELONGATION', 'ELLIPTICITY', 'A_IMAGE', 'B_IMAGE', 'KRON_RADIUS', 'FLUX_RADIUS_02', 'FLUX_RADIUS_051', 'FLUX_RADIUS_08', 'SPHEROID_SERSICN', 'CLASS_STAR', 'MAG_PSF_H', 'MAGERR_PSF_H', 'MAG_AUTO_H', 'MAGERR_AUTO_H', 'MAG_APER_H', 'MAGERR_APER_H', 'MAG_PSF_J', 'MAGERR_PSF_J', 'MAG_AUTO_J', 'MAGERR_AUTO_J', 'MAG_APER_J', 'MAGERR_APER_J', 'C']


In [4]:
d010 = ascii.read('./restos/RESTO_d010.cat', names=colnames)

In [5]:
d010

ALPHA,DELTA,MAG_PSF_Ks,MAGERR_PSF_Ks,MAG_AUTO_Ks,MAGERR_AUTO_Ks,MAG_APER_Ks,MAGERR_APER_Ks,MAG_MODEL_Ks,MAGERR_MODEL_Ks,SPREAD_MODEL,AMODEL_IMAGE,BMODEL_IMAGE,ELONGATION,ELLIPTICITY,A_IMAGE,B_IMAGE,KRON_RADIUS,FLUX_RADIUS_02,FLUX_RADIUS_051,FLUX_RADIUS_08,SPHEROID_SERSICN,CLASS_STAR,MAG_PSF_H,MAGERR_PSF_H,MAG_AUTO_H,MAGERR_AUTO_H,MAG_APER_H,MAGERR_APER_H,MAG_PSF_J,MAGERR_PSF_J,MAG_AUTO_J,MAGERR_AUTO_J,MAG_APER_J,MAGERR_APER_J,C
float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64
203.9948,-63.7845,17.0627,0.06,17.1152,0.07,17.2994,0.07,16.701,0.1,0.0021,9.35,0.148,2.04,0.509,1.54,0.757,3.5,0.903,0.53,2.57,9.894,0.84,17.82,0.0609,17.75,0.0811,17.88,0.0668,18.46,0.0841,18.71,0.0933,18.91,0.0895,2.273
204.0021,-63.8033,17.5726,0.09,18.047,0.12,18.0957,0.13,17.4643,0.1,-0.0046,0.75,0.445,1.19,0.16,0.81,0.68,3.5,0.643,0.37,1.49,10.0,0.795,17.99,0.0705,18.06,0.0952,18.08,0.0822,19.1,0.1176,19.43,0.1633,19.42,0.1408,1.822
204.0225,-63.8555,17.1464,0.11,17.3794,0.12,17.4788,0.08,16.839,0.74,0.0073,6.42,1.287,2.11,0.525,1.57,0.748,4.8,1.044,0.59,2.55,3.939,0.608,17.89,0.073,17.59,0.0957,18.02,0.0782,19.15,0.1131,18.76,0.1379,19.18,0.1147,1.939
204.047,-63.9181,17.1915,0.07,17.5174,0.09,17.5174,0.08,16.9157,0.27,0.0022,1.45,0.329,1.59,0.373,1.21,0.762,3.5,0.757,0.43,1.91,5.055,0.893,16.01,0.0271,16.43,0.0303,16.59,0.0223,18.18,0.0969,18.38,0.0895,18.67,0.0741,2.011
204.0511,-63.9287,17.8882,0.1,18.2675,0.14,18.3841,0.17,16.4123,0.41,-0.0083,506.35,5.461,1.22,0.184,0.82,0.67,3.5,0.611,0.36,1.4,9.995,0.627,18.19,0.0818,18.27,0.1188,18.31,0.1068,18.77,0.0869,19.03,0.1075,19.03,0.1013,1.797
204.058,-63.9462,16.9369,0.06,17.3914,0.08,17.4016,0.07,16.7857,0.06,-0.0013,0.3,0.289,1.58,0.366,1.2,0.759,3.5,0.66,0.4,1.67,1.09,0.948,17.17,0.0379,17.35,0.0574,17.41,0.0449,18.48,0.09,18.84,0.1145,18.95,0.0933,2.011
204.1018,-64.0565,18.0436,0.12,18.1416,0.14,18.1739,0.16,17.4578,0.67,-0.0037,11.6,0.444,1.26,0.21,0.84,0.665,3.5,0.739,0.44,1.97,5.585,0.605,18.17,0.0788,17.93,0.1047,18.17,0.0942,18.79,0.0779,18.56,0.1161,18.81,0.0876,2.127
204.0016,-63.8018,16.6425,0.04,16.7536,0.06,16.8451,0.05,16.3198,0.23,0.0006,2.24,0.465,1.5,0.332,1.43,0.954,3.5,0.828,0.48,2.16,5.807,0.991,16.81,0.0262,16.69,0.0384,16.87,0.0289,17.86,0.0481,18.11,0.0599,18.21,0.0506,2.082
204.069,-63.9741,16.4114,0.04,16.8186,0.05,16.8528,0.05,16.2626,0.04,-0.001,0.31,0.294,1.55,0.355,1.33,0.859,3.5,0.708,0.4,1.78,0.491,0.998,16.15,0.019,16.44,0.0268,16.46,0.0202,17.54,0.0397,17.68,0.0504,17.88,0.0394,2.001
204.0914,-64.0307,17.5289,0.07,17.4742,0.1,17.5019,0.08,16.9396,0.56,0.0033,25.59,0.277,1.12,0.105,1.1,0.987,3.5,0.866,0.5,2.14,9.293,0.081,18.33,0.0924,18.42,0.1291,18.45,0.1228,18.76,0.0777,18.78,0.0956,18.8,0.0868,1.967


Probamos la interfaz de IRSA dust extinction mediante tablas

In [54]:
coord_table = d010[['ALPHA', 'DELTA']]

rows_d010 = np.random.choice(len(coord_table), 20000)

submit_tab = coord_table[rows_d010]

submit_tab.add_column(Column(data=[2. for i in xrange(20000)], name='size'))

submit_tab.write('extinction_tab_d010.dat', format='ipac', names=['ra', 'dec', 'size'], overwrite=True)

Subimos la tabla creada en formato ipac para probar y funciona. 

La cantidad de datos es enorme (750k filas) lo que nos obliga a adoptar otra estrategia.

### Probemos con astroquery

`astroquery` sirve para realizar consultas a bases de datos astronomicas, siguiendo la filosofia de Astropy.

Para eso creamos la funcion `dered` la cual toma una fila de una tabla y realiza la correccion de extincion usando las coordenadas y una query a la base de datos IRSA.

In [10]:
from retrying import retry

@retry(stop_max_attempt_number=7)
def av(obj):
    return IrsaDust.get_query_table(obj, section='ebv')['ext SandF mean']*3.1

def dered(row):
    obj = str(row['ALPHA'])+' '+str(row['DELTA'])
    
    av_SanF = av(obj)
    AJ=0.28*av_SanF
    AH=0.184*av_SanF
    AKs=0.118*av_SanF
    
    row['MAG_PSF_Ks_C']=row['MAG_PSF_Ks'] - AKs
    row['MAG_APER_Ks_C']=row['MAG_APER_Ks'] - AKs
    row['MAG_PSF_J_C']=row['MAG_PSF_J'] - AJ
    row['MAG_APER_J_C']=row['MAG_APER_J'] - AJ
    row['MAG_PSF_H_C']=row['MAG_PSF_H'] - AH
    row['MAG_APER_H_C']=row['MAG_APER_H'] - AH
        

In [11]:
obj = str(test_table['ALPHA'][0])+' '+str(test_table['DELTA'][0])
av_SanF = IrsaDust.get_query_table(obj, section='ebv')['ext SandF mean']*3.1

Dejamos corriendo la correccion de la tablita.

In [12]:
from log_progress import log_progress

In [13]:
test_table = d010[0:200]

test_table['MAG_PSF_Ks_C']  = np.zeros(len(test_table))
test_table['MAG_APER_Ks_C'] = np.zeros(len(test_table))
test_table['MAG_PSF_J_C']   = np.zeros(len(test_table))
test_table['MAG_APER_J_C']  = np.zeros(len(test_table))
test_table['MAG_PSF_H_C']   = np.zeros(len(test_table))
test_table['MAG_APER_H_C']  = np.zeros(len(test_table))

%time for arow in log_progress(test_table, every=1): dered(arow)

CPU times: user 16.9 s, sys: 228 ms, total: 17.2 s
Wall time: 5min 4s


In [14]:
test_table.write('corrected_resto_d010.dat', format='ipac')

Vemos que tarda demasiado en procesar tan solo 200 filas. 

Es importante que sepamos que tarda mas

Finalmente en las celdas anteriores se seleccionaron 20000 objetos de muestra del tile d010 para realizar la correccion por extincion. Ahora se seleccionaran 20000 objetos mas del tile d115.

In [35]:
d115 = ascii.read('./restos/RESTO_d115.cat', names=colnames)

Abajo esta la celda usada para calcular los objetos para corregir. Pero ahora esta congelada para que no se sobreescriba el file.

## Correction of magnitudes

Ahora podemos corregir, usando las tablas de resultados de IRSA:

In [57]:
d010 = ascii.read('./restos/RESTO_d010.cat', names=colnames)[rows_d010]
d115 = ascii.read('./restos/RESTO_d115.cat', names=colnames)[rows_d115]

In [61]:
exct_d010 = ascii.read('extinction_d010.tbl', format='ipac')
exct_d115 = ascii.read('extinction_d115.tbl', format='ipac')

La correccion es de la siguiente forma entonces:

In [70]:
exct_d115.colnames

['objname',
 'ra',
 'dec',
 'cutout_size',
 'E_B_V_SandF',
 'mean_E_B_V_SandF',
 'stdev_E_B_V_SandF',
 'max_E_B_V_SandF',
 'min_E_B_V_SandF',
 'AV_SandF',
 'E_B_V_SFD',
 'mean_E_B_V_SFD',
 'stdev_E_B_V_SFD',
 'max_E_B_V_SFD',
 'min_E_B_V_SFD',
 'AV_SFD',
 'errmsg']

In [71]:
d115['MAG_PSF_Ks_C']=d115['MAG_PSF_Ks'] - exct_d115['AV_SandF']*0.118
d115['MAG_APER_Ks_C']=d115['MAG_APER_Ks'] - exct_d115['AV_SandF']*0.118

In [72]:
d115['MAG_PSF_J_C']=d115['MAG_PSF_J'] - exct_d115['AV_SandF']*0.28
d115['MAG_APER_J_C']=d115['MAG_APER_J'] - exct_d115['AV_SandF']*0.28

In [73]:
d115['MAG_PSF_H_C']=d115['MAG_PSF_H'] - exct_d115['AV_SandF']*0.184
d115['MAG_APER_H_C']=d115['MAG_APER_H'] - exct_d115['AV_SandF']*0.184