# Building selection functions for Galaxia 3
Calculate observed and intrinsic selection function using Galaxia3  mock photometric and spectroscopic catalogues.

In [1]:
# Load modules
%load_ext autoreload
%autoreload 2

import os, sys
import numpy as np
import pandas as pd

def ErrorWarning():
    raise ValueError("Wait, don't run this, you don't want to!")

## 1. Create new data directory

In [2]:
path = "/<path to directory...>/"

In [3]:
from seestar import createNew
createNew.create()

KeyboardInterrupt: 

The newly created folder contains templates of all the files that are needed for calculating the selection function. Once you have created this folder, you must replace the template files with real field files:
* Spectroscopic catalogue (SURVEY_survey.csv). This file will be a comma-separated file with at least the following five columns (appropriately labelled): galactic longitude in radians ('glon'), galactic latitude in radians ('glat'), apparent magnitudes ('Happ', 'Japp', 'Kapp'), field id tag for the star ('fieldID'). The file can have other columns too but they won't be used.
* Locations and IDs of the spectroscopic field pointings (SURVEY_fieldinfo.csv). This file gives the central galactic longitude in radians ('glon') and galactic latitude in radians ('glat') of each field, half angle of the field in radians ('halfangle') and the color and magnitude limits imposed by the spectroscopic survey ('Magmin', 'Magmax', 'Colmin', 'Colmax'). If none are imposed, write "NoLimit".
* PARSEC isochrone files can extracted from [isoPARSEC.tar.gz](https://drive.google.com/drive/folders/1mz09FRP6hJPo1zPBJHP1T0BNhtDOkdGs?usp=sharing). Move isochrone_interpolantinstances.pickle into the isochrones/ folder. Without the isochrones, you can still generate the selection function in observable coordinates.

### Check pickled instance of file locations

The newly created folder also has a file called SURVEY_fileinfo.pickle, which stores all the file locations and model setup parameters for evaluating the selection function. You can test the locations and print the values of all the parameters.

In [3]:
from seestar import surveyInfoPickler
fileinfo_path = os.path.join(path, 'Galaxia3_sf/Galaxia3_sf_fileinfo.pickle')
fileinfo = surveyInfoPickler.surveyInformation(fileinfo_path)

# Print values for the file locations
fileinfo.printValues()

Filename for spectroscopic catalogue:
     spectro_path: /home/andy/Documents/Research/SF/CodeDev/Notebooks/091118Tests/Galaxia3_sf/Galaxia3_sf_survey.csv
Filename for spectroscopic field pointings:
     field_path: /home/andy/Documents/Research/SF/CodeDev/Notebooks/091118Tests/Galaxia3_sf/Galaxia3_sf_fieldinfo.csv

Folder containing photometric samples in the spectroscopic field pointings:
     photo_path: /home/andy/Documents/Research/SF/CodeDev/Notebooks/091118Tests/Galaxia3_sf/photometric
Column headers and dtypes for spectroscopic files [ fieldID, Phi, Th, magA, magB, magC]
where magA-magB = Colour, magC = m (for selection limits):
     spectro_coords: ['fieldID', 'glon', 'glat', 'Japp', 'Kapp', 'Happ']
     spectro_dtypes: [<type 'numpy.float64'>, <type 'float'>, <type 'float'>, <type 'float'>, <type 'float'>, <type 'float'>]
Column headers for field pointing information [fieldID, Phi, Th, halfangle, Magmin, Magmax, Colmin, Colmax]:
     field_coords: ['fieldID', 'glon', 'glat', 

In [5]:
# Model for evaluating selection function
fileinfo.spectro_model = ('GMM', 10)
fileinfo.photo_model   = ('GMM', 10)

In [4]:
# Test and save
fileinfo.testFiles()
fileinfo.save()

1) Checking file paths exist:
OK

2) Checking spectroscopic catalogue file structure:
OK

3) Checking field information file structure:
(make sure halfangle is in units of radians.)
OK

4) Checking photometric catalogue file structure:
Checking 2.0.csv:
OK

5) Checking selection function pickle paths exist:
OK

6) Checking isochrone pickle files exist:
The premade interpolants (isochrone_interpolantinstances.pickle) will be automatically be used to calculate the selection function.


In [50]:
from seestar import FieldAssignment

fileinfo_path = os.path.join(fileinfo.survey_folder, "Galaxia3_sf_fileinfo.pickle")
# Galaxia 'photometric' files
files = [os.path.join(fileinfo.photo_path, "Galaxia_subset_"+str(n)+".csv") for n in xrange(1, 3)]
FA = FieldAssignment.FieldAssignment(fileinfo_path, files)
# ncores=1: SF will run the Serial version (not use multiprocessing)
# ncores>1: Parallel version on the number of cores given
# When running in parallel, the field counting can mess up a bit because the pools get disordered

# You can also specify memory= as an additional kwargs
# If memory given: its currently set up to use 5% of given memory
# If no memory kwarg given: It will use 5% of the memory available in the computer (not sure how this works on a cluser)

Checking photometric catalogue file structure:
Checking /home/andy/Documents/Research/SF/CodeDev/Notebooks/091118Tests/Galaxia3_sf/photometric/Galaxia_subset_1.csv:
File OK

Counting total number of stars..done
Total number of stars 2652527.
Importing 2417400 stars at a time. Iterating 1290982 stars at a time.
Field file path for field 1.0: /home/andy/Documents/Research/SF/CodeDev/Notebooks/091118Tests/Galaxia3_sf/photometric/1.0.csv
Clearing field files...
...done

File: Galaxia_subset_1.csv  Complete: 1500000/2652527(56.55%)  Time: 0.5m  Projected: 0h0m...Saving: 3/3       /1152525        
Total stars assigned to fields: 2022647.
Dictionary of stars per field in fileinfo.photo_field_starcount.


## 2. Calculate observed and intrinsic selection functions

In [54]:
from seestar import SelectionGrid
SF = SelectionGrid.SFGenerator(fileinfo_path,ncores=3)
# ncores = 0: SF will run the Serial version (not use multiprocessing)
# ncores > 0: Parallel version on the number of cores given
# When running in parallel, the field counting can mess up a bit because the pools get disordered

Would you like the selection function in: a) observable, b) intrinsic, c) both? (return a, b or c)c
Path to intrinsic SF (Galaxia3_sf_SF.pickle) exists. Load SF in from here? (y/n)n
Will generate a new intrinsic SF.
No observed SF found at /home/andy/Documents/Research/SF/CodeDev/Notebooks/091118Tests/Galaxia3_sf/Galaxia3_sf_obsSF.pickle, we'll build observable seleciton function from scratch.
Path to interpolated isochrones (isochrone_interpolantinstances.pickle) exists. These will be used.
The spectro model description is:('GMM', 1)
The photo model description is:('GMM', 2)

{'glon': 'phi', 'Colmin': 'Colmin', 'glat': 'theta', 'Colmax': 'Colmax', 'Magmin': 'Magmin', 'Magmax': 'Magmax', 'fieldID': 'fieldID', 'halfangle': 'halfangle'}
Creating distance-age-metallicity interpolants...
Importing colour-magnitude isochrone interpolants...
...done
...done.

Importing data for colour-magnitude field interpolants...
{'Kapp': 'appB', 'Happ': 'appC', 'glon': 'phi', 'glat': 'theta', 'Japp': 'ap

## 3. Calculate selection function probabilities (assuming duplicates still in samples)

In [None]:
from seestar import IsochroneScaling
# Load in Isochrone Calculator
IsoCalculator = IsochroneScaling.IntrinsicToObservable()
IsoCalculator.LoadColMag(SF.isocolmag_pickle)

In [55]:
# Load in spectroscopic data
survey = pd.read_csv(fileinfo.spectro_path)

# Separate spectroscopic data into fields 1 and overlapping field 2 and 3
survey1 = survey[survey[fileinfo.spectro_coords[0]] == 1.0]
survey23 = survey[survey[fileinfo.spectro_coords[0]].apply(lambda x: x in [2.0,3.0])]

In [56]:
# Load in photometric data in fields 1, 2, and 3
full1 = pd.read_csv(os.path.join(fileinfo.photo_path, str(1.0))+'.csv')
full1['Colour'] = full1[fileinfo.photo_coords[2]]-full1[fileinfo.photo_coords[3]]
full1.rename(index=str, columns={'feh':'mh', 'smass':'mass','rad':'s'}, inplace=True)

full23 = pd.DataFrame()
fields= [2.0,3.0]
for field in fields:
    full23 = pd.concat((full23, pd.read_csv(os.path.join(fileinfo.photo_path, str(field))+'.csv')))
full23['Colour'] = full23[fileinfo.photo_coords[2]]-full23[fileinfo.photo_coords[3]]
full23.rename(index=str, columns={'feh':'mh', 'smass':'mass','rad':'s'}, inplace=True)

In [61]:
fileinfo.iso_data_path

'/home/andy/Documents/Research/SF/CodeDev/Notebooks/091118Tests/Galaxia3_sf/isochrones/iso_fulldata.pickle'

In [6]:
# Recalculate colour and apparent magnitudes
full23['Colour'], full23['Happ'] = IsoCalculator.ColourMapp(full23.ageGyr, full23.mh, full23.mass, full23.s)
full1['Colour'], full1['Happ'] = IsoCalculator.ColourMapp(full1.ageGyr, full1.mh, full1.mass, full1.s)

/home/andy/Documents/Research/SF/CodeDev/Notebooks/091118Tests/Galaxia3_sf/isochrones/iso_fulldata.pickle
Unpickling isochrone dictionaries...
...done.

Scaled masses spacing: 235 / 353



Scaled masses spacing: 352 / 353
Scaled masses points: 999 / 1000
nage, nmh, nmass: 353, 54, 500
((0.0, 13.220006551228417), (-2.2435283018867924, 0.5905283018867925), (0.0, 1.0))


NameError: name 'full23' is not defined

In [24]:
# Calculate observable SF probabilities for field 1
full1 = SF(full1, method='observable', coords=['Happ', 'Colour'], angle_coords=['glon', 'glat'])
full1['union_obs'] = full1.union

Calculating all SF values...
...Assigning: 1969303/1969303        
Calculating: 1969303/1969303        ...done
Calculating union contribution...
...done


In [13]:
# Calculate intrinsic SF probabilities for field 1
full1 = SF(full1, method='intrinsic', coords=['ageGyr', 'mh', 's', 'mass'], angle_coords=['glon', 'glat'])
full1['union_int'] = full1.union

Calculating all SF values...
...Assigning: 1969303/1969303        
Calculating: 1556567/1969303        

AttributeError: SFGenerator instance has no attribute 'instanceSF'

In [25]:
# Calculate observable SF probabilities for field 2, 3
full23 = SF(full23, method='observable', coords=['Happ', 'Colour'], angle_coords=['glon', 'glat'])
full23['union_obs'] = full23.union

Calculating all SF values...
...Assigning: 14566/14566        
Calculating: 14566/14566        ...done
Calculating union contribution...
...done


In [15]:
# Calculate intrinsic SF probabilities for field 2, 3
full23 = SF(full23, method='intrinsic', coords=['ageGyr', 'mh', 's', 'mass'], angle_coords=['glon', 'glat'])
full23['union_int'] = full23.union

Calculating all SF values...
...Assigning: 14566/14566        
Calculating: 14566/14566        

AttributeError: SFGenerator instance has no attribute 'instanceSF'

## 4. Drop duplicates for overlapping fields

In [26]:
# Clear duplicates for photometric field 2, 3
full23_c = full23.drop_duplicates(subset=['glon','glat'], keep='first', inplace=False)

Duplicate progress: 14565 / 14566

In [17]:
# Clear duplicates for spectroscopic fields 2,3    
survey23_c = survey23.drop_duplicates(subset=['glon','glat'], keep='first', inplace=False)

Duplicate progress: 244 / 245

In [None]:
len(full23), len(full23_c), len(survey23), len(survey23_c)

## 5. Save all files

In [27]:
# Save all data with selection function probabilities counted
full1.to_csv(os.path.join(fileinfo.photo_path, 'solutionF1.csv'), index=False)
full23.to_csv(os.path.join(fileinfo.photo_path, 'solutionF23.csv'), index=False)
survey1.to_csv(os.path.join(fileinfo.photo_path, 'solutionS1.csv'), index=False)
survey23.to_csv(os.path.join(fileinfo.photo_path, 'solutionS23.csv'), index=False)
full23_c.to_csv(os.path.join(fileinfo.photo_path, 'solutionF23_c.csv'), index=False)
survey23_c.to_csv(os.path.join(fileinfo.photo_path, 'solutionS23_c.csv'), index=False)