# Data from the Monson et al paper

In this notebook we are going over the machine readable tables in the paper by Monson et al, 2017. We will be extracting data from the different tables such as magnitude, right ascension, declination, type of RR Lyrae and other parameters which may help choosing stars for observation in our project.

## Analysing Table 5: Intensity Mean Magnitudes from GLOESS Light Curves

### Importing the data

Start by getting the data from the link given in the paper. The table can easily be read by a human, but passing it to a pandas dataframe will require some playing around.

In [7]:
url_table_5 = "https://cfn-live-content-bucket-iop-org.s3.amazonaws.com/journals/1538-3881/153/3/96/revision1/ajaa531bt5_mrt.txt?AWSAccessKeyId=AKIAYDKQL6LTV7YY2HIK&Expires=1666713919&Signature=luLyHsRxW900dRL3k5KiOEsswwo%3D"

In [8]:
import pandas as pd

# Pass the table to a df
dataTable5 = pd.read_table(url_table_5, names = ['Data table 5'])

### Getting the header titles

First we want the header titles. These are in the first lines of the table, presented like this (only first lines shown)
```
   Bytes Format Units   Label   Explanations
--------------------------------------------------------------------------------
   1-  9 A9     ---     Name    Star Name
  11- 16 F6.3   mag     Umag    ?="" Intensity Mean U band magnitude
  18- 22 F5.3   mag   e_Umag    ?="" Uncertainty in Umag
  24- 29 F6.3   mag     Bmag    ?="" Intensity Mean B band magnitude
  31- 35 F5.3   mag   e_Bmag    ?="" Uncertainty in Bmag
  .
  .
  .
```
From here, we want to extract the labels and find the characters of the table they correspond to. It should be noted that due to the way that pandas imported, what we have at the moment is a dataframe with a single column and several rows, and we want to separate that, in this particular case, into 5 columns each with its own information.

In [9]:
# Rows in table 5 corresponding to the information in the header: 6-29
header_rows_temp_table_5 = dataTable5.iloc[6:29]

# Rename the column to something different, code can be refactored to delete this line
header_rows_temp_table_5 = header_rows_temp_table_5.rename(columns = {header_rows_temp_table_5.columns[0]:str(header_rows_temp_table_5.loc[7].values).strip('[\'   \']')})

# Reset index to make the df start from 1
header_rows_temp_table_5.reset_index(inplace = True, drop = True)

# Remove the first 3 lines and reset index again
header_rows_temp_table_5 = header_rows_temp_table_5[3:].reset_index(drop = True)

# Convert to series to be able to work with the Series.str method
header_rows_temp_table_5 = header_rows_temp_table_5[header_rows_temp_table_5.columns[0]]

Now that we have it in a series, we can use the pd.Series.str[slice_start:slice_end] method to divide each column into different ones and get the 5 columns we wanted at the beginning.

In [10]:
header_rows_final_table_5 = pd.DataFrame({'Bytes': header_rows_temp_table_5.str[0:8],
                       'Format': header_rows_temp_table_5.str[8:15],
                       'Units': header_rows_temp_table_5.str[15:20],
                       'Label': header_rows_temp_table_5.str[20:30],
                       'Explanation': header_rows_temp_table_5.str[30:]})
header_rows_final_table_5

Unnamed: 0,Bytes,Format,Units,Label,Explanation
0,1- 9,A9,---,Name,Star Name
1,11- 16,F6.3,mag,Umag,"?="""" Intensity Mean U band magnitude"
2,18- 22,F5.3,mag,e_Umag,"?="""" Uncertainty in Umag"
3,24- 29,F6.3,mag,Bmag,"?="""" Intensity Mean B band magnitude"
4,31- 35,F5.3,mag,e_Bmag,"?="""" Uncertainty in Bmag"
5,37- 42,F6.3,mag,Vmag,"?="""" Intensity Mean V band magnitude"
6,44- 48,F5.3,mag,e_Vmag,"?="""" Uncertainty in Vmag"
7,50- 55,F6.3,mag,Rcmag,"?="""" Intensity Mean Rc band magnitude"
8,57- 61,F5.3,mag,e_Rcmag,"?="""" Uncertainty in Rcmag"
9,63- 68,F6.3,mag,Icmag,"?="""" Intensity Mean Ic band magnitude"


### Extracting the star data

We now follow the same procedure to get the actual magnitude from stars.

In [11]:
data_rows_temp_table_5 = dataTable5.iloc[31:]

# Make it into a series
data_rows_temp_table_5 = data_rows_temp_table_5[data_rows_temp_table_5.columns[0]]

And we use the column labels in the label column from our header dataframe to name the columns we will be splitting the dataframe into. First it needs a bit of formatting, however...

In [12]:
# getting the labels for the columns
column_labels_table_5 = str(header_rows_final_table_5['Label'].values).strip('[\'   \']')
# Get rid of white spaces
column_labels_table_5 = column_labels_table_5.replace(' ', '')
# Get rid of line breaks
column_labels_table_5 = column_labels_table_5.replace('\n', '')
# Get rid of all the weird '' in the text
column_labels_table_5 = column_labels_table_5.replace('\'\'', ',')
#Make it into a list
column_labels_table_5 = column_labels_table_5.split(',')

Finally, we can use the list we just created and the bytes position from the byte column in the header dataframe to split everything nicely. Python indexing starts at 0, so that's why every slice starts a bit earlier that its header dataframe counterpart.

In [26]:
data_rows_final_table_5 = pd.DataFrame({column_labels_table_5[0]: data_rows_temp_table_5.str[0:8],
                                column_labels_table_5[1]: data_rows_temp_table_5.str[10:15],
                                column_labels_table_5[2]: data_rows_temp_table_5.str[17:21],
                                column_labels_table_5[3]: data_rows_temp_table_5.str[23:28],
                                column_labels_table_5[4]: data_rows_temp_table_5.str[30:34],
                                column_labels_table_5[5]: data_rows_temp_table_5.str[36:41],
                                column_labels_table_5[6]: data_rows_temp_table_5.str[43:47],
                                column_labels_table_5[7]: data_rows_temp_table_5.str[49:54],
                                column_labels_table_5[8]: data_rows_temp_table_5.str[56:60],
                                column_labels_table_5[9]: data_rows_temp_table_5.str[62:67],
                                column_labels_table_5[10]: data_rows_temp_table_5.str[69:73],
                                column_labels_table_5[11]: data_rows_temp_table_5.str[75:80],
                                column_labels_table_5[12]: data_rows_temp_table_5.str[82:86],
                                column_labels_table_5[13]: data_rows_temp_table_5.str[88:93],
                                column_labels_table_5[14]: data_rows_temp_table_5.str[95:99],
                                column_labels_table_5[15]: data_rows_temp_table_5.str[101:106],
                                column_labels_table_5[16]: data_rows_temp_table_5.str[108:112],
                                column_labels_table_5[17]: data_rows_temp_table_5.str[114:119],
                                column_labels_table_5[18]: data_rows_temp_table_5.str[121:125],
                                column_labels_table_5[19]: data_rows_temp_table_5.str[127:132]})

data_rows_final_table_5.head()
data_rows_temp_table_5.to_pickle('./MonsonTable5.pkl')

## Analysing Table 1: RRL Galactic Calibrators and Ephemerides

The type of RR Lyrae is in table 2, we are going to repeat the process from earlier with this table and join the tables together.

### Importing the data


In [14]:
url_table_1 = 'https://cfn-live-content-bucket-iop-org.s3.amazonaws.com/journals/1538-3881/153/3/96/revision1/ajaa531bt1_mrt.txt?AWSAccessKeyId=AKIAYDKQL6LTV7YY2HIK&Expires=1666713919&Signature=XFl2CIxMK%2BRSdygodvwEAz6vfxg%3D'

In [15]:
dataTable1 = pd.read_table(url_table_1, names = ['Data table 1'])

### Getting the header titles

In [16]:
# Rows in table 2 corresponding to the information in the header: 6-29
header_rows_temp_table_1 = dataTable1.iloc[6:19]

# Rename the column to something different, code can be refactored to delete this line
header_rows_temp_table_1 = header_rows_temp_table_1.rename(columns = {header_rows_temp_table_1.columns[0]:str(header_rows_temp_table_1.loc[7].values).strip('[\'   \']')})

# Reset index to make the df start from 1
header_rows_temp_table_1.reset_index(inplace = True, drop = True)

# Remove the first 3 lines and reset index again
header_rows_temp_table_1 = header_rows_temp_table_1[3:].reset_index(drop = True)

# Convert to series to be able to work with the Series.str method
header_rows_temp_table_1 = header_rows_temp_table_1[header_rows_temp_table_1.columns[0]]

In [17]:
header_rows_final_table_1 = pd.DataFrame({'Bytes': header_rows_temp_table_1.str[0:8],
                                          'Format': header_rows_temp_table_1.str[8:15],
                                          'Units': header_rows_temp_table_1.str[15:21],
                                          'Label': header_rows_temp_table_1.str[21:30],
                                          'Explanation': header_rows_temp_table_1.str[30:]})
header_rows_final_table_1.head()

Unnamed: 0,Bytes,Format,Units,Label,Explanation
0,1- 9,A9,---,Name,Star Name
1,11- 21,F11.9,d,PerF,Final period
2,23- 34,F12.4,d,HJD-ma,x TMMT HJD-max
3,36- 45,E10.3,d/yr,{zeta},"?="""" Quadratic O-C shape term, if required."
4,47- 50,A4,---,RRL,RR Lyrae Class


### Getting the star data

In [18]:
data_rows_temp_table_1 = dataTable1.iloc[40:]
# Make it into a series
data_rows_temp_table_1 = data_rows_temp_table_1[data_rows_temp_table_1.columns[0]]

In [19]:
# Getting the column labels
column_labels_table_1 = str(header_rows_final_table_1['Label'].values).strip('[\'   \']')
# Get rid of white spaces
column_labels_table_1 = column_labels_table_1.replace(' ', '')
# Get rid of line breaks
column_labels_table_1 = column_labels_table_1.replace('\n', '')
# Get rid of all the weird '' in the text
column_labels_table_1 = column_labels_table_1.replace('\'\'', ',')
#Make it into a list
column_labels_table_1 = column_labels_table_1.split(',')

In [28]:
data_rows_final_table_1 = pd.DataFrame({column_labels_table_1[0]: data_rows_temp_table_1.str[0:8],
                                        column_labels_table_1[1]: data_rows_temp_table_1.str[10:20],
                                        column_labels_table_1[2]: data_rows_temp_table_1.str[22:33],
                                        column_labels_table_1[3]: data_rows_temp_table_1.str[35:44],
                                        column_labels_table_1[4]: data_rows_temp_table_1.str[46:49],
                                        column_labels_table_1[5]: data_rows_temp_table_1.str[51:57],
                                        column_labels_table_1[6]: data_rows_temp_table_1.str[59:63],
                                        column_labels_table_1[7]: data_rows_temp_table_1.str[65:67],
                                        column_labels_table_1[8]: data_rows_temp_table_1.str[69:71],
                                        column_labels_table_1[9]: data_rows_temp_table_1.str[73:75],
                                        })
data_rows_final_table_1.to_pickle('./MonsonTable1.pkl')
data_rows_final_table_1.head()


Unnamed: 0,Name,PerF,HJD-ma,{zeta},RRL,PerBL,[Fe/H],r_Par-HI,r_Par-BW,r_Par-HS
40,SW And,0.4422602,2456876.92,1.72,RRa,36.8,-0.2,HI,1.0,
41,XX And,0.722757,2456750.915,,RRa,,-1.9,HI,,
42,WY Ant,0.5743456,2456750.384,-1.46,RRa,,-1.4,HI,,
43,X Ari,0.65117288,2456750.387,-2.4,RRa,,-2.4,HI,4.0,
44,ST Boo,0.622286,2456750.525,,RRa,284.0,-1.7,HI,,


##

# Getting these stars from SIMBAD

In [1]:
from astroquery.simbad import Simbad
import pandas as pd


Simbad.reset_votable_fields()
Simbad.add_votable_fields("distance", "flux(K)", "flux(G)", "flux(R)", "flux(B)")

table_1 = pd.read_pickle('MonsonTable1.pkl')
table_1['PerF']


40    0.4422602 
41    0.722757  
42    0.5743456 
43    0.65117288
44    0.622286  
45    0.65083   
46    0.553029  
47    0.41201459
48    0.56070478
49    0.46659934
50    0.47261673
51    0.66042001
52    0.56966993
53    0.58724622
54    0.713853  
55    0.39729   
56    0.39960010
57    0.45535984
58    0.4785428 
59    0.4796017 
60    0.4523933 
61    0.59743435
62    0.5668378 
63    0.5711625 
64    0.54258   
65    0.3903747 
66    0.640993  
67    0.48054884
68    0.7342073 
69    0.493355  
70    0.52207144
71    0.47747883
72    0.6422893 
73    0.59958113
74    0.46806   
75    0.5576587 
76    0.4756089 
77    0.31489   
78    0.31256107
79    0.329045  
80    0.2670274 
81    0.2734563 
82    0.30868   
83    0.33168   
84    0.311331  
85    0.362755  
86    0.25551053
87    0.390365  
88    0.377356  
89    0.34083   
90    0.3246846 
91    0.3168974 
92    0.4058016 
93    0.3071178 
94    0.4057605 
Name: PerF, dtype: object

In [7]:
Monson_Simbad_query = pd.DataFrame()
for star_name  in table_1['Name']:
    result =  Simbad.query_object(star_name)
    if result is None:
        print(star_name)
    else:
        Monson_Simbad_query = pd.concat([Monson_Simbad_query, result.to_pandas()])



V0440 Sg
V0675 Sg


In [3]:
# RV UMA

result = Simbad.query_object('RV UMa')
Monson_Simbad_query = pd.concat([Monson_Simbad_query, result.to_pandas()])

In [9]:
Monson_Simbad_query.to_pickle('./Monson_stars_simbad.pkl')

In [26]:
pd.read_csv('Observable_tonight.csv')['MAIN_ID']

0    V* SW And
1    V* UY Cyg
2    V* XZ Cyg
3    V* RR Lyr
4    V* AV Peg
5    V* RZ Cep
Name: MAIN_ID, dtype: object

In [28]:
table_1[table_1['Name'] == 'UY Cyg']

Unnamed: 0,Name,PerF,HJD-ma,{zeta},RRL,PerBL,[Fe/H],r_Par-HI,r_Par-BW,r_Par-HS


In [29]:
table_1

Unnamed: 0,Name,PerF,HJD-ma,{zeta},RRL,PerBL,[Fe/H],r_Par-HI,r_Par-BW,r_Par-HS
40,SW And,0.4422602,2456876.92,1.72,RRa,36.8,-0.2,HI,1.0,
41,XX And,0.722757,2456750.915,,RRa,,-1.9,HI,,
42,WY Ant,0.5743456,2456750.384,-1.46,RRa,,-1.4,HI,,
43,X Ari,0.65117288,2456750.387,-2.4,RRa,,-2.4,HI,4.0,
44,ST Boo,0.622286,2456750.525,,RRa,284.0,-1.7,HI,,
45,UY Boo,0.65083,2456750.522,,RRa,171.8,-2.5,HI,,
46,RR Cet,0.553029,2456750.365,,RRa,,-1.4,HI,,
47,W Crt,0.41201459,2456750.279,-9.4,RRa,,-0.5,HI,,
48,UY Cyg,0.56070478,2456750.608,,RRa,,-0.8,HI,,
49,XZ Cyg,0.46659934,2456750.55,,RRa,57.3,-1.4,HI,,HS
