# Catalogues exploration

## OpenNGC catalogue
 Voir [source](https://github.com/mattiaverga/OpenNGC).

In [484]:
import pandas as pd
import numpy as np
pd.set_option('display.max_columns', 500)
CHEMIN_SOURCES = "/Users/lyes/Desktop/MD5/Python/python_projet/Data/données brutes/"

In [124]:
col_names = ['NGC_IC_designation', 
    'object_type',
    'Right_Ascension',
    'Declinaison',
    'Constelation_name_abrv',
    'major_axis',
    'minor_axis',
    'major__axis_position_angle', 
    'B_Apparent_Magnitude',
    'V_Apparent_Magnitude',             
    'J_Apparent_Magnitude',
    'H_Apparent_Magnitude',
    'K_Apparent_Magnitude',
    'Mean_surface_brigthness', 
    'Hubble_morphological_type',
    'center_star_U_maghitude',
    'center_star_B_maghitude',
    'center_star_V_maghitude',
    'Messier_number',
    'NGC_numner',
    'IC_number',
    'Center_star_name',
    'Identifiers',
    'Common_names',       
    'NED_notes',  
    'OpenNGC_notes']

NGC_mattiaAverga_GitHub = pd.read_csv(CHEMIN_SOURCES+"NGC_mattiaAverga_GitHub.csv", sep=';',header=0, names=col_names)
NGC_mattiaAverga_GitHub.shape

(13958, 26)

### Adendum

In [125]:
addendum_NGC_mattiaAverga_GitHub = pd.read_csv(CHEMIN_SOURCES+"addendum_NGC_mattiaAverga_GitHub.csv", sep=';', header=0, names=col_names)
addendum_NGC_mattiaAverga_GitHub.shape

(20, 26)

## Messier Catalogue Construction


### Messier catalogue of Datastro
[link](https://www.datastro.eu/explore/dataset/catalogue-de-messier/table/?disjunctive.objet&disjunctive.mag&disjunctive.english_name_nom_en_anglais&disjunctive.french_name_nom_francais&disjunctive.latin_name_nom_latin&sort=messier)

In [227]:
col_names = ['Messier_number',
    'NGC_IC_designation', 
    'object_type',
    'Season',
    'Magnitude',
    'Constellation_EN',
    'Constellation_FR',
    'Constellation_Latin',
    'Right_Ascension',        
    'Declinaison',
    'Distance_light_year',
    'Size',
    'Discoverer',
    'Year',
    'Image1',
    'Image2',
    'Constelation_abr' ]
messier_Datastro_website = pd.read_csv(CHEMIN_SOURCES+"messier_Datastro_website.csv", sep=';', header=0, names=col_names)
messier_Datastro_website.shape

(110, 17)

In [297]:
# Corrections : 
messier_Datastro_website.loc[messier_Datastro_website['NGC_IC_designation']=='NGC 9176', 'NGC_IC_designation'] = 'NGC 1976'

In [228]:
NGC_mattiaAverga_GitHub.loc[(NGC_mattiaAverga_GitHub['NGC_IC_designation']=='IC4725') |
                        (NGC_mattiaAverga_GitHub['NGC_IC_designation']=='IC4715')]

Unnamed: 0,NGC_IC_designation,object_type,Right_Ascension,Declinaison,Constelation_name_abrv,major_axis,minor_axis,major__axis_position_angle,B_Apparent_Magnitude,V_Apparent_Magnitude,J_Apparent_Magnitude,H_Apparent_Magnitude,K_Apparent_Magnitude,Mean_surface_brigthness,Hubble_morphological_type,center_star_U_maghitude,center_star_B_maghitude,center_star_V_maghitude,Messier_number,NGC_numner,IC_number,Center_star_name,Identifiers,Common_names,NED_notes,OpenNGC_notes
4923,IC4715,*Ass,18:16:56.12,-18:30:52.4,Sgr,120.0,60.0,,,4.5,,,,,,,,,24.0,,,,,Small Sgr Star Cloud,Milky Way star cloud.,V-mag taken from HEASARC's messier table
4933,IC4725,OCl,18:31:46.77,-19:06:53.8,Sgr,14.1,,,5.29,4.6,,,,,,,,,25.0,,,,,,,


According to this [link](http://messier.obspm.fr/m-q&a.html):
- M24: the Milky Way Patch in Sagittarius; it contains however the 11th magnitude cluster NGC 6603 which is sometimes erroneously listed as M24. Also, it may be that IC 4715 is M24. 
- M25: this is IC 4725, 
- M40: the double star Winnecke 4
- M45: the Pleiades; however, this cluster is associated with nebulae which have NGC numbers.

*Note:* M24 and M25 exist in NGC_mattiaAverga_GitHub. But M40 and M45 does not exist. So, we need to separate of from the dataset of Messier catalogue the objects M40 and M45. Then, merge the M40 and M45 with NGC Addendum dataset (which contains these two objects)

#### Spliting of Messier dataset

In [298]:
messier_Datastro_website_without_M40_M45 = messier_Datastro_website[(messier_Datastro_website['Messier_number']!='M40') &
                        (messier_Datastro_website['Messier_number']!='M45')].copy()
messier_Datastro_website_M40_M45 = messier_Datastro_website[(messier_Datastro_website['Messier_number']=='M40') |
                        (messier_Datastro_website['Messier_number']=='M45')].copy()

#### Merging of Messier and NGC datasets

In [299]:
# Normalise the NGC_designation: NGCnnnn
#1
messier_Datastro_website_without_M40_M45.loc[messier_Datastro_website_without_M40_M45['NGC_IC_designation'].str.len()==8
                             , 'NGC_IC_designation']= messier_Datastro_website_without_M40_M45['NGC_IC_designation'].str.replace(" ", "")

messier_Datastro_website_without_M40_M45.loc[messier_Datastro_website_without_M40_M45['NGC_IC_designation'].str.len()==7
                             , 'NGC_IC_designation']= messier_Datastro_website_without_M40_M45['NGC_IC_designation'].str.replace(" ", "0")

# Merge of first dataset
merge_messier_Datastro_website_without_M40_M45 = messier_Datastro_website_without_M40_M45.merge(NGC_mattiaAverga_GitHub, how='left', on='NGC_IC_designation')
merge_messier_Datastro_website_without_M40_M45.shape

(108, 42)

Messier_number_x
NGC_IC_designation
*object_type_x* (delete object_type_y is abreviation) 
Season                         
Magnitude                       
Constellation_EN               
Constellation_FR               
Constellation_Latin            
Right_Ascension_x  1 Nan(keep Right_Ascension_y no nan)            
Declinaison_x  1 Nan(keep Declinaison_y no nan)                
Distance_light_year           
Size                           
Discoverer                     
Year                          
Image1                         
Image2                         
*Constelation_abr (Delete)*             
*object_type_y  (delete)*            
*Right_Ascension_y*             
*Declinaison_y*                  
Constelation_name_abrv      
major_axis                   
minor_axis                    
major__axis_position_angle    
B_Apparent_Magnitude          
V_Apparent_Magnitude          
J_Apparent_Magnitude          
H_Apparent_Magnitude          
K_Apparent_Magnitude          
Mean_surface_brigthness       
Hubble_morphological_type      
center_star_U_maghitude       
center_star_B_maghitude       
center_star_V_maghitude       
*Messier_number_y (delete)*             
NGC_numner  (6 double-listed in NGC)                  
*IC_number (delete: 108 Nan)*                    
Center_star_name               
Identifiers                    
Common_names                   
NED_notes                      
OpenNGC_notes

In [363]:
drop_list = ['object_type_y', 'Right_Ascension_x', 'Declinaison_x', 'Messier_number_y', 'IC_number']
rename_lis = {'Messier_number_x' : 'Messier_number',
             'object_type_x' : 'object_type',
             'Right_Ascension_y' : 'Right_Ascension',
             'Declinaison_y' : 'Declinaison'}

merge_messier_Datastro_website_without_M40_M45 = merge_messier_Datastro_website_without_M40_M45.drop(columns=drop_list)
merge_messier_Datastro_website_without_M40_M45 = merge_messier_Datastro_website_without_M40_M45.rename(columns=rename_lis)
merge_messier_Datastro_website_without_M40_M45.shape

(108, 37)

In [317]:
# Merge of second dataset (M40 and M45)
messier_Datastro_website_M40_M45

Unnamed: 0,Messier_number,NGC_IC_designation,object_type,Season,Magnitude,Constellation_EN,Constellation_FR,Constellation_Latin,Right_Ascension,Declinaison,Distance_light_year,Size,Discoverer,Year,Image1,Image2,Constelation_abr
55,M45,,Open Cluster / Amas Ouvert,Winter / Hiver,1,,,,,,410.0,"120,0'",,,http://www.lasam.ca/messier/M045.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,
97,M40,Winnecke 4,Double star / Étoile Double,Spring / Printemps,9,,,,,,,,Hevelius,1660.0,http://www.lasam.ca/messier/M040.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,


In [352]:
Open_NGC_M40_M45 = addendum_NGC_mattiaAverga_GitHub[~addendum_NGC_mattiaAverga_GitHub['Messier_number'].isna()].copy()
Open_NGC_M40_M45['Messier_number_bis'] = 'M'+Open_NGC_M40_M45['Messier_number'].astype('int').astype('str')
Open_NGC_M40_M45 = Open_NGC_M40_M45.drop(columns='Messier_number')
Open_NGC_M40_M45 = Open_NGC_M40_M45.rename(columns={'Messier_number_bis': 'Messier_number'})
Open_NGC_M40_M45

Unnamed: 0,NGC_IC_designation,object_type,Right_Ascension,Declinaison,Constelation_name_abrv,major_axis,minor_axis,major__axis_position_angle,B_Apparent_Magnitude,V_Apparent_Magnitude,J_Apparent_Magnitude,H_Apparent_Magnitude,K_Apparent_Magnitude,Mean_surface_brigthness,Hubble_morphological_type,center_star_U_maghitude,center_star_B_maghitude,center_star_V_maghitude,NGC_numner,IC_number,Center_star_name,Identifiers,Common_names,NED_notes,OpenNGC_notes,Messier_number
13,M040,**,12:22:16.1,+58:05:04,UMa,,,,,8.0,,,,,,,,,,,,WDS J12222+5805AB,,,VMag taken from HEASARC's messier table,M40
14,Mel022,OCl,03:47:28.6,+24:06:19,Tau,150.0,150.0,90.0,,1.2,,,,,,,,,,,,MWSC 0305,Pleiades,,"Coordinates taken from Simbad, VMag taken from...",M45


In [364]:
merge_messier_Datastro_website_M40_M45 = messier_Datastro_website_M40_M45.merge(Open_NGC_M40_M45, how='left', on='Messier_number')
merge_messier_Datastro_website_M40_M45

Unnamed: 0,Messier_number,NGC_IC_designation_x,object_type_x,Season,Magnitude,Constellation_EN,Constellation_FR,Constellation_Latin,Right_Ascension_x,Declinaison_x,Distance_light_year,Size,Discoverer,Year,Image1,Image2,Constelation_abr,NGC_IC_designation_y,object_type_y,Right_Ascension_y,Declinaison_y,Constelation_name_abrv,major_axis,minor_axis,major__axis_position_angle,B_Apparent_Magnitude,V_Apparent_Magnitude,J_Apparent_Magnitude,H_Apparent_Magnitude,K_Apparent_Magnitude,Mean_surface_brigthness,Hubble_morphological_type,center_star_U_maghitude,center_star_B_maghitude,center_star_V_maghitude,NGC_numner,IC_number,Center_star_name,Identifiers,Common_names,NED_notes,OpenNGC_notes
0,M45,,Open Cluster / Amas Ouvert,Winter / Hiver,1,,,,,,410.0,"120,0'",,,http://www.lasam.ca/messier/M045.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,,Mel022,OCl,03:47:28.6,+24:06:19,Tau,150.0,150.0,90.0,,1.2,,,,,,,,,,,,MWSC 0305,Pleiades,,"Coordinates taken from Simbad, VMag taken from..."
1,M40,Winnecke 4,Double star / Étoile Double,Spring / Printemps,9,,,,,,,,Hevelius,1660.0,http://www.lasam.ca/messier/M040.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,,M040,**,12:22:16.1,+58:05:04,UMa,,,,,8.0,,,,,,,,,,,,WDS J12222+5805AB,,,VMag taken from HEASARC's messier table


In [371]:
# Move data before deleting NGC_IC_designation_y
merge_messier_Datastro_website_M40_M45.loc[merge_messier_Datastro_website_M40_M45['NGC_IC_designation_x'].isna(),
                                          'NGC_IC_designation_x'] = merge_messier_Datastro_website_M40_M45['NGC_IC_designation_y']

In [372]:
drop_list = ['NGC_IC_designation_y', 'object_type_y', 'Right_Ascension_x', 'Declinaison_x', 'IC_number']
rename_lis = {'NGC_IC_designation_x': 'NGC_IC_designation',
             'object_type_x' : 'object_type',
             'Right_Ascension_y' : 'Right_Ascension',
             'Declinaison_y' : 'Declinaison'}

merge_messier_Datastro_website_M40_M45 = merge_messier_Datastro_website_M40_M45.drop(columns=drop_list)
merge_messier_Datastro_website_M40_M45 = merge_messier_Datastro_website_M40_M45.rename(columns=rename_lis)
merge_messier_Datastro_website_M40_M45.shape

(2, 37)

In [392]:
messier_catalogue = pd.concat([merge_messier_Datastro_website_without_M40_M45, merge_messier_Datastro_website_M40_M45])
#messier_catalogue.sort_values(by='Messier_number')

In [393]:
messier_catalogue[messier_catalogue['Hubble_morphological_type'].isna()]

### Messier Catalogue of jbcurtin_gitHub
https://github.com/jbcurtin/messier-catalogue

In [495]:
col_names = ['Messier_number_jbcurtin_gitHub',
    'NGC_IC_designation_jbcurtin_gitHub', 
    'Common_name_jbcurtin_gitHub',
    'image_jbcurtin_gitHub',
    'Object_type_jbcurtin_gitHub',
    'Distance_KLY_jbcurtin_gitHub',             
    'Constellation_jbcurtin_gitHub',
    'Apparent_magnitude_jbcurtin_gitHub',
    'Right_Ascension_jbcurtin_gitHub',        
    'Declinaison_jbcurtin_gitHub']
messier_with_picture_links_jbcurtin_gitHub = pd.read_csv(CHEMIN_SOURCES+"messier_with_picture_links_jbcurtin_gitHub.csv", keep_default_na=True, sep=';', header=0, names=col_names)
messier_with_picture_links_jbcurtin_gitHub.loc[messier_with_picture_links_jbcurtin_gitHub['Common_name_jbcurtin_gitHub']==' ',
                       'Common_name_jbcurtin_gitHub'] = np.nan #not a space
    
messier_with_picture_links_jbcurtin_gitHub.dtypes

Messier_number_jbcurtin_gitHub         object
NGC_IC_designation_jbcurtin_gitHub     object
Common_name_jbcurtin_gitHub            object
image_jbcurtin_gitHub                  object
Object_type_jbcurtin_gitHub            object
Distance_KLY_jbcurtin_gitHub           object
Constellation_jbcurtin_gitHub          object
Apparent_magnitude_jbcurtin_gitHub    float64
Right_Ascension_jbcurtin_gitHub        object
Declinaison_jbcurtin_gitHub            object
dtype: object

In [536]:
messier_catalogue2 = messier_with_picture_links_jbcurtin_gitHub.merge(messier_catalogue,how='left', left_on='Messier_number_jbcurtin_gitHub', right_on='Messier_number')
messier_catalogue2.shape

(110, 47)

In [537]:
#keep 'Common_name_jbcurtin_gitHub', delete_Common_names
messier_catalogue2.loc[messier_catalogue2['Common_name_jbcurtin_gitHub'].isna(),
                   'Common_name_jbcurtin_gitHub'] = messier_catalogue2['Common_names']

# In 'Object_type_jbcurtin_gitHub', Galaxy is subtyped. So we should keep that info
# Keep 'object_type', delete 'Object_type_jbcurtin_gitHub'
messier_catalogue2.loc[messier_catalogue2['object_type'].str.contains('Galaxy'),
                   'object_type'] = messier_catalogue2['Object_type_jbcurtin_gitHub']

#Delete 'Constellation_jbcurtin_gitHub'
messier_catalogue2.loc[messier_catalogue2['Constellation_Latin'].isna(),
                   'Constellation_Latin'] = messier_catalogue2['Constellation_jbcurtin_gitHub']

drop_list = ['Common_names', 'Messier_number_jbcurtin_gitHub', 'NGC_IC_designation_jbcurtin_gitHub', 
             'Object_type_jbcurtin_gitHub', 'Distance_KLY_jbcurtin_gitHub', 'Constellation_jbcurtin_gitHub',
            'Right_Ascension_jbcurtin_gitHub', 'Declinaison_jbcurtin_gitHub']
rename_lis = {'Common_name_jbcurtin_gitHub': 'Common_name'}

messier_catalogue2 = messier_catalogue2.drop(columns=drop_list)
messier_catalogue2 = messier_catalogue2.rename(columns=rename_lis)
messier_catalogue2.shape


(110, 39)

### Messier Catalogue of Nexstarsite
https://www.nexstarsite.com/Book/DSO.htm

In [539]:
MessierObjects_nexstarsite = pd.read_excel(CHEMIN_SOURCES+"nexstarsite/"+"MessierObjects_nexstarsite.xls", skiprows=2)
MessierObjects_nexstarsite.dtypes

ObjectNum          int64
Name              object
Type              object
Constellation     object
RAHour             int64
RAMinute         float64
DecSign           object
DecDeg             int64
DecMinute        float64
Magnitude         object
Info              object
Distance (ly)      int64
dtype: object

There is not any new data in this dataset !

In [546]:
messier_catalogue_final = messier_catalogue2

## Caldwell Catalogue

In [547]:
messier_catalogue_final.dtypes

Common_name                            object
image_jbcurtin_gitHub                  object
Apparent_magnitude_jbcurtin_gitHub    float64
Messier_number                         object
NGC_IC_designation                     object
object_type                            object
Season                                 object
Magnitude                               int64
Constellation_EN                       object
Constellation_FR                       object
Constellation_Latin                    object
Distance_light_year                   float64
Size                                   object
Discoverer                             object
Year                                  float64
Image1                                 object
Image2                                 object
Constelation_abr                       object
Right_Ascension                        object
Declinaison                            object
Constelation_name_abrv                 object
major_axis                        

In [85]:
CaldwellObjects_nexstarsite = pd.read_excel(CHEMIN_SOURCES+"nexstarsite/"+"CaldwellObjects_nexstarsite.xls", skiprows=2)
CaldwellObjects_nexstarsite.shape

(109, 11)

Unnamed: 0,ObjectNum,Name,Type,Constellation,Magnitude,Info,RAHour,RAMinute,DecSign,DecDeg,DecMinute
0,1,NGC 188,Open Cluster,Cep,8.1,Size: 14,0,44.4,+,85,20
1,2,NGC 40,Planetary Nebula,Cep,12.4,Size: 0.6,0,13.0,+,72,32
2,3,NGC 4236,Spiral Galaxy,Dra,9.7,Size: 19 x 7,12,16.7,+,69,28
3,4,NGC 7023,Bright Nebula,Cep,--,Size: 18 x 18,21,1.8,+,68,12
4,5,IC 342,Spiral Galaxy,Cam,9.2,Size: 18 x 17,3,46.8,+,68,6
5,6,NGC 6543,Planetary Nebula,Dra,8.1,Size: 0.3/5.8,17,58.6,+,66,38
6,7,NGC 2403,Spiral Galaxy,Cam,8.4,Size: 18 x 10,7,36.9,+,65,36
7,8,NGC 559,Open Cluster,Cas,9.5,Size: 4,1,29.5,+,63,18
8,9,Sh2-155,Bright Nebula,Cep,--,Size: 50 x 10,22,56.8,+,62,37
9,10,NGC 663,Open Cluster,Cas,7.1,Size: 16,1,46.0,+,61,15


## Herschel400 Catalogue

In [87]:
Herschel400_nexstarsite = pd.read_excel(CHEMIN_SOURCES+"nexstarsite/"+"Herschel400_nexstarsite.xls", skiprows=2)
Herschel400_nexstarsite.shape

(400, 11)

# Catalogues Merging