# Catalogues exploration
Ce notebook a pour but d'analyser et de rassembler les diffrentes sources de données afin de constituer un catalogue qui se veut être assez exhaustif accessible via une Web API.  

## 1. OpenNGC catalogue
 Ce catalogue a été télécharger depuis [ce lien](https://github.com/mattiaverga/OpenNGC). Il contient toutes les données présentes dans le catalogue NGC.

In [1]:
#import sys
import os
import pandas as pd
import numpy as np
pd.set_option('display.max_columns', 500)
CHEMIN_SOURCES = os.path.join("..","1_Data_brute")

In [2]:
# Lecture de données depuis le fichier source vers un objet DataFrame
col_names = ['NGC_IC_designation', 
    'object_type',
    'Right_Ascension',
    'Declinaison',
    'Constelation_name_abrv',
    'major_axis',
    'minor_axis',
    'major__axis_position_angle', 
    'B_Apparent_Magnitude',
    'V_Apparent_Magnitude',             
    'J_Apparent_Magnitude',
    'H_Apparent_Magnitude',
    'K_Apparent_Magnitude',
    'Mean_surface_brigthness', 
    'Hubble_morphological_type',
    'center_star_U_maghitude',
    'center_star_B_maghitude',
    'center_star_V_maghitude',
    'Messier_number',
    'NGC_number',
    'IC_number',
    'Center_star_name',
    'Identifiers',
    'Common_names',       
    'NED_notes',  
    'OpenNGC_notes']

NGC_mattiaAverga_GitHub = pd.read_csv(os.path.join(CHEMIN_SOURCES, "NGC_mattiaAverga_GitHub.csv"), sep=';',header=0, names=col_names)
NGC_mattiaAverga_GitHub.shape

(13958, 26)

In [3]:
#Normaliser les seconds nombres NGC/IC s'il existent
NGC_mattiaAverga_GitHub.loc[NGC_mattiaAverga_GitHub['NGC_number'].str.len()==4, 'NGC_number'] = 'NGC' + NGC_mattiaAverga_GitHub['NGC_number']
NGC_mattiaAverga_GitHub.loc[NGC_mattiaAverga_GitHub['IC_number'].str.len()==4, 'IC_number'] = 'IC' + NGC_mattiaAverga_GitHub['IC_number']

In [4]:
NGC_mattiaAverga_GitHub[~NGC_mattiaAverga_GitHub['NGC_number'].isnull()]

Unnamed: 0,NGC_IC_designation,object_type,Right_Ascension,Declinaison,Constelation_name_abrv,major_axis,minor_axis,major__axis_position_angle,B_Apparent_Magnitude,V_Apparent_Magnitude,J_Apparent_Magnitude,H_Apparent_Magnitude,K_Apparent_Magnitude,Mean_surface_brigthness,Hubble_morphological_type,center_star_U_maghitude,center_star_B_maghitude,center_star_V_maghitude,Messier_number,NGC_number,IC_number,Center_star_name,Identifiers,Common_names,NED_notes,OpenNGC_notes
10,IC0011,Dup,00:52:59.35,+56:37:18.8,Cas,,,,,,,,,,,,,,,NGC0281,,,,,,
25,IC0026,Dup,00:31:45.94,-13:20:14.9,Cet,,,,,,,,,,,,,,,NGC0135,,,,,,
38,IC0039,Dup,00:39:08.40,-14:10:22.2,Cet,,,,,,,,,,,,,,,NGC0178,,,,,,
43,IC0044,Dup,00:42:15.88,+00:50:43.8,Cet,,,,,,,,,,,,,,,NGC0223,,,,,,
90,IC0089,Dup,01:16:03.61,+04:17:38.8,Psc,,,,,,,,,,,,,,,NGC0446,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13749,NGC7643,G,23:22:50.40,+11:59:19.8,Peg,1.47,0.81,44.0,14.80,,11.02,10.31,10.04,23.33,Sab,,,,,NGC7644,,,"2MASX J23225040+1159195,MCG +02-59-033,PGC 071...",,Identification as NGC 7644 is doubtful; N7644 ...,
13750,NGC7644,Dup,23:22:50.40,+11:59:19.8,Peg,,,,,,,,,,,,,,,NGC7643,,,,,,
13882,NGC7769,G,23:51:03.97,+20:09:01.5,Peg,1.75,0.67,85.0,12.77,,9.90,9.21,8.93,21.79,Sb,,,,,7771W,,,"2MASX J23510396+2009014,IRAS 23485+1952,MCG +0...",,,
13883,NGC7770,G,23:51:22.54,+20:05:47.5,Peg,0.97,0.85,105.0,14.50,,11.75,11.11,10.75,23.06,S0-a,,,,,7771S,,,"2MASX J23512260+2005485,MCG +03-60-034,PGC 072...",,"Incorrectly noted as a ""double system"" in CGCG.",


Dans cette [source](http://messier.obspm.fr/m-q&a.html), on peut lire:
- M24: the Milky Way Patch in Sagittarius; it contains however the 11th magnitude cluster NGC 6603 which is sometimes erroneously listed as M24. Also, it may be that IC 4715 is M24. 
- M25: this is IC 4725, 
- M40: the double star Winnecke 4
- M45: the Pleiades; however, this cluster is associated with nebulae which have NGC numbers.

In [5]:
NGC_mattiaAverga_GitHub.loc[(NGC_mattiaAverga_GitHub['NGC_IC_designation']=='IC4725') |
                        (NGC_mattiaAverga_GitHub['NGC_IC_designation']=='IC4715')]

Unnamed: 0,NGC_IC_designation,object_type,Right_Ascension,Declinaison,Constelation_name_abrv,major_axis,minor_axis,major__axis_position_angle,B_Apparent_Magnitude,V_Apparent_Magnitude,J_Apparent_Magnitude,H_Apparent_Magnitude,K_Apparent_Magnitude,Mean_surface_brigthness,Hubble_morphological_type,center_star_U_maghitude,center_star_B_maghitude,center_star_V_maghitude,Messier_number,NGC_number,IC_number,Center_star_name,Identifiers,Common_names,NED_notes,OpenNGC_notes
4923,IC4715,*Ass,18:16:56.12,-18:30:52.4,Sgr,120.0,60.0,,,4.5,,,,,,,,,24.0,,,,,Small Sgr Star Cloud,Milky Way star cloud.,V-mag taken from HEASARC's messier table
4933,IC4725,OCl,18:31:46.77,-19:06:53.8,Sgr,14.1,,,5.29,4.6,,,,,,,,,25.0,,,,,,,


*Note:* Les objets M24 et  M25 existent dans NGC_mattiaAverga_GitHub. Mais M40 et M45 n'y sont pas. Ils sont dans l'Addendum.

### Addendum
Le fichier Addendum contient des objets qui ne font partie des catalogue NGC ou IC mais qui peuvent interessé les amateurs de l'astronomie. Les objets M40 et M45 du catalogue de Messier n'ont pas d'identifiant NGC donc ils sont présents dans cet addendum (voir [source](https://github.com/mattiaverga/OpenNGC/blob/master/README.md)).

In [6]:
# Lecture de données depuis le fichier source vers un objet DataFrame
addendum_NGC_mattiaAverga_GitHub = pd.read_csv(os.path.join(CHEMIN_SOURCES, "addendum_NGC_mattiaAverga_GitHub.csv"), sep=';', header=0, names=col_names)
addendum_NGC_mattiaAverga_GitHub.shape

(20, 26)

## 2. Construction du catalogue de Messier
Dans ce qui suit, nous allons construire notre catalogue de Messier à partir de plusieurs sources.

### 2.1. Le catalogue de Messier: *'Datastro'*
[Source](https://www.datastro.eu/explore/dataset/catalogue-de-messier/table/?disjunctive.objet&disjunctive.mag&disjunctive.english_name_nom_en_anglais&disjunctive.french_name_nom_francais&disjunctive.latin_name_nom_latin&sort=messier)

#### 2.1.1. Lecture du fichier

In [7]:
# Lecture de données depuis le fichier source vers un objet DataFrame
col_names = ['Messier_number',
    'NGC_IC_designation', 
    'object_type',
    'Season',
    'Magnitude',
    'Constellation_EN',
    'Constellation_FR',
    'Constellation_Latin',
    'Right_Ascension',        
    'Declinaison',
    'Distance_light_year',
    'Size',
    'Discoverer',
    'Year',
    'Image1',
    'Image2',
    'Constelation_abr' ]
messier_Datastro_website = pd.read_csv(os.path.join(CHEMIN_SOURCES, "messier_Datastro_website.csv"), sep=';', header=0, names=col_names)
messier_Datastro_website.shape

(110, 17)

In [8]:
# Corrections d'un identifiant NGC 
messier_Datastro_website.loc[messier_Datastro_website['NGC_IC_designation']=='NGC 9176', 'NGC_IC_designation'] = 'NGC 1976'

#### 2.1.2. Division du catalogue de Messier

Nous avons besoin de traiter à part les deux objets M40 et M45. Car ces deux objets vont être mergés avec les données présentes dans le *Addendum* et le reste des objets Messier vont être mergés avec les données du *NGC_mattiaAverga_GitHub*.

In [9]:
messier_Datastro_website_without_M40_M45 = messier_Datastro_website[(messier_Datastro_website['Messier_number']!='M40') &
                        (messier_Datastro_website['Messier_number']!='M45')].copy()

messier_Datastro_website_M40_M45 = messier_Datastro_website[(messier_Datastro_website['Messier_number']=='M40') |
                        (messier_Datastro_website['Messier_number']=='M45')].copy()

#### 2.1.3. Merging entre Messier et NGC

In [10]:
# Normalisation du NGC_designation: NGCnnnn (suppression des vides)
messier_Datastro_website_without_M40_M45.loc[messier_Datastro_website_without_M40_M45['NGC_IC_designation'].str.len()==8
                             , 'NGC_IC_designation']= messier_Datastro_website_without_M40_M45['NGC_IC_designation'].str.replace(" ", "")

messier_Datastro_website_without_M40_M45.loc[messier_Datastro_website_without_M40_M45['NGC_IC_designation'].str.len()==7
                             , 'NGC_IC_designation']= messier_Datastro_website_without_M40_M45['NGC_IC_designation'].str.replace(" ", "0")

In [11]:
# Merge du dataframe qui contient tout les objets sauf M40 et M45
merge_messier_Datastro_website_without_M40_M45 = messier_Datastro_website_without_M40_M45.merge(NGC_mattiaAverga_GitHub, how='left', on='NGC_IC_designation')
merge_messier_Datastro_website_without_M40_M45.shape

(108, 42)

*Liste des features:*

- Messier_number_x
- NGC_IC_designation
- *object_type_x* (delete object_type_y is abreviation) 
- Season                         
- Magnitude                       
- Constellation_EN               
- Constellation_FR               
- Constellation_Latin            
- Right_Ascension_x  1 Nan(keep Right_Ascension_y no nan)            
- Declinaison_x  1 Nan(keep Declinaison_y no nan)                
- Distance_light_year           
- Size                           
- Discoverer                     
- Year                          
- Image1                         
- Image2                         
- *Constelation_abr (Delete)*             
- *object_type_y  (delete)*            
- *Right_Ascension_y*             
- *Declinaison_y*                  
- Constelation_name_abrv      
- major_axis                   
- minor_axis                    
- major__axis_position_angle    
- B_Apparent_Magnitude          
- V_Apparent_Magnitude          
- J_Apparent_Magnitude          
- H_Apparent_Magnitude          
- K_Apparent_Magnitude          
- Mean_surface_brigthness       
- Hubble_morphological_type      
- center_star_U_maghitude       
- center_star_B_maghitude       
- center_star_V_maghitude       
- *Messier_number_y (delete)*             
- NGC_numner  (6 double-listed in NGC)                  
- *IC_number (delete: 108 Nan)*                    
- Center_star_name               
- Identifiers                    
- Common_names                   
- NED_notes                      
- OpenNGC_notes

In [12]:
# Suppression des colunne inutiles ou redondantes
drop_list = ['object_type_y', 'Right_Ascension_x', 'Declinaison_x', 'Messier_number_y', 'IC_number']
rename_lis = {'Messier_number_x' : 'Messier_number',
             'object_type_x' : 'object_type',
             'Right_Ascension_y' : 'Right_Ascension',
             'Declinaison_y' : 'Declinaison'}

merge_messier_Datastro_website_without_M40_M45 = merge_messier_Datastro_website_without_M40_M45.drop(columns=drop_list)
merge_messier_Datastro_website_without_M40_M45 = merge_messier_Datastro_website_without_M40_M45.rename(columns=rename_lis)
merge_messier_Datastro_website_without_M40_M45.shape

(108, 37)

In [13]:
# Extraction des objets M40 et M45 depuis l'Addendum
Open_NGC_M40_M45 = addendum_NGC_mattiaAverga_GitHub[~addendum_NGC_mattiaAverga_GitHub['Messier_number'].isna()].copy()
# Ajout du Numero Messier standardisé
Open_NGC_M40_M45['Messier_number_bis'] = 'M'+Open_NGC_M40_M45['Messier_number'].astype('int').astype('str')

# Remplacement de la colonne Messier_number par la nouvelle colonne
Open_NGC_M40_M45 = Open_NGC_M40_M45.drop(columns='Messier_number')
Open_NGC_M40_M45 = Open_NGC_M40_M45.rename(columns={'Messier_number_bis': 'Messier_number'})
Open_NGC_M40_M45

Unnamed: 0,NGC_IC_designation,object_type,Right_Ascension,Declinaison,Constelation_name_abrv,major_axis,minor_axis,major__axis_position_angle,B_Apparent_Magnitude,V_Apparent_Magnitude,J_Apparent_Magnitude,H_Apparent_Magnitude,K_Apparent_Magnitude,Mean_surface_brigthness,Hubble_morphological_type,center_star_U_maghitude,center_star_B_maghitude,center_star_V_maghitude,NGC_number,IC_number,Center_star_name,Identifiers,Common_names,NED_notes,OpenNGC_notes,Messier_number
13,M040,**,12:22:16.1,+58:05:04,UMa,,,,,8.0,,,,,,,,,,,,WDS J12222+5805AB,,,VMag taken from HEASARC's messier table,M40
14,Mel022,OCl,03:47:28.6,+24:06:19,Tau,150.0,150.0,90.0,,1.2,,,,,,,,,,,,MWSC 0305,Pleiades,,"Coordinates taken from Simbad, VMag taken from...",M45


In [14]:
# Merge du dataframe qui contient M40 et M45
merge_messier_Datastro_website_M40_M45 = messier_Datastro_website_M40_M45.merge(Open_NGC_M40_M45, how='left', on='Messier_number')
merge_messier_Datastro_website_M40_M45

Unnamed: 0,Messier_number,NGC_IC_designation_x,object_type_x,Season,Magnitude,Constellation_EN,Constellation_FR,Constellation_Latin,Right_Ascension_x,Declinaison_x,Distance_light_year,Size,Discoverer,Year,Image1,Image2,Constelation_abr,NGC_IC_designation_y,object_type_y,Right_Ascension_y,Declinaison_y,Constelation_name_abrv,major_axis,minor_axis,major__axis_position_angle,B_Apparent_Magnitude,V_Apparent_Magnitude,J_Apparent_Magnitude,H_Apparent_Magnitude,K_Apparent_Magnitude,Mean_surface_brigthness,Hubble_morphological_type,center_star_U_maghitude,center_star_B_maghitude,center_star_V_maghitude,NGC_number,IC_number,Center_star_name,Identifiers,Common_names,NED_notes,OpenNGC_notes
0,M45,,Open Cluster / Amas Ouvert,Winter / Hiver,1,,,,,,410.0,"120,0'",,,http://www.lasam.ca/messier/M045.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,,Mel022,OCl,03:47:28.6,+24:06:19,Tau,150.0,150.0,90.0,,1.2,,,,,,,,,,,,MWSC 0305,Pleiades,,"Coordinates taken from Simbad, VMag taken from..."
1,M40,Winnecke 4,Double star / Étoile Double,Spring / Printemps,9,,,,,,,,Hevelius,1660.0,http://www.lasam.ca/messier/M040.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,,M040,**,12:22:16.1,+58:05:04,UMa,,,,,8.0,,,,,,,,,,,,WDS J12222+5805AB,,,VMag taken from HEASARC's messier table


In [15]:
# Move data before deleting NGC_IC_designation_y
merge_messier_Datastro_website_M40_M45.loc[merge_messier_Datastro_website_M40_M45['NGC_IC_designation_x'].isna(),
                                          'NGC_IC_designation_x'] = merge_messier_Datastro_website_M40_M45['NGC_IC_designation_y']

In [16]:
# Suppression des colunne inutiles ou redondantes
drop_list = ['NGC_IC_designation_y', 'object_type_y', 'Right_Ascension_x', 'Declinaison_x', 'IC_number']
rename_lis = {'NGC_IC_designation_x': 'NGC_IC_designation',
             'object_type_x' : 'object_type',
             'Right_Ascension_y' : 'Right_Ascension',
             'Declinaison_y' : 'Declinaison'}

merge_messier_Datastro_website_M40_M45 = merge_messier_Datastro_website_M40_M45.drop(columns=drop_list)
merge_messier_Datastro_website_M40_M45 = merge_messier_Datastro_website_M40_M45.rename(columns=rename_lis)
merge_messier_Datastro_website_M40_M45.shape

(2, 37)

#### 2.1.4. Concatination entre Messier contenant les objets M40 et M45 et le Messier ayant le reste des objets

In [17]:
messier_catalogue = pd.concat([merge_messier_Datastro_website_without_M40_M45, merge_messier_Datastro_website_M40_M45])

In [18]:
messier_catalogue[messier_catalogue['Hubble_morphological_type'].isna()]

Unnamed: 0,Messier_number,NGC_IC_designation,object_type,Season,Magnitude,Constellation_EN,Constellation_FR,Constellation_Latin,Distance_light_year,Size,Discoverer,Year,Image1,Image2,Constelation_abr,Right_Ascension,Declinaison,Constelation_name_abrv,major_axis,minor_axis,major__axis_position_angle,B_Apparent_Magnitude,V_Apparent_Magnitude,J_Apparent_Magnitude,H_Apparent_Magnitude,K_Apparent_Magnitude,Mean_surface_brigthness,Hubble_morphological_type,center_star_U_maghitude,center_star_B_maghitude,center_star_V_maghitude,NGC_number,Center_star_name,Identifiers,Common_names,NED_notes,OpenNGC_notes
2,M76,NGC0650,Planetary Nebula / Nébuleuse Planétaire,Autumn / Automne,10,Perseus,Persée,Perseus,8200.0,65'',Méchain,1780.0,http://www.lasam.ca/messier/M076.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,Per,01:42:19.69,+51:34:31.7,Per,1.12,,,12.20,10.1,,,,,,,,17.48,NGC0651,,"2MASX J01421808+5134243,IRAS 01391+5119,PN G13...","Barbell Nebula,Cork Nebula,Little Dumbbell Nebula",The 2MASX position is centered on a superposed...,
3,M80,NGC6093,Globular Cluster / Amas Globulaire,Summer / Été,7,Scorpion,Scorpion,Scorpius,36000.0,"5,1'",Messier,1781.0,http://www.lasam.ca/messier/M080.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,Sco,16:17:02.51,-22:58:30.4,Sco,5.70,,,,7.3,,,,,,,,,,,MWSC 2376,,,V-mag taken from LEDA
4,M93,NGC2447,Open Cluster / Amas Ouvert,Winter / Hiver,6,"Stern,Poop deck",Poupe,Puppis,3600.0,"22,0'",Messier,1781.0,http://www.lasam.ca/messier/M093.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,Pup,07:44:29.23,-23:51:11.1,Pup,15.00,,,6.57,6.2,,,,,,,,,,,MWSC 1324,,,
7,M37,NGC2099,Open Cluster / Amas Ouvert,Winter / Hiver,5,Charioteer,Cocher,Auriga,3600.0,"24,0'",Hodierna,1654.0,http://www.lasam.ca/messier/M037.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,Aur,05:52:18.35,+32:33:10.8,Aur,11.40,,,6.19,5.6,,,,,,,,,,,MWSC 0689,,,
8,M36,NGC1960,Open Cluster / Amas Ouvert,Winter / Hiver,6,Charioteer,Cocher,Auriga,3700.0,"12,0'",Hodierna,1654.0,http://www.lasam.ca/messier/M036.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,Aur,05:36:17.74,+34:08:26.7,Aur,7.20,,,6.09,6.0,,,,,,,,,,,MWSC 0594,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
104,M17,NGC6618,Emission Nebula / Nébuleuse à émission,Summer / Été,6,Archer,Sagittaire,Sagittarius,5870.0,"45,0' x 35,0'",Chéseaux,1746.0,http://www.lasam.ca/messier/M017.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,Sgr,18:20:47.11,-16:10:17.5,Sgr,12.60,,,6.00,7.0,,,,,,,,,,,"LBN 60,MWSC 2896","Checkmark Nebula,Lobster Nebula,Swan Nebula,om...",,"B-Mag taken from LEDA, V-mag taken from HEASAR..."
105,M27,NGC6853,Planetary Nebula / Nébuleuse Planétaire,Summer / Été,8,Little Fox,Petit Renard,Vulpecula,980.0,"348,0''",Messier,1764.0,http://www.lasam.ca/messier/M027.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,Vul,19:59:36.38,+22:43:15.7,Vul,6.70,,,7.60,7.4,11.79,10.61,,,,12.43,13.66,13.94,,BD+22 3878,"2MASX J19593637+2243157,PN G060.8-03.6",Dumbbell Nebula,,
106,M28,NGC6626,Globular Cluster / Amas Globulaire,Summer / Été,6,Archer,Sagittaire,Sagittarius,19000.0,"15,0'",Messier,1764.0,http://www.lasam.ca/messier/M028.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,Sgr,18:24:32.89,-24:52:11.4,Sgr,5.10,,,,6.9,,,,,,,,,,,MWSC 2908,,,V-mag taken from LEDA
0,M45,Mel022,Open Cluster / Amas Ouvert,Winter / Hiver,1,,,,410.0,"120,0'",,,http://www.lasam.ca/messier/M045.JPG,https://www.datastro.eu/api/v2/catalog/dataset...,,03:47:28.6,+24:06:19,Tau,150.00,150.0,90.0,,1.2,,,,,,,,,,,MWSC 0305,Pleiades,,"Coordinates taken from Simbad, VMag taken from..."


### 2.3. Le catalogue de Messier: *'jbcurtin_gitHub'*
Ce catalogue ([Voir référence](https://github.com/jbcurtin/messier-catalogue)) est pricipalement intéressant pour les images qu'il contient pour chaque objet.

In [19]:
# Lecture de données depuis le fichier source vers un objet DataFrame
col_names = ['Messier_number_jbcurtin_gitHub',
    'NGC_IC_designation_jbcurtin_gitHub', 
    'Common_name_jbcurtin_gitHub',
    'image_jbcurtin_gitHub',
    'Object_type_jbcurtin_gitHub',
    'Distance_KLY_jbcurtin_gitHub',             
    'Constellation_jbcurtin_gitHub',
    'Apparent_magnitude_jbcurtin_gitHub',
    'Right_Ascension_jbcurtin_gitHub',        
    'Declinaison_jbcurtin_gitHub']
messier_with_picture_links_jbcurtin_gitHub = pd.read_csv(os.path.join(CHEMIN_SOURCES, "messier_with_picture_links_jbcurtin_gitHub.csv"), keep_default_na=True, sep=';', header=0, names=col_names)
messier_with_picture_links_jbcurtin_gitHub.loc[messier_with_picture_links_jbcurtin_gitHub['Common_name_jbcurtin_gitHub']==' ',
                       'Common_name_jbcurtin_gitHub'] = np.nan #not a space
    
messier_with_picture_links_jbcurtin_gitHub.dtypes

Messier_number_jbcurtin_gitHub         object
NGC_IC_designation_jbcurtin_gitHub     object
Common_name_jbcurtin_gitHub            object
image_jbcurtin_gitHub                  object
Object_type_jbcurtin_gitHub            object
Distance_KLY_jbcurtin_gitHub           object
Constellation_jbcurtin_gitHub          object
Apparent_magnitude_jbcurtin_gitHub    float64
Right_Ascension_jbcurtin_gitHub        object
Declinaison_jbcurtin_gitHub            object
dtype: object

In [20]:
# Merge entre 'messier_catalogue' et le nouveau dataset
messier_catalogue2 = messier_with_picture_links_jbcurtin_gitHub.merge(messier_catalogue,how='left', left_on='Messier_number_jbcurtin_gitHub', right_on='Messier_number')
messier_catalogue2.shape

(110, 47)

In [21]:
#keep 'Common_name_jbcurtin_gitHub', delete_Common_names
messier_catalogue2.loc[messier_catalogue2['Common_name_jbcurtin_gitHub'].isna(),
                   'Common_name_jbcurtin_gitHub'] = messier_catalogue2['Common_names']

# In 'Object_type_jbcurtin_gitHub', Galaxy is subtyped. So we should keep that info
# Keep 'object_type', delete 'Object_type_jbcurtin_gitHub'
messier_catalogue2.loc[messier_catalogue2['object_type'].str.contains('Galaxy'),
                   'object_type'] = messier_catalogue2['Object_type_jbcurtin_gitHub']

#Delete 'Constellation_jbcurtin_gitHub'
messier_catalogue2.loc[messier_catalogue2['Constellation_Latin'].isna(),
                   'Constellation_Latin'] = messier_catalogue2['Constellation_jbcurtin_gitHub']

drop_list = ['Common_names', 'Messier_number_jbcurtin_gitHub', 'NGC_IC_designation_jbcurtin_gitHub', 
             'Object_type_jbcurtin_gitHub', 'Distance_KLY_jbcurtin_gitHub', 'Constellation_jbcurtin_gitHub',
            'Right_Ascension_jbcurtin_gitHub', 'Declinaison_jbcurtin_gitHub', 'Constelation_abr']
rename_lis = {'Common_name_jbcurtin_gitHub': 'Common_name'}

messier_catalogue2 = messier_catalogue2.drop(columns=drop_list)
messier_catalogue2 = messier_catalogue2.rename(columns=rename_lis)
messier_catalogue2.shape


(110, 38)

### 2.4. Le catalogue de Messier: *'Nexstarsite'*

https://www.nexstarsite.com/Book/DSO.htm

In [22]:
MessierObjects_nexstarsite = pd.read_excel(os.path.join(CHEMIN_SOURCES, "nexstarsite", "MessierObjects_nexstarsite.xls"), skiprows=2)
MessierObjects_nexstarsite.dtypes

ObjectNum          int64
Name              object
Type              object
Constellation     object
RAHour             int64
RAMinute         float64
DecSign           object
DecDeg             int64
DecMinute        float64
Magnitude         object
Info              object
Distance (ly)      int64
dtype: object

Ce dataset ne contient pas de nouvelles données.

In [23]:
messier_catalogue_final = messier_catalogue2

In [24]:
messier_catalogue_final.dtypes

Common_name                            object
image_jbcurtin_gitHub                  object
Apparent_magnitude_jbcurtin_gitHub    float64
Messier_number                         object
NGC_IC_designation                     object
object_type                            object
Season                                 object
Magnitude                               int64
Constellation_EN                       object
Constellation_FR                       object
Constellation_Latin                    object
Distance_light_year                   float64
Size                                   object
Discoverer                             object
Year                                  float64
Image1                                 object
Image2                                 object
Right_Ascension                        object
Declinaison                            object
Constelation_name_abrv                 object
major_axis                            float64
minor_axis                        

## 3. Construction du catalogue de Caldwell

In [25]:
# Lecture de données depuis le fichier source vers un objet DataFrame
col_names = ['Caldwell_number',
    'NGC_IC_designation', 
    'object_type',
    'Constelation_abr',
    'Magnitude',
    'Size',
    'Right_Ascension_Hour',
    'Right_Ascension_Minute', 
    'Declinaison_Sign',
    'Declinaison_Deg',
    'Declinaison_Minute'
     ]

CaldwellObjects_nexstarsite = pd.read_excel(os.path.join(CHEMIN_SOURCES, "nexstarsite", "CaldwellObjects_nexstarsite.xls"), skiprows=2, keep_default_na=True, names=col_names)
CaldwellObjects_nexstarsite.shape

(109, 11)

In [26]:
# Normalisation du NGC_designation: NGCnnnn (suppression des vides)
CaldwellObjects_nexstarsite.loc[( (CaldwellObjects_nexstarsite['NGC_IC_designation'].str.len()==8) &
                                (CaldwellObjects_nexstarsite['NGC_IC_designation'].str.contains("NGC")) ), 
                                'NGC_IC_designation']= CaldwellObjects_nexstarsite['NGC_IC_designation'].str.replace(" ", "")

CaldwellObjects_nexstarsite.loc[( (CaldwellObjects_nexstarsite['NGC_IC_designation'].str.len()==7) &
                                (CaldwellObjects_nexstarsite['NGC_IC_designation'].str.contains("NGC")) ), 
                                'NGC_IC_designation']= CaldwellObjects_nexstarsite['NGC_IC_designation'].str.replace(" ", "0")

CaldwellObjects_nexstarsite.loc[( (CaldwellObjects_nexstarsite['NGC_IC_designation'].str.len()==6) &
                                (CaldwellObjects_nexstarsite['NGC_IC_designation'].str.contains("NGC")) ), 
                                'NGC_IC_designation']= CaldwellObjects_nexstarsite['NGC_IC_designation'].str.replace(" ", "00")

In [27]:
# Normalisation du NGC_designation: ICnnnn (suppression des vides)
CaldwellObjects_nexstarsite.loc[( (CaldwellObjects_nexstarsite['NGC_IC_designation'].str.len()==7) &
                                (CaldwellObjects_nexstarsite['NGC_IC_designation'].str.contains("IC")) ), 
                                'NGC_IC_designation']= CaldwellObjects_nexstarsite['NGC_IC_designation'].str.replace(" ", "")

CaldwellObjects_nexstarsite.loc[( (CaldwellObjects_nexstarsite['NGC_IC_designation'].str.len()==6) &
                                (CaldwellObjects_nexstarsite['NGC_IC_designation'].str.contains("IC")) ), 
                                'NGC_IC_designation']= CaldwellObjects_nexstarsite['NGC_IC_designation'].str.replace(" ", "0")

CaldwellObjects_nexstarsite.loc[( (CaldwellObjects_nexstarsite['NGC_IC_designation'].str.len()==5) &
                                (CaldwellObjects_nexstarsite['NGC_IC_designation'].str.contains("IC")) ), 
                                'NGC_IC_designation']= CaldwellObjects_nexstarsite['NGC_IC_designation'].str.replace(" ", "00")



In [28]:
CaldwellObjects_nexstarsite[~(CaldwellObjects_nexstarsite['NGC_IC_designation'].str.contains("IC")| CaldwellObjects_nexstarsite['NGC_IC_designation'].str.contains("NGC") )]

Unnamed: 0,Caldwell_number,NGC_IC_designation,object_type,Constelation_abr,Magnitude,Size,Right_Ascension_Hour,Right_Ascension_Minute,Declinaison_Sign,Declinaison_Deg,Declinaison_Minute
8,9,Sh2-155,Bright Nebula,Cep,--,Size: 50 x 10,22,56.8,+,62,37
40,41,none,Open Cluster,Tau,0.5,Size: 330,4,27.0,+,16,0
98,99,none,Dark Nebula,Cru,--,Size: 400 x 300,12,53.0,-,63,0


In [29]:
# Changer l'abreviation 'Can' vers 'Cnc'
CaldwellObjects_nexstarsite.loc[CaldwellObjects_nexstarsite['Constelation_abr']=='Can', 'Constelation_abr'] = 'Cnc'

The objects C9, C41 and C99 have no NGC/IC designation

In [30]:
# Merge entre 'messier_catalogue' et le nouveau dataset
cadlwell_catalogue = CaldwellObjects_nexstarsite.merge(NGC_mattiaAverga_GitHub, how='left', on='NGC_IC_designation')
cadlwell_catalogue.shape

(109, 36)

Liste des colonnes:

- Caldwell_number   (Keep)      
- NGC_IC_designation  (Keep)         
- object_type_x    (keep -rename)             
- Constelation_abr  (keep -rename)           
- Magnitude   (delete)            
- Size   (Keep)                        
- Right_Ascension_Hour  (delete)
- Right_Ascension_Minute  (delete)
- Declinaison_Sign  (delete)
- Declinaison_Deg  (delete)
- Declinaison_Minute (delete)
- object_type_y  (delete)
- Right_Ascension (Keep) 
- Declinaison (Keep) 
- Constelation_name_abrv (delete)
- major_axis (Keep) 
- minor_axis (Keep) 
- major__axis_position_angle (Keep) 
- B_Apparent_Magnitude  (Keep) 
- V_Apparent_Magnitude  (Keep) 
- J_Apparent_Magnitude (Keep) 
- H_Apparent_Magnitude  (Keep) 
- K_Apparent_Magnitude  (Keep) 
- Mean_surface_brigthness  (Keep) 
- Hubble_morphological_type  (Keep) 
- center_star_U_maghitude   (Keep) 
- center_star_B_maghitude (Keep) 
- center_star_V_maghitude   (Keep) 
- Messier_number  (delete)
- NGC_numner     (Keep) 
- IC_number     (Keep) 
- Center_star_name (Keep) 
- Identifiers     (Keep) 
- Common_names     (Keep) 
- NED_notes       (Keep) 
- OpenNGC_notes   (Keep) 

In [31]:
drop_list = ["Magnitude", "Right_Ascension_Hour", "Right_Ascension_Minute", "Declinaison_Sign", 
            "Declinaison_Deg", "Declinaison_Minute", "object_type_y", "Constelation_name_abrv", "Messier_number"]
rename_lis = {'object_type_x': 'object_type',
             'Constelation_abr': 'Constelation_name_abrv'}

cadlwell_catalogue = cadlwell_catalogue.drop(columns=drop_list)
cadlwell_catalogue = cadlwell_catalogue.rename(columns=rename_lis)

In [32]:
# Ajout du Numero Messier standardisé
cadlwell_catalogue['Caldwell_number_bis'] = 'C'+cadlwell_catalogue['Caldwell_number'].astype('int').astype('str')

# Remplacement de la colonne Messier_number par la nouvelle colonne
cadlwell_catalogue = cadlwell_catalogue.drop(columns='Caldwell_number')
cadlwell_catalogue = cadlwell_catalogue.rename(columns={'Caldwell_number_bis': 'Caldwell_number'})
cadlwell_catalogue

Unnamed: 0,NGC_IC_designation,object_type,Constelation_name_abrv,Size,Right_Ascension,Declinaison,major_axis,minor_axis,major__axis_position_angle,B_Apparent_Magnitude,V_Apparent_Magnitude,J_Apparent_Magnitude,H_Apparent_Magnitude,K_Apparent_Magnitude,Mean_surface_brigthness,Hubble_morphological_type,center_star_U_maghitude,center_star_B_maghitude,center_star_V_maghitude,NGC_number,IC_number,Center_star_name,Identifiers,Common_names,NED_notes,OpenNGC_notes,Caldwell_number
0,NGC0188,Open Cluster,Cep,Size: 14,00:47:27.53,+85:16:10.7,17.70,,,8.91,8.10,,,,,,,,,,,,"C 001,MWSC 0074",,,,C1
1,NGC0040,Planetary Nebula,Cep,Size: 0.6,00:13:01.03,+72:31:19.0,0.80,,,11.27,11.89,10.89,10.80,10.38,,,11.14,11.82,11.58,,,"HD 000826,HIP 001041,TYC 4302-01297-1","C 002,IRAS 00102+7214,PN G120.0+09.8",Bow-Tie nebula,,,C2
2,NGC4236,Spiral Galaxy,Dra,Size: 19 x 7,12:16:42.12,+69:27:45.3,23.50,6.85,161.0,10.58,10.08,9.91,9.17,9.01,24.58,SBd,,,,,,,"2MASX J12164211+6927452,C 003,IRAS 12143+6945,...",,,,C3
3,NGC7023,Bright Nebula,Cep,Size: 18 x 18,21:01:35.62,+68:10:10.4,10.00,8.00,,7.20,,,,,,,,,,,,,"C 004,IRAS 21009+6758,LBN 487",Iris Nebula,"Identified as IR cirrus by Strauss, et al (199...",,C4
4,IC0342,Spiral Galaxy,Cam,Size: 18 x 17,03:46:48.50,+68:05:46.9,19.77,18.79,0.0,10.50,,5.66,5.01,4.56,24.85,SABc,,,,,,,"2MASX J03464851+6805459,C 005,IRAS 03419+6756,...",,,,C5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
104,NGC4833,Globular Cluster,Mus,Size: 14,12:59:34.94,-70:52:28.5,8.40,,,8.72,7.79,,,,,,,,,,,,"C 105,MWSC 2077",,,,C105
105,NGC0104,Globular Cluster,Tuc,Size: 31,00:24:05.36,-72:04:53.2,31.80,,,5.78,4.09,2.29,1.76,1.54,,,,,,,,,"2MASX J00240535-7204531,C 106,MWSC 0038",47 Tuc Cluster,,,C106
106,NGC6101,Globular Cluster,Aps,Size: 11,16:25:48.57,-72:12:05.6,4.50,,,10.90,10.08,,,,,,,,,,,,"C 107,MWSC 2404",,,,C107
107,NGC4372,Globular Cluster,Mus,Size: 19,12:25:45.38,-72:39:32.7,12.00,,,10.86,9.85,,,,,,,,,,,,"C 108,MWSC 2029",,,,C108


In [33]:
caldwell_catalogue_final = cadlwell_catalogue

## 4. Construction du catalogue de Herschel400

https://www.go-astronomy.com/herschel-objects.htm

The Herschel 400 catalog is a subset of the original Herschel (2,500 object) catalog that contains 400 northern-sky deep-sky objects (galaxies, nebulae, and star clusters) that can be found by amateur astronomers that want a challenge after finding the easier/brighter Messier and Caldwell catalog objects. The Herschel 400 requires a bit larger telescope (6 inches and larger) than the Messier objects that can all be found in 4-inch scopes.

Herschel 400 objects are often designated by their NGC number in constellation charts rather than by a Herschel H designation, unlike Messier objects that are often designated by their M number..

In [34]:
# Lecture de données depuis le fichier source vers un objet DataFrame
col_names = ['NGC_designation', 
    'name',
    'object_type',
    'Constelation_abr',
    'Right_Ascension_Hour',
    'Right_Ascension_Minute', 
    'Declinaison_Sign',
    'Declinaison_Deg',
    'Declinaison_Minute',
    'Magnitude',
    'info']

Herschel400_nexstarsite = pd.read_excel(os.path.join(CHEMIN_SOURCES, "nexstarsite","Herschel400_nexstarsite.xls"), skiprows=2, keep_default_na=True, names=col_names)
Herschel400_nexstarsite

Unnamed: 0,NGC_designation,name,object_type,Constelation_abr,Right_Ascension_Hour,Right_Ascension_Minute,Declinaison_Sign,Declinaison_Deg,Declinaison_Minute,Magnitude,info
0,40,PK 120+9.1,Planetary Nebula,Cep,0,13.0,+,72,32,11,"Size: 0.6 F, vS, R, vsmbM, *12 sp"
1,129,OCL 294,Open Cluster,Cas,0,29.9,+,60,14,6.5,"Size: 21. Cl, vL, pR, lC, st 9...13"
2,136,OCL 295,Open Cluster,Cas,0,31.5,+,61,32,Unknown,"Size: 1. glob. cl. , vF, S, eC"
3,157,MCG 2-2-56,Galaxy,Cet,0,34.8,-,8,24,10.4,"Size: 4.3 pB, L, E, bet 2 cB st"
4,185,UGC 396,Galaxy,Cas,0,39.0,+,48,20,9.2,"Size: 11.5 pB, vL, iR, vgmbM, r"
...,...,...,...,...,...,...,...,...,...,...,...
395,7723,MCG 2-60-5,Galaxy,Aqr,23,38.9,-,12,58,11.1,"Size: 3.6 cB, cL, E, gmbM, r"
396,7727,MCG 2-60-8,Galaxy,Aqr,23,39.9,-,12,18,10.7,"Size: 4.2 pB, pL, iR, mbM"
397,7789,OCL 269,Open Cluster,Cas,23,57.0,+,56,44,6.7,"Size: 16. Cl, vL, vRi, vmC, st 11...18"
398,7790,OCL 276,Open Cluster,Cas,23,58.4,+,61,13,8.5,"Size: 17. Cl, pRi, pC"


In [35]:
# Normalisation du NGC_designation: ICnnnn (suppression des vides)
Herschel400_nexstarsite['NGC_designation'] = Herschel400_nexstarsite['NGC_designation'] .astype(str)
Herschel400_nexstarsite.loc[Herschel400_nexstarsite['NGC_designation'].str.len()==2, 
                                'NGC_designation']= "NGC00" + Herschel400_nexstarsite['NGC_designation']
Herschel400_nexstarsite.loc[Herschel400_nexstarsite['NGC_designation'].str.len()==3, 
                                'NGC_designation']= "NGC0" + Herschel400_nexstarsite['NGC_designation']
Herschel400_nexstarsite.loc[Herschel400_nexstarsite['NGC_designation'].str.len()==4, 
                                'NGC_designation']= "NGC" + Herschel400_nexstarsite['NGC_designation']

In [36]:
Herschel400_catalogue = Herschel400_nexstarsite.merge(NGC_mattiaAverga_GitHub, how='left', left_on='NGC_designation', right_on='NGC_IC_designation')

Liste des colonnes:

- NGC_designation  (keep)             
- name  (Delete)                         
- object_type_x  (keep/rename)                
- Constelation_abr (Delete)                
- Right_Ascension_Hour (Delete)          
- Right_Ascension_Minute (Delete)       
- Declinaison_Sign  (Delete)             
- Declinaison_Deg   (Delete)               
- Declinaison_Minute  (Delete)             
- Magnitude  (Delete)                  
- info  (delete*)                         
- NGC_IC_designation  (Delete)           
- object_type_y  (Delete)               
- Right_Ascension  (Keep)              
- Declinaison   (Keep)                 
- Constelation_name_abrv (Keep)        
- major_axis  (Keep)                 
- minor_axis  (Keep)                  
- major__axis_position_angle  (Keep)  
- B_Apparent_Magnitude   (Keep)       
- V_Apparent_Magnitude (Keep)         
- J_Apparent_Magnitude (Keep)         
- H_Apparent_Magnitude (Keep)         
- K_Apparent_Magnitude (Keep)         
- Mean_surface_brigthness (Keep)      
- Hubble_morphological_type (Keep)     
- center_star_U_maghitude  (Keep)    
- center_star_B_maghitude (Keep)      
- center_star_V_maghitude (Keep)      
- Messier_number  (Keep)              
- NGC_numner   (Keep)                  
- IC_number   (Keep)                   
- Center_star_name  (Keep)             
- Identifiers     (Keep)               
- Common_names    (Keep)               
- NED_notes     (Keep)               
- OpenNGC_notes    (Keep)              

In [37]:
drop_list = ['name', 'Constelation_abr', 'Right_Ascension_Hour', 'Right_Ascension_Minute',
            'Declinaison_Sign', 'Declinaison_Deg', 'Declinaison_Minute', #'Magnitude', 
             'info', 'NGC_IC_designation',
            'object_type_y']
rename_lis = {'object_type_x': 'object_type'}

Herschel400_catalogue = Herschel400_catalogue.drop(columns=drop_list)
Herschel400_catalogue = Herschel400_catalogue.rename(columns=rename_lis)
Herschel400_catalogue.shape

(400, 27)

In [38]:
Herschel400_catalogue_final = Herschel400_catalogue

# 5. Ajout des nom des constellations
[Source](https://www.downloadexcelfiles.com/wo_en/download-excel-file-list-constellations#.Xj7g6CXjLxs)

In [39]:
# Lecture de données depuis le fichier source vers un objet DataFrame
constellations = pd.read_csv(os.path.join(CHEMIN_SOURCES, "list-constellations-677j.csv"), sep=',',encoding='ISO-8859-1')
constellations.shape

(92, 9)

In [40]:
# Cleaning Data
constellations = constellations[~constellations['SNo'].isnull()]
constellations = constellations[constellations['SNo'].str.isnumeric()]

In [41]:
constellations = constellations[['Constellation', 'IAU abbreviation']]

In [42]:
constellations

Unnamed: 0,Constellation,IAU abbreviation
0,Andromeda,And
1,Antlia,Ant
2,Apus,Aps
3,Aquarius,Aqr
4,Aquila,Aql
...,...,...
83,Ursa Minor,UMi
84,Vela,Vel
85,Virgo,Vir
86,Volans,Vol


In [43]:
# Caldwell catalog
caldwell_catalogue_final = caldwell_catalogue_final.merge(constellations, how='left', left_on='Constelation_name_abrv', right_on='IAU abbreviation')

drop_list = ['IAU abbreviation']
rename_lis = {'Constellation': 'Constellation_Latin'}

caldwell_catalogue_final = caldwell_catalogue_final.drop(columns=drop_list)
caldwell_catalogue_final = caldwell_catalogue_final.rename(columns=rename_lis)

In [44]:
# Herschel400 catalog
Herschel400_catalogue_final = Herschel400_catalogue_final.merge(constellations, how='left', left_on='Constelation_name_abrv', right_on='IAU abbreviation')

drop_list = ['IAU abbreviation']
rename_lis = {'Constellation': 'Constellation_Latin'}

Herschel400_catalogue_final = Herschel400_catalogue_final.drop(columns=drop_list)
Herschel400_catalogue_final = Herschel400_catalogue_final.rename(columns=rename_lis)

# 6. Ajout de désignations Messier et/ou Caldwell dans Herschel400 catalog
La source de la liste des objets Messier et Caldwell listés dans le catalogue Herschel400 est [Source](https://www.wikiwand.com/en/Herschel_400_Catalogue). 

*NB*: L'objet M51B n'est pas présent dans notre catalogue de Messier.

In [45]:
Messier_in_Herschel400_list=['M20', 'M33', 'M47', 'M48', 'M51', 'M61', 'M76', 'M82',
    'M91', 'M102', 'M104', 'M105', 'M106', 'M107', 'M108', 'M109', 'M110']
Messier_in_Herschel400 = messier_catalogue_final[messier_catalogue_final['Messier_number'].isin(Messier_in_Herschel400_list)]
Messier_in_Herschel400 = Messier_in_Herschel400[['Messier_number', 'NGC_IC_designation']]

In [46]:
Caldwell_in_Herschel400_list = ['C2', 'C6', 'C7', 'C8', 'C10', 'C12', 'C13', 'C14', 'C15', 'C16', 'C18', 'C20', 'C21', 'C22', 'C23', 'C25', 
 'C28', 'C29', 'C30', 'C32', 'C36', 'C37', 'C38', 'C39', 'C40', 'C42', 'C43', 'C44', 'C45', 'C47', 'C48','C50', 
 'C52', 'C53', 'C54', 'C55', 'C56', 'C58', 'C59', 'C60', 'C62', 'C64', 'C65', 'C66']
Caldwell_in_Herschel400 = caldwell_catalogue_final[caldwell_catalogue_final['Caldwell_number'].isin(Caldwell_in_Herschel400_list)]
Caldwell_in_Herschel400 = Caldwell_in_Herschel400[['Caldwell_number', 'NGC_IC_designation']]

In [47]:
#Merge des tables
Herschel400_catalogue_final = Herschel400_catalogue_final.drop(columns='Messier_number')
Herschel400_catalogue_final = Herschel400_catalogue_final.merge(Messier_in_Herschel400, how='left', left_on='NGC_designation', right_on='NGC_IC_designation')
Herschel400_catalogue_final = Herschel400_catalogue_final.drop(columns='NGC_IC_designation')

Herschel400_catalogue_final = Herschel400_catalogue_final.merge(Caldwell_in_Herschel400, how='left', left_on='NGC_designation', right_on='NGC_IC_designation')
Herschel400_catalogue_final = Herschel400_catalogue_final.drop(columns='NGC_IC_designation')

In [48]:
# Merge dans le cas où l'objet à deux numeros NGC
Herschel400_catalogue_final = Herschel400_catalogue_final.merge(Messier_in_Herschel400, how='left', left_on='NGC_number', right_on='NGC_IC_designation')
Herschel400_catalogue_final = Herschel400_catalogue_final.drop(columns='NGC_IC_designation')

Herschel400_catalogue_final = Herschel400_catalogue_final.merge(Caldwell_in_Herschel400, how='left', left_on='NGC_number', right_on='NGC_IC_designation')
Herschel400_catalogue_final = Herschel400_catalogue_final.drop(columns='NGC_IC_designation')

In [49]:
Herschel400_catalogue_final.loc[Herschel400_catalogue_final['Messier_number_x'].isnull(), 
                                'Messier_number_x'] = Herschel400_catalogue_final['Messier_number_y']
Herschel400_catalogue_final.loc[Herschel400_catalogue_final['Caldwell_number_x'].isnull(), 
                                'Caldwell_number_x'] = Herschel400_catalogue_final['Caldwell_number_y'] 

drop_list = ['Messier_number_y', 'Caldwell_number_y']
rename_lis = {'Messier_number_x': 'Messier_number',
             'Caldwell_number_x': 'Caldwell_number'}

Herschel400_catalogue_final = Herschel400_catalogue_final.drop(columns=drop_list)
Herschel400_catalogue_final = Herschel400_catalogue_final.rename(columns=rename_lis)

In [50]:
Herschel400_catalogue_final[~(Herschel400_catalogue_final['Messier_number'].isnull())].shape

(16, 29)

In [51]:
 Herschel400_catalogue_final[~(Herschel400_catalogue_final['Caldwell_number'].isnull())].shape
      

(44, 29)

# 7. Analyse du pourcentage de données manquantes

In [55]:
percent_missing = messier_catalogue_final.isnull().sum() * 100/ len(messier_catalogue_final)
percent_missing

Messier_number                         0.000000
NGC_IC_designation                     0.000000
object_type                            0.000000
Common_name                           66.363636
Constelation_name_abrv                 0.000000
Constellation_Latin                    0.000000
Constellation_FR                       4.545455
Constellation_EN                       4.545455
Hubble_morphological_type             63.636364
Discoverer                             2.727273
Year                                   3.636364
Season                                 0.000000
Right_Ascension                        0.000000
Declinaison                            0.000000
Distance_light_year                    2.727273
Size                                   0.909091
major_axis                             3.636364
minor_axis                            57.272727
major__axis_position_angle            62.727273
Apparent_magnitude_jbcurtin_gitHub     0.000000
Magnitude                              0

In [56]:
percent_missing = caldwell_catalogue_final.isnull().sum() * 100/ len(caldwell_catalogue_final)
percent_missing

Caldwell_number                0.000000
NGC_IC_designation             0.000000
object_type                    0.000000
Common_names                  61.467890
Constelation_name_abrv         0.000000
Constellation_Latin            0.000000
Hubble_morphological_type     67.889908
Right_Ascension                5.504587
Declinaison                    5.504587
Size                           0.000000
major_axis                     8.256881
minor_axis                    57.798165
major__axis_position_angle    67.889908
B_Apparent_Magnitude          14.678899
V_Apparent_Magnitude          27.522936
J_Apparent_Magnitude          60.550459
H_Apparent_Magnitude          60.550459
K_Apparent_Magnitude          59.633028
Mean_surface_brigthness       67.889908
Center_star_name              88.990826
center_star_U_maghitude       95.412844
center_star_B_maghitude       88.073394
center_star_V_maghitude       89.908257
NGC_number                    95.412844
IC_number                     99.082569


In [57]:
percent_missing = Herschel400_catalogue_final.isnull().sum() * 100/ len(Herschel400_catalogue)
percent_missing

NGC_designation                0.00
object_type                    0.00
Common_names                  92.75
Constelation_name_abrv         0.00
Constellation_Latin            0.00
Hubble_morphological_type     43.25
Right_Ascension                0.00
Declinaison                    0.00
major_axis                     2.50
minor_axis                    41.50
major__axis_position_angle    43.25
Magnitude                      0.00
B_Apparent_Magnitude           9.75
V_Apparent_Magnitude          30.25
J_Apparent_Magnitude          37.75
H_Apparent_Magnitude          38.00
K_Apparent_Magnitude          38.00
Mean_surface_brigthness       43.25
Center_star_name              95.00
center_star_U_maghitude       98.25
center_star_B_maghitude       94.75
center_star_V_maghitude       95.25
Messier_number                96.00
Caldwell_number               89.00
NGC_number                    94.25
IC_number                     98.75
Identifiers                    2.50
NED_notes                   

# 8. Création des fichiers Finaux

In [69]:
reorder = ['Messier_number', 'NGC_IC_designation', 'object_type', 'Common_name', 'Constelation_name_abrv',
'Constellation_Latin', 'Constellation_FR', 'Constellation_EN', 'Hubble_morphological_type', 'Discoverer', 'Year', 
'Season', 'Right_Ascension','Declinaison', 'Distance_light_year', 'Size', 'major_axis', 'minor_axis', 'major__axis_position_angle',
'Apparent_magnitude_jbcurtin_gitHub', 'Magnitude', 'B_Apparent_Magnitude', 'V_Apparent_Magnitude', 'J_Apparent_Magnitude',
'H_Apparent_Magnitude', 'K_Apparent_Magnitude', 'Mean_surface_brigthness',
'Center_star_name', 'center_star_U_maghitude', 'center_star_B_maghitude', 'center_star_V_maghitude',
'Image1', 'Image2', 'image_jbcurtin_gitHub', 'NGC_number', 'Identifiers', 'NED_notes', 'OpenNGC_notes']

messier_catalogue_final = messier_catalogue_final[reorder]

messier_catalogue_final = messier_catalogue_final.replace(np.nan, '-', regex=True)

messier_catalogue_final.to_csv("Final_messier_catalogue.csv",";")
#messier_catalogue_final.to_csv(os.path.join(CHEMIN_OUTPUT, "Final_catalogue.csv"),";")

In [67]:
reorder = ['Caldwell_number', 'NGC_IC_designation', 'object_type', 'Common_names', 'Constelation_name_abrv', 
           'Constellation_Latin', 'Hubble_morphological_type', 'Right_Ascension', 'Declinaison', 'Size', 'major_axis', 
           'minor_axis', 'major__axis_position_angle', #'Magnitude', 
          'B_Apparent_Magnitude', 'V_Apparent_Magnitude', 'J_Apparent_Magnitude',
          'H_Apparent_Magnitude', 'K_Apparent_Magnitude', 'Mean_surface_brigthness', 'Center_star_name', 'center_star_U_maghitude',
          'center_star_B_maghitude', 'center_star_V_maghitude', 'NGC_number', 'IC_number', 'Identifiers', 
           'NED_notes', 'OpenNGC_notes']

caldwell_catalogue_final = caldwell_catalogue_final[reorder]

caldwell_catalogue_final = caldwell_catalogue_final.replace(np.nan, '-', regex=True)


caldwell_catalogue_final.to_csv("Final_caldwell_catalogue.csv",";")

In [70]:
reorder = ['NGC_designation', 'object_type', 'Common_names', 'Constelation_name_abrv', 'Constellation_Latin', 'Hubble_morphological_type',
          'Right_Ascension', 'Declinaison', 'major_axis', 'minor_axis', 'major__axis_position_angle', 'Magnitude',
           'B_Apparent_Magnitude',
          'V_Apparent_Magnitude', 'J_Apparent_Magnitude', 'H_Apparent_Magnitude', 'K_Apparent_Magnitude', 
          'Mean_surface_brigthness', 'Center_star_name', 'center_star_U_maghitude', 'center_star_B_maghitude',
          'center_star_V_maghitude', 'Messier_number', 'Caldwell_number', 'NGC_number', 'IC_number', 'Identifiers', 'NED_notes', 'OpenNGC_notes']

Herschel400_catalogue_final = Herschel400_catalogue_final[reorder]

Herschel400_catalogue_final = Herschel400_catalogue_final.replace(np.nan, '-', regex=True)


Herschel400_catalogue_final.to_csv("Herschel400_catalogue_final.csv",";")