Skip to content

Machine Learning used on SDSS Data to Classify Stars, Galaxies and Quasars

Notifications You must be signed in to change notification settings

WDoyle123/AstroClassifierML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AstroClassifierML

Contents

Overview

AstroClassifierML is a machine learning project aimed at classifying space objects. It uses data from Sloan Digital Sky Survey, applying advanced machine learning techniques to classify Galaxies, Stars, and Quasars.

Technical Features

  • TensorFlow and Scikit-Learn for Machine Learning: This project employs TensorFlow for building and training neural network models and Scikit-Learn for SVM model implementation and data preprocessing.
  • Pandas for Data Handling: Utilises Pandas for data manipulation, enabling efficient data operations.
  • Data Preprocessing: Features data preprocessing techniques like feature scaling, outlier removal, and handling imbalanced datasets using SMOTE.
  • Model Optimisation and Evaluation: Uses the Adam optimiser for the neural network and evaluates models based on accuracy metrics.
  • Regularisation Techniques: Applies dropout in neural networks to prevent overfitting and ensure model generalisability.

Machine Learning Models

  • Deep Neural Network (DNN) Model:
    • Architected with dense layers and dropout for regularisation.
    • Utilises Early Stopping to monitor training performance.
  • Support Vector Machine (SVM) Model:
    • Implemented using Scikit-Learn's svm.SVC with an RBF kernel.
    • Optimised with hyperparameter tuning.

Files Description

  • main.py: Main script for executing the machine learning workflow, including data preprocessing, model training, and evaluation.
  • data_handler.py: Script for data extraction and preprocessing. Handles tasks like data cleaning and feature engineering.
  • models.py: Contains the implementation of the DNN and SVM models.
  • plotter.py: Handles all the plots generated from the models and data.

Output

The data contains a range of characteristcs of Stars, Galaxies and Quasars, which can be visualised here: plot

The script outputs the accuracy of the DNN model and generates a plot showing the training and validation accuracy of the model over epochs.

plot

Using the SVM model we can categories the errors in the model seen here: plot

Acknowledgements

Sloan Digital Sky Survey

About

Machine Learning used on SDSS Data to Classify Stars, Galaxies and Quasars

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages