The dataset can be found at https://www.kaggle.com/datasets/rainmaker29/sdss-images
To set up environment, please install the libraries from libraries_installed.txt using conda
To reproduce the results, please run the following files:
- Data Preprocessing : .\Numeric data modelling\Testing_Normal_distribution_features.ipynb
- Traditional models on numeric data : .\Numeric data modelling\Traditional ml algorthims.ipynb
- Fetch images from tabular data (optional as the shared dataset already has fetched images from sdss database) : Image modelling\Fetch images.ipynb
- SVM : .\Image modelling\SVM.ipynb
- CNN : .\Image modelling\CNN.ipynb
- DNN : .\Image modelling\DNN.ipynb
- ViT : .\Image modelling\ViT Train and Eval.ipynb
- Generate quasar augmentations : .\Imbalanced techniques\Augmentation\image_augmentation.ipynb
- CNN on augmented data : .\Imbalanced techniques\Augmentation\CNN_Augmentations.ipynb
- Transfer learning on all data : .\Imbalanced techniques\Transfer Learning\All data\all_data_DNN.ipynb
- Transfer learning on balanced data : .\Imbalanced techniques\Transfer Learning\Balanced data\post_DNN.ipynb
- SimCLR Pretraining : .\Imbalanced techniques\Pretraining\SimCLR Train.ipynb
- SimCLR + other classifiers train,eval : .\Imbalanced techniques\Pretraining\SimCLR Validation.ipynb
- BYOL Pretraining : .\Imbalanced techniques\Pretraining\BYOL Train.ipynb
- BYOL + other classifiers train,eval : .\Imbalanced techniques\Pretraining\BYOL Validation.ipynb
We also provide Colab/Kaggle versions of most of above notebooks so that reproduction of results gets easier (below are the links).
- Fetch images from tabular data (optional as the shared dataset already has fetched images from sdss database) : https://www.kaggle.com/code/rainmaker29/sloan-dataset-code-amaan/notebook
- CNN : https://colab.research.google.com/drive/1rSXBMaxN1MXrGxNMDwEnVcipveR0iUa3?usp=sharing
- ViT : https://www.kaggle.com/rainmaker29/vision-transformer-vit-tutorial-baseline
- Generate quasar augmentations : https://colab.research.google.com/drive/19Mhhhs0O73tVKQHSW_anubQ0GXepfRIa?usp=sharing
- CNN on augmented data : https://colab.research.google.com/drive/1f7uU4YYBOJuAw66hR_r4fbSpV5L2PxtD?usp=sharing
- SimCLR Pretraining : https://www.kaggle.com/rainmaker29/simclr-ml701
- SimCLR + other classifiers train,eval : https://www.kaggle.com/code/rainmaker29/simclr-post-training-ml701-byol
- BYOL Pretraining : https://www.kaggle.com/rainmaker29/contrastive-learning-using-byol
- BYOL + other classifiers train,eval : https://www.kaggle.com/code/mohammadamaansayeed/contrastive-learning-using-byol