High Time Resolution Universe (HTRU) Survey was conducted to search for Pulsars and Fast Transients using the Parkes Telescope in Australia. Majority of the Pulsars detections were actually false positives caused by radio frequency interference (RFI) and noise. We have used state of the art Machine Learning techniques that have improved significantly in recent years to evaluate feature importance and compare the performances of different approaches to design a binary classifier that automatically labels real Pulsar candidates. We have tried to address the problem of class imbalance by using Synthetic minority oversampling technique (SMOTE) and optimized our models by hyper parameter tuning to maximize accuracy and the geometric mean.
17,898 examples and 8 features
- Standard Scaler
- Stratified train-test split
- Oversampling using SMOTE
- Decision Tree
- SVM
- XgBoost
- Neural Networks
- Calculating feature importance
- K-Means
- Agglomerative Clustering
- Confusion Matrix
- F-Score
- G-Mean
Note: Hyper-parameters are adjusted for best performance.