Skip to content

JadeWibbels/MLFall2018

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 

Repository files navigation

MLFall2018 - Kaggle Competition Plasticc - Astronomy

https://www.kaggle.com/c/PLAsTiCC-2018

Objective

The Photometric LSST Astronomical Time-Series Classification Challenge (PLAsTiCC) asks Kagglers to help prepare to classify the data from this new survey. Competitors will classify astronomical sources that vary with time into different classes, scaling from a small training set to a very large test set of the type the LSST will discover.

https://arxiv.org/abs/1810.00001

Plan

  1. Start AWS Server for processing, import data
  2. Use a naive Neural Network to achieve baseline
  3. Iterate on feature engineering:

Good resource for feature engineering ideas: https://machinelearningmastery.com/an-introduction-to-feature-selection/ https://towardsdatascience.com/why-how-and-when-to-apply-feature-selection-e9c69adfabf2

  • Brainstorming or Testing features
  • Deciding what features to create
  • Creating features
  • Checking how the features work with your model
  • Improving your features if needed
  • Go back to brainstorming/creating more features until the work is done.

Featuring Engineering implamented:

  • OneHot passband values
  • check baseline with and without distmod (Alex similarirty check)
  • Test for feature significance f-test, mutual info, extra tree, variance threshold
  1. Iterate on Neural Network hyper parameters
  • (initial)

Results

Submissions are evaluated using a weighted multi-class logarithmic loss. The overall effect is such that each class is roughly equally important for the final score.

Each object has been labeled with one type. For each object, you must submit a set of predicted probabilities (one for every category).

Releases

No releases published

Packages

No packages published

Languages