# MALLORN Astronomical Classification Challenge

## Overview
The **Many Artificial LSST Lightcurves based on Observations of Real Nuclear transients (MALLORN) Classifier Challenge** invites participants to develop machine learning algorithms to photometrically identify **Tidal Disruption Events (TDEs)**. 


## Objective
* **Goal**: Detect TDEs—stars being torn apart by supermassive black holes—within a simulated LSST dataset.
* **Significance**: TDEs are rare (~100 known) but valuable for studying black holes.
* **Data**: Simulated lightcurves based on real observations from the Zwicky Transient Facility (ZTF).

## Evaluation
* **Metric**: **F1 Score** (harmonic mean of precision and recall).
* **Reasoning**: The dataset is highly imbalanced as TDEs are significantly rarer than other transients.
* **Target**: Binary classification:
    * `1`: TDE
    * `0`: Not TDE

## Submission Format
The submission file must be a CSV containing two columns: `object_id` and `prediction`.

| object_id | prediction |
| :--- | :--- |
| Eluwaith_Mithrim_nothrim | 0 |
| Eru_heledir_archam | 0 |
| ... | ... |

## 1. Setup

Load libraries

In [2]:
import os
import zipfile
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import sncosmo

# professional plotting style
sns.set_theme(style="whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)

Download datasets

In [11]:
COMPETITION_NAME = 'mallorn-astronomical-classification-challenge'

# Check if data exists, if not, download it
if not os.path.exists('../data/raw/train_log.csv'):
    print(f"Downloading {COMPETITION_NAME}...")
    !kaggle competitions download -c {COMPETITION_NAME} -p ../data/raw
    
    print("Unzipping data...")
    with zipfile.ZipFile(f"../data/raw/{COMPETITION_NAME}.zip", 'r') as zip_ref:
        zip_ref.extractall("../data/raw")
    print("Download and Extraction Complete!")
else:
    print("Data already exists. Skipping download.")

Data already exists. Skipping download.
