# Notebook 1: Introduction

Welcome! The main goal of this notebook is to provide a clear understanding of the problem at hand to prepare for the hackathon.

---

## Problem Description

### Introduction

Nanosatellites are increasingly used as low-cost dedicated sensing systems for astronomical data. Kyushu Institute of Technology (Kyutech) and collaborators have launched a joint venture for a nanosatellite mission, Visible Extragalactic background RadiaTion Exploration by CubeSat [(**VERTECS**)](https://www.spiedigitallibrary.org/conference-proceedings-of-spie/13092/130920W/Astronomical-6U-CubeSat-mission-VERTECS--scientific-objective-and-project/10.1117/12.3014708.short). The primary purpose is to gain information about the formation history of stars by observing the optical-wavelength extragalactic background light (EBL). The **VERTECS** satellite will be equipped with a small-aperture telescope and a high-precision attitude control system to capture the astronomical data for analysis on the ground.

<figure style="text-align:center;">
    <img src="pictures/SAT.png" class="center-image" style="width:900px; height:400px;">
    <figcaption>Figure 1: VERTECS Satellite Design</figcaption>
</figure>

However, nanosatellites like VERTECS face challenges due to their size and weight constraints. These limitations affect computational power, onboard storage, and communication capabilities, leading to slower data transmission. As a result, they struggle to send large volumes of data quickly, which can impact the efficiency of missions that rely on fast or continuous data updates.

To solve the problem of slow data transmission in CubeSats, we can use machine learning techniques directly on the satellite. This allows the CubeSat to select **`priority`** data before sending it to the ground, making the process more efficient.

### The goal of this hackathon 

Develop a machine learning model that accurately classifies data captured by CubeSats. The goal is to prioritize images that are most valuable for transmission back to Earth, given the limited onboard resources and slow data downlink speeds. Your task is to create a lightweight model that improves the efficiency and/or classification accuracy of the existing solution in this [paper](https://arxiv.org/pdf/2408.14865).

---

## Prizes

### A: Best Team per Regional Hackathon:

Every team will present their work at their local events in front of local organizers and peers. 🎤 Each team will have **6 minutes** to present their work. The winning team from each regional hackathon will receive prizes worth approximately 3000 ZAR 🏆. Teams will be judged based on the following criteria:

1. **Clarity of Problem Description**
2. **Novelty**
3. **Use of Graphics**
4. **Presentation Skills**
5. **Team Collaboration**

For more detailed guidelines on how the judges will be evaluating you, please visit the following  [page](https://docs.google.com/document/d/1CUyLNZtQ8htKBXVJ-zhDiQpBoC8QY2eMx3aZMd1NhIo/edit).

**Note**: All team members must participate in the presentation, sharing different parts of it among themselves. For example, one member can describe the problem, another can discuss the solution, and so on.

### B: Best overall pipeline across all Regional hackathons:

Every team from **each regional hackathon** has the opportunity to compete for this prize. To participate, all teams must submit the following to a shared folder managed by their local organizers:

1. **Slides Presentation:** The slides used during their regional hackathon presentation.
2. **Notebook:** Submit one clear, comprehensive notebook alongside all steps to reach the final results, written so that anyone can run the code successfully. **The trained model** must also be included in the submission. 🛑 Submitting more than one notebook or pipeline will result in **disqualification**.
3. **Written Description:** Teams must submit a 200-350 word description, with relevant graphs if necessary, covering the following:
    1. **Preprocessing & ML Methods:** Summarize the preprocessing and machine learning techniques used, and justify the pipeline choice (experimentation or by literature review).
    2. **Results:** Highlight key outcomes and explain how they compare with previous work, emphasizing the novelty of your approach.
       
**To assist you in structuring your description, we’ve provided a [template](https://docs.google.com/document/d/1M5qOSSBSFUYCXGsg7KwuWY8gYlInKFP_PJHYLKjibRk/edit?tab=t.0#heading=h.gzmom18r0hd0) that you can use**



A committee of experienced ML academics will evaluate each team’s pipeline based on the clarity of their slides, notebooks, and write-ups, as well as the novelty of their approach. All teams are highly encouraged to apply 🙌, as we will use a separate testing set, and your results may vary from the validation set you’ve used.

🎉 **Prize:** The winning team will be featured as co-authors on an academic paper written by Hack4dev and will receive an additional prize of **10,000ZAR**.



⚠️ **Important**: The notebook, trained model, and slides must be submitted before the presentation begins on the final day of the hackathon. The write-up can be submitted within a week following the conclusion of your regional hackathon. 🛑 Ensure all submissions are on time to avoid disqualification. All documents must be submitted in English.

Good luck! 🍀

---

## Structure for the Upcoming Notebooks

- Notebook 2: **Data Reading** – This notebook handles reading and visualizing the data.
- Notebook 3: **ML** – Here we train a classical ML model and evaluate it against the model in Notebook 4.
- Notebook 4: **CubeSatNet_CNN** – Trains the model provided by the authors in [link](https://arxiv.org/pdf/2408.14865) and uses it as a baseline against Notebook 3. Only run it if you’re adopting or experimenting with this model, as it can take up to two hours to train with the original data.
- Notebook 5 **Evaluation** - 
