# SpaceNet Dataset Overview

The dataset choosen fot this task was the (description below)

## Description
SpaceNet, created through a novel double-stage augmentation framework (FLARE), is a hierarchically structured, high-quality astronomical image dataset. The dataset is tailored for both fine-grained and macro classification tasks, enabling enhanced generalization on various recognition tasks. With approximately **12,900 samples**, SpaceNet integrates lower-resolution (LR) to higher-resolution (HR) conversions, standard augmentations, and a diffusion approach for synthetic sample generation. 

**Reference:** [FLARE Paper](https://arxiv.org/pdf/2405.13267)

## Dataset Structure

- **Fine-Grained Classes**: 8 distinct classes, covering a range of astronomical objects:
    - Planets
    - Galaxies
    - Asteroids
    - Nebulae
    - Comets
    - Black Holes
    - Stars
    - Constellations

## Dataset Composition

- **Total Samples**: Approximately 12,900 images
- **Class Distribution**:
    - **Asteroid**: 283 files
    - **Black Hole**: 656 files
    - **Comet**: 416 files
    - **Constellation**: 1,552 files
    - **Galaxy**: 3,984 files
    - **Nebula**: 1,192 files
    - **Planet**: 1,472 files
    - **Star**: 3,269 files

## Usage

SpaceNet is designed for:
1. **Training and Evaluation**: Suitable for training and evaluating machine learning models on fine-grained and macro-level astronomical classification tasks.
2. **Hierarchical Classification Research**: Facilitates research on hierarchical classification in the field of astronomy.
3. **Generalization Research**: Helps in developing robust models that generalize across in-domain and out-of-domain datasets.

---

This section provides an overview of the SpaceNet dataset's capabilities, structure, and suggested applications for machine learning models in astronomy.


# Verificando se a GPU está disponível

In [None]:
# Verificando se a GPU está disponível
print("GPUs disponíveis:", tf.config.list_physical_devices('GPU'))

# Configurando a GPU para uso com crescimento de memória gradual
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        print("Memória da GPU configurada para crescimento gradual.")
    except RuntimeError as e:
        print(e)

# SpaceNet Astronomy Classification with CNNs

## 1. Setup
### Importando Bibliotecas

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from sklearn.metrics import classification_report, confusion_matrix
import kagglehub

ModuleNotFoundError: No module named 'tensorflow'

## 2. Download do Dataset
### Usando o kagglehub para baixar o dataset do Kaggle

In [1]:
path = kagglehub.dataset_download("razaimam45/spacenet-an-optimally-distributed-astronomy-data")
print("Path to dataset files:", path)

NameError: name 'kagglehub' is not defined

## 3. Análise Exploratória do Dataset
### Configuração da ImageDataGenerator para contar e visualizar as classes

In [None]:
datagen = ImageDataGenerator(rescale=1./255, validation_split=0.2)
train_data = datagen.flow_from_directory(
    path,
    target_size=(150, 150),
    batch_size=32,
    class_mode='categorical',
    subset='training'
)

### Exibindo Distribuição de Classes

In [None]:
# Exibindo a distribuição das classes
class_counts = train_data.class_indices
class_names = list(class_counts.keys())
num_classes = len(class_names)