# Clasificación de estrellas con redes neuronales feedforward

Clasificación de objetos (estrella, galaxia o quasar) utilizando datos científicos obtenidos del observatorio Apache Point de Nuevo México. El set de datos forma parte del proyecto Sloan Digital Sky Survey SDSS y contiene información de características espectrales.

Dirección del set de datos: https://www.kaggle.com/datasets/fedesoriano/stellar-classification-dataset-sdss17<br/>
Cantidad de registros: 100,000<br/>

Descripción de las columnas:

<table>
    <thead>
        <tr>
            <td>#</td>
            <td>Columna</td>
            <td>Descripción</td>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>1</td>
            <td>obj_ID</td>
            <td>Object Identifier, the unique value that identifies the object in the image catalog used by the CAS.</td>
        </tr>
        <tr>
            <td>2</td>
            <td>alpha</td>
            <td>Right Ascension angle (at J2000 epoch).</td>
        </tr>
        <tr>
            <td>3</td>
            <td>delta</td>
            <td>Declination angle (at J2000 epoch).</td>
        </tr>
        <tr>
            <td>4</td>
            <td>u</td>
            <td>Ultraviolet filter in the photometric system.</td>
        </tr>
        <tr>
            <td>5</td>
            <td>g</td>
            <td>Green filter in the photometric system.</td>
        </tr>
        <tr>
            <td>6</td>
            <td>r</td>
            <td>Red filter in the photometric system.</td>
        </tr>
        <tr>
            <td>7</td>
            <td>i</td>
            <td>Near Infrared filter in the photometric system.</td>
        </tr>
        <tr>
            <td>8</td>
            <td>z</td>
            <td>Infrared filter in the photometric system.</td>
        </tr>
        <tr>
            <td>9</td>
            <td>run_ID</td>
            <td>Run Number used to identify the specific scan.</td>
        </tr>
        <tr>
            <td>10</td>
            <td>rereun_ID</td>
            <td>Rerun Number to specify how the image was processed.</td>
        </tr>
        <tr>
            <td>11</td>
            <td>cam_col</td>
            <td>Camera column to identify the scanline within the run.</td>
        </tr>
        <tr>
            <td>12</td>
            <td>field_ID</td>
            <td>Field number to identify each field.</td>
        </tr>
        <tr>
            <td>13</td>
            <td>spec_obj_ID</td>
            <td>Unique ID used for optical spectroscopic objects (this means that 2 different observations with the same spec_obj_ID must share the output class).</td>
        </tr>
        <tr>
            <td>14</td>
            <td>class</td>
            <td>Object class (galaxy, star or quasar object).</td>
        </tr>
        <tr>
            <td>15</td>
            <td>redshift</td>
            <td>Redshift value based on the increase in wavelength.</td>
        </tr>
        <tr>
            <td>16</td>
            <td>plate</td>
            <td>Plate ID, identifies each plate in SDSS.</td>
        </tr>
        <tr>
            <td>17</td>
            <td>MJD</td>
            <td>Modified Julian Date, used to indicate when a given piece of SDSS data was taken.</td>
        </tr>
        <tr>
            <td>18</td>
            <td>fiber_ID</td>
            <td>Fiber ID that identifies the fiber that pointed the light at the focal plane in each observation.</td>
        </tr>
    </tbody>
</table>


# Librerías a utilizar en el proyecto

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, LeakyReLU
from tensorflow.keras.optimizers import Adagrad
from tensorflow.keras.metrics import Recall, BinaryAccuracy, Precision
from tensorflow.keras.callbacks import Callback, EarlyStopping, ModelCheckpoint
from tensorflow.keras.initializers import GlorotNormal
import matplotlib.pyplot as plt

Evita imprimir varios logs al momento de entrenar el modelo con Keras

In [2]:
from tqdm.keras import TqdmCallback

# Análisis de datos

Cargando set de datos

In [3]:
star_classification = pd.read_csv('data1/star_classification.csv') 
star_visualization = star_classification.copy()
star_classification.head()

Unnamed: 0,obj_ID,alpha,delta,u,g,r,i,z,run_ID,rerun_ID,cam_col,field_ID,spec_obj_ID,class,redshift,plate,MJD,fiber_ID
0,1.237661e+18,135.689107,32.494632,23.87882,22.2753,20.39501,19.16573,18.79371,3606,301,2,79,6.543777e+18,GALAXY,0.634794,5812,56354,171
1,1.237665e+18,144.826101,31.274185,24.77759,22.83188,22.58444,21.16812,21.61427,4518,301,5,119,1.176014e+19,GALAXY,0.779136,10445,58158,427
2,1.237661e+18,142.18879,35.582444,25.26307,22.66389,20.60976,19.34857,18.94827,3606,301,2,120,5.1522e+18,GALAXY,0.644195,4576,55592,299
3,1.237663e+18,338.741038,-0.402828,22.13682,23.77656,21.61162,20.50454,19.2501,4192,301,3,214,1.030107e+19,GALAXY,0.932346,9149,58039,775
4,1.23768e+18,345.282593,21.183866,19.43718,17.58028,16.49747,15.97711,15.54461,8102,301,3,137,6.891865e+18,GALAXY,0.116123,6121,56187,842


Según la descripción de los datos, las siguientes columnas no aportan valor para el entrenamiento:

<ul>
    <li>obj_ID</li>
    <li>run_ID</li>
    <li>rerun_ID</li>
    <li>cam_col</li>
    <li>field_ID</li>
    <li>spec_obj_ID</li>
    <li>plate</li>
    <li>MJD</li>
    <li>fiber_ID</li>
</ul>

In [4]:
star_classification = star_classification.drop(columns=['obj_ID', 'run_ID', 'rerun_ID', 'cam_col', 'field_ID', 'spec_obj_ID', 'plate', 'MJD', 'fiber_ID'])
star_classification.head()

Unnamed: 0,alpha,delta,u,g,r,i,z,class,redshift
0,135.689107,32.494632,23.87882,22.2753,20.39501,19.16573,18.79371,GALAXY,0.634794
1,144.826101,31.274185,24.77759,22.83188,22.58444,21.16812,21.61427,GALAXY,0.779136
2,142.18879,35.582444,25.26307,22.66389,20.60976,19.34857,18.94827,GALAXY,0.644195
3,338.741038,-0.402828,22.13682,23.77656,21.61162,20.50454,19.2501,GALAXY,0.932346
4,345.282593,21.183866,19.43718,17.58028,16.49747,15.97711,15.54461,GALAXY,0.116123


# Tratamiento de datos

One hot encoding para variable categórica

In [5]:
one_hot_encoding = pd.get_dummies(star_classification["class"], prefix='class')
star_classification = star_classification.drop(columns=['class'])
star_classification.head()

Unnamed: 0,alpha,delta,u,g,r,i,z,redshift
0,135.689107,32.494632,23.87882,22.2753,20.39501,19.16573,18.79371,0.634794
1,144.826101,31.274185,24.77759,22.83188,22.58444,21.16812,21.61427,0.779136
2,142.18879,35.582444,25.26307,22.66389,20.60976,19.34857,18.94827,0.644195
3,338.741038,-0.402828,22.13682,23.77656,21.61162,20.50454,19.2501,0.932346
4,345.282593,21.183866,19.43718,17.58028,16.49747,15.97711,15.54461,0.116123


Normalización de datos

In [6]:
mean = np.mean(star_classification, axis=0)
std = np.std(star_classification, axis=0)
star_classification = (star_classification - mean) / std
star_classification = pd.concat([star_classification, one_hot_encoding], axis=1)
star_classification.head()

Unnamed: 0,alpha,delta,u,g,r,i,z,redshift,class_GALAXY,class_QSO,class_STAR
0,-0.434604,0.425529,0.059755,0.054926,0.403962,0.046007,0.003937,0.079557,1,0,0
1,-0.339921,0.363402,0.088045,0.072456,1.584406,1.185097,0.092835,0.277096,1,0,0
2,-0.367251,0.582713,0.103327,0.067165,0.519745,0.150019,0.008808,0.092423,1,0,0
3,1.669523,-1.249105,0.004921,0.10221,1.059904,0.80761,0.018321,0.48677,1,0,0
4,1.73731,-0.150242,-0.080055,-0.092948,-1.697421,-1.767887,-0.098468,-0.630267,1,0,0


Verificación de valores nulos

In [7]:
star_classification.isna().sum(axis=0)

alpha           0
delta           0
u               0
g               0
r               0
i               0
z               0
redshift        0
class_GALAXY    0
class_QSO       0
class_STAR      0
dtype: int64

# Datos de entrenamiento y pruebas

Separación de datos para entrenamiento y validación

In [8]:
columnas_x = ["alpha", "delta", "u", "g", "r", "i", "z", "redshift"]
columnas_y = ["class_GALAXY", "class_QSO", "class_STAR"]
X_train_model, X_val, y_train_model, y_val = train_test_split(
    star_classification[columnas_x], 
    star_classification[columnas_y],
    test_size=0.77, random_state=2022)

Separación de datos para entrenamiento

In [9]:
X_train, X_test, y_train, y_test = train_test_split(
    X_train_model, 
    y_train_model, 
    test_size=0.8, random_state=2022)

Separación de datos para validación y demo

In [10]:
X_demo, X_validation, y_demo, y_validation = train_test_split(
    X_val, 
    y_val, 
    test_size=0.15, random_state=2022)

# Modelo Keras

## Optimizador

In [11]:
adagrad = Adagrad( learning_rate=0.01,
    initial_accumulator_value=0.1,
    epsilon=1e-07,
    name="Adagrad"
)

## Custom callback

In [12]:
class custom_callback(Callback):
    def __init__(self):
        self.f1score = 0
    
    def get_f1score(self):
        return self.f1score
    
    def set_f1score(self, recall, precision):
        denominador = recall + precision
        if( denominador == 0):
            self.f1score = 0
        else:
            self.f1score = (2*(recall * precision) / denominador)
    
    def on_epoch_end(self, batch, logs=None):
        recall = logs.get('recall')
        accuracy = logs.get('accuracy')
        precision = logs.get('precision')
        self.set_f1score(recall, precision)
        
        if( accuracy > 0.95 ):
            self.model.stop_training = True

## Modelo secuencial con sigmoid

In [13]:
modelo_sigmoid = Sequential()
modelo_sigmoid.add(Dense(4, input_shape=(8,), activation="sigmoid", kernel_initializer=GlorotNormal()))
modelo_sigmoid.add(Dropout(0.1, input_shape=(8,)))
modelo_sigmoid.add(Dense(8, activation="sigmoid", kernel_initializer=GlorotNormal()))
modelo_sigmoid.add(Dropout(0.1, input_shape=(6,)))
modelo_sigmoid.add(Dense(6, activation="sigmoid", kernel_initializer=GlorotNormal()))
modelo_sigmoid.add(Dropout(0.1, input_shape=(3,)))
modelo_sigmoid.add(Dense(3, activation="softmax"))

callbacks_sigmoid = custom_callback()
modelo_sigmoid.compile(
    loss="categorical_crossentropy", 
    optimizer=adagrad, 
    metrics=[Recall(name="recall"), BinaryAccuracy(name="accuracy"), Precision(name="precision")])

historial_prediccion_sigmoid = modelo_sigmoid.fit(
    X_train, 
    y_train, 
    validation_data=(X_test, y_test), 
    epochs=20,
    batch_size=128, 
    verbose=0, 
    callbacks=[callbacks_sigmoid, TqdmCallback(verbose=1)])

0epoch [00:00, ?epoch/s]

0batch [00:00, ?batch/s]

In [14]:
print("Loss:", historial_prediccion_sigmoid.history["loss"][-1])
print("Recall:", historial_prediccion_sigmoid.history["recall"][-1])
print("Accuracy:", historial_prediccion_sigmoid.history["accuracy"][-1])
print("Precision:", historial_prediccion_sigmoid.history["precision"][-1])
print("F1-Score:", callbacks_sigmoid.get_f1score())

Loss: 0.967513382434845
Recall: 0.5323913097381592
Accuracy: 0.7210870385169983
Precision: 0.5905473828315735
F1-Score: 0.5599634186416097


## Modelo secuencial con relu

In [15]:
modelo_relu = Sequential()
modelo_relu.add(Dense(4, input_shape=(8,), activation="relu", kernel_initializer=GlorotNormal()))
modelo_relu.add(Dropout(0.1, input_shape=(8,)))
modelo_relu.add(Dense(8, activation="relu", kernel_initializer=GlorotNormal()))
modelo_relu.add(Dropout(0.1, input_shape=(6,)))
modelo_relu.add(Dense(6, activation="relu", kernel_initializer=GlorotNormal()))
modelo_relu.add(Dropout(0.1, input_shape=(3,)))
modelo_relu.add(Dense(3, activation="softmax"))

callbacks_relu = custom_callback()
modelo_relu.compile(
    loss="categorical_crossentropy", 
    optimizer=adagrad, 
    metrics=[Recall(name="recall"), BinaryAccuracy(name="accuracy"), Precision(name="precision")])

historial_prediccion_relu = modelo_sigmoid.fit(
    X_train, 
    y_train, 
    validation_data=(X_test, y_test), 
    epochs=20,
    batch_size=128, 
    verbose=0, 
    callbacks=[callbacks_relu, TqdmCallback(verbose=1)])

0epoch [00:00, ?epoch/s]

0batch [00:00, ?batch/s]

In [16]:
print("Loss:", historial_prediccion_relu.history["loss"][-1])
print("Recall:", historial_prediccion_relu.history["recall"][-1])
print("Accuracy:", historial_prediccion_relu.history["accuracy"][-1])
print("Precision:", historial_prediccion_relu.history["precision"][-1])
print("F1-Score:", callbacks_relu.get_f1score())

Loss: 0.9642006158828735
Recall: 0.5363043546676636
Accuracy: 0.7213768362998962
Precision: 0.5903326272964478
F1-Score: 0.5620230185761342


## Modelo secuencial con Leaky Relu

In [17]:
modelo_leakyrelu = Sequential()
modelo_leakyrelu.add(Dense(4, input_shape=(8,), activation=LeakyReLU(alpha=0.01), kernel_initializer=GlorotNormal()))
modelo_leakyrelu.add(Dropout(0.1, input_shape=(8,)))
modelo_leakyrelu.add(Dense(8, activation=LeakyReLU(alpha=0.01), kernel_initializer=GlorotNormal()))
modelo_leakyrelu.add(Dropout(0.1, input_shape=(6,)))
modelo_leakyrelu.add(Dense(6, activation=LeakyReLU(alpha=0.01), kernel_initializer=GlorotNormal()))
modelo_leakyrelu.add(Dropout(0.1, input_shape=(3,)))
modelo_leakyrelu.add(Dense(3, activation="softmax"))

callbacks_leakyrelu = custom_callback()
modelo_leakyrelu.compile(
    loss="categorical_crossentropy", 
    optimizer=adagrad, 
    metrics=[Recall(name="recall"), BinaryAccuracy(name="accuracy"), Precision(name="precision")])

historial_prediccion_leakyrelu = modelo_sigmoid.fit(
    X_train, 
    y_train, 
    validation_data=(X_test, y_test), 
    epochs=20,
    batch_size=128, 
    verbose=0, 
    callbacks=[callbacks_leakyrelu, TqdmCallback(verbose=1)])

0epoch [00:00, ?epoch/s]

0batch [00:00, ?batch/s]

In [18]:
print("Loss:", historial_prediccion_leakyrelu.history["loss"][-1])
print("Recall:", historial_prediccion_leakyrelu.history["recall"][-1])
print("Accuracy:", historial_prediccion_leakyrelu.history["accuracy"][-1])
print("Precision:", historial_prediccion_leakyrelu.history["precision"][-1])
print("F1-Score", callbacks_leakyrelu.get_f1score())

Loss: 0.9629098176956177
Recall: 0.5408695936203003
Accuracy: 0.7223187685012817
Precision: 0.5912547707557678
F1-Score 0.5649409864277204


## Modelo secuencial con tanh (hyperbolic tangent)

In [19]:
modelo_tanh = Sequential()
modelo_tanh.add(Dense(4, input_shape=(8,), activation="tanh", kernel_initializer=GlorotNormal()))
modelo_tanh.add(Dropout(0.1, input_shape=(8,)))
modelo_tanh.add(Dense(8, activation="tanh", kernel_initializer=GlorotNormal()))
modelo_tanh.add(Dropout(0.1, input_shape=(6,)))
modelo_tanh.add(Dense(6, activation="tanh", kernel_initializer=GlorotNormal()))
modelo_tanh.add(Dropout(0.1, input_shape=(3,)))
modelo_tanh.add(Dense(3, activation="softmax"))

callbacks_tanh = custom_callback()
modelo_tanh.compile(
    loss="categorical_crossentropy", 
    optimizer=adagrad, 
    metrics=[Recall(name="recall"), BinaryAccuracy(name="accuracy"), Precision(name="precision")])

historial_prediccion_tanh = modelo_tanh.fit(
    X_train, 
    y_train, 
    validation_data=(X_test, y_test), 
    epochs=20,
    batch_size=128, 
    verbose=0, 
    callbacks=[callbacks_tanh, TqdmCallback(verbose=1)])

0epoch [00:00, ?epoch/s]

0batch [00:00, ?batch/s]

In [20]:
print("Loss:", historial_prediccion_tanh.history["loss"][-1])
print("Recall:", historial_prediccion_tanh.history["recall"][-1])
print("Accuracy:", historial_prediccion_tanh.history["accuracy"][-1])
print("Precision:", historial_prediccion_tanh.history["precision"][-1])
print("F1-Score", callbacks_tanh.get_f1score())

Loss: 0.6119504570960999
Recall: 0.716304361820221
Accuracy: 0.8207247257232666
Precision: 0.7381272315979004
F1-Score 0.727052077202592


## Modelo final

La función de tangente hiperbolica funcionó mejor que sigmoid, relu y leaky relu por la distribución de los datos de entrenamiento. La función tangente hiperbolica tiene el codominio -1 < f(x) < 1 para -∞ < x < ∞, en este caso los datos de entrada también están en ese dominio debido a la normalización realizada.

<img src="data1/tanh.png" />

In [21]:
modelo_final = Sequential()
modelo_final.add(Dense(4, input_shape=(8,), activation="tanh", kernel_initializer=GlorotNormal()))
modelo_final.add(Dropout(0.1, input_shape=(8,)))
modelo_final.add(Dense(8, activation="tanh", kernel_initializer=GlorotNormal()))
modelo_final.add(Dropout(0.1, input_shape=(6,)))
modelo_final.add(Dense(6, activation="tanh", kernel_initializer=GlorotNormal()))
modelo_final.add(Dropout(0.1, input_shape=(3,)))
modelo_final.add(Dense(3, activation="softmax"))

#Early Stop después de 100 epochs
my_early_top = EarlyStopping(monitor='loss', patience=100)

#Guardar modelo con la mejor presición
modelo_checkpoint_callback = ModelCheckpoint(
    filepath='data1/modelo_final',
    save_weights_only=True,
    monitor='val_accuracy',
    mode='max',
    save_best_only=True)

callbacks_final = custom_callback()
modelo_final.compile(
    loss="categorical_crossentropy", 
    optimizer=adagrad, 
    metrics=[Recall(name="recall"), BinaryAccuracy(name="accuracy"), Precision(name="precision")])

historial_prediccion_final = modelo_final.fit(
    X_train, 
    y_train, 
    validation_data=(X_test, y_test), 
    epochs=1000,
    batch_size=1000, 
    verbose=0, 
    callbacks=[callbacks_final, TqdmCallback(verbose=1), my_early_top, modelo_checkpoint_callback])

0epoch [00:00, ?epoch/s]

0batch [00:00, ?batch/s]

In [22]:
print("Loss:", historial_prediccion_final.history["loss"][-1])
print("Recall:", historial_prediccion_final.history["recall"][-1])
print("Accuracy:", historial_prediccion_final.history["accuracy"][-1])
print("Precision:", historial_prediccion_final.history["precision"][-1])
print("F1-Score", callbacks_final.get_f1score())

Loss: 0.2666434049606323
Recall: 0.9230434894561768
Accuracy: 0.9501447677612305
Precision: 0.9270742535591125
F1-Score 0.9250544806900022


# Predicción

In [23]:
predicciones = modelo_final.predict(X_validation, batch_size=1000)
print(classification_report(
    np.array(y_validation).argmax(axis=1), 
    predicciones.argmax(axis=1), 
    target_names=y_validation.columns))

              precision    recall  f1-score   support

class_GALAXY       0.95      0.95      0.95      6881
   class_QSO       0.93      0.88      0.90      2220
  class_STAR       0.92      0.98      0.95      2449

    accuracy                           0.94     11550
   macro avg       0.94      0.94      0.94     11550
weighted avg       0.94      0.94      0.94     11550



# Visualización de datos reales

Los siguientes indices contiene una predicción correcta de una galaxia, un quasar y una estrella:

In [24]:
galaxia = np.argmax(y_validation.class_GALAXY==1)
quasar = np.argmax(y_validation.class_QSO==1)
estrella = np.argmax(y_validation.class_STAR==1)
predicion_correcta = [galaxia, quasar, estrella]
print( predicion_correcta )

[0, 3, 9]


Parametros de entrada:

In [25]:
x_visualizacion = X_validation.iloc[predicion_correcta,:]
x_visualizacion = x_visualizacion*std + mean
x_visualizacion

Unnamed: 0,alpha,delta,u,g,r,i,z,redshift
71516,249.733,18.116355,22.34891,20.82807,19.02194,18.34502,17.99161,0.336377
13155,189.808522,46.621074,19.14531,19.01218,19.1075,18.91895,18.91119,1.900497
72991,210.20409,45.132961,20.81794,20.38485,20.27744,20.37429,20.39449,0.000336


Predicción:

In [26]:
prediccion = pd.DataFrame(predicciones[predicion_correcta,:], columns=y_validation.columns)
prediccion

Unnamed: 0,class_GALAXY,class_QSO,class_STAR
0,0.977616,0.018901,0.003483
1,0.030821,0.968965,0.000215
2,0.069331,0.001176,0.929493


Probabilidad más alta:

In [27]:
argmax = np.array(prediccion).argmax(axis=1)
pd_prediccion = (pd.DataFrame(predicciones[predicion_correcta,:] >= predicciones[predicion_correcta, argmax ], columns=y_validation.columns))+0
pd_prediccion

Unnamed: 0,class_GALAXY,class_QSO,class_STAR
0,1,0,0
1,0,1,0
2,0,0,1


Uniendo los parametros de entrada y la predicción:

In [28]:
visualicacion_final = x_visualizacion.reset_index(drop=True).join(pd_prediccion)
visualicacion_final

Unnamed: 0,alpha,delta,u,g,r,i,z,redshift,class_GALAXY,class_QSO,class_STAR
0,249.733,18.116355,22.34891,20.82807,19.02194,18.34502,17.99161,0.336377,1,0,0
1,189.808522,46.621074,19.14531,19.01218,19.1075,18.91895,18.91119,1.900497,0,1,0
2,210.20409,45.132961,20.81794,20.38485,20.27744,20.37429,20.39449,0.000336,0,0,1


Según los datos científicos obtenidos del observatorio Apache Point de Nuevo México, del proyecto Sloan Digital Sky Survey SDSS, los objetos reales son los siguientes:

## Galaxia

<a href="http://skyserver.sdss.org/dr17/VisualTools/explore/summary?id=1237664887465902678">http://skyserver.sdss.org/dr17/VisualTools/explore/summary?id=1237664887465902678</a>

In [29]:
visualicacion_final.iloc[0,:]

alpha           249.733000
delta            18.116355
u                22.348910
g                20.828070
r                19.021940
i                18.345020
z                17.991610
redshift          0.336377
class_GALAXY      1.000000
class_QSO         0.000000
class_STAR        0.000000
Name: 0, dtype: float64

<table>
    <tr>
        <td><img src="data1/galaxia.jpg" ></td>
        <td><img src="data1/galaxia-zoomout.png" ></td>
        <td><img src="data1/galaxia-spectrum.jpg" ></td>
    </tr>
</table>

## Quasar

<a href="http://skyserver.sdss.org/dr17/VisualTools/explore/summary?id=1237661360757342276">http://skyserver.sdss.org/dr17/VisualTools/explore/summary?id=1237661360757342276</a>

In [30]:
visualicacion_final.iloc[1,:]

alpha           189.808522
delta            46.621074
u                19.145310
g                19.012180
r                19.107500
i                18.918950
z                18.911190
redshift          1.900497
class_GALAXY      0.000000
class_QSO         1.000000
class_STAR        0.000000
Name: 1, dtype: float64

<table>
    <tr>
        <td><img src="data1/quasar.jpg" ></td>
        <td><img src="data1/quasar-zoomout.png" ></td>
        <td><img src="data1/quasar-spectrum.jpg" ></td>
    </tr>
</table>

## Estrella

<a href="http://skyserver.sdss.org/dr17/VisualTools/explore/summary?id=1237661362374181020">http://skyserver.sdss.org/dr17/VisualTools/explore/summary?id=1237661362374181020</a>

In [31]:
visualicacion_final.iloc[2,:]

alpha           210.204090
delta            45.132961
u                20.817940
g                20.384850
r                20.277440
i                20.374290
z                20.394490
redshift          0.000336
class_GALAXY      0.000000
class_QSO         0.000000
class_STAR        1.000000
Name: 2, dtype: float64

<table>
    <tr>
        <td><img src="data1/estrella.jpg" ></td>
        <td><img src="data1/estrella-zoomout.png" ></td>
        <td><img src="data1/estrella-spectrum.jpg" ></td>
    </tr>
</table>

# Guardando modelo y estadísticos

In [38]:
# Guardando el Modelo
modelo_final.save('data1/modelo/modelo_final.h5')

In [41]:
# Guardando estadísticos
mean.to_csv("data1/modelo/mean.csv", header=False)
std.to_csv("data1/modelo/std.csv", header=False)

In [46]:
# Guardando datos para demo
(X_demo*std + mean).to_csv("data1/modelo/X_demo.csv", index=False)
y_demo.to_csv("data1/modelo/y_demo.csv", index=False)

In [44]:
type(X_demo)

pandas.core.frame.DataFrame