# Análise Exploratória

## responder a seguinte questão de pesquisa:

### "Existe alguma correlação entre as regiões definidas pelos ângulos alpha e delta de um corpo celeste e a sua emiss ̃ao luminosa, especificamente em relaço à faixa de cor observada?"

In [1]:
#!pip install -r requirements.txt

In [2]:
import pandas as pd
import numpy as np

In [3]:
df = pd.read_csv('../data/star_classification_10.csv')
df = df[['obj_ID', 'alpha', 'delta', 'u', 'g', 'r', 'i', 'z', 'redshift']]
df.head()

Unnamed: 0,obj_ID,alpha,delta,u,g,r,i,z,redshift
0,1.237663e+18,15.342907,0.794882,18.74547,17.49025,16.89122,16.5735,16.2991,0.042002
1,1.237664e+18,120.365538,55.660432,19.99985,19.68133,19.50156,19.17364,19.16122,1.633797
2,1.237655e+18,245.610038,42.974786,23.11792,20.81292,18.88351,18.12335,17.68182,0.454852
3,1.23766e+18,127.957356,6.647703,21.94454,21.01012,20.93496,20.93184,20.56855,2.608515
4,1.237665e+18,159.174526,35.881846,18.89945,17.68422,17.02925,16.6,16.36798,0.083804


### identificação das unidades de medida usadas

O alpha, também conhecido como ascensão reta, é o ângulo medido sobre o horizonte, no sentido horário (NLSO), com origem no Norte geográfico e extremidade no círculo vertical do astro. O azimute varia entre 0° e 360°.

Já o delta possui valores negativos, logo só pode se referir à declinação. A altura varia entre -90° e +90°. Podemos converter para distância polar para que varie entre 0° e 180°.

In [4]:
coord_cols = ['alpha', 'delta']
light_cols = ['u', 'g', 'r', 'i']

In [5]:
coord = df[coord_cols]
coord.describe()

Unnamed: 0,alpha,delta
count,10000.0,10000.0
mean,178.026487,23.777924
std,97.206836,19.555968
min,0.011684,-16.450911
25%,127.309188,4.765674
50%,180.777242,22.985017
75%,234.307319,39.855873
max,359.97891,82.5675


### Derivando coordenadas esféricas (raio, ângulo azimutal e ângulo zenital) a partir de alfa e delta

In [6]:
coord['x'] = np.cos(coord['alpha']) * np.cos(coord['delta'])
coord['y'] = np.sin(coord['alpha']) * np.cos(coord['delta'])
coord['z'] = np.sin(coord['delta'])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  coord['x'] = np.cos(coord['alpha']) * np.cos(coord['delta'])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  coord['y'] = np.sin(coord['alpha']) * np.cos(coord['delta'])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  coord['z'] = np.sin(coord['delta'])


In [7]:
coord_cols = ['alpha', 'delta', 'x','y','z']
df[['x','y','z']] = coord[['x','y','z']]

In [8]:
for c in light_cols:
    correlation_matrix = df[['x', 'y', 'z', c]].corr()

    print('suco: \n', correlation_matrix)

suco: 
           x         y         z         u
x  1.000000  0.003615  0.005868  0.009552
y  0.003615  1.000000 -0.002166  0.012940
z  0.005868 -0.002166  1.000000  0.008445
u  0.009552  0.012940  0.008445  1.000000
suco: 
           x         y         z         g
x  1.000000  0.003615  0.005868  0.009685
y  0.003615  1.000000 -0.002166  0.013083
z  0.005868 -0.002166  1.000000  0.008840
g  0.009685  0.013083  0.008840  1.000000
suco: 
           x         y         z         r
x  1.000000  0.003615  0.005868  0.003886
y  0.003615  1.000000 -0.002166 -0.003232
z  0.005868 -0.002166  1.000000  0.044566
r  0.003886 -0.003232  0.044566  1.000000
suco: 
           x         y         z         i
x  1.000000  0.003615  0.005868  0.002787
y  0.003615  1.000000 -0.002166 -0.004029
z  0.005868 -0.002166  1.000000  0.055464
i  0.002787 -0.004029  0.055464  1.000000


In [9]:
for c in light_cols:
    correlation_matrix = df[['alpha', 'delta', c]].corr()

    print('suco: \n', correlation_matrix)

suco: 
           alpha     delta         u
alpha  1.000000  0.134858 -0.004653
delta  0.134858  1.000000  0.011703
u     -0.004653  0.011703  1.000000
suco: 
           alpha     delta         g
alpha  1.000000  0.134858 -0.005107
delta  0.134858  1.000000  0.012147
g     -0.005107  0.012147  1.000000
suco: 
           alpha     delta         r
alpha  1.000000  0.134858 -0.033189
delta  0.134858  1.000000 -0.013308
r     -0.033189 -0.013308  1.000000
suco: 
           alpha     delta         i
alpha  1.000000  0.134858 -0.036893
delta  0.134858  1.000000 -0.009532
i     -0.036893 -0.009532  1.000000


# Conclusão

Não tem correlação alguma