<a href="https://colab.research.google.com/github/kbcvcbk/cesar-school/blob/master/estatistica-e-probabilidade/nascer-do-sol/Ricardo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Etapa 1: Pré-processamento dos dados.
## Links
- [Atividade no Classroom](https://classroom.google.com/u/0/c/MTQ4MzY4NzYyMDM3/a/MTU1OTk5ODg0OTcx/details)
- [Artigo de Ricardo](https://drive.google.com/file/d/1dNBfmRlWIvUj_8mzpTEBtRqQF9tVl0kh/view)
- [Cálculo de nascer e por do sol](https://www.inf.ufrgs.br/~cabral/Nascer_Por_Sol.html#:~:text=Subtraindo%206%20horas%20do%20meio,pa%C3%ADs%2C%20estes%20hor%C3%A1rios%20seriam%20corretos.)

## Enunciado
Esta etapa consiste na conversão das fórmulas disponíveis no artigo para programação juntamente com o tratamento dos dados.

O tratamento dos dados deverá ter foco no horário do nascimento do sol na cidade de Água Branca - AL, convertendo o tempo total de sol em minutos por dia.

Para isso, você precisa calcular a hora que o sol nasce para um dado dia (Utilizar o artigo Cálculo do Nascer e Pôr do Sol).

A entrega deverá ser feita até o dia 03/11, as 23:59.

## Etapas
1. Conversão das fórmulas do artigo
2. Cálcular a hora que o sol nasce cada dia
3. Converter o tempo total de sol em minutos por dia

In [31]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

### Ler XLS e montar o DataFrame
Fiz upload do arquivo aqui no colab (vocês podem ver clicando na pasta na que tem lá na esquerda), então ele deve ficar disponível pra todo mundo.

### Refs
- [Função read_excel](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html)

In [32]:
pag = pd.read_excel("Janeiro 2008.xls",
                    sheet_name = [0, 1, 2],
                    skiprows = [0, 2, 3],
                    header = 0)

df = pd.concat([pag[0], pag[1], pag[2]])

In [33]:
df

Unnamed: 0,TIMESTAMP,RECORD,CR1000_Bat_Avg,VelVento,DirVento,TempAR_Avg,RH_Max,RadHZtot_Avg,RadPAR_Avg,IlumHZ_Avg,IlumNORTE_Avg,IlumSUL_Avg,IlumLESTE_Avg,IlumOESTE_Avg
0,2008-01-01 00:00:00,243801,13.35,6.147,111.8,20.06,80.4,0.0,-0.011,0.067,18.590,0.067,1.549,1.326
1,2008-01-01 00:01:00,243802,13.35,5.113,111.1,20.07,80.4,0.0,0.022,0.067,18.590,0.067,1.549,1.318
2,2008-01-01 00:02:00,243803,13.35,5.265,107.3,20.04,80.4,0.0,0.045,0.067,18.590,0.067,1.549,1.324
3,2008-01-01 00:03:00,243804,13.35,5.289,105.4,20.04,80.5,0.0,0.011,0.067,18.590,0.067,1.549,1.329
4,2008-01-01 00:04:00,243805,13.35,6.436,105.8,20.04,80.6,0.0,-0.067,0.067,18.590,0.067,1.549,1.340
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
15835,2008-01-31 23:55:00,288436,13.31,1.300,130.1,21.54,94.9,0.0,-0.011,0.067,3.637,0.067,1.145,1.347
15836,2008-01-31 23:56:00,288437,13.31,1.522,118.3,21.54,94.8,0.0,-0.045,0.067,3.637,0.067,1.145,1.347
15837,2008-01-31 23:57:00,288438,13.31,1.799,115.0,21.54,94.7,0.0,-0.101,0.067,3.637,0.067,1.145,1.347
15838,2008-01-31 23:58:00,288439,13.31,1.584,122.6,21.55,94.7,0.0,-0.112,0.067,3.637,0.067,1.145,1.360


### Conversão das fórmulas do artigo

In [34]:
from numpy import (
    sin,
    cos,
    tan,
    arccos,
    radians,
    degrees,
    absolute,
)

from datetime import (
    timedelta,
    datetime,
    date,
)

In [35]:
def earth_declination(n):
    return 23.45 * sin(radians(360/365 * (284+n)))

def td(lat, day):
    dec = earth_declination(day)
    cofactor = -(tan(radians(lat)) * tan(radians(dec)))
    return 2/15 * degrees(arccos(cofactor))

def longitude_correction(lng, fuse):
    diff = absolute(lng) - absolute(fuse)
    
    return timedelta(
        minutes=(diff * 60) / 15
    )

def day_range(td, lng, fuse):
    td /= 2
    sunrise = timedelta(hours=12-td)
    sunset = timedelta(hours=12+td)

    correction = longitude_correction(lng, fuse)
    sunrise += correction
    sunset += correction
    
    return (sunrise, sunset)

In [36]:
"""
Teste com dados do artigo
valores esperados:
daytime = 11,15174 
sunrise = 6h 32min 4s
sunset = 17h 41min 05s
"""

lat = -23.543333
lng = 46.633056
fuse = 45
day = 119

daytime = td(lat, day)
sunrise, sunset = day_range(daytime, lng, fuse)

print(f"""daytime = {daytime}
sunrise = {sunrise}
sunset = {sunset}""")

daytime = 11.151741131164016
sunrise = 6:31:58.799404
sunset = 17:41:05.067476


### Calcular a hora que o sol nasce cada dia

In [37]:
all_days = map(lambda ts: ts.date(), df["TIMESTAMP"])
days = pd.Series(all_days).unique()

In [41]:
def date_to_nth_day(d):
  new_year = date(d.year, 1, 1)
  delta = (d - new_year)
  delta += timedelta(days = 1)
  return delta.days

In [42]:
# Coordenadas de Água Branca - AL
lat = -9.25402
lng = -37.9449
fuse = 45

def daytime_from_day(day):
  day_int = date_to_nth_day(day)
  daytime = td(lat, day_int)
  sunrise, sunset = day_range(daytime, lng, fuse)

#  print(f"""\ndaytime = {daytime}
#sunrise = {sunrise}
#sunset = {sunset}""")

  daytime = pd.Timedelta(daytime, 'h')
  sunrise = pd.to_datetime(day) + sunrise
  sunset = pd.to_datetime(day) + sunset

  return (daytime, sunrise, sunset)

pd.DataFrame(map(daytime_from_day, days),
             columns=["Horas de Sol", "Nascer", "Pôr"])

Unnamed: 0,Horas de Sol,Nascer,Pôr
0,0 days 12:31:44.651722800,2008-01-01 05:15:54.450139,2008-01-01 17:47:39.101861
1,0 days 12:31:37.152210,2008-01-02 05:15:58.199894,2008-01-02 17:47:35.352106
2,0 days 12:31:29.034609599,2008-01-03 05:16:02.258696,2008-01-03 17:47:31.293304
3,0 days 12:31:20.303727600,2008-01-04 05:16:06.624136,2008-01-04 17:47:26.927864
4,0 days 12:31:10.964719200,2008-01-05 05:16:11.293640,2008-01-05 17:47:22.258360
5,0 days 12:31:01.023052800,2008-01-06 05:16:16.264474,2008-01-06 17:47:17.287526
6,0 days 12:30:50.484506400,2008-01-07 05:16:21.533747,2008-01-07 17:47:12.018253
7,0 days 12:30:39.355160400,2008-01-08 05:16:27.098419,2008-01-08 17:47:06.453581
8,0 days 12:30:27.641386800,2008-01-09 05:16:32.955306,2008-01-09 17:47:00.596694
9,0 days 12:30:15.349834800,2008-01-10 05:16:39.101083,2008-01-10 17:46:54.450917
