---
# **Financial Data Analysis Tools**
---

Este código tem como objetivo calcular médias móveis simples (SMA) para diferentes períodos em uma série temporal de dados financeiros e, em seguida, calcular as distâncias percentuais entre o preço atual e essas médias móveis. Adicionalmente, ele gera histogramas das distribuições dessas distâncias percentuais para visualização, bem como o plotagem do preço e das médias móveis em formato de candlestick.

This code aims to calculate simple moving averages (SMA) for different periods in a financial time series data and then calculate the percentage distances between the current price and these moving averages. Additionally, it generates histograms of the distributions of these percentage distances for visualization, as well as plotting the price and moving averages in candlestick format.

### **Instalando e importando Bibliotecas - Installing and Importing Libraries**

In [None]:
import yfinance as yf
import pandas as pd
import numpy as np
import seaborn as sns
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
import scipy.stats as stats


### **Escolhendo o ativo e realizando os cálculos - Picking the Stock and Performing Calculations**

In [None]:
ticker = 'PETR4.SA'

In [None]:
data = yf.download(ticker, period = 'max')

[*********************100%%**********************]  1 of 1 completed


In [None]:
def SMA(data, *periods, column='Close'):
    for period in periods:
        data[f'SMA_{period}'] = data[column].rolling(window=period).mean()
    return data

def calculate_distances(data, *periods, column='Close'):
    for period in periods:
        sma_column = f'SMA_{period}'
        if sma_column in data.columns:
            data[f'Distance_SMA{period} %'] = ((data[column] - data[sma_column]) / data[sma_column]) * 100
        else:
            raise ValueError(f'Missing SMA column for period {period}. Make sure to calculate SMA before calling this function.')
    return data

# Exemplo de uso:
data = SMA(data, 20, 50, 100)
data = calculate_distances(data, 20, 50, 100)
print(data.head())


             Open   High    Low  Close  Adj Close       Volume  SMA_20  \
Date                                                                     
2000-01-03  5.875  5.875  5.875  5.875   1.402440  35389440000     NaN   
2000-01-04  5.550  5.550  5.550  5.550   1.324858  28861440000     NaN   
2000-01-05  5.494  5.494  5.494  5.494   1.311491  43033600000     NaN   
2000-01-06  5.475  5.475  5.475  5.475   1.306955  34055680000     NaN   
2000-01-07  5.500  5.500  5.500  5.500   1.312922  20912640000     NaN   

            SMA_50  SMA_100  Distance_SMA20 %  Distance_SMA50 %  \
Date                                                              
2000-01-03     NaN      NaN               NaN               NaN   
2000-01-04     NaN      NaN               NaN               NaN   
2000-01-05     NaN      NaN               NaN               NaN   
2000-01-06     NaN      NaN               NaN               NaN   
2000-01-07     NaN      NaN               NaN               NaN   

           

### **Criação de Função para Cálculo da Distância Entre as Médias Móveis e o Preço de Fechamento Atual, Juntamente com a Plotagem Desses Dados em Histograma**

In [None]:
def plot_distance_histograms(data, nbins=50):
    # Identifica as colunas de distância
    distance_columns = [col for col in data.columns if col.startswith('Distance_SMA') and col.endswith('%')]

    if not distance_columns:
        raise ValueError("No distance columns found in the DataFrame. Make sure to calculate distances first.")

    # Paleta de cores para os histogramas
    colors = ['blue', 'green', 'red', 'purple', 'orange', 'brown', 'pink', 'gray', 'olive', 'cyan']

    # Cria subplots
    fig = make_subplots(rows=len(distance_columns), cols=1, subplot_titles=[f'Distribution of {col}' for col in distance_columns])

    # Adiciona histogramas
    for i, col in enumerate(distance_columns, start=1):
        color = colors[i % len(colors)]  # Seleciona a cor da paleta, repetindo se necessário
        fig.add_trace(go.Histogram(x=data[col], name=col, nbinsx=nbins, marker_color=color, opacity=0.75), row=i, col=1)

    # Ajusta o layout
    fig.update_layout(
        height=400*len(distance_columns),  # Altura ajustada para melhorar a visibilidade
        title_text="Histograms of Distance Percentages to SMAs",
        title_x=0.5,
        showlegend=False,
        template='plotly_white',
        margin=dict(l=50, r=50, t=80, b=50)
    )

    # Ajusta os títulos dos eixos
    fig.update_xaxes(title_text="Distance Percentage (%)")
    fig.update_yaxes(title_text="Frequency")

    fig.show()

# Exemplo de uso:
plot_distance_histograms(data, nbins=300)


**Teste de Normalidade dos Dados - Normality Test**

In [None]:
def check_normality(data):
    # Identifica as colunas de distância
    distance_columns = [col for col in data.columns if col.startswith('Distance_SMA') and col.endswith('%')]

    if not distance_columns:
        raise ValueError("No distance columns found in the DataFrame. Make sure to calculate distances first.")

    normality_results = {}

    for col in distance_columns:
        stat, p_value = stats.shapiro(data[col].dropna())
        normality_results[col] = {'Statistic': stat, 'p-value': p_value}

    return normality_results

# Exemplo de uso:
normality_results = check_normality(data)

# Exibindo os resultados
for col, result in normality_results.items():
    print(f"{col}:")
    print(f"  Statistic: {result['Statistic']}")
    print(f"  p-value: {result['p-value']}\n")


Distance_SMA20 %:
  Statistic: 0.946096658706665
  p-value: 1.1448608453533755e-42

Distance_SMA50 %:
  Statistic: 0.9711710214614868
  p-value: 2.393739410803066e-33

Distance_SMA100 %:
  Statistic: 0.9829591512680054
  p-value: 2.1526966397591666e-26




p-value may not be accurate for N > 5000.



**Plotagem dos Preços e das Médias Móveis - Plotting Candlestick Chart with SMAs**

In [None]:
def plot_candlestick_with_sma(data, column='Close'):
    # Verifica se as colunas necessárias estão presentes
    required_columns = ['Open', 'High', 'Low', 'Close']
    for col in required_columns:
        if col not in data.columns:
            raise ValueError(f"Data must contain '{col}' column.")

    # Identifica as colunas de SMA
    sma_columns = [col for col in data.columns if col.startswith('SMA_')]

    if not sma_columns:
        raise ValueError("No SMA columns found in the DataFrame. Make sure to calculate SMAs first.")

    # Cria a figura do candlestick
    fig = go.Figure(data=[go.Candlestick(
        x=data.index,
        open=data['Open'],
        high=data['High'],
        low=data['Low'],
        close=data['Close'],
        name='Candlestick'
    )])

    # Adiciona as SMAs ao gráfico
    for sma_column in sma_columns:
        fig.add_trace(go.Scatter(
            x=data.index,
            y=data[sma_column],
            mode='lines',
            name=sma_column
        ))

    # Atualiza o layout do gráfico
    fig.update_layout(
        title='Candlestick Chart with SMAs',
        xaxis_title='Date',
        yaxis_title='Price',
        template='plotly_white'
    )

    fig.show()

# Exemplo de uso:
data = SMA(data, 20, 50, 100)
plot_candlestick_with_sma(data)
