# Statistical and Machine Learning Models for Fundamentalist Data

This notebook is a useful tool for investors interested in the Brazilian macroeconomic scenario. The aim is to provide in-depth analysis and facilitate investment decision-making, focusing on identifying opportunities and mitigating risks. It includes interactive visualizations and real-time updates, making it accessible and practical for both experienced investors and beginners.

## Initial Setup

### Install Packages

In [1]:
%pip install pandas -q
%pip install plotly -q
%pip install scikit-learn -q

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


### Import libs

In [23]:
import os
from pathlib import Path
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import warnings
warnings.filterwarnings('ignore')

### Create a file path default

In [3]:
file_path_scored = str(Path(os.getcwd()).parent.parent.parent / "data/scored_base")

### Pandas Config

In [4]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

### Load data

In [5]:
df_macroeconomic_scored = pd.read_csv(file_path_scored + "/macroeconomic_selic_scored.csv")
df_macroeconomic_scored.head()

Unnamed: 0,selic,selic_trend,predicted_trend,probability_descend,probability_ascend,date
0,6.5,0,0,1.0,0.0,2019-01-31
1,6.5,0,0,0.798864,0.201136,2019-02-28
2,6.5,0,1,0.414501,0.585499,2019-03-31
3,6.5,0,0,0.922727,0.077273,2019-04-30
4,6.5,0,0,0.729672,0.270328,2019-05-31


## Insights on Models Output (RandomForestClassifier)

### Line Chart Timeline for Selic.

In [6]:
fig = px.line(df_macroeconomic_scored, x='date', y='selic', title='Time Line for Selic', color_discrete_sequence=['rgb(100, 195, 181)'])
fig.update_layout(title_text="Timeline for Selic", template='plotly_dark', xaxis_title='Date', yaxis_title='Selic (%)')

fig.show()

**`Selic Rate and Predictive Trend Analysis`**

This analysis covers the Selic rate, the central interest rate of Brazil, and its `predictive trends` from **`January 2019`** to **`April 2021`**.

- The **`Selic`** rate began at **6.50%** in `**January 2019`** and showed `stability` until `**July 2019`**.

- A downward trend commenced in **`August 2019`**, with a `decrease` to **5.00%** by **`October 2019`**, and reaching a low of **2.00%** from **`August 2020`** onwards.

- The predictive trend indicates `fluctuations` with a probability of descent as `high` as **1.000000** in **`December 2019`**, signaling strong market `certainty of a decrease`.

- Conversely, the probability of an `ascent` peaked at `0.649136` in **`January`** and **`July 2020`**, highlighting expectations of a `rate increase`.

- The rate experienced a `slight uptick` to **2.75%** in **`March 2021`**, with a predicted trend of `ascent` and a `higher probability` (**0.570726**) of `increasing further`.

The graph depicts a period of economic easing with a pronounced reduction in the Selic rate, possibly in response to economic conditions. The Central Bank's predictions and market probabilities are aligned with these movements, indicating a responsiveness to macroeconomic indicators and future expectations.


### Probabilities of Upside and Downside

In [7]:
color_discrete_map = {
    'probability_descend': 'rgb(169, 214, 206)',
    'probability_ascend':'rgb(100, 195, 181)'
}

fig = px.bar(df_macroeconomic_scored, x='date', y=['probability_ascend', 'probability_descend'],
             title='High and Low Probabilities Over Time.',
             labels={'value': 'Probability', 'variable': 'Type of Probability'},
             template='plotly_dark', color='variable',
             color_discrete_map=color_discrete_map)
fig.update_layout(xaxis_title='Date', yaxis_title='Probability (%)')

fig.show()


**`Probability Trends for Selic Rate Changes`**

This bar chart represents the `probability trends` for the Selic rate, the central interest rate of Brazil, from **`April 2019`** to **`April 2021`**.

- The probability of a `descend` in the Selic rate was at its **peak** with a **100%** chance numerous times throughout the period, indicating a strong market sentiment towards a `decrease`.

- The probability of an `ascend` fluctuated, reaching as `high` as **0.649136** in **`January and July 2020`**, which suggests a significant expectation of an `increase` in the Selic rate.

- The `dark teal bars` represent the `descending probability`, while the `light teal bars` indicate the `ascending probability`.

- A `notable shift` in probabilities can be observed in **`March 2021`**, where the likelihood of an increase `rose` to a probability of **0.570726**, marking a potential change in the economic outlook.

This chart provides a visual representation of market expectations and predictive analytics regarding the Selic rate's movement, offering valuable insights into the economic sentiment and potential monetary policy shifts over the analyzed period.


In [8]:
color_ascend = 'rgba(169, 214, 206, 0.5)'

fig = px.area(df_macroeconomic_scored, x='date', y=['probability_ascend', 'probability_descend'], title='Probabilities Over Time', labels={'value': 'Probability (%)', 'variable': 'Type of Probability'}, color_discrete_map={'probability_ascend': color_ascend}, color='variable', template='plotly_dark')
fig.update_traces(fillcolor='rgba(255, 0, 0, 0.5)', selector=dict(name='probability_descend'))
fig.update_layout(xaxis_title='Date', yaxis_title='Probability (%)')
fig.show()


**`Analysis of Probability Trends for Selic Rate Changes`**

This area chart depicts the changing `probability trends` for the Selic rate, Brazil's central interest rate, from **`April 2019`** to **`April 2021`**.

- The `ascending probability` (chance of increase) is shown in a `light teal area`, while the `descending probability` (chance of decrease) is represented by a `dark red area`.
  
- Throughout the timeline, the `descending probability` is consistently `high`, dominating the chart with values close to **1.00 (100%)**.

- The `ascending probability` sees several spikes, most notably in **`July 2019`**, **`January 2020`**, and **`July 2020`**, where it peaks just above **0.60 (60%)**, suggesting a significant but not dominant expectation of an `increase` in the Selic rate.

- The chart culminates in **`April 2021`** with a sharp increase in `ascending probability`, indicating a shifting market sentiment towards a potential `rise` in the interest rate.

This visual analysis provides a comprehensive view of market sentiments and predictions, with the overall dominance of the `descending probability` suggesting a period of expected rate `decreases` and the spikes in `ascending probability` reflecting transient market optimism or expectations of policy tightening.


In [19]:
fig = px.line(df_macroeconomic_scored, x='date', y='predicted_trend', markers=True, title='Predicted Trend e Selic ao Longo do Tempo')

fig.add_scatter(x=df_macroeconomic_scored['date'], y=df_macroeconomic_scored['selic'], mode='lines+markers', name='Selic')
fig.add_scatter(x=df_macroeconomic_scored['date'], y=df_macroeconomic_scored['selic_trend'], mode='lines+markers', name='Selic Trend')

fig.show()

AttributeError: 'Figure' object has no attribute 'update_trace'

In [24]:
line_trace = go.Scatter(x=df_macroeconomic_scored['date'], y=df_macroeconomic_scored['predicted_trend'], mode='lines+markers', name='Predicted Trend')

fig = px.line(df_macroeconomic_scored, x='date', y='selic', markers=True, title='Predicted Trend e Selic ao Longo do Tempo')
fig.add_trace(line_trace)
fig.add_scatter(x=df_macroeconomic_scored['date'], y=df_macroeconomic_scored['selic'], mode='lines+markers', name='Selic')
fig.add_scatter(x=df_macroeconomic_scored['date'], y=df_macroeconomic_scored['selic_trend'], mode='lines+markers', name='Selic Trend')
fig.update_layout(template='plotly_dark')

fig.show()

**`Predicted Trend and Selic Rate Over Time with Seasonal Adjustments`**

This graph provides an updated visualization of the `predicted trend`, actual `Selic rate`, and the `Selic trend` from **`January 2019`** to **`January 2021`**.

- The actual **`Selic`** rate is indicated by a `cyan line` with markers, which shows a consistent `decrease` from approximately **6.50%** in **`January 2019`** to about **2.00%** in **`January 2021`**.

- The **`Predicted Trend`** is shown by an `orange line` with markers. This reflects the expected movement in the Selic rate, which shows `spikes` at various points, potentially indicating anticipated rate changes.

- The **`Selic Trend`**, highlighted by a `purple line` with markers, represents the trend based on the COPOM meetings. A value of **0** here indicates periods of `no variation`, which may be attributed to the `seasonality` of COPOM's meetings. This same value of **0** can also signify periods where a `decrease` in the rate was expected or occurred.

- It's important to note the `seasonality` in the Selic trend, where the value of **0** often precedes or aligns with the COPOM meeting schedule, emphasizing the influence of these meetings on the Selic rate's stability or change.

This graph underscores the importance of considering `seasonal adjustments` and `institutional schedules` when interpreting economic data. The Selic trend line provides a nuanced understanding of the rate's movements, particularly in relation to the timing of COPOM meetings and their impact on monetary policy.


## TL/DR

**`Comprehensive Analysis of Selic Rate and Predictive Trends`**

A detailed examination of the provided dataset reveals a narrative about Brazil's central interest rate, the Selic rate, and the predictive trends from **`January 2019`** to **`April 2021`**.

- Starting at **6.50%** in **`January 2019`**, the **`Selic`** rate displayed initial `stability` but began a downward trend in **`July 2019`**, eventually settling at around **2.00%**.
- The `predictive trends` and associated probabilities indicated market expectations, with **`100%`** probabilities of a descent at times, signifying strong conviction in an impending decrease.
- The value of **0** in `selic_trend` often indicated `no variation` due to the `seasonal nature` of COPOM meetings, but also aligned with periods of actual `declines` in the Selic rate.
- Notable peaks in the probability of an ascent, particularly in **`January`** and **`July 2020`**, suggested anticipated increases which provide insights into market sentiment and the challenges of forecasting.

**`Potential Further Analyses with Expanded Data`**

With a broader dataset, the analysis could extend to:

- **`Correlation Studies`**: Exploring how the Selic rate interacts with other macroeconomic variables.
- **`Predictive Modeling`**: Employing advanced econometric models to forecast future economic conditions.
- **`Sentiment Analysis`**: Assessing market sentiment through qualitative data to understand its influence on monetary policy.
- **`Sectoral Impact Assessment`**: Investigating the differential impact of the Selic rate across economic sectors.

**`Conclusion for Novice Investors`**

For novice investors, the insights gleaned from this analysis highlight the importance of understanding:

- The impact of the **`Selic rate`** on the broader economy and investment decisions.
- How market expectations, as reflected by probabilities, can guide investment strategies.
- The value of continual learning and staying informed about macroeconomic trends for informed investment decisions.
- Recognizing the seasonality inherent in monetary policy decisions can provide a strategic edge in investment planning.

In essence, novice investors can learn that a keen understanding of monetary policy dynamics and economic indicators is essential for successful navigation in the investment landscape.
