# **Exploratory Data Analisys: Companies**

In the notebook dedicated to the exploratory data analysis (EDA) of companies listed on B3, the filters are specifically designed to meet the needs of investors with a Buy and Hold strategy. These filters allow for better visualization and interpretation of data, facilitating the identification of companies that offer good long-term appreciation and stability, which is essential for this type of investment.

Here are the fundamental metrics that are used to filter and assess the companies:

- **`market_cap`**: to filter companies by their market capitalization, allowing the identification of companies with greater market stability and solidity.
- **`dividend_rate`**: to select companies based on the dividend rate paid, which can be an indicator of consistent cash generation.
- **`trailing_annual_dividend_yield`**: for companies based on past annual dividend yields, a metric that reflects profitability over the stock price.
- **`debt_to_equity`**: to analyze the ratio between the company's total debt and its equity, which helps to understand the level of indebtedness and financial risk.
- **`trailing_pe`**: (Price to Earnings) to compare the current stock price with the earnings per share over the last 12 months, a classic metric for evaluating the price-earnings ratio.
- **`forward_pe`**: for projecting the stock price relative to expected earnings, providing a future estimate of the price-earnings ratio.
- **`price_to_book`**: for comparison between the market value of the company and its book value, which can signal stocks traded below their net asset value.
- **`earnings_quarterly_growth`**: to assess the company's earnings growth compared to the same quarter of the previous year, showing growth or decline trends.

These columns would be at the core of Buy and Hold analysis, as they reflect the fundamental aspects long-term investors generally seek: sustainable growth, financial solidity, and dividend return. The 
visualization of data filtered by these columns offers clear and precise insights, making the investment analysis process more efficient and well-founded.

## **Initial Setup**

### Install Packages

In [2]:
%pip install pandas -q
%pip install plotly -q

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


### Import libs

In [3]:
import os
import itertools
import pandas as pd
from pathlib import Path
import plotly.express as px
import plotly.graph_objects as go
import plotly.subplots as sp

### Pandas Config

In [4]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

### Create a file path default

In [5]:
file_path_book = str(Path(os.getcwd()).parent.parent.parent / "data/book")

## Companies

### Load data

In [6]:
df_fundamentals_book = pd.read_csv(file_path_book + "/fundamentals_book.csv")
df_fundamentals_book.head(5)

Unnamed: 0,ticker,long_name,sector,industry,market_cap,enterprise_value,total_revenue,profit_margins,operating_margins,dividend_rate,beta,ebitda,trailing_pe,forward_pe,volume,average_volume,fifty_two_week_low,fifty_two_week_high,price_to_sales_trailing_12_months,fifty_day_average,two_hundred_day_average,trailing_annual_dividend_rate,trailing_annual_dividend_yield,book_value,price_to_book,total_cash,total_cash_per_share,total_debt,earnings_quarterly_growth,revenue_growth,gross_margins,ebitda_margins,return_on_assets,return_on_equity,gross_profits,total_assets_approx,asset_turnover,earnings_growth_rate,dividend_payout_ratio,equity,debt_to_equity,roi,roce
0,ABCB4.SA,Banco ABC Brasil S.A.,Financial Services,Banks - Regional,4265434000.0,14773390000.0,1941779000.0,0.41576,0.38826,1.56,0.679,0.0,4.069768,4.706601,92300.0,747165.0,15.85,21.99,2.196663,19.3382,18.14667,1.55,0.080687,24.518,0.785138,7774306000.0,35.162,18298460000.0,0.001,0.003,0.0,0.0,0.0153,0.1568,1973086000.0,7774306000.0,0.249769,0.1,155000.0,-10524160000.0,-1.73871,0.131438,0.0
1,AGRO3.SA,BrasilAgro - Companhia Brasileira de Proprieda...,Consumer Defensive,Farm Products,2466480000.0,2912933000.0,1249437000.0,0.21493,0.25031,3.21,0.432,264892000.0,9.450382,6.332481,298100.0,666692.0,22.29,32.71,1.974073,27.0106,25.58635,3.24,0.132029,22.237,1.11346,383837000.0,3.885,872075000.0,6.801,0.671,0.25252,0.21201,0.03839,0.1217,315504000.0,383837000.0,3.255124,680.1,47.640053,-488238000.0,-1.786168,0.428927,0.079343
2,RAIL3.SA,Rumo S.A.,Industrials,Railroads,42288820000.0,55243050000.0,10317460000.0,0.07639,0.33544,0.07,0.227,4522541000.0,54.309525,21.72381,5733400.0,14644522.0,16.21,24.44,4.098764,22.5852,20.95235,0.066,0.002993,8.334,2.736981,7656040000.0,4.132,21843200000.0,3.935,0.121,0.34493,0.43834,0.04252,0.05163,3146360000.0,7656040000.0,1.347623,393.5,1.677255,-14187160000.0,-1.539646,0.186765,0.070519
3,ALPA3.SA,Alpargatas S.A.,Consumer Cyclical,Footwear & Accessories,5309793000.0,6482982000.0,4022153000.0,-0.05671,-0.06434,0.4,0.571,-198000.0,0.0,0.0,1100.0,3953.0,7.27,17.8,1.320137,8.7146,9.6354,0.0,0.0,7.867,1.008008,414288000.0,0.614,1550341000.0,0.0,-0.127,0.43246,-5e-05,-0.0091,-0.04153,1968303000.0,414288000.0,9.708591,0.0,0.0,-1136053000.0,-1.364673,0.620417,-2.9e-05
4,ALPA4.SA,Alpargatas S.A.,Consumer Cyclical,Footwear & Accessories,5350758000.0,6395236000.0,4022153000.0,-0.05671,-0.06434,0.43,0.571,-198000.0,0.0,14.555555,1132100.0,5605825.0,6.81,22.51,1.330322,8.3228,9.2729,0.0,0.0,7.867,0.99911,414288000.0,0.614,1550341000.0,0.0,-0.127,0.43246,-5e-05,-0.0091,-0.04153,1968303000.0,414288000.0,9.708591,0.0,0.0,-1136053000.0,-1.364673,0.62893,-2.9e-05


In [7]:
df_fundamentals_numeric_cols = df_fundamentals_book.select_dtypes(include=["int", "number", "float64"])
df_fundamentals_numeric_cols.head(5)

Unnamed: 0,market_cap,enterprise_value,total_revenue,profit_margins,operating_margins,dividend_rate,beta,ebitda,trailing_pe,forward_pe,volume,average_volume,fifty_two_week_low,fifty_two_week_high,price_to_sales_trailing_12_months,fifty_day_average,two_hundred_day_average,trailing_annual_dividend_rate,trailing_annual_dividend_yield,book_value,price_to_book,total_cash,total_cash_per_share,total_debt,earnings_quarterly_growth,revenue_growth,gross_margins,ebitda_margins,return_on_assets,return_on_equity,gross_profits,total_assets_approx,asset_turnover,earnings_growth_rate,dividend_payout_ratio,equity,debt_to_equity,roi,roce
0,4265434000.0,14773390000.0,1941779000.0,0.41576,0.38826,1.56,0.679,0.0,4.069768,4.706601,92300.0,747165.0,15.85,21.99,2.196663,19.3382,18.14667,1.55,0.080687,24.518,0.785138,7774306000.0,35.162,18298460000.0,0.001,0.003,0.0,0.0,0.0153,0.1568,1973086000.0,7774306000.0,0.249769,0.1,155000.0,-10524160000.0,-1.73871,0.131438,0.0
1,2466480000.0,2912933000.0,1249437000.0,0.21493,0.25031,3.21,0.432,264892000.0,9.450382,6.332481,298100.0,666692.0,22.29,32.71,1.974073,27.0106,25.58635,3.24,0.132029,22.237,1.11346,383837000.0,3.885,872075000.0,6.801,0.671,0.25252,0.21201,0.03839,0.1217,315504000.0,383837000.0,3.255124,680.1,47.640053,-488238000.0,-1.786168,0.428927,0.079343
2,42288820000.0,55243050000.0,10317460000.0,0.07639,0.33544,0.07,0.227,4522541000.0,54.309525,21.72381,5733400.0,14644522.0,16.21,24.44,4.098764,22.5852,20.95235,0.066,0.002993,8.334,2.736981,7656040000.0,4.132,21843200000.0,3.935,0.121,0.34493,0.43834,0.04252,0.05163,3146360000.0,7656040000.0,1.347623,393.5,1.677255,-14187160000.0,-1.539646,0.186765,0.070519
3,5309793000.0,6482982000.0,4022153000.0,-0.05671,-0.06434,0.4,0.571,-198000.0,0.0,0.0,1100.0,3953.0,7.27,17.8,1.320137,8.7146,9.6354,0.0,0.0,7.867,1.008008,414288000.0,0.614,1550341000.0,0.0,-0.127,0.43246,-5e-05,-0.0091,-0.04153,1968303000.0,414288000.0,9.708591,0.0,0.0,-1136053000.0,-1.364673,0.620417,-2.9e-05
4,5350758000.0,6395236000.0,4022153000.0,-0.05671,-0.06434,0.43,0.571,-198000.0,0.0,14.555555,1132100.0,5605825.0,6.81,22.51,1.330322,8.3228,9.2729,0.0,0.0,7.867,0.99911,414288000.0,0.614,1550341000.0,0.0,-0.127,0.43246,-5e-05,-0.0091,-0.04153,1968303000.0,414288000.0,9.708591,0.0,0.0,-1136053000.0,-1.364673,0.62893,-2.9e-05


In [52]:
df_top_companies = df_fundamentals_book.nlargest(18, ['market_cap', 'dividend_rate', 'trailing_annual_dividend_yield', 'trailing_annual_dividend_rate', 'debt_to_equity', 'return_on_equity', 'trailing_pe', 'forward_pe', 'price_to_book', 'earnings_quarterly_growth']).drop_duplicates(subset='long_name', keep='first').reset_index(drop=True)
df_top_companies

Unnamed: 0,ticker,long_name,sector,industry,market_cap,enterprise_value,total_revenue,profit_margins,operating_margins,dividend_rate,beta,ebitda,trailing_pe,forward_pe,volume,average_volume,fifty_two_week_low,fifty_two_week_high,price_to_sales_trailing_12_months,fifty_day_average,two_hundred_day_average,trailing_annual_dividend_rate,trailing_annual_dividend_yield,book_value,price_to_book,total_cash,total_cash_per_share,total_debt,earnings_quarterly_growth,revenue_growth,gross_margins,ebitda_margins,return_on_assets,return_on_equity,gross_profits,total_assets_approx,asset_turnover,earnings_growth_rate,dividend_payout_ratio,equity,debt_to_equity,roi,roce
0,PETR3.SA,Petróleo Brasileiro S.A. - Petrobras,Energy,Oil & Gas Integrated,483211900000.0,729364000000.0,581563000000.0,0.26889,0.39067,9.28,1.097,291087000000.0,2.914199,4.821608,2993300.0,13234123.0,23.61,41.86,0.830885,37.0388,31.98075,9.657,0.247298,28.417,1.3506,60985000000.0,4.675,279375000000.0,-0.47,-0.334,0.50633,0.50053,0.15427,0.39988,334100000000.0,60985000000.0,9.536164,-47.0,0.0,-218390000000.0,-1.279248,0.797356,0.38171
1,VALE3.SA,Vale S.A.,Basic Materials,Other Industrial Metals & Mining,277694700000.0,339907100000.0,206414000000.0,0.27585,0.33672,3.84,0.821,85364000000.0,5.319043,5.421849,4387500.0,25978987.0,61.0,98.29,1.345329,66.0796,73.07495,4.037,0.063147,43.051,1.498688,24238000000.0,5.58,79346000000.0,-0.848,-0.131,0.39091,0.41356,0.09975,0.29588,102313000000.0,24238000000.0,8.516132,-84.8,0.0,-55108000000.0,-1.439827,0.607266,0.239088
2,ITUB4.SA,Itaú Unibanco Holding S.A.,Financial Services,Banks - Regional,248683200000.0,707787500000.0,118938000000.0,0.26661,0.35896,1.5,0.499,0.0,8.557632,6.8675,10480300.0,24528203.0,22.62,31.29,2.090864,27.277,26.66895,0.503,0.018602,18.25,1.505205,394325000000.0,40.237,827034000000.0,0.181,0.135,0.0,0.0,0.01419,0.18295,115570000000.0,394325000000.0,0.301624,18.1,277.900552,-432709000000.0,-1.911294,0.168042,0.0
3,ABEV3.SA,Ambev S.A.,Consumer Defensive,Beverages - Brewers,200723300000.0,188845900000.0,82710540000.0,0.17263,0.18666,0.76,0.604,22251540000.0,14.166667,12.75,8618800.0,24680549.0,12.33,16.88,2.426816,13.4434,14.1037,0.762,0.061254,5.403,2.3598,12430520000.0,0.789,4096231000.0,-0.157,0.051,0.49945,0.26903,0.08271,0.16919,39286760000.0,12430520000.0,6.653829,-15.7,0.0,8334286000.0,0.491492,0.437979,0.10864
4,BPAC5.SA,Banco BTG Pactual S.A.,Financial Services,Capital Markets,148326600000.0,43454110000.0,27751350000.0,0.30292,0.3496,0.35,1.291,0.0,11.260274,0.0,2600.0,6184.0,4.55,9.08,5.344842,8.365,6.8769,0.0,0.0,9.808,0.838091,221143400000.0,46.436,220720500000.0,0.181,0.021,0.0,0.0,0.0191,0.18284,26639300000.0,221143400000.0,0.12549,18.1,0.0,422953000.0,521.855886,0.638636,0.0
5,BBDC4.SA,Banco Bradesco S.A.,Financial Services,Banks - Regional,142338800000.0,542111000000.0,72893190000.0,0.20718,0.22324,0.23,0.675,0.0,10.649254,6.150862,14450200.0,33278303.0,12.39,20.56,1.952704,14.612,14.7858,0.401,0.028359,15.641,0.912346,231553400000.0,21.758,622578600000.0,-0.447,-0.27,0.0,0.0,0.00853,0.09476,84471660000.0,231553400000.0,0.314801,-44.7,0.0,-391025200000.0,-1.59217,0.134462,0.0
6,BBAS3.SA,Banco do Brasil S.A.,Financial Services,Banks - Regional,140406000000.0,957440600000.0,95328930000.0,0.3483,0.45908,0.94,0.803,0.0,3.770115,3.805105,1529200.0,8738341.0,30.25,51.99,1.472858,47.7504,44.66415,4.5,0.091743,54.955,0.895278,82065870000.0,28.757,895776500000.0,0.087,0.091,0.0,0.0,0.01734,0.22474,89250530000.0,82065870000.0,1.161615,8.7,5172.413793,-813710600000.0,-1.100854,0.099566,0.0
7,WEGE3.SA,WEG S.A.,Industrials,Specialty Industrial Machinery,138442600000.0,130366400000.0,31758310000.0,0.15827,0.20575,0.32,0.513,6480928000.0,26.829268,24.444445,8227600.0,7064393.0,31.47,42.42,4.359256,35.499,38.03615,0.67,0.02129,3.666,9.001637,5480459000.0,1.306,3346563000.0,0.499,0.137,0.31911,0.20407,0.13783,0.34605,8695487000.0,5480459000.0,5.794826,49.9,134.268537,2133896000.0,1.568288,0.243608,0.045708
8,ITSA4.SA,Itaúsa S.A.,Industrials,Conglomerates,86580980000.0,93975280000.0,7807000000.0,1.69976,0.20113,0.74,0.647,2240000000.0,6.740458,5.553459,2919900.0,22110422.0,7.65,10.09,11.090173,9.11561,8.937748,0.515,0.059127,7.923,1.114477,7606000000.0,0.784,13237000000.0,0.168,-0.117,0.34546,0.28692,0.00854,0.18054,2875000000.0,7606000000.0,1.026426,16.8,306.547619,-5631000000.0,-2.350737,0.083075,0.022441
9,ELET3.SA,Centrais Elétricas Brasileiras S.A. - Eletrobrás,Utilities,Utilities - Renewable,79292630000.0,120118400000.0,35511730000.0,0.04797,0.42603,0.22,0.629,12748600000.0,34.7,8.361445,867000.0,8876606.0,29.81,52.49,2.232858,35.4158,35.99495,1.494,0.043646,49.431,0.701989,15801230000.0,7.011,58713390000.0,0.206,0.044,0.58912,0.359,0.02306,0.01719,20697320000.0,15801230000.0,2.247404,20.6,725.242718,-42912170000.0,-1.368223,0.295639,0.092377


To promote effective portfolio diversification, we refine the dataset to include only one ticker per company, choosing the 10 companies with the best financial metrics.

### Companies Analisys

#### Outlier Analysis

In [48]:
df_top_companies_cols_num = df_top_companies.select_dtypes(include= ['number', 'float64'])

top_companies_cols = df_top_companies_cols_num.columns
num_columns = len(top_companies_cols)
num_rows = (num_columns + 2) // 3

subplot_titles = [str(col) for col in top_companies_cols]

fig = sp.make_subplots(rows=num_rows, cols=3, subplot_titles=subplot_titles)

for i, column in enumerate(top_companies_cols, start=1):
    row = (i - 1) // 3 + 1
    col = (i - 1) % 3 + 1

    trace = go.Box(y=df_top_companies[column], name=column, 
                   marker_color='lightseagreen', boxpoints='outliers', 
                   jitter=0.7, hoverinfo='y+text', 
                   text=(df_top_companies['ticker']+ ' - ' + df_top_companies['long_name']))

    fig.add_trace(trace, row=row, col=col)

fig.update_layout(title_text='Boxplot of Numerical Variables', height=300*num_rows, showlegend=False, template='plotly_dark')
fig.show()

**Petróleo Brasileiro S.A. - Petrobras: (PETR3.SA)**

- **`market_cap`**:  Petrobras has an atypical market capitalization of **R$ 483,211,900,000**, reflecting its significant size in the energy sector.
- **`total_revenue`**: **R$ 581.563 billion**. This metric reflects the total amount of revenue generated by Petrobras in its financial operations.
- **`dividend_rate`**: The dividend rate of Petrobras at **9.28%** of the current stock value is relatively high, which may be seen as positive for investors seeking to receive significant dividends.
- **``**
- **`trailing_annual_dividend_rate`**: O valor incomum de **9.65%** para a Petrobras pode ser considerado um outlier, indicando uma taxa de dividendos significativamente alta em comparação com outras empresas.
- **`trailing_annual_dividend_yield`**: The value of **0.24%** for Petrobras does not appear to be an outlier, as it represents a dividend rate relative to the stock price within a reasonable range when compared to other companies in the same sector.
- **`gross_profits`**: The field is **R$ 334.1 billion**, representing the company's total revenue before the deduction of operating expenses, taxes, and other charges.
- **`roce`**: Return on Capital Employed is **0.38171**, indicating the company's efficiency in generating a return on invested capital. The higher the value, the better the efficiency in using its financial resources.

**Itaúsa S.A. (ITSA4.SA)**

- **`profit_margins`**:The field for Itaúsa represents the company's profit margin, which is **1.69976%**. This metric indicates the percentage of profit earned by Itaúsa in relation to its total revenue. That margin could be considered relatively low.
- **`price_to_sales_trailing_12_months`**: The value of 11.09017 for Itaúsa indicates that the market is willing to pay **11.09x**  the company's sales revenue over the past 12 months for each share, which may reflect optimism regarding the company's financial performance.

**Banco BTG Pactual S.A. (BPAC5.SA)**

- **`beta`**:The beta value of 1.291 for Banco BTG Pactual S.A. indicates a relatively high level of volatility in its shares in relation to the market.

In [41]:
fig = px.bar(top_companies, x='long_name', y=['profit_margins', 'operating_margins'],
             title='Margens de Lucro das Top 10 Empresas por Market Cap',
             labels={'value': 'Margem', 'variable': 'Tipo de Margem'},
             barmode='group')

fig.update_layout(xaxis={'categoryorder':'total descending'}, xaxis_title='Empresa', yaxis_title='Margem')
fig.show()
