# *Resultados Fundamentais - Extração de Dados Financeiros e de Empresas*
*Este script extrai dados financeiros de uma tabela na página de resultados do site "[Fundamentus](https://www.fundamentus.com.br)" e informações sobre papeis e empresas de outra tabela na página de detalhes do mesmo site.*

## *Instalação das Bibliotecas Necessárias*

In [1]:
%pip install selenium
%pip install webdriver-manager
%pip install pandas
%pip install lxml

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


## *Importação das Bibliotecas Necessárias*

In [18]:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from io import StringIO
from selenium.common.exceptions import StaleElementReferenceException
import pandas as pd


##  *Configuração e Inicialização do WebDriver do Selenium*

In [3]:
# Configuração e Inicialização do WebDriver do Selenium em Modo Headless
options = webdriver.ChromeOptions()
options.add_argument('--headless')
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)

## *URLs das Páginas a Serem Acessadas*

In [4]:
url_pagina1 = 'https://www.fundamentus.com.br/resultado.php'
url_pagina2 = 'https://www.fundamentus.com.br/detalhes.php'

## *Localização e Extração das Tabelas* 

In [5]:
# Acessando a Página 1 (Resultados)
driver.get(url_pagina1)

# Localizando e Extraindo a Primeira Tabela (Dados Financeiros)
local_tabela = '/html/body/div[1]/div[2]/table'
elemento = driver.find_element("xpath", local_tabela)

# Obter o HTML da tabela
html_tabela = elemento.get_attribute('outerHTML')

# Usando StringIO para ler o HTML
html_string_io = StringIO(html_tabela)

In [6]:
# Acessando a Página 2 (Detalhes de Empresas)
driver.get(url_pagina2)

# Aguardar até que a página esteja completamente carregada
driver.implicitly_wait(10)

# Localizando e Extraindo a Segunda Tabela (Papeis e Nomes de Empresas)
local_tabela_2 = '/html/body/div[1]/div[2]/div[1]/div/table'

# Tentar localizar o elemento novamente
try:
    elemento_2 = driver.find_element("xpath", local_tabela_2)
    # Obter o HTML da tabela
    html_tabela_2 = elemento_2.get_attribute('outerHTML')
except StaleElementReferenceException:
    # Se ocorrer um erro StaleElementReferenceException, tente localizar o elemento novamente
    elemento_2 = driver.find_element("xpath", local_tabela_2)
    # Obter o HTML da tabela
    html_tabela_2 = elemento_2.get_attribute('outerHTML')

# Usando StringIO para ler o HTML
html_string_io_2 = StringIO(html_tabela_2)

## Conversão em DataFrame Pandas

In [7]:
df_financas = pd.read_html(html_string_io, decimal=',', thousands='.')[0]

df_empresas = pd.read_html(html_string_io_2)[0]

# Extraindo o Papel e o Nome da Empresa, e Removendo a Coluna de Razão Social
df_empresas.columns = ['Papel', 'Nome', 'Razão Social']
df_empresas.drop(columns=['Razão Social'], inplace=True)


## Exibindo os DataFrames

In [8]:
## Tabela com os dados financeiros
print("Tabela com os dados financeiros:")
display(df_financas)

Tabela com os dados financeiros:


Unnamed: 0,Papel,Cotação,P/L,P/VP,PSR,Div.Yield,P/Ativo,P/Cap.Giro,P/EBIT,P/Ativ Circ.Liq,...,EV/EBITDA,Mrg Ebit,Mrg. Líq.,Liq. Corr.,ROIC,ROE,Liq.2meses,Patrim. Líq,Dív.Brut/ Patrim.,Cresc. Rec.5a
0,CLAN3,0.00,0.00,0.00,0.000,"0,00%",0.000,0.00,0.00,0.00,...,0.00,"0,00%","0,00%",0.00,"0,00%","-1,05%",0.0,1.012240e+09,0.00,"-63,96%"
1,PORP4,2.40,0.00,0.00,0.000,"0,00%",0.000,0.00,0.00,0.00,...,0.00,"0,00%","0,00%",0.00,"0,00%","-2,08%",0.0,2.239900e+07,0.00,"13,66%"
2,MNSA4,0.47,0.00,0.00,0.000,"0,00%",0.000,0.00,0.00,0.00,...,0.00,"-208,15%","-362,66%",3.63,"-13,50%","145,70%",0.0,-9.105000e+06,-6.52,"-41,11%"
3,MNSA3,0.42,0.00,0.00,0.000,"0,00%",0.000,0.00,0.00,0.00,...,0.00,"-208,15%","-362,66%",3.63,"-13,50%","145,70%",0.0,-9.105000e+06,-6.52,"-41,11%"
4,CFLU4,1000.00,0.00,0.00,0.000,"0,00%",0.000,0.00,0.00,0.00,...,0.00,"8,88%","10,72%",1.10,"17,68%","32,15%",0.0,6.035100e+07,0.06,"8,14%"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
981,PDTC3,2.00,819.28,1.14,0.457,"2,30%",0.302,0.79,9.80,-14.22,...,6.63,"4,66%","0,06%",2.17,"3,82%","0,14%",148720.0,1.390150e+08,1.07,"-6,51%"
982,CSNA3,13.10,939.58,1.00,0.396,"11,14%",0.189,2.36,2.88,-0.44,...,4.98,"13,77%","1,70%",1.29,"8,82%","0,11%",106609000.0,1.729740e+10,2.68,"14,14%"
983,UBBR11,14.75,1201.81,3.91,0.000,"0,00%",0.000,0.00,0.00,0.00,...,0.00,"0,00%","0,00%",0.00,"0,00%","0,33%",0.0,1.031720e+10,0.00,"10,58%"
984,UBBR3,18.00,1466.61,4.77,0.000,"0,00%",0.000,0.00,0.00,0.00,...,0.00,"0,00%","0,00%",0.00,"0,00%","0,33%",0.0,1.031720e+10,0.00,"10,58%"


In [9]:
print("\nTabela com os papeis e seus nomes:")
display(df_empresas)


Tabela com os papeis e seus nomes:


Unnamed: 0,Papel,Nome
0,AALR3,ALLIAR
1,ABCB3,ABC Brasil
2,ABCB4,ABC Brasil
3,ABEV3,AMBEV S/A
4,ABRE3,SOMOS EDUCA
...,...,...
1028,WMBY3,WEMBLEY SOCIEDADE ANÔNIMA
1029,WSON33,Wilson Sons
1030,YBRA4,OPPORTUNITY ALEF S/A
1031,YDUQ3,YDUQS PART


## Fechando o Driver Selenium

In [10]:
# Fechar o WebDriver do Selenium após concluir a extração e exibição dos dados
driver.quit()
