# Script de Web Scraping
Este foi o código utilizado para fazer o scraping do site da InfoMoney. Para isso, utilizei a biblioteca requires, junto da beautifulsoup4, para fazer a raspagem no site. Após ler a estrutura do sitemap 'https://www.infomoney.com.br/news-sitemap.xml', que contém as notícias dos últimos três dias, montei um scraper para coletar o título, data e url destas notícias e salvar em um arquivo CSV. Após isto, montei um outro scraper que, usando as URLs salvas neste CSV, baixava o conteúdo das notícias e armazenava em um arquivo JSON. Por fim, fiz um tratamento para remover as informações relativas às publicidades do site, para manter apenas o conteúdo relevante.

## Instalação das Dependências
Instalação das bibliotecas dependentes, requires, beautifulsoup4 e label-studio, para a rotulação

In [None]:
!pip install requires
!pip install beautifulsoup4
#Instalar Label-Studio - Pode ser feita usando PowerShell Prompt do Anaconda
!conda create --name label-studio
!conda activate label-studio
!conda install psycopg2
!pip install label-studio

## Importar bibliotecas

In [1]:
import requests
from bs4 import BeautifulSoup
import csv
import json

## Scraper do Sitemap

In [10]:
response = requests.get("https://www.infomoney.com.br/news-sitemap.xml")
response.raise_for_status() 
soup = BeautifulSoup(response.content, 'xml')


url_elements = soup.find_all('https://www.infomoney.com.br/news-sitemap.xml')

data = []

for url_element in url_elements:
    loc = url_element.find('loc').text
    title = url_element.find('news:title').text
    publication_date = url_element.find('news:publication_date').text
    data.append([title, loc, publication_date])

with open('news_data.csv', mode='w', newline='', encoding='utf-8') as file:
    writer = csv.writer(file)
    writer.writerow(["Title", "URL", "Publication Date"])
    writer.writerows(data)

print("Os dados foram salvos em news_data.csv")


2024-07-29 21:38:11 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443
2024-07-29 21:38:11 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /news-sitemap.xml HTTP/1.1" 200 None


Os dados foram salvos em news_data.csv


## Scraper das notícias

In [12]:
def scrape_content(url):
    response = requests.get(url)
    response.raise_for_status()
    soup = BeautifulSoup(response.content, 'html.parser')
    
    article = soup.find('article', class_='im-article')
    if article:
        paragraphs = article.find_all('p')
        content = ' '.join([p.get_text() for p in paragraphs])
        return content
    return ""

data = []
with open('news_data.csv', mode='r', encoding='utf-8') as file:
    reader = csv.DictReader(file)
    for row in reader:
        title = row['Title']
        url = row['URL']
        publication_date = row['Publication Date']
        
        print(f"Scraping content from: {url}")
        content = scrape_content(url)
        
        data.append({
            'title': title,
            'url': url,
            'date': publication_date,
            'content': content
        })

with open('news_data.json', mode='w', encoding='utf-8') as file:
    json.dump(data, file, ensure_ascii=False, indent=4)

print("Os dados foram salvos em news_data.json")


2024-07-29 23:05:39 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/petrobras-petr4-producao-de-petroleo-gas-segundo-trimestre-2024/


2024-07-29 23:05:39 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/petrobras-petr4-producao-de-petroleo-gas-segundo-trimestre-2024/ HTTP/1.1" 200 None
2024-07-29 23:05:40 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/consumo/lotofacil-premio-de-r-4-milhoes-e-sorteado-nesta-segunda-veja-dezenas/


2024-07-29 23:05:40 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /consumo/lotofacil-premio-de-r-4-milhoes-e-sorteado-nesta-segunda-veja-dezenas/ HTTP/1.1" 200 None
2024-07-29 23:05:40 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/consumo/quina-sorteia-premio-de-r-48-milhoes-nesta-segunda-veja-numeros/


2024-07-29 23:05:41 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /consumo/quina-sorteia-premio-de-r-48-milhoes-nesta-segunda-veja-numeros/ HTTP/1.1" 200 None
2024-07-29 23:05:41 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/consumo/olimpiadas-confira-a-programacao-desta-terca-feira/


2024-07-29 23:05:41 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /consumo/olimpiadas-confira-a-programacao-desta-terca-feira/ HTTP/1.1" 200 None
2024-07-29 23:05:41 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/lider-da-oposicao-diz-que-gonzalez-obteve-70-dos-votos-na-venezuela/


2024-07-29 23:05:42 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/lider-da-oposicao-diz-que-gonzalez-obteve-70-dos-votos-na-venezuela/ HTTP/1.1" 200 None
2024-07-29 23:05:42 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/economia/governo-adia-inicio-da-vigencia-de-regra-sobre-trabalho-do-comercio-em-feriados/


2024-07-29 23:05:42 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /economia/governo-adia-inicio-da-vigencia-de-regra-sobre-trabalho-do-comercio-em-feriados/ HTTP/1.1" 200 None
2024-07-29 23:05:42 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/protestos-eclodem-em-caracas-apos-maduro-reivindicar-vitoria-em-reeleicao/


2024-07-29 23:05:43 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/protestos-eclodem-em-caracas-apos-maduro-reivindicar-vitoria-em-reeleicao/ HTTP/1.1" 200 None
2024-07-29 23:05:43 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/ccr-ccro3-resultados-segundo-trimestre-2024/


2024-07-29 23:05:43 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/ccr-ccro3-resultados-segundo-trimestre-2024/ HTTP/1.1" 200 None
2024-07-29 23:05:44 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/brasil-mexico-e-colombia-negociam-declaracao-conjunta-sobre-eleicoes-na-venezuela/


2024-07-29 23:05:45 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/brasil-mexico-e-colombia-negociam-declaracao-conjunta-sobre-eleicoes-na-venezuela/ HTTP/1.1" 200 None
2024-07-29 23:05:45 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/minhas-financas/bc-comunica-vazamento-de-dados-de-chaves-pix-de-cooperativas-filiadas-a-unicred/


2024-07-29 23:05:45 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /minhas-financas/bc-comunica-vazamento-de-dados-de-chaves-pix-de-cooperativas-filiadas-a-unicred/ HTTP/1.1" 200 None
2024-07-29 23:05:45 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/venezuela-expulsa-corpo-diplomatico-de-sete-paises-apos-eleicoes-presidenciais/


2024-07-29 23:05:46 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/venezuela-expulsa-corpo-diplomatico-de-sete-paises-apos-eleicoes-presidenciais/ HTTP/1.1" 200 None
2024-07-29 23:05:46 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/telefonica-dona-da-vivo-vivt3-registra-lucro-de-r-12-bilhao-no-segundo-tri/


2024-07-29 23:05:47 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/telefonica-dona-da-vivo-vivt3-registra-lucro-de-r-12-bilhao-no-segundo-tri/ HTTP/1.1" 200 None
2024-07-29 23:05:47 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/minhas-financas/ministro-afirma-que-nao-ha-previsao-de-reavaliacao-do-limite-de-juros-para-consignado/


2024-07-29 23:05:47 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /minhas-financas/ministro-afirma-que-nao-ha-previsao-de-reavaliacao-do-limite-de-juros-para-consignado/ HTTP/1.1" 200 None
2024-07-29 23:05:47 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/onde-investir/tesouro-direto-bate-recorde-de-aplicacoes-em-junho-veja-titulos-mais-buscados/


2024-07-29 23:05:48 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /onde-investir/tesouro-direto-bate-recorde-de-aplicacoes-em-junho-veja-titulos-mais-buscados/ HTTP/1.1" 200 None
2024-07-29 23:05:48 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/sp-500-tem-pequena-alta-antes-de-balancos-importantes-dados-dos-eua-e-fed/


2024-07-29 23:05:48 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/sp-500-tem-pequena-alta-antes-de-balancos-importantes-dados-dos-eua-e-fed/ HTTP/1.1" 200 None
2024-07-29 23:05:48 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/dolar-hoje-abertura-fechamento-comercial-turismo-29072024/


2024-07-29 23:05:49 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/dolar-hoje-abertura-fechamento-comercial-turismo-29072024/ HTTP/1.1" 200 None
2024-07-29 23:05:49 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/usiminas-bofa-rebaixa-usim5-a-venda-apos-resultados-fracos-no-2o-tri-e-acao-cai-5/


2024-07-29 23:05:50 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/usiminas-bofa-rebaixa-usim5-a-venda-apos-resultados-fracos-no-2o-tri-e-acao-cai-5/ HTTP/1.1" 200 None
2024-07-29 23:05:50 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/politica/governo-quer-terminar-ano-com-800-mil-verificacoes-de-beneficios-temporarios-do-inss/


2024-07-29 23:05:50 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /politica/governo-quer-terminar-ano-com-800-mil-verificacoes-de-beneficios-temporarios-do-inss/ HTTP/1.1" 200 None
2024-07-29 23:05:50 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/petrobras-acoes-caem-cerca-de-3-com-petroleo-em-baixa-e-possiveis-compras-no-radar/


2024-07-29 23:05:51 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/petrobras-acoes-caem-cerca-de-3-com-petroleo-em-baixa-e-possiveis-compras-no-radar/ HTTP/1.1" 200 None
2024-07-29 23:05:51 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/ibovespa-hoje-bolsa-de-valores-ao-vivo-29072024/


2024-07-29 23:05:51 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/ibovespa-hoje-bolsa-de-valores-ao-vivo-29072024/ HTTP/1.1" 200 None
2024-07-29 23:05:52 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/retrato-perdido-de-henrique-viii-encontrado-na-inglaterra-apos-postagem-no-x/


2024-07-29 23:05:52 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/retrato-perdido-de-henrique-viii-encontrado-na-inglaterra-apos-postagem-no-x/ HTTP/1.1" 200 None
2024-07-29 23:05:53 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/taxas-futuras-de-juros-sobem-a-espera-de-decisoes-de-copom-fed-e-banco-do-japao/


2024-07-29 23:05:53 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/taxas-futuras-de-juros-sobem-a-espera-de-decisoes-de-copom-fed-e-banco-do-japao/ HTTP/1.1" 200 None
2024-07-29 23:05:53 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/grupo-sbf-sbfg3-resultados-segundo-trimestre-2024-analise-desempenho-acoes/


2024-07-29 23:05:54 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/grupo-sbf-sbfg3-resultados-segundo-trimestre-2024-analise-desempenho-acoes/ HTTP/1.1" 200 None
2024-07-29 23:05:54 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/economia/exportacao-de-carne-bovina-do-brasil-marca-recorde-antes-do-mes-acabar/


2024-07-29 23:05:54 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /economia/exportacao-de-carne-bovina-do-brasil-marca-recorde-antes-do-mes-acabar/ HTTP/1.1" 200 None
2024-07-29 23:05:54 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/petroleo-fecha-em-queda-diante-de-alta-do-dolar-e-incertezas-sobre-demanda/


2024-07-29 23:05:55 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/petroleo-fecha-em-queda-diante-de-alta-do-dolar-e-incertezas-sobre-demanda/ HTTP/1.1" 200 None
2024-07-29 23:05:55 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/politica/isencoes-da-cesta-basica-nao-devem-reduzir-os-precos-com-reforma-dizem-especialistas/


2024-07-29 23:05:55 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /politica/isencoes-da-cesta-basica-nao-devem-reduzir-os-precos-com-reforma-dizem-especialistas/ HTTP/1.1" 200 None
2024-07-29 23:05:55 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/acao-da-heineken-desaba-apos-balanco-frustrar-qual-a-leitura-para-a-ambev-no-brasil/


2024-07-29 23:05:56 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/acao-da-heineken-desaba-apos-balanco-frustrar-qual-a-leitura-para-a-ambev-no-brasil/ HTTP/1.1" 200 None
2024-07-29 23:05:56 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/consumo/encomenda-retida-correios-alertam-para-golpe-em-compras-veja-como-se-proteger/


2024-07-29 23:05:56 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /consumo/encomenda-retida-correios-alertam-para-golpe-em-compras-veja-como-se-proteger/ HTTP/1.1" 200 None
2024-07-29 23:05:57 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/tim-tims3-e-vivo-vivt3-devem-trazer-bons-dados-no-2t-e-manter-tendencia-positiva/


2024-07-29 23:05:57 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/tim-tims3-e-vivo-vivt3-devem-trazer-bons-dados-no-2t-e-manter-tendencia-positiva/ HTTP/1.1" 200 None
2024-07-29 23:05:57 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/eua-acusam-venezuela-de-manipulacao-eleitoral-e-repressao/


2024-07-29 23:05:58 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/eua-acusam-venezuela-de-manipulacao-eleitoral-e-repressao/ HTTP/1.1" 200 None
2024-07-29 23:05:58 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/minhas-financas/moro-em-um-imovel-ha-cinco-anos-posso-provar-usucapiao/


2024-07-29 23:05:58 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /minhas-financas/moro-em-um-imovel-ha-cinco-anos-posso-provar-usucapiao/ HTTP/1.1" 200 None
2024-07-29 23:05:59 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/onde-investir/mega-sena-acumulada-sorteia-r-100-milhoes-nesta-terca-quanto-rendem-na-poupanca/


2024-07-29 23:05:59 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /onde-investir/mega-sena-acumulada-sorteia-r-100-milhoes-nesta-terca-quanto-rendem-na-poupanca/ HTTP/1.1" 200 None
2024-07-29 23:06:00 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/netanyahu-diz-que-hamas-esta-impedindo-acordo-e-que-israel-nao-mudou-condicoes/


2024-07-29 23:06:00 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/netanyahu-diz-que-hamas-esta-impedindo-acordo-e-que-israel-nao-mudou-condicoes/ HTTP/1.1" 200 None
2024-07-29 23:06:00 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/conselho-eleitoral-proclama-nicolas-maduro-como-presidente-da-venezuela/


2024-07-29 23:06:01 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/conselho-eleitoral-proclama-nicolas-maduro-como-presidente-da-venezuela/ HTTP/1.1" 200 None
2024-07-29 23:06:01 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/politica/discussao-e-sobre-o-quao-proximo-estamos-de-cumprir-meta-fiscal-diz-ceron/


2024-07-29 23:06:01 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /politica/discussao-e-sobre-o-quao-proximo-estamos-de-cumprir-meta-fiscal-diz-ceron/ HTTP/1.1" 200 None
2024-07-29 23:06:01 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/biden-propoe-limites-de-mandato-para-juizes-da-suprema-corte-dos-eua/


2024-07-29 23:06:02 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/biden-propoe-limites-de-mandato-para-juizes-da-suprema-corte-dos-eua/ HTTP/1.1" 200 None
2024-07-29 23:06:02 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/procurador-geral-acusa-lider-da-oposicao-de-sabotagem-eleitoral-na-venezuela/


2024-07-29 23:06:02 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/procurador-geral-acusa-lider-da-oposicao-de-sabotagem-eleitoral-na-venezuela/ HTTP/1.1" 200 None
2024-07-29 23:06:02 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/acoes-europeias-fecham-em-baixa-antes-de-decisao-do-fed-e-de-balancos/


2024-07-29 23:06:03 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/acoes-europeias-fecham-em-baixa-antes-de-decisao-do-fed-e-de-balancos/ HTTP/1.1" 200 None
2024-07-29 23:06:03 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/carreira/trf-5-divulga-edital-de-concurso-publico-vagas-tem-salario-inicial-de-r-8-500/


2024-07-29 23:06:03 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /carreira/trf-5-divulga-edital-de-concurso-publico-vagas-tem-salario-inicial-de-r-8-500/ HTTP/1.1" 200 None
2024-07-29 23:06:03 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/economia/desesperada-por-dolares-cuba-proibe-empresas-de-usarem-bancos-dos-eua/


2024-07-29 23:06:04 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /economia/desesperada-por-dolares-cuba-proibe-empresas-de-usarem-bancos-dos-eua/ HTTP/1.1" 200 None
2024-07-29 23:06:04 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/business/apple-vai-adiar-implementacao-de-recursos-de-ia-em-novo-sistema-operacional/


2024-07-29 23:06:04 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /business/apple-vai-adiar-implementacao-de-recursos-de-ia-em-novo-sistema-operacional/ HTTP/1.1" 200 None
2024-07-29 23:06:04 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/ex-presidente-zuma-e-expulso-de-seu-partido-na-africa-do-sul-apos-derrota-eleitoral/


2024-07-29 23:06:05 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/ex-presidente-zuma-e-expulso-de-seu-partido-na-africa-do-sul-apos-derrota-eleitoral/ HTTP/1.1" 200 None
2024-07-29 23:06:05 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/onde-investir/petroliferas-ou-bancos-qual-setor-paga-mais-dividendos/


2024-07-29 23:06:05 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /onde-investir/petroliferas-ou-bancos-qual-setor-paga-mais-dividendos/ HTTP/1.1" 200 None
2024-07-29 23:06:05 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/paises-latino-americanos-pedem-reuniao-urgente-da-oea-por-eleicoes-na-venezuela/


2024-07-29 23:06:06 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/paises-latino-americanos-pedem-reuniao-urgente-da-oea-por-eleicoes-na-venezuela/ HTTP/1.1" 200 None
2024-07-29 23:06:06 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/onde-investir/tesouro-direto-taxas-de-prefixados-voltam-a-subir-e-se-aproximam-de-recorde-do-ano/


2024-07-29 23:06:07 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /onde-investir/tesouro-direto-taxas-de-prefixados-voltam-a-subir-e-se-aproximam-de-recorde-do-ano/ HTTP/1.1" 200 None
2024-07-29 23:06:07 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/hamas-acusa-netanyahu-de-acrescentar-novas-condicoes-a-proposta-de-cessar-fogo/


2024-07-29 23:06:07 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/hamas-acusa-netanyahu-de-acrescentar-novas-condicoes-a-proposta-de-cessar-fogo/ HTTP/1.1" 200 None
2024-07-29 23:06:07 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/minhas-financas/meis-e-empregadores-domesticos-devem-se-cadastrar-no-det-ate-quinta-1-veja-como/


2024-07-29 23:06:08 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /minhas-financas/meis-e-empregadores-domesticos-devem-se-cadastrar-no-det-ate-quinta-1-veja-como/ HTTP/1.1" 200 None
2024-07-29 23:06:08 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/usiminas-usim5-deve-ter-novamente-custos-pressionados-no-3o-trimestre-diz-analista/


2024-07-29 23:06:08 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/usiminas-usim5-deve-ter-novamente-custos-pressionados-no-3o-trimestre-diz-analista/ HTTP/1.1" 200 None
2024-07-29 23:06:08 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/voos-em-beirute-sao-cancelados-ou-adiados-em-meio-a-temores-de-ataque-israelense/


2024-07-29 23:06:09 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/voos-em-beirute-sao-cancelados-ou-adiados-em-meio-a-temores-de-ataque-israelense/ HTTP/1.1" 200 None
2024-07-29 23:06:09 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/economia/atividade-manufatureira-da-china-deve-registrar-queda-crescente-em-julho/


2024-07-29 23:06:10 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /economia/atividade-manufatureira-da-china-deve-registrar-queda-crescente-em-julho/ HTTP/1.1" 200 None
2024-07-29 23:06:10 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/e-se-petrobras-comprar-bloco-na-namibia-oportunidade-ou-so-pressao-para-dividendos/


2024-07-29 23:06:10 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/e-se-petrobras-comprar-bloco-na-namibia-oportunidade-ou-so-pressao-para-dividendos/ HTTP/1.1" 200 None
2024-07-29 23:06:10 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/elon-musk-viola-suas-proprias-regras-no-x-e-posta-video-falso-de-kamala-harris/


2024-07-29 23:06:11 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/elon-musk-viola-suas-proprias-regras-no-x-e-posta-video-falso-de-kamala-harris/ HTTP/1.1" 200 None
2024-07-29 23:06:11 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/minhas-financas/restituicao-do-imposto-de-renda-como-saber-se-vou-receber-valores-no-3o-lote/


2024-07-29 23:06:11 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /minhas-financas/restituicao-do-imposto-de-renda-como-saber-se-vou-receber-valores-no-3o-lote/ HTTP/1.1" 200 None
2024-07-29 23:06:11 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/governo-brasileiro-pede-verificacao-imparcial-dos-resultados-na-venezuela/


2024-07-29 23:06:12 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/governo-brasileiro-pede-verificacao-imparcial-dos-resultados-na-venezuela/ HTTP/1.1" 200 None
2024-07-29 23:06:12 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/china-evergrande-auto-sofre-pedido-de-falencia-e-acao-tomba-em-hong-kong/


2024-07-29 23:06:12 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/china-evergrande-auto-sofre-pedido-de-falencia-e-acao-tomba-em-hong-kong/ HTTP/1.1" 200 None
2024-07-29 23:06:12 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/minhas-financas/foi-vitima-de-golpe-do-pix-errado-saiba-quando-o-seguro-cobre-a-transacao-bancaria/


2024-07-29 23:06:13 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /minhas-financas/foi-vitima-de-golpe-do-pix-errado-saiba-quando-o-seguro-cobre-a-transacao-bancaria/ HTTP/1.1" 200 None
2024-07-29 23:06:13 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/caixa-seguridade-cxse3-elevada-compra-bradesco-bbi-dividendo-fornece-protecao-em-ambiente-incerto/


2024-07-29 23:06:14 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/caixa-seguridade-cxse3-elevada-compra-bradesco-bbi-dividendo-fornece-protecao-em-ambiente-incerto/ HTTP/1.1" 200 None
2024-07-29 23:06:14 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/allos-alos3-diz-que-brmalls-vai-emitir-ate-r-25-bilhoes-em-debentures/


2024-07-29 23:06:15 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/allos-alos3-diz-que-brmalls-vai-emitir-ate-r-25-bilhoes-em-debentures/ HTTP/1.1" 200 None
2024-07-29 23:06:15 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/franca-suspeita-que-grupos-de-extrema-esquerda-estejam-por-tras-da-sabotagem-de-trens/


2024-07-29 23:06:15 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/franca-suspeita-que-grupos-de-extrema-esquerda-estejam-por-tras-da-sabotagem-de-trens/ HTTP/1.1" 200 None
2024-07-29 23:06:15 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/vale-vale3-diz-que-avalia-oportunidades-apos-rumores-sobre-compra-da-teck/


2024-07-29 23:06:16 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/vale-vale3-diz-que-avalia-oportunidades-apos-rumores-sobre-compra-da-teck/ HTTP/1.1" 200 None
2024-07-29 23:06:16 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/usiminas-diz-que-csn-descumpriu-decisao-da-justica-para-venda-de-acoes/


2024-07-29 23:06:17 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/usiminas-diz-que-csn-descumpriu-decisao-da-justica-para-venda-de-acoes/ HTTP/1.1" 200 None
2024-07-29 23:06:17 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/politica/lula-sanciona-lei-que-cria-a-letra-de-credito-de-desenvolvimento-sem-vetos/


2024-07-29 23:06:17 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /politica/lula-sanciona-lei-que-cria-a-letra-de-credito-de-desenvolvimento-sem-vetos/ HTTP/1.1" 200 None
2024-07-29 23:06:17 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/governos-da-russia-china-cuba-ira-e-bolivia-parabenizam-maduro-pela-vitoria/


2024-07-29 23:06:18 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/governos-da-russia-china-cuba-ira-e-bolivia-parabenizam-maduro-pela-vitoria/ HTTP/1.1" 200 None
2024-07-29 23:06:18 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/onde-investir/por-que-bacon-cannabis-e-ia-podem-alavancar-ipos-nos-eua-e-como-participar/


2024-07-29 23:06:18 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /onde-investir/por-que-bacon-cannabis-e-ia-podem-alavancar-ipos-nos-eua-e-como-participar/ HTTP/1.1" 200 None
2024-07-29 23:06:18 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/ibovespa-futuro-cai-antes-decisoes-de-bcs-e-producao-da-petrobras-petr4/


2024-07-29 23:06:19 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/ibovespa-futuro-cai-antes-decisoes-de-bcs-e-producao-da-petrobras-petr4/ HTTP/1.1" 200 None
2024-07-29 23:06:19 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/minerio-de-ferro-fecha-em-leve-alta-em-dalian-com-dados-industriais-da-china/


2024-07-29 23:06:19 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/minerio-de-ferro-fecha-em-leve-alta-em-dalian-com-dados-industriais-da-china/ HTTP/1.1" 200 None
2024-07-29 23:06:19 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/business/medsystems-pioneira-em-tecnologias-do-mercado-de-estetica-diversifica-o-negocio/


2024-07-29 23:06:20 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /business/medsystems-pioneira-em-tecnologias-do-mercado-de-estetica-diversifica-o-negocio/ HTTP/1.1" 200 None
2024-07-29 23:06:20 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/autoridades-israelenses-dizem-que-buscam-evitar-guerra-total-em-retaliacao-ao-libano/


2024-07-29 23:06:20 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/autoridades-israelenses-dizem-que-buscam-evitar-guerra-total-em-retaliacao-ao-libano/ HTTP/1.1" 200 None
2024-07-29 23:06:20 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/zamp-grupo-sbf-vale-vivo-e-mais-acoes-para-acompanhar-nesta-2a/


2024-07-29 23:06:21 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/zamp-grupo-sbf-vale-vivo-e-mais-acoes-para-acompanhar-nesta-2a/ HTTP/1.1" 200 None
2024-07-29 23:06:21 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/economia/deficit-primario-em-junho-foi-de-r-409-bilhoes-e-chegou-a-244-do-pib-em-12-meses/


2024-07-29 23:06:21 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /economia/deficit-primario-em-junho-foi-de-r-409-bilhoes-e-chegou-a-244-do-pib-em-12-meses/ HTTP/1.1" 200 None
2024-07-29 23:06:21 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/heineken-sofre-prejuizo-inesperado-no-1o-semestre-de-95-milhoes-de-euros/


2024-07-29 23:06:22 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/heineken-sofre-prejuizo-inesperado-no-1o-semestre-de-95-milhoes-de-euros/ HTTP/1.1" 200 None
2024-07-29 23:06:22 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/economia/boletim-focus-projecoes-de-ipca-e-pib-de-2024-e-2025-voltam-a-subir/


2024-07-29 23:06:23 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /economia/boletim-focus-projecoes-de-ipca-e-pib-de-2024-e-2025-voltam-a-subir/ HTTP/1.1" 200 None
2024-07-29 23:06:23 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/nao-vivemos-cenario-de-responsabilidade-fiscal-diz-analista-sobre-fala-de-lula-na-tv/


2024-07-29 23:06:23 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/nao-vivemos-cenario-de-responsabilidade-fiscal-diz-analista-sobre-fala-de-lula-na-tv/ HTTP/1.1" 200 None
2024-07-29 23:06:23 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/politica/corte-de-juro-nao-deve-ocorrer-nesta-semana-em-meio-a-incertezas-diz-guilherme-mello/


2024-07-29 23:06:24 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /politica/corte-de-juro-nao-deve-ocorrer-nesta-semana-em-meio-a-incertezas-diz-guilherme-mello/ HTTP/1.1" 200 None
2024-07-29 23:06:24 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/economia/confianca-da-industria-cresce-em-julho-e-engata-4o-mes-seguido-de-alta/


2024-07-29 23:06:24 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /economia/confianca-da-industria-cresce-em-julho-e-engata-4o-mes-seguido-de-alta/ HTTP/1.1" 200 None
2024-07-29 23:06:24 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/zamp-zamp3-aprova-aumento-de-capital-no-valor-de-ate-r-450-milhoes/


2024-07-29 23:06:25 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/zamp-zamp3-aprova-aumento-de-capital-no-valor-de-ate-r-450-milhoes/ HTTP/1.1" 200 None
2024-07-29 23:06:25 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/advisor/como-o-assessor-de-investimentos-pode-prospectar-mais-clientes/


2024-07-29 23:06:25 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /advisor/como-o-assessor-de-investimentos-pode-prospectar-mais-clientes/ HTTP/1.1" 200 None
2024-07-29 23:06:26 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/enauta-enat3-retoma-producao-no-campo-de-atlanta/


2024-07-29 23:06:26 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/enauta-enat3-retoma-producao-no-campo-de-atlanta/ HTTP/1.1" 200 None
2024-07-29 23:06:26 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/mercados-day-trade-hoje-analistas-29072024/


2024-07-29 23:06:27 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/mercados-day-trade-hoje-analistas-29072024/ HTTP/1.1" 200 None
2024-07-29 23:06:27 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/onde-investir/bitcoin-hoje-atinge-maior-preco-em-6-semanas-apos-discurso-de-trump/


2024-07-29 23:06:27 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /onde-investir/bitcoin-hoje-atinge-maior-preco-em-6-semanas-apos-discurso-de-trump/ HTTP/1.1" 200 None
2024-07-29 23:06:27 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/futuro-bitcoin-bitq24-analise-tecnica-29072024/


2024-07-29 23:06:28 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/futuro-bitcoin-bitq24-analise-tecnica-29072024/ HTTP/1.1" 200 None
2024-07-29 23:06:28 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/maduro-e-reeleito-na-venezuela-mas-resultado-oficial-e-contestado/


2024-07-29 23:06:28 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/maduro-e-reeleito-na-venezuela-mas-resultado-oficial-e-contestado/ HTTP/1.1" 200 None
2024-07-29 23:06:28 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/bc-divulga-estatisticas-fiscais-em-semana-de-copom-producao-da-petrobras-e-mais/


2024-07-29 23:06:29 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/bc-divulga-estatisticas-fiscais-em-semana-de-copom-producao-da-petrobras-e-mais/ HTTP/1.1" 200 None
2024-07-29 23:06:29 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/minidolar-hoje-futuro-do-dolar-wdoq24-29072024/


2024-07-29 23:06:29 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/minidolar-hoje-futuro-do-dolar-wdoq24-29072024/ HTTP/1.1" 200 None
2024-07-29 23:06:29 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/dow-jones-futuro-sobe-em-inicio-de-semana-marcada-por-fed-e-dados-de-trabalho/


2024-07-29 23:06:30 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/dow-jones-futuro-sobe-em-inicio-de-semana-marcada-por-fed-e-dados-de-trabalho/ HTTP/1.1" 200 None
2024-07-29 23:06:30 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/mini-indice-hoje-futuro-ibovespa-winq24-29072024/


2024-07-29 23:06:31 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/mini-indice-hoje-futuro-ibovespa-winq24-29072024/ HTTP/1.1" 200 None
2024-07-29 23:06:32 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/business/global/como-o-starbucks-desvalorizou-sua-propria-marca/


2024-07-29 23:06:32 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /business/global/como-o-starbucks-desvalorizou-sua-propria-marca/ HTTP/1.1" 200 None
2024-07-29 23:06:32 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/politica/lula-diz-que-em-seu-governo-o-brasil-se-reencontrou-com-a-civilizacao/


2024-07-29 23:06:32 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /politica/lula-diz-que-em-seu-governo-o-brasil-se-reencontrou-com-a-civilizacao/ HTTP/1.1" 200 None
2024-07-29 23:06:33 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/aiatola-khamenei-endossa-presidente-eleito-pezeshkian-e-ataca-israel-em-discurso/


2024-07-29 23:06:33 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/aiatola-khamenei-endossa-presidente-eleito-pezeshkian-e-ataca-israel-em-discurso/ HTTP/1.1" 200 None
2024-07-29 23:06:33 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/consumo/deadpool-wolverine-estreia-nos-eua-e-canada-arrecadando-us-205-milhoes/


2024-07-29 23:06:33 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /consumo/deadpool-wolverine-estreia-nos-eua-e-canada-arrecadando-us-205-milhoes/ HTTP/1.1" 200 None
2024-07-29 23:06:34 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/ibovespa-petr4-vale3-bitcoin-analise-tecnica-cohen-28072024/


2024-07-29 23:06:34 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/ibovespa-petr4-vale3-bitcoin-analise-tecnica-cohen-28072024/ HTTP/1.1" 200 None
2024-07-29 23:06:34 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/putin-alerta-eua-sobre-crise-de-misseis-ao-estilo-da-guerra-fria/


2024-07-29 23:06:35 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/putin-alerta-eua-sobre-crise-de-misseis-ao-estilo-da-guerra-fria/ HTTP/1.1" 200 None
2024-07-29 23:06:35 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/parodia-de-a-ultima-ceia-nao-quis-desrespeitar-diz-comite-organizador-de-paris/


2024-07-29 23:06:35 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/parodia-de-a-ultima-ceia-nao-quis-desrespeitar-diz-comite-organizador-de-paris/ HTTP/1.1" 200 None
2024-07-29 23:06:36 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/eleicao-na-venezuela-tem-grandes-filas-e-promessas-de-respeito-as-urnas/


2024-07-29 23:06:36 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/eleicao-na-venezuela-tem-grandes-filas-e-promessas-de-respeito-as-urnas/ HTTP/1.1" 200 None
2024-07-29 23:06:36 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/franca-investiga-ameacas-de-morte-contra-atletas-olimpicos-israelenses/


2024-07-29 23:06:36 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/franca-investiga-ameacas-de-morte-contra-atletas-olimpicos-israelenses/ HTTP/1.1" 200 None
2024-07-29 23:06:37 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/brasil-ganha-suas-tres-primeiras-medalhas-em-paris-judoca-willian-lima-e-prata/


2024-07-29 23:06:37 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/brasil-ganha-suas-tres-primeiras-medalhas-em-paris-judoca-willian-lima-e-prata/ HTTP/1.1" 200 None
2024-07-29 23:06:37 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/business/falha-da-crowdstrike-lanca-foco-sobre-fragilidade-no-resultado-financeiro-da-empresa/


2024-07-29 23:06:38 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /business/falha-da-crowdstrike-lanca-foco-sobre-fragilidade-no-resultado-financeiro-da-empresa/ HTTP/1.1" 200 None
2024-07-29 23:06:38 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/web-stories/6-passaportes-mais-poderosos-do-mundo-em-2024/


2024-07-29 23:06:38 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /web-stories/6-passaportes-mais-poderosos-do-mundo-em-2024/ HTTP/1.1" 200 None
2024-07-29 23:06:38 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/politica/lula-fara-balanco-de-sua-gestao-em-rede-nacional-neste-domingo/


2024-07-29 23:06:39 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /politica/lula-fara-balanco-de-sua-gestao-em-rede-nacional-neste-domingo/ HTTP/1.1" 200 None
2024-07-29 23:06:39 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/campanha-de-kamala-harris-arrecada-us-200-milhoes-em-uma-semana/


2024-07-29 23:06:39 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/campanha-de-kamala-harris-arrecada-us-200-milhoes-em-uma-semana/ HTTP/1.1" 200 None
2024-07-29 23:06:39 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/business/os-20-atletas-mais-bem-pagos-de-paris-2024-e-o-lider-da-lista-nao-esta-na-nba/


2024-07-29 23:06:40 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /business/os-20-atletas-mais-bem-pagos-de-paris-2024-e-o-lider-da-lista-nao-esta-na-nba/ HTTP/1.1" 200 None
2024-07-29 23:06:40 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/business/queijo-artesanal-cai-no-gosto-do-consumidor-e-ja-representa-20-do-mercado-nacional/


2024-07-29 23:06:41 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /business/queijo-artesanal-cai-no-gosto-do-consumidor-e-ja-representa-20-do-mercado-nacional/ HTTP/1.1" 200 None
2024-07-29 23:06:41 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/business/apostas-da-mega-financiam-atletas-olimpicos-diretor-do-cob-explica/


2024-07-29 23:06:41 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /business/apostas-da-mega-financiam-atletas-olimpicos-diretor-do-cob-explica/ HTTP/1.1" 200 None
2024-07-29 23:06:42 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/maior-importacao-de-diesel-russo-reduz-pressao-sobre-precos-da-petrobras-petr4/


2024-07-29 23:06:42 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/maior-importacao-de-diesel-russo-reduz-pressao-sobre-precos-da-petrobras-petr4/ HTTP/1.1" 200 None
2024-07-29 23:06:42 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/ataque-que-matou-criancas-e-adolescentes-de-israel-pode-escalar-a-guerra/


2024-07-29 23:06:43 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/ataque-que-matou-criancas-e-adolescentes-de-israel-pode-escalar-a-guerra/ HTTP/1.1" 200 None
2024-07-29 23:06:43 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/consumo/taxacao-de-alugueis-deve-dobrar-apos-reforma-calcula-entidade-da-construcao/


2024-07-29 23:06:43 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /consumo/taxacao-de-alugueis-deve-dobrar-apos-reforma-calcula-entidade-da-construcao/ HTTP/1.1" 200 None
2024-07-29 23:06:43 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/onde-investir/o-que-levou-a-athena-capital-a-apostar-em-joia-brasileira-da-tecnologia/


2024-07-29 23:06:44 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /onde-investir/o-que-levou-a-athena-capital-a-apostar-em-joia-brasileira-da-tecnologia/ HTTP/1.1" 200 None
2024-07-29 23:06:44 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/agenda-economica-copom-fomc-e-dados-de-trabalho-no-brasil-e-nos-eua-o-que-acompanhar-na-semana/


2024-07-29 23:06:44 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/agenda-economica-copom-fomc-e-dados-de-trabalho-no-brasil-e-nos-eua-o-que-acompanhar-na-semana/ HTTP/1.1" 200 None
2024-07-29 23:06:44 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/governo-tem-projeto-para-substituir-o-gas-de-cozinha-em-regioes-isoladas/


2024-07-29 23:06:45 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/governo-tem-projeto-para-substituir-o-gas-de-cozinha-em-regioes-isoladas/ HTTP/1.1" 200 None
2024-07-29 23:06:45 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/onde-investir/dividendos-da-semana-pagamentos-entre-29-de-julho-a-2-de-agosto-2024/


2024-07-29 23:06:46 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /onde-investir/dividendos-da-semana-pagamentos-entre-29-de-julho-a-2-de-agosto-2024/ HTTP/1.1" 200 None
2024-07-29 23:06:46 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/business/como-triatlo-impulsionou-carreira-destes-executivos/


2024-07-29 23:06:47 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /business/como-triatlo-impulsionou-carreira-destes-executivos/ HTTP/1.1" 200 None
2024-07-29 23:06:47 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/consumo/vai-mudar-de-pais-brisbane-na-australia-atrai-brasileiros-por-baixo-custo-de-vida/


2024-07-29 23:06:48 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /consumo/vai-mudar-de-pais-brisbane-na-australia-atrai-brasileiros-por-baixo-custo-de-vida/ HTTP/1.1" 200 None
2024-07-29 23:06:48 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/onde-investir/entrevista-geoff-white/


2024-07-29 23:06:49 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /onde-investir/entrevista-geoff-white/ HTTP/1.1" 200 None
2024-07-29 23:06:49 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/consumo/lotofacil-concurso-3166-numeros-sorteados-rateio-premio-4-milhoes/


2024-07-29 23:06:49 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /consumo/lotofacil-concurso-3166-numeros-sorteados-rateio-premio-4-milhoes/ HTTP/1.1" 200 None
2024-07-29 23:06:50 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/paris-2024-deve-custar-us-3-bi-a-menos-do-que-rio-2016-saiba-por-que/


2024-07-29 23:06:50 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/paris-2024-deve-custar-us-3-bi-a-menos-do-que-rio-2016-saiba-por-que/ HTTP/1.1" 200 None
2024-07-29 23:06:50 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/consumo/quina-concurso-6492-numeros-sorteados-rateio-premio-48-milhoes/


2024-07-29 23:06:51 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /consumo/quina-concurso-6492-numeros-sorteados-rateio-premio-48-milhoes/ HTTP/1.1" 200 None
2024-07-29 23:06:51 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/consumo/mega-sena-concurso-2754-numeros-sorteados-rateio-premio-100-milhoes/


2024-07-29 23:06:52 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /consumo/mega-sena-concurso-2754-numeros-sorteados-rateio-premio-100-milhoes/ HTTP/1.1" 200 None
2024-07-29 23:06:52 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/business/global/chefes-admitem-que-retorno-presencial-buscava-demissoes-voluntarias/


2024-07-29 23:06:53 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /business/global/chefes-admitem-que-retorno-presencial-buscava-demissoes-voluntarias/ HTTP/1.1" 200 None
2024-07-29 23:06:53 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mercados/klabin-ambev-weg-gerdau-e-mais-confira-a-agenda-de-balancos-da-semana/


2024-07-29 23:06:54 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mercados/klabin-ambev-weg-gerdau-e-mais-confira-a-agenda-de-balancos-da-semana/ HTTP/1.1" 200 None
2024-07-29 23:06:54 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.infomoney.com.br:443


Scraping content from: https://www.infomoney.com.br/mundo/maduro-x-oposicao-como-sera-a-eleicao-na-venezuela-hoje/


2024-07-29 23:06:55 [urllib3.connectionpool] DEBUG: https://www.infomoney.com.br:443 "GET /mundo/maduro-x-oposicao-como-sera-a-eleicao-na-venezuela-hoje/ HTTP/1.1" 200 None


Os dados foram salvos em news_data.json


## Tratamento dos dados: remoção das informações relacionadas a publicidade do site

In [14]:
def remove_publi(content):
    return content.replace("Continua depois da publicidade ", "")

with open('news_data.json', mode='r', encoding='utf-8') as file:
    data = json.load(file)

for item in data:
    item['content'] = remove_publi(item['content'])

with open('news_data.json', mode='w', encoding='utf-8') as file:
    json.dump(data, file, ensure_ascii=False, indent=4)

print("fim")


fim


In [15]:
def remove_publi(content):
    return content.replace("Continua depois da publicidade", "")

with open('news_data.json', mode='r', encoding='utf-8') as file:
    data = json.load(file)

for item in data:
    item['content'] = remove_publi(item['content'])

with open('news_data.json', mode='w', encoding='utf-8') as file:
    json.dump(data, file, ensure_ascii=False, indent=4)

print("fim")

fim


Função para remover anúncios: Insira o anúncio em ads e execute o bloco para remover os anúncios especificados do dataset

In [8]:
def remove_ads(content, phrase):
    if content.startswith(phrase):
        return content[len(phrase):]
    return content

with open('Dados/news_data.json', 'r', encoding='utf-8') as file:
    data = json.load(file)

ads = "Herança Desvendada: baixe agora ebook gratuito que ensina tudo sobre planejamento sucessório "
for item in data:
    item['content'] = remove_ads(item['content'], ads)

with open('news_data.json', 'w', encoding='utf-8') as outfile:
    json.dump(data, outfile, indent=4)

print("Concluído com sucesso")


Concluído com sucesso


## Executar o Label-Studio

In [None]:
!label-studio