# Exercício no 1
## “Monitor com Threads”

O objetivo deste exercício é criar um código multithread em Python capaz de monitorar
alterações em sites de notícias. Suponha os seguintes 4 sites brasileiros de notícias:

- https://g1.globo.com
- https://noticias.uol.com.br
- https://www.r7.com
- https://www.cnnbrasil.com.br
 
O código deverá inicializar 4 threads, um para cada site.
A função de monitoramento deverá receber a **URL**, **INTERVALO DE VERIFICAÇÕES** e a
**QUANTIDADE DE VERIFICAÇÕES**. Ou seja, estes três parâmetros devem ser enviados para a
função associada a cada thread.

> Ao término do processamento, cada thread deverá dizer se o site foi modificado ou não no
período monitorado.

In [None]:
import requests
import hashlib
import pandas as pd

from time import sleep, perf_counter
from datetime import datetime

# Constants
# MARK: - Parameters
SITES = ['https://g1.globo.com/',
         'https://noticias.uol.com.br/',
         'https://www.r7.com/',
         'https://www.cnnbrasil.com.br/']

MONITORING_INTERVAL = 2
MONITORING_COUNT_TARGET = 5

# MARK: - Utils
HEADERS = {'User-Agent': 'Mozilla/5.0'}

class HashComparer:
    def __init__(self):
        self.previous = None
        self.latest = None

    def update(self, new_hash):
        self.previous = self.latest
        self.latest = new_hash   

    def did_change(self):
        if self.previous is not None and self.latest is not None:
            return self.previous != self.latest
        return False

class SiteMonitor:
    def __init__(self):
        self._data = pd.DataFrame(columns = ['timestamp', 'attempt', 'url', 'changed', 'latest', 'previous'])
    
    def monitor(self, site: str):
        
        hash_comparer = HashComparer()

        for count in range(MONITORING_COUNT_TARGET): # Count from 0 to TARGET

            response = requests.get(
                site,
                headers = HEADERS
            )

            utf8_content = repr(response.text).encode('utf-8')
            response_hash = hashlib.sha224(utf8_content).hexdigest()

            hash_comparer.update(response_hash)

            self._data = pd.concat([self._data, pd.DataFrame({
                'timestamp': [datetime.now()],
                'attempt': [count],
                'url': [site],
                'changed': [hash_comparer.did_change()],
                'latest': [hash_comparer.latest],
                'previous': [hash_comparer.previous]
            })], ignore_index=True)

            print(f'[{datetime.now()}] Monitoring attempt: ({count}) Url: {site} Did change?: {hash_comparer.did_change()} | Hashes latest: {hash_comparer.latest} VS. previous: {hash_comparer.previous}')

            sleep(MONITORING_INTERVAL)


## 1 Sequential

In [2]:
sites_data = []

start_time = perf_counter()

for site in SITES: # For each URL
    site_monitor = SiteMonitor()
    site_monitor.monitor(site)
    
    sites_data.append(site_monitor._data)

end_time = perf_counter()

print(f'As tarefas levaram {end_time - start_time: 0.2f} segundo(s) para executar.')

  self._data = pd.concat([self._data, pd.DataFrame({


[2025-09-02 22:17:09.483203] Monitoring attempt: (0) Url: https://g1.globo.com/ Did change?: False | Hashes latest: b466f1a49cd9ed7fd37f9943c64b660931254e352ccc5d52e81b3668 VS. previous: None
[2025-09-02 22:17:12.325392] Monitoring attempt: (1) Url: https://g1.globo.com/ Did change?: True | Hashes latest: 1355a70021c95160df1f1e11dda99818d6b49508935290e440427a3a VS. previous: b466f1a49cd9ed7fd37f9943c64b660931254e352ccc5d52e81b3668
[2025-09-02 22:17:15.180796] Monitoring attempt: (2) Url: https://g1.globo.com/ Did change?: True | Hashes latest: f7845cce336ccfe5cc66f16f6cff3881613f3785bec52841f8c204e1 VS. previous: 1355a70021c95160df1f1e11dda99818d6b49508935290e440427a3a
[2025-09-02 22:17:18.009847] Monitoring attempt: (3) Url: https://g1.globo.com/ Did change?: True | Hashes latest: 62391c1684356d5cc9ed7ca04be98623a5f2dab46ef255ead49486ea VS. previous: f7845cce336ccfe5cc66f16f6cff3881613f3785bec52841f8c204e1
[2025-09-02 22:17:20.842798] Monitoring attempt: (4) Url: https://g1.globo.com/

  self._data = pd.concat([self._data, pd.DataFrame({


[2025-09-02 22:17:23.040805] Monitoring attempt: (0) Url: https://noticias.uol.com.br/ Did change?: False | Hashes latest: 28adbf45c4e8386c8efc883478ad3150d5be5434a2be78207c373bdf VS. previous: None
[2025-09-02 22:17:25.083310] Monitoring attempt: (1) Url: https://noticias.uol.com.br/ Did change?: True | Hashes latest: 0dddcd58d9784efd9dc3fe07cf3334e318481ff86712e81f12802c24 VS. previous: 28adbf45c4e8386c8efc883478ad3150d5be5434a2be78207c373bdf
[2025-09-02 22:17:27.131626] Monitoring attempt: (2) Url: https://noticias.uol.com.br/ Did change?: True | Hashes latest: 95fc240802fb4d30e5372f18ca590fd7e5194c3d9f69effa8f87e34b VS. previous: 0dddcd58d9784efd9dc3fe07cf3334e318481ff86712e81f12802c24
[2025-09-02 22:17:29.178481] Monitoring attempt: (3) Url: https://noticias.uol.com.br/ Did change?: True | Hashes latest: 07c9f67d24385afb57bb1b4473b01abebeb802c38fd457dcef1afe79 VS. previous: 95fc240802fb4d30e5372f18ca590fd7e5194c3d9f69effa8f87e34b
[2025-09-02 22:17:31.226685] Monitoring attempt: (4

  self._data = pd.concat([self._data, pd.DataFrame({


[2025-09-02 22:17:33.576734] Monitoring attempt: (0) Url: https://www.r7.com/ Did change?: False | Hashes latest: f47862f74afa642ad6e7f336093985ce0e6dcef96c5bdb47ed5116f2 VS. previous: None
[2025-09-02 22:17:35.773781] Monitoring attempt: (1) Url: https://www.r7.com/ Did change?: True | Hashes latest: 6453ee0bf82c63795c597cd86308e36898f06c1ffc0debaa9d7f5e94 VS. previous: f47862f74afa642ad6e7f336093985ce0e6dcef96c5bdb47ed5116f2
[2025-09-02 22:17:37.960977] Monitoring attempt: (2) Url: https://www.r7.com/ Did change?: True | Hashes latest: 2f29e232dca61453cd3f952c15a84c6dd6b6ea5f7b51a27042b6bc91 VS. previous: 6453ee0bf82c63795c597cd86308e36898f06c1ffc0debaa9d7f5e94
[2025-09-02 22:17:40.097613] Monitoring attempt: (3) Url: https://www.r7.com/ Did change?: True | Hashes latest: 84e10c03602a4f0ed67dbbb965ad52323591d61e66e0ebc8e555dd3e VS. previous: 2f29e232dca61453cd3f952c15a84c6dd6b6ea5f7b51a27042b6bc91
[2025-09-02 22:17:42.294072] Monitoring attempt: (4) Url: https://www.r7.com/ Did chang

  self._data = pd.concat([self._data, pd.DataFrame({


[2025-09-02 22:17:44.553618] Monitoring attempt: (0) Url: https://www.cnnbrasil.com.br/ Did change?: False | Hashes latest: acd4068a5f9463bf01b7da8eb0e75ac28b8b878647a8afa4192e282c VS. previous: None
[2025-09-02 22:17:46.599910] Monitoring attempt: (1) Url: https://www.cnnbrasil.com.br/ Did change?: False | Hashes latest: acd4068a5f9463bf01b7da8eb0e75ac28b8b878647a8afa4192e282c VS. previous: acd4068a5f9463bf01b7da8eb0e75ac28b8b878647a8afa4192e282c
[2025-09-02 22:17:48.937206] Monitoring attempt: (2) Url: https://www.cnnbrasil.com.br/ Did change?: True | Hashes latest: 87f503eab3fbde1b4cc53f42a3ef324d4986bc545ed217434ab1734a VS. previous: acd4068a5f9463bf01b7da8eb0e75ac28b8b878647a8afa4192e282c
[2025-09-02 22:17:50.978273] Monitoring attempt: (3) Url: https://www.cnnbrasil.com.br/ Did change?: False | Hashes latest: 87f503eab3fbde1b4cc53f42a3ef324d4986bc545ed217434ab1734a VS. previous: 87f503eab3fbde1b4cc53f42a3ef324d4986bc545ed217434ab1734a
[2025-09-02 22:17:53.077351] Monitoring attem

In [3]:
for data in sites_data:
    display(data)

Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-09-02 22:17:09.478989,0,https://g1.globo.com/,False,b466f1a49cd9ed7fd37f9943c64b660931254e352ccc5d...,
1,2025-09-02 22:17:12.324271,1,https://g1.globo.com/,True,1355a70021c95160df1f1e11dda99818d6b49508935290...,b466f1a49cd9ed7fd37f9943c64b660931254e352ccc5d...
2,2025-09-02 22:17:15.179708,2,https://g1.globo.com/,True,f7845cce336ccfe5cc66f16f6cff3881613f3785bec528...,1355a70021c95160df1f1e11dda99818d6b49508935290...
3,2025-09-02 22:17:18.008770,3,https://g1.globo.com/,True,62391c1684356d5cc9ed7ca04be98623a5f2dab46ef255...,f7845cce336ccfe5cc66f16f6cff3881613f3785bec528...
4,2025-09-02 22:17:20.841713,4,https://g1.globo.com/,True,f7845cce336ccfe5cc66f16f6cff3881613f3785bec528...,62391c1684356d5cc9ed7ca04be98623a5f2dab46ef255...


Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-09-02 22:17:23.039408,0,https://noticias.uol.com.br/,False,28adbf45c4e8386c8efc883478ad3150d5be5434a2be78...,
1,2025-09-02 22:17:25.082159,1,https://noticias.uol.com.br/,True,0dddcd58d9784efd9dc3fe07cf3334e318481ff86712e8...,28adbf45c4e8386c8efc883478ad3150d5be5434a2be78...
2,2025-09-02 22:17:27.130505,2,https://noticias.uol.com.br/,True,95fc240802fb4d30e5372f18ca590fd7e5194c3d9f69ef...,0dddcd58d9784efd9dc3fe07cf3334e318481ff86712e8...
3,2025-09-02 22:17:29.177418,3,https://noticias.uol.com.br/,True,07c9f67d24385afb57bb1b4473b01abebeb802c38fd457...,95fc240802fb4d30e5372f18ca590fd7e5194c3d9f69ef...
4,2025-09-02 22:17:31.224149,4,https://noticias.uol.com.br/,True,fe2074a1842782115bbf49475cc7e64e730e6ae400a96d...,07c9f67d24385afb57bb1b4473b01abebeb802c38fd457...


Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-09-02 22:17:33.575320,0,https://www.r7.com/,False,f47862f74afa642ad6e7f336093985ce0e6dcef96c5bdb...,
1,2025-09-02 22:17:35.772201,1,https://www.r7.com/,True,6453ee0bf82c63795c597cd86308e36898f06c1ffc0deb...,f47862f74afa642ad6e7f336093985ce0e6dcef96c5bdb...
2,2025-09-02 22:17:37.959947,2,https://www.r7.com/,True,2f29e232dca61453cd3f952c15a84c6dd6b6ea5f7b51a2...,6453ee0bf82c63795c597cd86308e36898f06c1ffc0deb...
3,2025-09-02 22:17:40.096545,3,https://www.r7.com/,True,84e10c03602a4f0ed67dbbb965ad52323591d61e66e0eb...,2f29e232dca61453cd3f952c15a84c6dd6b6ea5f7b51a2...
4,2025-09-02 22:17:42.292405,4,https://www.r7.com/,True,3983bf6b24e077850fe784873f19600479c9797f7a8a28...,84e10c03602a4f0ed67dbbb965ad52323591d61e66e0eb...


Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-09-02 22:17:44.552261,0,https://www.cnnbrasil.com.br/,False,acd4068a5f9463bf01b7da8eb0e75ac28b8b878647a8af...,
1,2025-09-02 22:17:46.598048,1,https://www.cnnbrasil.com.br/,False,acd4068a5f9463bf01b7da8eb0e75ac28b8b878647a8af...,acd4068a5f9463bf01b7da8eb0e75ac28b8b878647a8af...
2,2025-09-02 22:17:48.936188,2,https://www.cnnbrasil.com.br/,True,87f503eab3fbde1b4cc53f42a3ef324d4986bc545ed217...,acd4068a5f9463bf01b7da8eb0e75ac28b8b878647a8af...
3,2025-09-02 22:17:50.977247,3,https://www.cnnbrasil.com.br/,False,87f503eab3fbde1b4cc53f42a3ef324d4986bc545ed217...,87f503eab3fbde1b4cc53f42a3ef324d4986bc545ed217...
4,2025-09-02 22:17:53.072872,4,https://www.cnnbrasil.com.br/,False,87f503eab3fbde1b4cc53f42a3ef324d4986bc545ed217...,87f503eab3fbde1b4cc53f42a3ef324d4986bc545ed217...


## 2 Parallel

In [4]:
from threading import Thread

In [5]:
# Task
def task(site: str, sites_data: list):
    site_monitor = SiteMonitor()
    site_monitor.monitor(site)

    sites_data.append(site_monitor._data)

In [6]:
sites_data = []

start_time = perf_counter()

# 1 threads for each site
threads = []
for site in SITES:
    threads.append(
        Thread(
            target = task,
            args = (site, sites_data)
        )
    )

# Inicializa as 4 threads
for thread in threads:
    thread.start()

# Aguarda até que as 4 threads sejam completadas
for thread in threads:
    thread.join()

end_time = perf_counter()

print(f'As tarefas levaram {end_time - start_time: 0.2f} segundo(s) para executar.')

  self._data = pd.concat([self._data, pd.DataFrame({


[2025-09-02 22:17:55.326229] Monitoring attempt: (0) Url: https://noticias.uol.com.br/ Did change?: False | Hashes latest: e731f78889026e93c62979bf347c2b600f170927534cc0940085577c VS. previous: None
[2025-09-02 22:17:55.401694] Monitoring attempt: (0) Url: https://www.r7.com/ Did change?: False | Hashes latest: a515b4a268318ce5a225b8a2cbe4d7bb5d6483e36132c905b9b62b06 VS. previous: None
[2025-09-02 22:17:55.486391] Monitoring attempt: (0) Url: https://www.cnnbrasil.com.br/ Did change?: False | Hashes latest: 1958f8e1360171c78756bf7b90045327f6f29aaf9c7873e376ee09a7 VS. previous: None
[2025-09-02 22:17:55.965846] Monitoring attempt: (0) Url: https://g1.globo.com/ Did change?: False | Hashes latest: a8baf0368a4be8f227cdd4f648e2f19f2b5105af282cda0a77084e44 VS. previous: None
[2025-09-02 22:17:57.373216] Monitoring attempt: (1) Url: https://noticias.uol.com.br/ Did change?: True | Hashes latest: ae9d64af54b193fa7282bb0a361f79237e60f12b6598c251690475f2 VS. previous: e731f78889026e93c62979bf34

In [7]:
for data in sites_data:
    display(data)

Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-09-02 22:17:55.324270,0,https://noticias.uol.com.br/,False,e731f78889026e93c62979bf347c2b600f170927534cc0...,
1,2025-09-02 22:17:57.372085,1,https://noticias.uol.com.br/,True,ae9d64af54b193fa7282bb0a361f79237e60f12b6598c2...,e731f78889026e93c62979bf347c2b600f170927534cc0...
2,2025-09-02 22:17:59.420003,2,https://noticias.uol.com.br/,True,c44067ce1a5c393961f73432e31fa6aedb7ea8c11e79db...,ae9d64af54b193fa7282bb0a361f79237e60f12b6598c2...
3,2025-09-02 22:18:01.461855,3,https://noticias.uol.com.br/,True,a344db221a137126e6da87a175c9937c161c3dda4ca2b5...,c44067ce1a5c393961f73432e31fa6aedb7ea8c11e79db...
4,2025-09-02 22:18:03.503982,4,https://noticias.uol.com.br/,True,b5da2e0b17f3dd02b3d82a138f04763cd6d3ba85e028f0...,a344db221a137126e6da87a175c9937c161c3dda4ca2b5...


Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-09-02 22:17:55.400449,0,https://www.r7.com/,False,a515b4a268318ce5a225b8a2cbe4d7bb5d6483e36132c9...,
1,2025-09-02 22:17:57.534791,1,https://www.r7.com/,True,a7072b3120489049e35fce33b8e6e7c42706e9022f9a5d...,a515b4a268318ce5a225b8a2cbe4d7bb5d6483e36132c9...
2,2025-09-02 22:17:59.702996,2,https://www.r7.com/,True,f33edb2dd1552cf3ff5deca5f213fcdbb0b5e3b4d70b35...,a7072b3120489049e35fce33b8e6e7c42706e9022f9a5d...
3,2025-09-02 22:18:01.878157,3,https://www.r7.com/,True,a507cdc34af59bfcf68087db677d72b41b0dfec901ecda...,f33edb2dd1552cf3ff5deca5f213fcdbb0b5e3b4d70b35...
4,2025-09-02 22:18:04.006450,4,https://www.r7.com/,True,dc2a34465704a25903ac76f02041dc910fda46fa940f01...,a507cdc34af59bfcf68087db677d72b41b0dfec901ecda...


Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-09-02 22:17:55.485056,0,https://www.cnnbrasil.com.br/,False,1958f8e1360171c78756bf7b90045327f6f29aaf9c7873...,
1,2025-09-02 22:17:57.707735,1,https://www.cnnbrasil.com.br/,True,9f6403a44ea0058e334fde56d3be49f4c89bf28d35b56c...,1958f8e1360171c78756bf7b90045327f6f29aaf9c7873...
2,2025-09-02 22:17:59.749785,2,https://www.cnnbrasil.com.br/,False,9f6403a44ea0058e334fde56d3be49f4c89bf28d35b56c...,9f6403a44ea0058e334fde56d3be49f4c89bf28d35b56c...
3,2025-09-02 22:18:01.941308,3,https://www.cnnbrasil.com.br/,True,f8932944c0cb26331b6b055edc47b5690b3a4196ec96be...,9f6403a44ea0058e334fde56d3be49f4c89bf28d35b56c...
4,2025-09-02 22:18:04.110962,4,https://www.cnnbrasil.com.br/,True,db00a270f6479b4fec9b01814df6d9826518cb904db967...,f8932944c0cb26331b6b055edc47b5690b3a4196ec96be...


Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-09-02 22:17:55.964432,0,https://g1.globo.com/,False,a8baf0368a4be8f227cdd4f648e2f19f2b5105af282cda...,
1,2025-09-02 22:17:58.796098,1,https://g1.globo.com/,False,a8baf0368a4be8f227cdd4f648e2f19f2b5105af282cda...,a8baf0368a4be8f227cdd4f648e2f19f2b5105af282cda...
2,2025-09-02 22:18:01.639637,2,https://g1.globo.com/,False,a8baf0368a4be8f227cdd4f648e2f19f2b5105af282cda...,a8baf0368a4be8f227cdd4f648e2f19f2b5105af282cda...
3,2025-09-02 22:18:04.466767,3,https://g1.globo.com/,False,a8baf0368a4be8f227cdd4f648e2f19f2b5105af282cda...,a8baf0368a4be8f227cdd4f648e2f19f2b5105af282cda...
4,2025-09-02 22:18:07.297988,4,https://g1.globo.com/,False,a8baf0368a4be8f227cdd4f648e2f19f2b5105af282cda...,a8baf0368a4be8f227cdd4f648e2f19f2b5105af282cda...
