# Exercício no 1
“Monitor com Threads”

O objetivo deste exercício é criar um código multithread em Python capaz de monitorar
alterações em sites de notícias. Suponha os seguintes 4 sites brasileiros de notícias:

- https://g1.globo.com/
- https://noticias.uol.com.br/
- https://www.r7.com/
- https://www.cnnbrasil.com.br
 
O código deverá inicializar 4 threads, um para cada site.
A função de monitoramento deverá receber a **URL**, **INTERVALO DE VERIFICAÇÕES** e a
**QUANTIDADE DE VERIFICAÇÕES**. Ou seja, estes três parâmetros devem ser enviados para a
função associada a cada thread.

> Ao término do processamento, cada thread deverá dizer se o site foi modificado ou não no
período monitorado.

In [29]:
import requests
import hashlib
import pandas as pd
from time import sleep, perf_counter
from datetime import datetime

# Constants
# MARK: - Parameters
SITES = ['https://g1.globo.com/',
         'https://noticias.uol.com.br/',
         'https://www.r7.com/',
         'https://www.cnnbrasil.com.br/']
MONITORING_INTERVAL = 2
MONITORING_COUNT_TARGET = 5

# MARK: - Utils
HEADERS = {'User-Agent': 'Mozilla/5.0'}

class HashComparer:
    def __init__(self):
        self.previous = None
        self.latest = None

    def update(self, new_hash):
        self.previous = self.latest
        self.latest = new_hash   

    def did_change(self):
        if self.previous is not None and self.latest is not None:
            return self.previous != self.latest
        return False

class SiteMonitor:
    def __init__(self):
        self._data = pd.DataFrame(columns = ['timestamp', 'attempt', 'url', 'changed', 'latest', 'previous'])
    
    def monitor(self, site: str):
        
        hash_comparer = HashComparer()

        for count in range(MONITORING_COUNT_TARGET): # Count from 0 to TARGET

            response = requests.get(
                site,
                headers = HEADERS
            )

            utf8_content = repr(response.text).encode('utf-8')
            response_hash = hashlib.sha224(utf8_content).hexdigest()

            hash_comparer.update(response_hash)

            self._data = pd.concat([self._data, pd.DataFrame({
                'timestamp': [datetime.now()],
                'attempt': [count],
                'url': [site],
                'changed': [hash_comparer.did_change()],
                'latest': [hash_comparer.latest],
                'previous': [hash_comparer.previous]
            })], ignore_index=True)

            print(f'[{datetime.now()}] Monitoring attempt: ({count}) Url: {site} Did change?: {hash_comparer.did_change()} | Hashes latest: {hash_comparer.latest} VS. previous: {hash_comparer.previous}')

            sleep(MONITORING_INTERVAL)


## 1 Sequential

In [33]:
sites_data = []

start_time = perf_counter()

for site in SITES: # For each URL
    site_monitor = SiteMonitor()
    site_monitor.monitor(site)
    
    sites_data.append(site_monitor._data)

end_time = perf_counter()

print(f'As tarefas levaram {end_time - start_time: 0.2f} segundo(s) para executar.')

  self._data = pd.concat([self._data, pd.DataFrame({


[2025-08-26 22:58:11.307453] Monitoring attempt: (0) Url: https://g1.globo.com/ Did change?: False | Hashes latest: 1f6c37479658af54de5fc86f5a7d18d2aa93acdeeec0bfe6b310ae14 VS. previous: None
[2025-08-26 22:58:14.141887] Monitoring attempt: (1) Url: https://g1.globo.com/ Did change?: False | Hashes latest: 1f6c37479658af54de5fc86f5a7d18d2aa93acdeeec0bfe6b310ae14 VS. previous: 1f6c37479658af54de5fc86f5a7d18d2aa93acdeeec0bfe6b310ae14
[2025-08-26 22:58:16.983013] Monitoring attempt: (2) Url: https://g1.globo.com/ Did change?: True | Hashes latest: 213d202ae65421cd666c0f9a458d583d80cc239057590f83428b665d VS. previous: 1f6c37479658af54de5fc86f5a7d18d2aa93acdeeec0bfe6b310ae14
[2025-08-26 22:58:19.822586] Monitoring attempt: (3) Url: https://g1.globo.com/ Did change?: False | Hashes latest: 213d202ae65421cd666c0f9a458d583d80cc239057590f83428b665d VS. previous: 213d202ae65421cd666c0f9a458d583d80cc239057590f83428b665d
[2025-08-26 22:58:22.652687] Monitoring attempt: (4) Url: https://g1.globo.co

  self._data = pd.concat([self._data, pd.DataFrame({


[2025-08-26 22:58:24.838239] Monitoring attempt: (0) Url: https://noticias.uol.com.br/ Did change?: False | Hashes latest: 18144de00394f8c37ba60e1cc436419c40e3c1b1942b5524816b0cad VS. previous: None
[2025-08-26 22:58:26.888865] Monitoring attempt: (1) Url: https://noticias.uol.com.br/ Did change?: True | Hashes latest: ff0b351ece6298514672fb1e36636e3ca646ba8515511ce8bd18dbae VS. previous: 18144de00394f8c37ba60e1cc436419c40e3c1b1942b5524816b0cad
[2025-08-26 22:58:28.945369] Monitoring attempt: (2) Url: https://noticias.uol.com.br/ Did change?: True | Hashes latest: fed553f287a91053caf5152e3aa620e912089d8d348d185f7acbdf97 VS. previous: ff0b351ece6298514672fb1e36636e3ca646ba8515511ce8bd18dbae
[2025-08-26 22:58:30.986530] Monitoring attempt: (3) Url: https://noticias.uol.com.br/ Did change?: True | Hashes latest: e82cd6aba6e93510d85870671b3028a68cc808fb2125e9db66073f87 VS. previous: fed553f287a91053caf5152e3aa620e912089d8d348d185f7acbdf97
[2025-08-26 22:58:33.025903] Monitoring attempt: (4

  self._data = pd.concat([self._data, pd.DataFrame({


[2025-08-26 22:58:35.451172] Monitoring attempt: (0) Url: https://www.r7.com/ Did change?: False | Hashes latest: 84472923ec9c1dc7d26df927430e5da7fe2e8f874ecb5c4758fb7e5d VS. previous: None
[2025-08-26 22:58:37.883667] Monitoring attempt: (1) Url: https://www.r7.com/ Did change?: True | Hashes latest: 829ee75aeaa1f6c9ba9f997975cda3b2c09f61951b3f7ec9fff8c54c VS. previous: 84472923ec9c1dc7d26df927430e5da7fe2e8f874ecb5c4758fb7e5d
[2025-08-26 22:58:40.030307] Monitoring attempt: (2) Url: https://www.r7.com/ Did change?: True | Hashes latest: e168c389a4db1e81215ece8a38cad8f2ed4670c9383845c142409ed7 VS. previous: 829ee75aeaa1f6c9ba9f997975cda3b2c09f61951b3f7ec9fff8c54c
[2025-08-26 22:58:42.166750] Monitoring attempt: (3) Url: https://www.r7.com/ Did change?: True | Hashes latest: 95a98da40e130ecd35cf7d2b195811d927f1246377bda597d9464723 VS. previous: e168c389a4db1e81215ece8a38cad8f2ed4670c9383845c142409ed7
[2025-08-26 22:58:44.309305] Monitoring attempt: (4) Url: https://www.r7.com/ Did chang

  self._data = pd.concat([self._data, pd.DataFrame({


[2025-08-26 22:58:46.507986] Monitoring attempt: (0) Url: https://www.cnnbrasil.com.br/ Did change?: False | Hashes latest: 8b7388ce2923534c94c21238b0d0a8c4aceb80fe1512b056c7504165 VS. previous: None
[2025-08-26 22:58:48.561749] Monitoring attempt: (1) Url: https://www.cnnbrasil.com.br/ Did change?: False | Hashes latest: 8b7388ce2923534c94c21238b0d0a8c4aceb80fe1512b056c7504165 VS. previous: 8b7388ce2923534c94c21238b0d0a8c4aceb80fe1512b056c7504165
[2025-08-26 22:58:50.964173] Monitoring attempt: (2) Url: https://www.cnnbrasil.com.br/ Did change?: True | Hashes latest: 19fa17bf66c79062d67b80019a0cdbef190c1bdcfab3dcdb8dfe0037 VS. previous: 8b7388ce2923534c94c21238b0d0a8c4aceb80fe1512b056c7504165
[2025-08-26 22:58:53.011538] Monitoring attempt: (3) Url: https://www.cnnbrasil.com.br/ Did change?: False | Hashes latest: 19fa17bf66c79062d67b80019a0cdbef190c1bdcfab3dcdb8dfe0037 VS. previous: 19fa17bf66c79062d67b80019a0cdbef190c1bdcfab3dcdb8dfe0037
[2025-08-26 22:58:55.053062] Monitoring attem

In [34]:
for data in sites_data:
    display(data)

Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-08-26 22:58:11.305778,0,https://g1.globo.com/,False,1f6c37479658af54de5fc86f5a7d18d2aa93acdeeec0bf...,
1,2025-08-26 22:58:14.140809,1,https://g1.globo.com/,False,1f6c37479658af54de5fc86f5a7d18d2aa93acdeeec0bf...,1f6c37479658af54de5fc86f5a7d18d2aa93acdeeec0bf...
2,2025-08-26 22:58:16.981973,2,https://g1.globo.com/,True,213d202ae65421cd666c0f9a458d583d80cc239057590f...,1f6c37479658af54de5fc86f5a7d18d2aa93acdeeec0bf...
3,2025-08-26 22:58:19.821203,3,https://g1.globo.com/,False,213d202ae65421cd666c0f9a458d583d80cc239057590f...,213d202ae65421cd666c0f9a458d583d80cc239057590f...
4,2025-08-26 22:58:22.651575,4,https://g1.globo.com/,False,213d202ae65421cd666c0f9a458d583d80cc239057590f...,213d202ae65421cd666c0f9a458d583d80cc239057590f...


Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-08-26 22:58:24.836762,0,https://noticias.uol.com.br/,False,18144de00394f8c37ba60e1cc436419c40e3c1b1942b55...,
1,2025-08-26 22:58:26.887193,1,https://noticias.uol.com.br/,True,ff0b351ece6298514672fb1e36636e3ca646ba8515511c...,18144de00394f8c37ba60e1cc436419c40e3c1b1942b55...
2,2025-08-26 22:58:28.944154,2,https://noticias.uol.com.br/,True,fed553f287a91053caf5152e3aa620e912089d8d348d18...,ff0b351ece6298514672fb1e36636e3ca646ba8515511c...
3,2025-08-26 22:58:30.985483,3,https://noticias.uol.com.br/,True,e82cd6aba6e93510d85870671b3028a68cc808fb2125e9...,fed553f287a91053caf5152e3aa620e912089d8d348d18...
4,2025-08-26 22:58:33.024749,4,https://noticias.uol.com.br/,True,3cd29cef23a33fd9e1d0a5f078cb0ec916d64a79fa1c21...,e82cd6aba6e93510d85870671b3028a68cc808fb2125e9...


Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-08-26 22:58:35.447363,0,https://www.r7.com/,False,84472923ec9c1dc7d26df927430e5da7fe2e8f874ecb5c...,
1,2025-08-26 22:58:37.882526,1,https://www.r7.com/,True,829ee75aeaa1f6c9ba9f997975cda3b2c09f61951b3f7e...,84472923ec9c1dc7d26df927430e5da7fe2e8f874ecb5c...
2,2025-08-26 22:58:40.029231,2,https://www.r7.com/,True,e168c389a4db1e81215ece8a38cad8f2ed4670c9383845...,829ee75aeaa1f6c9ba9f997975cda3b2c09f61951b3f7e...
3,2025-08-26 22:58:42.165686,3,https://www.r7.com/,True,95a98da40e130ecd35cf7d2b195811d927f1246377bda5...,e168c389a4db1e81215ece8a38cad8f2ed4670c9383845...
4,2025-08-26 22:58:44.308209,4,https://www.r7.com/,True,132785ccd38f50202d204e3ff209b7f8ff1700cba79367...,95a98da40e130ecd35cf7d2b195811d927f1246377bda5...


Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-08-26 22:58:46.506675,0,https://www.cnnbrasil.com.br/,False,8b7388ce2923534c94c21238b0d0a8c4aceb80fe1512b0...,
1,2025-08-26 22:58:48.560305,1,https://www.cnnbrasil.com.br/,False,8b7388ce2923534c94c21238b0d0a8c4aceb80fe1512b0...,8b7388ce2923534c94c21238b0d0a8c4aceb80fe1512b0...
2,2025-08-26 22:58:50.963133,2,https://www.cnnbrasil.com.br/,True,19fa17bf66c79062d67b80019a0cdbef190c1bdcfab3dc...,8b7388ce2923534c94c21238b0d0a8c4aceb80fe1512b0...
3,2025-08-26 22:58:53.010469,3,https://www.cnnbrasil.com.br/,False,19fa17bf66c79062d67b80019a0cdbef190c1bdcfab3dc...,19fa17bf66c79062d67b80019a0cdbef190c1bdcfab3dc...
4,2025-08-26 22:58:55.051999,4,https://www.cnnbrasil.com.br/,False,19fa17bf66c79062d67b80019a0cdbef190c1bdcfab3dc...,19fa17bf66c79062d67b80019a0cdbef190c1bdcfab3dc...


## 2 Parallel

In [16]:
from threading import Thread

In [36]:
# Task
def task(site: str, sites_data: list):
    site_monitor = SiteMonitor()
    site_monitor.monitor(site)

    sites_data.append(site_monitor._data)

In [37]:
sites_data = []

start_time = perf_counter()

# 1 threads for each site
threads = []
for site in SITES:
    threads.append(
        Thread(
            target = task,
            args = (site, sites_data)
        )
    )

# Inicializa as 4 threads
for thread in threads:
    thread.start()

# Aguarda até que as 4 threads sejam completadas
for thread in threads:
    thread.join()

end_time = perf_counter()

print(f'As tarefas levaram {end_time - start_time: 0.2f} segundo(s) para executar.')

  self._data = pd.concat([self._data, pd.DataFrame({


[2025-08-26 23:01:48.371327] Monitoring attempt: (0) Url: https://www.r7.com/ Did change?: False | Hashes latest: 0080d9ca3c66792087f8747bc14944f661b05ec7c1f9887143b40281 VS. previous: None
[2025-08-26 23:01:48.497292] Monitoring attempt: (0) Url: https://www.cnnbrasil.com.br/ Did change?: False | Hashes latest: 68cfbc79f32145a6217b37be237f330829e926019685e006c377a873 VS. previous: None
[2025-08-26 23:01:49.005949] Monitoring attempt: (0) Url: https://g1.globo.com/ Did change?: False | Hashes latest: 8121ed7669ace4671f951b66d189d0652584f926a564dbc73f4031dd VS. previous: None
[2025-08-26 23:01:50.539915] Monitoring attempt: (1) Url: https://www.cnnbrasil.com.br/ Did change?: False | Hashes latest: 68cfbc79f32145a6217b37be237f330829e926019685e006c377a873 VS. previous: 68cfbc79f32145a6217b37be237f330829e926019685e006c377a873
[2025-08-26 23:01:50.559998] Monitoring attempt: (1) Url: https://www.r7.com/ Did change?: True | Hashes latest: 6066f85e4a2b0c7ab75082138963c6eb27b6a0694c5c717ce0c15

In [38]:
for data in sites_data:
    display(data)

Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-08-26 23:01:48.495988,0,https://www.cnnbrasil.com.br/,False,68cfbc79f32145a6217b37be237f330829e926019685e0...,
1,2025-08-26 23:01:50.538819,1,https://www.cnnbrasil.com.br/,False,68cfbc79f32145a6217b37be237f330829e926019685e0...,68cfbc79f32145a6217b37be237f330829e926019685e0...
2,2025-08-26 23:01:52.583534,2,https://www.cnnbrasil.com.br/,False,68cfbc79f32145a6217b37be237f330829e926019685e0...,68cfbc79f32145a6217b37be237f330829e926019685e0...
3,2025-08-26 23:01:54.625947,3,https://www.cnnbrasil.com.br/,False,68cfbc79f32145a6217b37be237f330829e926019685e0...,68cfbc79f32145a6217b37be237f330829e926019685e0...
4,2025-08-26 23:01:56.800885,4,https://www.cnnbrasil.com.br/,True,eee00cb9f5fc06e4c19b596031be40e9daf2d953f185c2...,68cfbc79f32145a6217b37be237f330829e926019685e0...


Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-08-26 23:01:48.369083,0,https://www.r7.com/,False,0080d9ca3c66792087f8747bc14944f661b05ec7c1f988...,
1,2025-08-26 23:01:50.558822,1,https://www.r7.com/,True,6066f85e4a2b0c7ab75082138963c6eb27b6a0694c5c71...,0080d9ca3c66792087f8747bc14944f661b05ec7c1f988...
2,2025-08-26 23:01:52.816854,2,https://www.r7.com/,True,332cc470b2212fe4fd510409ccbf73f0cf057d730b49b0...,6066f85e4a2b0c7ab75082138963c6eb27b6a0694c5c71...
3,2025-08-26 23:01:55.046494,3,https://www.r7.com/,True,4aa64ad1ef898dbdf881951a8c187e193f514e73429445...,332cc470b2212fe4fd510409ccbf73f0cf057d730b49b0...
4,2025-08-26 23:01:57.235350,4,https://www.r7.com/,True,3dd40d9031023602673b0992c56ded69f9ebead5d2ece1...,4aa64ad1ef898dbdf881951a8c187e193f514e73429445...


Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-08-26 23:01:51.196982,0,https://noticias.uol.com.br/,False,93703a5362b1f8d6685ded8b53af9d0db7823c4e4e34ae...,
1,2025-08-26 23:01:53.239700,1,https://noticias.uol.com.br/,True,80e8759cf6fe3ffbe7028085241e939679bde97c50c623...,93703a5362b1f8d6685ded8b53af9d0db7823c4e4e34ae...
2,2025-08-26 23:01:55.283190,2,https://noticias.uol.com.br/,True,bf541c8b8b41d6f7c4153039b2b51a57ada1d8f0e038b5...,80e8759cf6fe3ffbe7028085241e939679bde97c50c623...
3,2025-08-26 23:01:57.327985,3,https://noticias.uol.com.br/,True,4762491c299d8157b76db7f2ccabcd8d084a6f42028747...,bf541c8b8b41d6f7c4153039b2b51a57ada1d8f0e038b5...
4,2025-08-26 23:01:59.366503,4,https://noticias.uol.com.br/,True,c41ce4237b67d4d08d3df342f159d04b49c301a6991b2b...,4762491c299d8157b76db7f2ccabcd8d084a6f42028747...


Unnamed: 0,timestamp,attempt,url,changed,latest,previous
0,2025-08-26 23:01:49.004669,0,https://g1.globo.com/,False,8121ed7669ace4671f951b66d189d0652584f926a564db...,
1,2025-08-26 23:01:51.835799,1,https://g1.globo.com/,False,8121ed7669ace4671f951b66d189d0652584f926a564db...,8121ed7669ace4671f951b66d189d0652584f926a564db...
2,2025-08-26 23:01:54.669964,2,https://g1.globo.com/,False,8121ed7669ace4671f951b66d189d0652584f926a564db...,8121ed7669ace4671f951b66d189d0652584f926a564db...
3,2025-08-26 23:01:57.506835,3,https://g1.globo.com/,True,93b2fe4cb378874a609b407a2a4ea2a04704b3f7cb25a1...,8121ed7669ace4671f951b66d189d0652584f926a564db...
4,2025-08-26 23:02:00.340534,4,https://g1.globo.com/,False,93b2fe4cb378874a609b407a2a4ea2a04704b3f7cb25a1...,93b2fe4cb378874a609b407a2a4ea2a04704b3f7cb25a1...
