# Parsing WARC File

### 🛠 Installing Necessary Libraries

Toplam işlenen kayıt: 98984
Toplam geçen süre: 7211.72 saniye
Saniyede ortalama kayıt: 13.73

Bu kayıtlar için başlık, url ve html bilgileri çıkarılarak sqlite db'ye kaydedildi

In [9]:
pip install warcio beautifulsoup4 psycopg2-binary

Note: you may need to restart the kernel to use updated packages.


In [10]:
# from warcio.archiveiterator import ArchiveIterator
# from bs4 import BeautifulSoup
# import sqlite3
# import time

# def process_warc_file_in_chunks(warc_path, db_path, batch_size=100):
#     """
#     WARC dosyasını parça parça işleyerek SQLite'a kaydeder
    
#     Args:
#         warc_path: WARC dosyasının yolu
#         db_path: SQLite veritabanının yolu
#         batch_size: Her defasında işlenecek kayıt sayısı
#     """
#     # SQLite bağlantısını oluştur
#     conn = sqlite3.connect(db_path)
#     c = conn.cursor()
    
#     # Tablo oluştur (eğer yoksa)
#     c.execute('CREATE TABLE IF NOT EXISTS pages (id INTEGER PRIMARY KEY, url TEXT, title TEXT, html TEXT)')
    
#     # İşleme istatistikleri
#     start_time = time.time()
#     total_records = 0
#     batch_records = []
    
#     # WARC dosyasını aç ve kayıtları işle
#     with open(warc_path, 'rb') as stream:
#         for record in ArchiveIterator(stream):
#             if record.rec_type == 'response':
#                 try:
#                     url = record.rec_headers.get_header('WARC-Target-URI')
#                     raw_html = record.content_stream().read().decode('utf-8', errors='replace')
                    
#                     soup = BeautifulSoup(raw_html, 'html.parser')
#                     title = soup.title.string if soup.title else ''
                    
#                     batch_records.append((url, title, raw_html))
#                     total_records += 1
                    
#                     # Belirli bir sayıda kayıt toplandıysa veritabanına kaydet
#                     if len(batch_records) >= batch_size:
#                         c.executemany('INSERT INTO pages (url, title, html) VALUES (?, ?, ?)', batch_records)
#                         conn.commit()
                        
#                         elapsed_time = time.time() - start_time
#                         print(f"İşlenen kayıt: {total_records}, Geçen süre: {elapsed_time:.2f} saniye")
                        
#                         # Belleği temizle
#                         batch_records = []
                
#                 except Exception as e:
#                     print(f"Hata oluştu: {e} - URL: {url if 'url' in locals() else 'bilinmiyor'}")
    
#     # Kalan kayıtları kaydet
#     if batch_records:
#         c.executemany('INSERT INTO pages (url, title, html) VALUES (?, ?, ?)', batch_records)
#         conn.commit()
    
#     # İstatistikleri göster
#     total_time = time.time() - start_time
#     print(f"Toplam işlenen kayıt: {total_records}")
#     print(f"Toplam geçen süre: {total_time:.2f} saniye")
#     print(f"Saniyede ortalama kayıt: {total_records/total_time:.2f}")
    
#     conn.close()

# # Kodu çalıştır
# warc_file = "./website_data.warc"
# db_file = "websites.db"
# process_warc_file_in_chunks(warc_file, db_file, batch_size=500)

### WARC dosyasının parse edilip PostgreSQL DB'ye kaydedilmesi

In [11]:
from warcio.archiveiterator import ArchiveIterator
from bs4 import BeautifulSoup
import psycopg2
import time
from psycopg2.extras import execute_values
from tqdm import tqdm  # tqdm kütüphanesini ekleyin

def process_warc_file(warc_path, db_params, batch_size=100):
    """
    WARC dosyasını parça parça işleyerek PostgreSQL'e kaydeder
    
    Args:
        warc_path: WARC dosyasının yolu
        db_params: PostgreSQL bağlantı parametreleri (dict)
        batch_size: Her defasında işlenecek kayıt sayısı
    """
    # PostgreSQL bağlantısını oluştur
    conn = psycopg2.connect(**db_params)
    print("connection success", conn)
    cur = conn.cursor()
    print("cursor success", cur)
    
    # İşleme istatistikleri
    start_time = time.time()
    total_records = 0
    batch_records = []
    
    # WARC dosyasını aç ve kayıtları işle
    with open(warc_path, 'rb') as stream:
        # tqdm ile ilerleme çubuğunu ekleyin
        print("started processing warc file");
        for record in tqdm(ArchiveIterator(stream), desc="İşlenen Kayıtlar"):
            if record.rec_type == 'response':
                try:
                    url = record.rec_headers.get_header('WARC-Target-URI')
                    raw_html = record.content_stream().read().decode('utf-8', errors='replace')
                    # remove null characters
                    raw_html = raw_html.replace('\x00', '')
                    
                    soup = BeautifulSoup(raw_html, 'html.parser')
                    title = soup.title.string if soup.title else ''
                    
                    batch_records.append((url, title, raw_html))
                    total_records += 1
                    
                    # Belirli bir sayıda kayıt toplandıysa veritabanına kaydet
                    if len(batch_records) >= batch_size:
                        print("inserting records to db", len(batch_records));
                        execute_values(cur,
                            'INSERT INTO pages (url, title, html) VALUES %s',
                            batch_records,
                            template='(%s, %s, %s)'
                        )
                        conn.commit()
                        
                        # Belleği temizle
                        batch_records = []
            
                except Exception as e:
                    print(f"Hata oluştu: {e} - URL: {url if 'url' in locals() else 'bilinmiyor'}")
    
    # Kalan kayıtları kaydet
    if batch_records:
        execute_values(cur,
            'INSERT INTO pages (url, title, html) VALUES %s',
            batch_records,
            template='(%s, %s, %s)'
        )
        conn.commit()
    
    # İstatistikleri göster
    total_time = time.time() - start_time
    print(f"Toplam işlenen kayıt: {total_records}")
    print(f"Toplam geçen süre: {total_time:.2f} saniye")
    print(f"Saniyede ortalama kayıt: {total_records/total_time:.2f}")
    
    cur.close()
    conn.close()

In [12]:
# PostgreSQL bağlantı parametreleri
db_params = {
    'dbname': 'websites',
    'user': 'postgres',
    'password': '1234567',
    'host': 'localhost',
    'port': '5432'
}

# WARC dosyasının yolu
warc_file = "../website_data.warc"

# Kodu çalıştır
process_warc_file(warc_file, db_params, batch_size=500)

connection success <connection object at 0x10d2a2650; dsn: 'user=postgres password=xxx dbname=websites host=localhost port=5432', closed: 0>
cursor success <cursor object at 0x115a9d030; closed: 0>
started processing warc file


İşlenen Kayıtlar: 0it [00:00, ?it/s]

İşlenen Kayıtlar: 497it [00:41,  5.05it/s]

inserting records to db 500


İşlenen Kayıtlar: 999it [01:22, 18.14it/s]

inserting records to db 500


İşlenen Kayıtlar: 1293it [01:47, 29.37it/s]

Hata oluştu: The markup you provided was rejected by the parser. Trying a different parser or a different encoding may help.

Original exception(s) from parser:
 AssertionError: expected name token at '<![��P���a�9\x02l~Q����' - URL: https://files.dainikshiksha.com/136301/conversions/najirpur-thumb.webp


İşlenen Kayıtlar: 1499it [02:03,  5.19it/s]

inserting records to db 500


İşlenen Kayıtlar: 1501it [02:05,  3.61it/s]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/thyssenkrupp-aktie-kursbewegung-02-01-2025-11201173
inserting records to db 501
Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/de/apps/tag/fleet-tracking-software?country=ms
inserting records to db 502


İşlenen Kayıtlar: 1503it [02:07,  2.23it/s]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/bilgi/insanlarin-97si-bulamiyor-harflerin-arasinda-5-tanesi-saklaniyor
inserting records to db 503


İşlenen Kayıtlar: 1504it [02:08,  1.80it/s]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: http://www.samanyoluhaber.com/eski-istihbaratcisi-sabri-uzun-adli-kontrolle-serbest-birakildi-bassavcilik-itiraz-edecek-haberi/1472095/
inserting records to db 504


İşlenen Kayıtlar: 1505it [02:09,  1.58it/s]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/eidiseis/348833/panakriba-dora-gia-ton-tzo-mpainten-apo-xenoys-igetes-alla-akribotero-pige-stin
inserting records to db 505


İşlenen Kayıtlar: 1506it [02:11,  1.37it/s]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204728-amundi-index-msci-emerging-markets-ucits-etf-dr-usd-d-net-asset-value-s-015.htm
inserting records to db 506


İşlenen Kayıtlar: 1507it [02:12,  1.18it/s]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/thyssenkrupp-aktie-kursbewegung-02-01-2025-11202068
inserting records to db 507


İşlenen Kayıtlar: 1508it [02:13,  1.10it/s]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/de/apps/tag/freight-management-software?country=ms
inserting records to db 508


İşlenen Kayıtlar: 1509it [02:14,  1.04it/s]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/bilgi/o-sarj-aletleri-icin-kritik-uyari-hayati-risk-tasiyorlar
inserting records to db 509


İşlenen Kayıtlar: 1510it [02:15,  1.02s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: http://www.samanyoluhaber.com/las-vegas-saldirgani-abd-ozel-kuvvetler-askeri-cikti-haberi/1472093/
inserting records to db 510


İşlenen Kayıtlar: 1511it [02:17,  1.05s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/eidiseis/348844/thessaloniki-peripoy-6000-hristoygenniatika-dentra-ektimatai-pos-tha-anakyklothoyn
inserting records to db 511


İşlenen Kayıtlar: 1512it [02:18,  1.07s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204784-zerez-eintrag-ab-1-februar-fuer-alle-photovoltaik-anlagen-verbindlich-032.htm
inserting records to db 512


İşlenen Kayıtlar: 1513it [02:19,  1.07s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/en/apps/tag/fleet-maintenance-software?country=ms
inserting records to db 513


İşlenen Kayıtlar: 1514it [02:20,  1.12s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/thyssenkrupp-aktie-kursbewegung-03-01-2025-11210144
inserting records to db 514


İşlenen Kayıtlar: 1515it [02:21,  1.09s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/bilgi/zamli-memur-maaslari-2025-ogretmen-muhendis-doktor-maaslari-ne-kadar-oldu
inserting records to db 515


İşlenen Kayıtlar: 1516it [02:22,  1.10s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: http://www.samanyoluhaber.com/ocalanla-gorusen-dem-parti-bahceliden-neler-istedi-rudaw-yazdi-haberi/1472094/
inserting records to db 516


İşlenen Kayıtlar: 1517it [02:24,  1.27s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/epiheiriseis/348827/nyt-o-mpainten-mplokarei-tin-exagora-tis-us-steel-apo-tin-iaponiki-nippon
inserting records to db 517


İşlenen Kayıtlar: 1518it [02:25,  1.25s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204800-arbeitslosigkeit-legt-jahreszeitlich-bedingt-zu-003.htm
inserting records to db 518


İşlenen Kayıtlar: 1519it [02:26,  1.22s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/en/apps/tag/fleet-management-software?country=ms
inserting records to db 519


İşlenen Kayıtlar: 1520it [02:27,  1.18s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/dunya/almanya-disisleri-bakani-baerbock-sama-celik-yelekle-geldi
inserting records to db 520


İşlenen Kayıtlar: 1521it [02:29,  1.24s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/tilray-aktie-kursbewegung-02-01-2025-11202144
inserting records to db 521


İşlenen Kayıtlar: 1522it [02:30,  1.20s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: http://www.samanyoluhaber.com/tutuklama-emri-guney-koreyi-karistirdi-baskanlik-guvenlik-servisiyle-polis-karsi-karsiya-haberi/1472091/
inserting records to db 522


İşlenen Kayıtlar: 1523it [02:31,  1.17s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/epiheiriseis/348835/terna-energeiaki-i-masdar-apektise-289960-metohes-axias-58-ekat-eyro
inserting records to db 523


İşlenen Kayıtlar: 1524it [02:32,  1.19s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204801-eil-im-dezember-170-000-mehr-arbeitslose-als-vor-einem-jahr-003.htm
inserting records to db 524


İşlenen Kayıtlar: 1525it [02:33,  1.23s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/en/apps/tag/fleet-tracking-software?country=ms
inserting records to db 525


İşlenen Kayıtlar: 1526it [02:35,  1.21s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/dunya/guney-afrikada-ucakta-daha-fazla-alkol-verilmeyen-yolcu-olay-cikardi
inserting records to db 526


İşlenen Kayıtlar: 1527it [02:36,  1.18s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/epiheiriseis/348841/i-gek-terna-anakoinonei-tin-apohorisi-toy-k-gagik-apkarian
inserting records to db 527


İşlenen Kayıtlar: 1528it [02:37,  1.16s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204802-deutschland-nimmt-an-trumps-amtseinfuehrung-teil-003.htm
inserting records to db 528


İşlenen Kayıtlar: 1529it [02:38,  1.16s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/en/apps/tag/freight-management-software?country=ms
inserting records to db 529


İşlenen Kayıtlar: 1530it [02:39,  1.16s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/dunya/samin-ara-sokaklarinin-altindaki-tunelden-dikkat-ceken-goruntuler
inserting records to db 530


İşlenen Kayıtlar: 1531it [02:40,  1.18s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/tuerkei-hohe-inflation-schwaecht-sich-sechsten-monat-in-folge-ab-14126051
inserting records to db 531


İşlenen Kayıtlar: 1532it [02:42,  1.19s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/epiheiriseis/348847/tayros-gia-ti-theon-i-edison-timi-stohos-sta-2120-eyro
inserting records to db 532


İşlenen Kayıtlar: 1533it [02:43,  1.17s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204803-suedkorea-soldaten-verhindern-verhaftung-von-yoon-003.htm
inserting records to db 533


İşlenen Kayıtlar: 1534it [02:44,  1.15s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/es/apps/tag/fleet-maintenance-software?country=ms
inserting records to db 534


İşlenen Kayıtlar: 1535it [02:45,  1.15s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/egitim-haberleri/ogrencilere-ulasim-destegi-ust-limiti-1250-liradan-1900-liraya-yukseltildi
inserting records to db 535


İşlenen Kayıtlar: 1536it [02:46,  1.18s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/tui-aktie-kursbewegung-02-01-2025-12205846
inserting records to db 536


İşlenen Kayıtlar: 1537it [02:47,  1.17s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/oikonomia/348092/me-blemma-stis-anadyomenes-agores-poy-stoheyei-i-oikonomiki-diplomatia-2025
inserting records to db 537


İşlenen Kayıtlar: 1538it [02:48,  1.18s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204804-baerbock-reist-ueberraschend-nach-syrien-003.htm
inserting records to db 538


İşlenen Kayıtlar: 1539it [02:50,  1.18s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/es/apps/tag/fleet-management-software?country=ms
inserting records to db 539


İşlenen Kayıtlar: 1540it [02:51,  1.17s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/ekonomi/2025-yili-sgk-prim-ve-odenek-tutarlari
inserting records to db 540


İşlenen Kayıtlar: 1541it [02:52,  1.31s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/tui-aktie-kursbewegung-02-01-2025-12209256
inserting records to db 541


İşlenen Kayıtlar: 1542it [02:54,  1.30s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/oikonomia/348832/i-oykrania-diekopse-tis-roes-rosikoy-aerioy-pros-tin-eyropi-poios-zimionetai
inserting records to db 542


İşlenen Kayıtlar: 1543it [02:55,  1.31s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204806-eqs-news-p-p-group-frequenz-rekordergebnis-fuer-erlebniscenter-flair-fuerth-im-weihnachtsgeschaeft-ebenfalls-spitzenwert-bei-wunschbaum-aktion-zuguns-022.htm
inserting records to db 543


İşlenen Kayıtlar: 1544it [02:56,  1.28s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/es/apps/tag/fleet-tracking-software?country=ms
inserting records to db 544


İşlenen Kayıtlar: 1545it [02:57,  1.24s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/ekonomi/aralik-2024te-fiyati-en-fazla-artan-ve-dusen-urunler-belli-oldu
inserting records to db 545


İşlenen Kayıtlar: 1546it [02:59,  1.24s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/tui-aktie-kursbewegung-03-01-2025-12208617
inserting records to db 546


İşlenen Kayıtlar: 1547it [03:00,  1.20s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/oikonomia/348836/toyrkia-o-plithorismos-frenare-perissotero-toy-anamenomenoy-ton-dekembrio
inserting records to db 547


İşlenen Kayıtlar: 1548it [03:01,  1.17s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204815-deutschland-zahl-der-arbeitslosen-steigt-im-dezember-016.htm
inserting records to db 548


İşlenen Kayıtlar: 1549it [03:02,  1.16s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/ekonomi/mehmet-simsekten-2024-yili-muhasebesi
inserting records to db 549


İşlenen Kayıtlar: 1550it [03:03,  1.19s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://dunya.com.pk/index.php/city/lahore/2025-01-03/2456098
inserting records to db 550


İşlenen Kayıtlar: 1551it [03:05,  1.22s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.star.com.tr/guncel/20-yildir-kayipti-nezaket-uyur-olayi-cozuldu-olduruldugu-ortaya-cikti-haber-1915975/
inserting records to db 551


İşlenen Kayıtlar: 1552it [03:06,  1.37s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/uber-aktie-kursbewegung-02-01-2025-11202141
inserting records to db 552


İşlenen Kayıtlar: 1553it [03:08,  1.37s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/es/apps/tag/freight-management-software?country=ms
inserting records to db 553


İşlenen Kayıtlar: 1554it [03:09,  1.36s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/oikonomia/348837/allazo-systima-thermansis-kai-thermosifona-sti-dimosiotita-o-odigos-toy
inserting records to db 554


İşlenen Kayıtlar: 1555it [03:10,  1.33s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.globenewswire.com/news-release/2025/01/02/3003360/0/sv/Virtune-AB-Publ-Virtune-har-genomf%C3%B6rt-den-m%C3%A5natliga-rebalanseringen-f%C3%B6r-december-2024-av-Virtune-Crypto-Altcoin-Index-ETP.html
inserting records to db 555


İşlenen Kayıtlar: 1556it [03:11,  1.27s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204816-schweizer-einkaufsmanagerindizes-im-dezember-uneinheitlich-095.htm
inserting records to db 556


İşlenen Kayıtlar: 1557it [03:13,  1.24s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/gundem/antalya-2024-yilinda-17-milyon-ziyaretci-ile-rekor-kirdi
inserting records to db 557


İşlenen Kayıtlar: 1558it [03:14,  1.33s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://dunya.com.pk/index.php/city/lahore/2025-01-03/2456099
inserting records to db 558


İşlenen Kayıtlar: 1559it [03:15,  1.28s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.star.com.tr/guncel/istanbul-bogazinda-gemi-trafigi-gecici-olarak-cift-yonde-askiya-alindi-haber-1915955/
inserting records to db 559


İşlenen Kayıtlar: 1560it [03:17,  1.27s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/ueberblick-am-abend-konjunktur-zentralbanken-politik-14125398
inserting records to db 560


İşlenen Kayıtlar: 1561it [03:18,  1.24s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/fr/apps/tag/fleet-maintenance-software?country=ms
inserting records to db 561


İşlenen Kayıtlar: 1562it [03:19,  1.24s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/oikonomia/348842/neoi-agrotes-2021-allages-stis-desmeyseis-poy-aforoyn-ton-kyklo-ergasion
inserting records to db 562


İşlenen Kayıtlar: 1563it [03:20,  1.20s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.globenewswire.com/news-release/2025/01/03/3003871/0/sv/Virtune-AB-Publ-Virtune-har-genomf%C3%B6rt-den-m%C3%A5natliga-rebalanseringen-f%C3%B6r-december-2024-av-Virtune-Crypto-Top-10-Index-ETP-Nordens-f%C3%B6rsta-kryptoindex-ETP.html
inserting records to db 563


İşlenen Kayıtlar: 1564it [03:21,  1.18s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204817-zahl-der-firmenkonkurse-steigt-im-2024-auf-hoechststand-095.htm
inserting records to db 564


İşlenen Kayıtlar: 1565it [03:22,  1.16s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/gundem/bakan-yumakli-duyurdu-istilaci-balon-baligi-avlayan-balikcilara-2-yil-daha-destekleme
inserting records to db 565


İşlenen Kayıtlar: 1566it [03:23,  1.16s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/fr/apps/tag/fleet-management-software?country=ms
inserting records to db 566


İşlenen Kayıtlar: 1567it [03:25,  1.19s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.star.com.tr/guncel/kayaliktan-dustu-dedigi-karisini-iterek-oldurmustu-cezasi-belli-oldu-haber-1915974/
inserting records to db 567


İşlenen Kayıtlar: 1568it [03:26,  1.30s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/ueberblick-am-mittag-konjunktur-zentralbanken-politik-14124449
inserting records to db 568


İşlenen Kayıtlar: 1569it [03:28,  1.36s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.zdf.de/nachrichten/heute-sendungen/aussenministerin-annalena-baerbock-besuch-syrien-video-100.html
inserting records to db 569


İşlenen Kayıtlar: 1570it [03:29,  1.34s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://dunya.com.pk/index.php/city/lahore/2025-01-03/2456100
inserting records to db 570


İşlenen Kayıtlar: 1571it [03:30,  1.33s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/oikonomia/348848/sto-96-ypohorise-i-anergia-ton-noembrio
inserting records to db 571


İşlenen Kayıtlar: 1572it [03:32,  1.33s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204818-ch-eroeffnung-smi-startet-freundlich-ins-boersenjahr-2025-095.htm
inserting records to db 572


İşlenen Kayıtlar: 1573it [03:33,  1.26s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/gundem/ferdi-tayfurun-cenazeci-istanbula-getirildi
inserting records to db 573


İşlenen Kayıtlar: 1574it [03:34,  1.23s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/fr/apps/tag/fleet-tracking-software?country=ms
inserting records to db 574


İşlenen Kayıtlar: 1575it [03:35,  1.19s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.star.com.tr/guncel/msb-acikladi-anitkabire-rekor-ziyaret-haber-1915969/
inserting records to db 575


İşlenen Kayıtlar: 1576it [03:36,  1.23s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/ueberblick-am-morgen-konjunktur-zentralbanken-politik-14126064
inserting records to db 576


İşlenen Kayıtlar: 1577it [03:38,  1.31s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.zdf.de/nachrichten/heute-sendungen/berlin-explosion-verletzte-polizisten-video-100.html
inserting records to db 577


İşlenen Kayıtlar: 1578it [03:39,  1.31s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/politiki/348846/sti-damasko-oi-ypex-gallias-kai-germanias-gia-na-synantithoyn-me-neo-igeti-tis
inserting records to db 578


İşlenen Kayıtlar: 1579it [03:41,  1.32s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://dunya.com.pk/index.php/city/lahore/2025-01-03/2456101
inserting records to db 579


İşlenen Kayıtlar: 1580it [03:42,  1.33s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204829-eqs-afr-yoc-ag-vorabbekanntmachung-ueber-die-veroeffentlichung-von-finanzberichten-gemaess-114-115-117-wphg-022.htm
inserting records to db 580


İşlenen Kayıtlar: 1581it [03:43,  1.32s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/gundem/maliyet-olarak-cin-otoyoluyla-karsilastirilan-istanbul-izmir-otoyolu-iddialarinin-asli
inserting records to db 581


İşlenen Kayıtlar: 1582it [03:44,  1.27s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/fr/apps/tag/freight-management-software?country=ms
inserting records to db 582


İşlenen Kayıtlar: 1583it [03:45,  1.24s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.star.com.tr/guncel/otoyol-ve-koprulerden-gecis-yapan-arac-sayisi-belli-oldu-haber-1915960/
inserting records to db 583


İşlenen Kayıtlar: 1584it [03:47,  1.27s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/ukraine-meldet-beschuss-von-kommandostelle-bei-kursk-14125117
inserting records to db 584


İşlenen Kayıtlar: 1585it [03:48,  1.23s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.zdf.de/nachrichten/heute-sendungen/brasilien-waldbraende-2024-flaeche-deutschlands-verbrannt-video-100.html
inserting records to db 585


İşlenen Kayıtlar: 1586it [03:49,  1.22s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/real-estate/348828/anakainizo-noikiazo-neo-programma-gia-12500-kleista-spitia
inserting records to db 586


İşlenen Kayıtlar: 1587it [03:51,  1.28s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://dunya.com.pk/index.php/city/lahore/2025-01-03/2456206
inserting records to db 587


İşlenen Kayıtlar: 1588it [03:52,  1.24s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204857-amundi-jpx-nikkei-400-ucits-etf-daily-hedged-usd-c-net-asset-value-s-015.htm
inserting records to db 588


İşlenen Kayıtlar: 1589it [03:53,  1.21s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/gundem/suriye-ve-orta-dogudaki-krizlerin-mimari-kasim-suleymaninin-kanli-gecmisi
inserting records to db 589


İşlenen Kayıtlar: 1590it [03:54,  1.27s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/he/apps/tag/fleet-maintenance-software?country=ms
inserting records to db 590


İşlenen Kayıtlar: 1591it [03:55,  1.23s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.star.com.tr/guncel/sabri-uzuna-tutuklama-talebi-haber-1915958/
inserting records to db 591


İşlenen Kayıtlar: 1592it [03:57,  1.27s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/ukraine-wehrt-russische-drohnen-ab-14124253
inserting records to db 592


İşlenen Kayıtlar: 1593it [03:58,  1.23s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.zdf.de/nachrichten/heute-sendungen/busunfall-verletzte-baden-wuerttemberg-schnee-video-100.html
inserting records to db 593


İşlenen Kayıtlar: 1594it [03:59,  1.20s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/tax-labour/348845/aade-i-diarkeia-misthosis-basiko-kritirio-gia-ti-forologiki-antimetopisi-ton
inserting records to db 594


İşlenen Kayıtlar: 1595it [04:00,  1.21s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204858-amundi-global-government-inflation-linked-bond-1-10y-ucits-etf-dist-net-asset-value-s-015.htm
inserting records to db 595


İşlenen Kayıtlar: 1596it [04:01,  1.19s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/ic-haber/kutahyada-banka-gorevlisi-gibi-davranip-50-bin-lira-dolandirdilar
inserting records to db 596


İşlenen Kayıtlar: 1597it [04:03,  1.19s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://dunya.com.pk/index.php/city/lahore/2025-01-03/2456208
inserting records to db 597


İşlenen Kayıtlar: 1598it [04:04,  1.18s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/he/apps/tag/fleet-management-software?country=ms
inserting records to db 598


İşlenen Kayıtlar: 1599it [04:05,  1.23s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/ukraine-will-exporte-trotz-krieges-weiter-erhoehen-14125076
inserting records to db 599


İşlenen Kayıtlar: 1600it [04:06,  1.25s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.zdf.de/nachrichten/heute-sendungen/fbi-new-orleans-einzeltaeter-amokfahrt-video-100.html
inserting records to db 600


İşlenen Kayıtlar: 1601it [04:08,  1.23s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/tax-labour/348850/oi-5-stratigikoi-stohoi-tis-aade-eos-2029
inserting records to db 601


İşlenen Kayıtlar: 1602it [04:09,  1.21s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/kralspor/futbol/recep-ucardan-transfer-sozleri
inserting records to db 602


İşlenen Kayıtlar: 1603it [04:10,  1.22s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204859-amundi-russell-2000-ucits-etf-usd-c-net-asset-value-s-015.htm
inserting records to db 603


İşlenen Kayıtlar: 1604it [04:11,  1.22s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://dunya.com.pk/index.php/city/lahore/2025-01-03/2456211
inserting records to db 604


İşlenen Kayıtlar: 1605it [04:12,  1.21s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/he/apps/tag/fleet-tracking-software?country=ms
inserting records to db 605


İşlenen Kayıtlar: 1606it [04:14,  1.26s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/unfallforscherin-verkehrstote-gehen-uns-alle-an-14125935
inserting records to db 606


İşlenen Kayıtlar: 1607it [04:15,  1.21s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.zdf.de/nachrichten/heute-sendungen/suedkorea-praesident-yoon-festnahme-gescheitert-militaer-video-100.html
inserting records to db 607


İşlenen Kayıtlar: 1608it [04:16,  1.18s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/toyrismos/348831/kroyaziera-nea-istorika-ypsila-2024-oi-ektimiseis-kai-oi-problimatismoi-gia-neo
inserting records to db 608


İşlenen Kayıtlar: 1609it [04:17,  1.17s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.ensonhaber.com/kralspor/futbol/rizespor-besiktas-macinin-vari-sarper-baris-saka
inserting records to db 609


İşlenen Kayıtlar: 1610it [04:18,  1.16s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanznachrichten.de/nachrichten-2025-01/64204860-amundi-global-aggregate-green-bond-1-10y-ucits-etf-gbp-hedged-dist-net-asset-value-s-015.htm
inserting records to db 610


İşlenen Kayıtlar: 1611it [04:19,  1.15s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://dunya.com.pk/index.php/city/lahore/2025-01-03/2456212
inserting records to db 611


İşlenen Kayıtlar: 1612it [04:21,  1.20s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://webcatalog.io/he/apps/tag/freight-management-software?country=ms
inserting records to db 612


İşlenen Kayıtlar: 1613it [04:22,  1.34s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.finanzen.net/nachricht/aktien/unilever-aktie-kursbewegung-02-01-2025-11201216
inserting records to db 613


İşlenen Kayıtlar: 1614it [04:24,  1.31s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.zdf.de/nachrichten/heute-sendungen/syrien-baerbock-besuch-keine-finanzversprechen-erwartet-video-100.html
inserting records to db 614


İşlenen Kayıtlar: 1615it [04:25,  1.30s/it]

Hata oluştu: A string literal cannot contain NUL (0x00) characters. - URL: https://www.insider.gr/toyrismos/348840/i-skopelos-anamesa-stoys-10-proorismoys-tis-eyropis-me-tis-kalyteres-paralies-gia
inserting records to db 615


İşlenen Kayıtlar: 1615it [04:25,  6.07it/s]


KeyboardInterrupt: 