v0.5.5
๐ Extend Publisher Support & Maintenance ๐
This release expands our publisher coverage with 9 new publishers from 7 countries, increasing Fundusโ total to 171 supported news outlets.
Alongside the expansion, we maintained existing publishers and enhanced the robustness of the forward crawler to better handle unexpected exceptions when fetching HTML files.
โจ Quality of Life Improvements
- Improve robustness of
fetchmethod forWebSourceby @MaxDall in #875 - Rework break transformation by @MaxDall in #885
๐ New Publishers
๐ฉ๐ช
- Add LTO (Legal Tribune Online) publisher by @elias-polyapp in #799
๐ป๐ณ
- Add VN publisher (VnExpress) by @bachthyaglx in #802
๐ธ๐ช
๐ฎ๐ฉ
๐บ๐ฆ
๐ฑ๐ง
- LBC publisher integrated by @nancyboukamel-ds in #814
๐ฟ๐ฆ
- Add
TheCitizenby @addie9800 in #847 - Add
EyethuNewsby @addie9800 in #835 - Add
Ilangaby @addie9800 in #848
๐ง Updated Publishers
- Update
Tageblattby @addie9800 in #868 - Fix
SeznamZpravyby @addie9800 in #873 - Fix paragraph selector for
LeMondeparser by @MaxDall in #878 - Fix paragraph selector for
sternparser by @MaxDall in #879 - Update
Landesspiegelparser by @MaxDall in #877 - Update
HankookIlboparser by @MaxDall in #882
๐ซ Deprecated
- Deprecate
NikkanGeadaiby @addie9800 in #872 - Deprecate
LesothoTimesby @MaxDall in #880
๐ Bug fixes
- Update error message by @addie9800 in #869
- Skip functioning publishers in publisher coverage by @addie9800 in #871
- Fix a bug with
VALID_UNTILdate in long crawls by @MaxDall in #876 - Fix error message in
BaseParserby @MaxDall in #881 - Remove unfinished bar in
check_coverageby @MaxDall in #883 - Ignore capitalization in supported_publishers.md ordering by @addie9800 in #886
New Contributors
- @elias-polyapp made their first contribution in #799
- @bachthyaglx made their first contribution in #802
- @rekordii made their first contribution in #803
- @vrdhn91 made their first contribution in #804
- @bucheben made their first contribution in #807
- @nancyboukamel-ds made their first contribution in #814
Full Changelog: v0.5.4...v0.5.5






