Skip to content

Commit

Permalink
Update faz.net.txt (#1381)
Browse files Browse the repository at this point in the history
The do have different layouts now, which need different body selectors
  • Loading branch information
HolgerAusB committed May 21, 2024
1 parent 36d754d commit 8e2db00
Showing 1 changed file with 16 additions and 3 deletions.
19 changes: 16 additions & 3 deletions faz.net.txt
Original file line number Diff line number Diff line change
Expand Up @@ -21,22 +21,34 @@ date: //span[@class='Datum'],/span
single_page_link: //a[contains(@href, 'printPagedArticle')]

# Content is here

body: //article[@class='storytelling']
body: //article[@class='article']//div[contains(@class,'body-elements')] | (//div[contains(@class,'header-teaser__image')])[1] | (//div[contains(concat(' ',normalize-space(@class),' '),' header-teaser ')])[last()]
body: //article[1]
body: //div[@class='Artikel']


# Tidy up before article
strip: //div[@id='FAZHeaderNeu']
strip: //h2[@itemprop='headline']
strip: //span[@class='Datum']
strip: //span[@class='Autor']
strip_id_or_class: ArticlePagerTop
strip_id_or_class: header-detail
strip_id_or_class: intro-text
strip: //button[contains(@class,'image-toggle')]

# General cleanup
strip: //div[@class='clear']
strip: //a[@title='Zur Homepage FAZ.NET']
strip: //iframe
#strip: //iframe
replace_string( · ):
strip_id_or_class: TeaserMore
strip_id_or_class: plista_alternativ
strip_id_or_class: paywall
#strip: //button
strip_id_or_class: header-teaser__image-details
strip_id_or_class: tik4-sharing

# Remove tracking and ads
strip_image_src: /l.gif?
Expand All @@ -57,6 +69,7 @@ strip_id_or_class: MultimediaNavigation
strip_id_or_class: IndexTitel
strip_id_or_class: cbx-Author-is-in-article-container-info
strip_id_or_class: BigBox
strip_id_or_class: upper-toolbar

# Fix picture caps and pictures (use better resolution and remove clutter)
strip_id_or_class: LightBoxOverlay
Expand Down Expand Up @@ -113,10 +126,10 @@ strip: //footer[@class='tsr-Base_ContentMeta']
strip_id_or_class: aut-Teaser

# Try it yourself
test_url: https://www.faz.net/aktuell/feuilleton/bilder-und-zeiten/lord-byron-in-venedig-als-pilger-popstar-und-poet-19648440.html
test_url: https://www.faz.net/aktuell/politik/europawahl/europawahl-2024-wer-in-die-eu-einzahlt-und-wer-mittel-erhaelt-19707069.html
test_url: http://www.faz.net/aktuell/feuilleton/zum-tod-von-margaret-thatcher-die-reizfigur-12141919.html#Drucken
test_url: http://www.faz.net/aktuell/politik/inland/allensbach-analyse-im-namen-des-volkes-13106492.html
test_url: http://www.faz.net/aktuell/feuilleton/kino/video-filmkritiken/video-filmkritik-when-animals-dream-zerrissene-jugend-13105772.html
test_url: https://www.faz.net/aktuell/feuilleton/debatten/keine-smart-city-in-toronto-google-stadt-ist-abgesagt-16763217.html?GEPC=s5

# Article with F+ Advert
test_url: https://www.faz.net/aktuell/wirtschaft/wohnen/christian-voelkers-der-immobilienmakler-der-superreichen-17778869.html

0 comments on commit 8e2db00

Please sign in to comment.