Skip to content

Commit

Permalink
Update scraper.rb
Browse files Browse the repository at this point in the history
  • Loading branch information
BfB-Schenefeld committed Apr 22, 2024
1 parent f3adecc commit 7bb2175
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions scraper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ def scrape_top_details(top_url)
puts "Zugriff auf TOP-Seite: #{top_url}"
document = Nokogiri::HTML(open(top_url))

# Extraktion der kompletten Hauptinhalte
main_content = document.at_css('#mainContent').text.strip.gsub(/\s+/, ' ')
puts "Hauptinhalt: #{main_content}"

# Extraktion der Vorlagen-Betreffs, wenn vorhanden
vorlagen_betreff_element = document.at_css('span#vobetreff a')
if vorlagen_betreff_element
Expand All @@ -42,6 +46,9 @@ def scrape_top_details(top_url)
puts "Keine Vorlage vorhanden."
["-", "-"]
end

# Rückgabe des Hauptinhalts und weiterer Details
return main_content
end

# Beispiel-URL für eine TOP-Seite
Expand All @@ -60,3 +67,4 @@ def scrape_top_details(top_url)




0 comments on commit 7bb2175

Please sign in to comment.