Skip to content

Commit

Permalink
Merge pull request #3995 from wisp3rwind/pr_lyrics_tekstowo_no_crashes
Browse files Browse the repository at this point in the history
Crash-resilient Tekstowo lyrics source
  • Loading branch information
sampsyo committed Jul 5, 2021
2 parents ed695f2 + c336191 commit 0f9ffee
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 15 deletions.
31 changes: 20 additions & 11 deletions beetsplug/lyrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -442,14 +442,15 @@ def fetch(self, artist, title):
search_results = self.fetch_url(url)
if not search_results:
return None
song_page_url = self.parse_search_results(search_results)

song_page_url = self.parse_search_results(search_results)
if not song_page_url:
return None

song_page_html = self.fetch_url(song_page_url)
if not song_page_html:
return None

return self.extract_lyrics(song_page_html)

def parse_search_results(self, html):
Expand All @@ -460,20 +461,27 @@ def parse_search_results(self, html):
if not soup:
return None

song_rows = soup.find("div", class_="content"). \
find("div", class_="card"). \
find_all("div", class_="box-przeboje")
content_div = soup.find("div", class_="content")
if not content_div:
return None

card_div = content_div.find("div", class_="card")
if not card_div:
return None

song_rows = card_div.find_all("div", class_="box-przeboje")
if not song_rows:
return None

song_row = song_rows[0]

if not song_row:
return None

href = song_row.find('a').get('href')
return self.BASE_URL + href
link = song_row.find('a')
if not link:
return None

return self.BASE_URL + link.get('href')

def extract_lyrics(self, html):
html = _scrape_strip_cruft(html)
Expand All @@ -483,10 +491,11 @@ def extract_lyrics(self, html):
if not soup:
return None

c = soup.find("div", class_="song-text")
if c:
return c.get_text()
return None
lyrics_div = soup.find("div", class_="song-text")
if not lyrics_div:
return None

return lyrics_div.get_text()


def remove_credits(text):
Expand Down
7 changes: 3 additions & 4 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,9 @@ Other new things:
* Get ISRC identifiers from musicbrainz
Thanks to :user:`aereaux`.
* :doc:`/plugins/metasync`: The ``metasync`` plugin now also fetches the ``Date Added`` field from iTunes databases and stores it in the``itunes_dateadded`` field.Thanks to :user:`sandersantema`.
* :doc:`/plugins/lyrics`: Added Tekstowo.pl lyrics provider. Thanks to various
people for the implementation and for reporting issues with the initial version.
:bug:`3344` :bug:`3904` :bug:`3905` :bug:`3994`

.. _py7zr: https://pypi.org/project/py7zr/

Expand Down Expand Up @@ -294,8 +297,6 @@ Fixes:
* Removed ``@classmethod`` decorator from dbcore.query.NoneQuery.match method
failing with AttributeError when called. It is now an instance method.
:bug:`3516` :bug:`3517`
* :doc:`/plugins/lyrics`: Added Tekstowo.pl lyrics provider
:bug:`3344`
* :doc:`/plugins/lyrics`: Tolerate missing lyrics div in Genius scraper.
Thanks to :user:`thejli21`.
:bug:`3535` :bug:`3554`
Expand Down Expand Up @@ -355,8 +356,6 @@ Fixes:
:bug:`3870`
* Allow equals within ``--set`` value when importing.
:bug:`2984`
* :doc:`/plugins/lyrics`: Fix crashes for Tekstowo false positives
:bug:`3904`
* :doc`/reference/cli`: Remove reference to rarfile version in link
* Fix :bug:`2873`. Duplicates can now generate checksums. Thanks user:`wisp3rwind`
for the pointer to how to solve. Thanks to :user:`arogl`.
Expand Down

0 comments on commit 0f9ffee

Please sign in to comment.