Timeouts at seemingly random moments #121

Arsanian · 2019-11-07T15:14:49Z

I'm trying to download a huge number of lyrics for a university project. I have files that represent a genre which contain 50 artists I want to download all lyrics from.

So I wrote a python script that scans the folder and reads the lists one by one, trying to download the lyrics for every artist in these lists.

Sometimes the following happens:

Timeout raised and caught:
HTTPSConnectionPool(host='api.genius.com', port=443): Read timed out. (read timeout=5)
Traceback (most recent call last):
File "lyricsapi.py", line 54, in
artist = api.search_artist(a.strip(), max_songs=max_songs, sort="title")
File "/home/duke/anaconda3/envs/dynamusic/lib/python3.7/site-packages/lyricsgenius/api.py", line 356, in search_artist
song = Song(info, lyrics)
File "/home/duke/anaconda3/envs/dynamusic/lib/python3.7/site-packages/lyricsgenius/song.py", line 26, in init
self._body = json_dict['song'] if 'song' in json_dict else json_dict
TypeError: argument of type 'NoneType' is not iterable

This error happens pretty randomly, sometimes after 50 texts, sometimes after 600. Earlier today it happened after downloading 113 texts by Eminem, but in the next try it managed to download all 490 of his songs, just to fail after a few songs from the next artist in line.

This also happened, when I ran the script on my server, which has a separate internet connection.

Version info

Package version 1.7.0
OS: Ubuntu 19.10 (also happened on a 18.04 machine)

mxdillon · 2019-11-16T14:34:45Z

I'm facing the same issue

GiorgioGhisotti · 2019-12-14T15:34:13Z

A workaround for this is to use a try...except block and place the request in a while loop

artists = []
while True:
    try:
        artists.append(genius.search_artist(artist, max_songs=10000))
        break
    except:
        pass

This will simply retry the call until it works. I successfully used this to scrape the full discography of 50 artists and I didn't run into any further problems.

dmlunde · 2020-02-27T16:12:09Z

@Arsanian how did you manage to narrow down the Eminem number of songs to 490?

danielhorizon · 2020-04-23T21:18:56Z

I've tried the above and am still getting a timeout..

"HTTPSConnectionPool(host='api.genius.com', port=443): Read timed out. (read timeout=5)"

Any suggestions? I've tried using a timeout as well (for 60 seconds) and tried the while() and a try/catch.

ArinkB · 2020-10-06T14:50:16Z

I am also having this same issue, my loop is pulling lyrics based on the artist name and song title. then appending that to a list. I have a try and except and the error still pops up. I also have time.sleep(15) just in case.
The code can run anywhere from 30min - 5hours. It requires a lot of time monitoring.

allerter · 2020-10-06T15:01:18Z

@ArinkB, could you please provide the following info so we can re-create and debug your issue:

the version of LyricsGenius
your traceback
a minimal working script so that we can re-create the error.

ArinkB · 2020-10-06T16:38:00Z

@ArinkB, could you please provide the following info so we can re-create and debug your issue:

the version of LyricsGenius

your traceback

a minimal working script so that we can re-create the error.

sure, the dataframe:

lyrics = []

def get_lyrics(): #no arguments needed
    while len(lyrics) != len(end_df): 
        genius = lyricsgenius.Genius("API KEY") # call to lyricsgenius
        for track in end_df.values: 
            song = genius.search_song(track[2], track[0])
            try:    
                lyrics.append(song.lyrics) 
            except:
                lyrics.append(np.NAN) 
        time.sleep(40)

The error:
D:\Anaconda\lib\site-packages\lyricsgenius\api\base.py in make_request(self, path, method, params, public_api, **kwargs)
58 except Timeout as e:
59 error = "Request timed out:\n{e}".format(e=e)
---> 60 raise Timeout(error)
61 except HTTPError as e:
62 error = str(e)

Timeout: Request timed out:
HTTPSConnectionPool(host='api.genius.com', port=443): Read timed out. (read timeout=5)

allerter · 2020-10-06T17:58:58Z

@ArinkB, thanks for providing the information. Although this issue is probably a valid issue, I don't think your script's primary issue is the one with the Timeout. I tested Spotify's Viral 50 songs using your script and here are a couple of things that you could improve:

from requests.exceptions import Timeout

lyrics = []


def get_lyrics():
    # while len(lyrics) != len(end_df): #1
    genius = lyricsgenius.Genius(token)
    genius.timeout = 15
    genius.sleep_time = 40  # 2
    # or: Genius(token, timeout=15, sleep_time=40)
    for track in end_df.values:
        retries = 0
        while retries < 3:
            try:
                song = genius.search_song(track[2], track[0])
            except Timeout as e:
                retries += 1
                continue
            if song is not None:
                lyrics.append(song.lyrics)
            else:
                lyrics.append(np.NAN)
            break

This will result in an infinite loop since some songs can't be found, and there's no need for it in the first place.
With the genius.sleep_time attribute, there's no need for time.sleep(40) anymore. Also, I don't think there's a need for a 40-sec sleep from the API's end. When I tested your script, I removed the time.sleep(40) line and everything worked fine.

Now your script will search for the songs and in case of timeouts, your script will retry the search three times before moving on to the next song (this should probably be a feature, @johnwmillr).

ArinkB · 2020-10-06T21:00:21Z

@allerter Thank you! I appreciate your help and insight. It has been pulling for 3 hours now and no issues so far.

NIkitabala · 2020-11-02T13:19:03Z

@ArinkB Hi, can you show me, how exactly do you use your script? I'm trying to use this solution, but I'm still getting an error.

ArinkB · 2020-11-02T17:17:28Z

@NIkitabala sure,
this is the notebook I used it in, I modified it slightly because my original project plan didn't work out at the time:
https://github.com/ArinkB/Predicting-Song-Skips/blob/master/1_Data%20Acquisition.ipynb

allerter · 2020-11-02T21:41:48Z

Based on this comment that I posted on #168, I think these random timeout errors will be solved by #162. We'll see.

johnwmillr added the bug label Nov 7, 2019

allerter linked a pull request Nov 6, 2020 that will close this issue

reconfigured types and added album, added retries and etc #162

Merged

johnwmillr closed this as completed in #162 Nov 6, 2020

zuziaszwedo mentioned this issue Dec 16, 2022

Request timed out during API call for songs of an artist #252

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timeouts at seemingly random moments #121

Timeouts at seemingly random moments #121

Arsanian commented Nov 7, 2019

mxdillon commented Nov 16, 2019

GiorgioGhisotti commented Dec 14, 2019

dmlunde commented Feb 27, 2020

danielhorizon commented Apr 23, 2020

ArinkB commented Oct 6, 2020

allerter commented Oct 6, 2020 •

edited

ArinkB commented Oct 6, 2020 •

edited by allerter

allerter commented Oct 6, 2020

ArinkB commented Oct 6, 2020

NIkitabala commented Nov 2, 2020

ArinkB commented Nov 2, 2020

allerter commented Nov 2, 2020

Timeouts at seemingly random moments #121

Timeouts at seemingly random moments #121

Comments

Arsanian commented Nov 7, 2019

mxdillon commented Nov 16, 2019

GiorgioGhisotti commented Dec 14, 2019

dmlunde commented Feb 27, 2020

danielhorizon commented Apr 23, 2020

ArinkB commented Oct 6, 2020

allerter commented Oct 6, 2020 • edited

ArinkB commented Oct 6, 2020 • edited by allerter

allerter commented Oct 6, 2020

ArinkB commented Oct 6, 2020

NIkitabala commented Nov 2, 2020

ArinkB commented Nov 2, 2020

allerter commented Nov 2, 2020

allerter commented Oct 6, 2020 •

edited

ArinkB commented Oct 6, 2020 •

edited by allerter