Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timeframe doesn't work - Google returned a response with code 429 #596

Open
dbitton opened this issue Jul 30, 2023 · 13 comments
Open

timeframe doesn't work - Google returned a response with code 429 #596

dbitton opened this issue Jul 30, 2023 · 13 comments

Comments

@dbitton
Copy link

dbitton commented Jul 30, 2023

The timeframe parameter (any hourly and daily flags tested) keeps breaking the code, giving Google 429 even in the first call (not over calling) in code that has been running for long without any issues. (most updated version of pytrends)

This works:
kw_list=['']
pytrend.build_payload(kw_list)
related_queries = pytrend.related_queries()

This doesn't:
pytrend.build_payload(kw_list, timefreme = "now 1-H")
related_queries = pytrend.related_queries()

@Helldez
Copy link

Helldez commented Jul 31, 2023

Same here from one month

@haghft
Copy link

haghft commented Jul 31, 2023

At first it gave an error of 500, then it became 429

@2uanDM
Copy link

2uanDM commented Aug 1, 2023

I have changed to use selenium to scrape instead for over a month :)))

@mouize
Copy link

mouize commented Aug 1, 2023

I have changed to use selenium to scrape instead for over a month :)))

I'm trying the same with puppeteer, but not stable, time to time, I still have the 429 in headless mode. Is it fully working for u? Any tips ?

@2uanDM
Copy link

2uanDM commented Aug 1, 2023

I will try to change to the headless mode to see whether it works.

First I get to https://trends.google.com/trends/, and then get to the url of https://trends.google.com.vn/trends/explore to get the cookies, then I perform a fake searching (shirt) for example. Then can get to another permalink with geo and time frame. I scrape using clicking on the download csv and parse that csv file to get the thing I want.

I found that GG can track whether an IP address is scraping or not, so you can random interval time between performing automation step, and I highly recommend using proxies. An IP can scrape data again without getting 429 if it "relax" for more than one or two hourse after performing downloading a bunch of keywords

@Helldez
Copy link

Helldez commented Aug 1, 2023

Does it work if you use pytrends with a list of proxies?

@2uanDM
Copy link

2uanDM commented Aug 1, 2023

Does it work if you use pytrends with a list of proxies?

I haven't tried yet but I think It will not solve the problem, since you still can scrape the data with pytrends if your time frame is more than a year, so I think the problem it that your requests is regconized as a bot, not a client

@2uanDM
Copy link

2uanDM commented Aug 1, 2023

I see that pytrends works good again

@francksa
Copy link

francksa commented Aug 1, 2023

are you sure?

@2uanDM
Copy link

2uanDM commented Aug 1, 2023

You can try, my automatic program can scrawl with time frame 1 - H without 429 errors for more than 3 hours. If it's true so the problem is Google API backend :v

@jeffsnack
Copy link

jeffsnack commented Oct 17, 2023

So, has this issue been resolved? I tried the aforementioned method of adding the custom header into "dailydata.py", but I still encounter the 429 Error...

It even triggered the error without executing data for six months.

Here is my code and exception.

from pytrends.request import TrendReq
import json
import concurrent.futures
from pytrends import dailydata
import pandas as pd
import time


#pytrend = TrendReq(hl='en-US',tz=360)



df = dailydata.get_daily_data('Rice', 2004, 1, 2022, 9, geo = 'US')
df.to_excel('Rice_6000.xlsx')
print('Complete')

Rice:2004-01-01 2004-01-31
---------------------------------------------------------------------------
TooManyRequestsError                      Traceback (most recent call last)
<ipython-input-1-44c7758dfcd6> in <module>
     11 
     12 
---> 13 df = dailydata.get_daily_data('Rice', 2004, 1, 2022, 9, geo = 'US')
     14 df.to_excel('Rice_6000.xlsx')
     15 print('Complete')

~\Anaconda3\lib\site-packages\pytrends\dailydata.py in get_daily_data(word, start_year, start_mon, stop_year, stop_mon, geo, verbose, wait_time)
    140         if verbose:
    141             print(f'{word}:{timeframe}')
--> 142         results[current] = _fetch_data(pytrends, build_payload, timeframe)
    143         current = last_date_of_month + timedelta(days=1)
    144         sleep(wait_time)  # don't go too fast or Google will send 429s

~\Anaconda3\lib\site-packages\pytrends\dailydata.py in _fetch_data(pytrends, build_payload, timeframe)
     70         else:
     71             fetched = True
---> 72     return pytrends.interest_over_time()
     73 
     74 

~\Anaconda3\lib\site-packages\pytrends\request.py in interest_over_time(self)
    233             method=TrendReq.GET_METHOD,
    234             trim_chars=5,
--> 235             params=over_time_payload,
    236         )
    237 

~\Anaconda3\lib\site-packages\pytrends\dailydata.py in _get_data(self, url, method, trim_chars, **kwargs)
     35 class CustomTrendReq(TrendReq):
     36     def _get_data(self, url, method=TrendReq.GET_METHOD, trim_chars=0, **kwargs):
---> 37         return super()._get_data(url, method=TrendReq.GET_METHOD, trim_chars=trim_chars, headers=headers, **kwargs)
     38 
     39 def get_last_date_of_month(year: int, month: int) -> date:

~\Anaconda3\lib\site-packages\pytrends\request.py in _get_data(self, url, method, trim_chars, **kwargs)
    156         else:
    157             if response.status_code == status_codes.codes.too_many_requests:
--> 158                 raise exceptions.TooManyRequestsError.from_response(response)
    159             raise exceptions.ResponseError.from_response(response)
    160 

TooManyRequestsError: The request failed: Google returned a response with code 429

@Raidus
Copy link

Raidus commented Oct 19, 2023

Just be patient. This issue we had 1 month ago and it disappeared after a few days. If iframe embedding is not working then crawling is likely not working too
image

@Karlheinzniebuhr
Copy link

Same here, the example from the documentation fails with:

TooManyRequestsError                      Traceback (most recent call last)
[~\AppData\Local\Temp/ipykernel_23216/3968946112.py](https://file+.vscode-resource.vscode-cdn.net/c%3A/dev/Python/Forecasting_PY/~/AppData/Local/Temp/ipykernel_23216/3968946112.py) in <module>
      8 
      9 # Interest Over Time
---> 10 interest_over_time_df = pytrend.interest_over_time()
     11 print(interest_over_time_df.head())
     12 

[c:\ProgramData\anaconda3\envs\ML\lib\site-packages\pytrends\request.py](file:///C:/ProgramData/anaconda3/envs/ML/lib/site-packages/pytrends/request.py) in interest_over_time(self)
    230 
    231         # make the request and parse the returned json
--> 232         req_json = self._get_data(
    233             url=TrendReq.INTEREST_OVER_TIME_URL,
    234             method=TrendReq.GET_METHOD,

[c:\ProgramData\anaconda3\envs\ML\lib\site-packages\pytrends\request.py](file:///C:/ProgramData/anaconda3/envs/ML/lib/site-packages/pytrends/request.py) in _get_data(self, url, method, trim_chars, **kwargs)
    157         else:
    158             if response.status_code == status_codes.codes.too_many_requests:
--> 159                 raise exceptions.TooManyRequestsError.from_response(response)
    160             raise exceptions.ResponseError.from_response(response)
    161 

TooManyRequestsError: The request failed: Google returned a response with code 429

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants