Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zeroes returned for blocks of dates #440

Open
coleeagland opened this issue Dec 5, 2022 · 8 comments
Open

Zeroes returned for blocks of dates #440

coleeagland opened this issue Dec 5, 2022 · 8 comments

Comments

@coleeagland
Copy link

coleeagland commented Dec 5, 2022

I am seeing blocks of zeroes returned in the interest_over_time data that don't make sense to me.

R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 17763)
gtrendsR 1.5.1

Edit: Was trying to use a bit of shorthand here but as I reread that makes it less clear. I am looking specifically at interest_over_time.

gtrends("crib", geo = "US")$interest_over_time returns

image

In all, there are 87 straight weeks of zeroes. It is not just this term - it also happened with the search term "espresso". The image shows the zeroes starting 2019-05-05 for the search term "crib". Oddly, espresso is missing the 87 weeks up until 2019-04-28 - the week before 2019-05-05.

image

gtrends("crib")$interest_over_time does not have this same issue with these search terms - but it does happen with other search terms.

Things I have tried:

  1. Confirmed google trends website does not have the issue
  2. Python's pytrends - same result! So it doesn't seem specific to this package, but obviously still an issue.
  3. Different computer (same result)
  4. Called and asked someone I know to try it for me on their computer (same result).

It does not happen with every search term - I'd love to share some kind of pattern, but I'm just not seeing it.

I suspect this issue might not exist tomorrow with these terms but will be found on others, but... hard to say before tomorrow. I looked at some data I've saved from gtrends() calls in the past and this didn't seem to be happening in August but was happening at the beginning of October.

@PMassicotte
Copy link
Owner

I have looked rapidly and there are not so much difference between our query and Google.

Google query using the webpage:

{
  "time": "2017-12-06 2022-12-06",
  "resolution": "WEEK",
  "locale": "en-US",
  "comparisonItem": [
    {
      "geo": { "country": "US" },
      "complexKeywordsRestriction": {
        "keyword": [{ "type": "BROAD", "value": "crib" }]
      }
    }
  ],
  "requestOptions": { "property": "", "backend": "IZG", "category": 0 },
  "userConfig": { "userType": "USER_TYPE_LEGIT_USER" }
}

Our query:

{
  "time": "2017-12-06 2022-12-06",
  "resolution": "WEEK",
  "locale": "en-US",
  "comparisonItem": [
    {
      "geo": { "country": "US" },
      "complexKeywordsRestriction": {
        "keyword": [{ "type": "BROAD", "value": "crib" }]
      }
    }
  ],
  "requestOptions": { "category": 0, "backend": "IZG", "property": "" },
  "userConfig": { "userType": "USER_TYPE_SCRAPER" },
}

The only difference I can see is the userType. It seems that Google is able to detect that we are scraping their data. I could not find how to bypass this, but I suspect this is related to the request of the token:

widget <- curl::curl_fetch_memory(url, handle = .pkgenv[["cookie_handler"]])

If anyone has a solution, I would be happy to look at it.

@charlesnuttens
Copy link

Hi,

Many thanks @PMassicotte for your excellent work on this package. It's greatly appreciated.

I have a similar issue for the keyword "lyme" for France.

` lyme <- gtrends(
keyword = "lyme",
geo = "FR",
time = "2004-01-01 2022-12-01",
gprop = c("web"),
onlyInterest = TRUE
)$interest_over_time

plot(lyme$date, lyme$hits, type = "l", ylim = c(0, 100))`

return
6733fa91-d4a9-46e8-9a16-3286af52854a

There are no hits between 2010 and 2014 with gtrendsR despite there are hits on the Google Trends website
Google Trends site

I tried on a different computer and different versions of gtrendsR.
There is no issue when using the keyword "maladie de lyme".
There is no issue for some other countries.

Do you have a similar observation from your side ?

Best,
Charles

@henriquefpires
Copy link

Same problem here!

@ghost
Copy link

ghost commented Dec 9, 2022

Hi again,

Problem solved for France, using the exact same code and same versions or R, RStudio, gtrendsR.

FR

But the the problem appeared for the UK (ISO2 "GB") for "lyme" keyword.
GB

As for France, using "lyme disease" solves the problem in the UK.
GB2

Best,
Charles

@MattCowgill
Copy link

I'm encountering this same issue on an unrelated search term ("inflation")

@alberto-agudo
Copy link

Same issue happening recently. It does not depend on the particular search term as it has happened for a different number of them depending on the moment when I send the query. Below you can find an example of a wide variety of them applied to telco / insurance terms, where the query is written as gtrendsR::gtrends(term, geo = "GB", time = "today+5-y").

It might be related to the UK only, but I also searched for US terms like "Biden" and got blocks of zeros.

indexes_by_term_2024-01-02 0852


Configuration:

devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.1 (2023-06-16 ucrt)
#>  os       Windows 10 x64 (build 18362)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  English_United Kingdom.utf8
#>  ctype    English_United Kingdom.utf8
#>  tz       Europe/Madrid
#>  date     2024-01-02
#>  pandoc   3.1.8 @ C:/Users/ALBERT~1.AGU/AppData/Local/Pandoc/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cachem        1.0.8   2023-05-01 [1] CRAN (R 4.3.1)
#>  callr         3.7.3   2022-11-02 [1] CRAN (R 4.3.1)
#>  cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.1)
#>  crayon        1.5.2   2022-09-29 [1] CRAN (R 4.3.1)
#>  devtools      2.4.5   2022-10-11 [1] CRAN (R 4.3.1)
#>  digest        0.6.33  2023-07-07 [1] CRAN (R 4.3.1)
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.3.1)
#>  evaluate      0.22    2023-09-29 [1] CRAN (R 4.3.1)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.1)
#>  fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.1)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.3.1)
#>  gtrendsR  * 1.5.1.9000 2023-11-23 [1] Github (pmassicotte/gtrendsR@d53b9b7)
#>  htmltools     0.5.6   2023-08-10 [1] CRAN (R 4.3.1)
#>  htmlwidgets   1.6.2   2023-03-17 [1] CRAN (R 4.3.1)
#>  httpuv        1.6.11  2023-05-11 [1] CRAN (R 4.3.1)
#>  knitr         1.44    2023-09-11 [1] CRAN (R 4.3.1)
#>  later         1.3.1   2023-05-02 [1] CRAN (R 4.3.1)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.2)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.1)
#>  memoise       2.0.1   2021-11-26 [1] CRAN (R 4.3.1)
#>  mime          0.12    2021-09-28 [1] CRAN (R 4.3.0)
#>  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.3.1)
#>  pkgbuild      1.4.2   2023-06-26 [1] CRAN (R 4.3.1)
#>  pkgload       1.3.3   2023-09-22 [1] CRAN (R 4.3.1)
#>  prettyunits   1.2.0   2023-09-24 [1] CRAN (R 4.3.1)
#>  processx      3.8.2   2023-06-30 [1] CRAN (R 4.3.1)
#>  profvis       0.3.8   2023-05-02 [1] CRAN (R 4.3.1)
#>  promises      1.2.1   2023-08-10 [1] CRAN (R 4.3.1)
#>  ps            1.7.5   2023-04-18 [1] CRAN (R 4.3.1)
#>  purrr         1.0.2   2023-08-10 [1] CRAN (R 4.3.1)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.3.1)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.3.0)
#>  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.3.0)
#>  R.utils       2.12.2  2022-11-11 [1] CRAN (R 4.3.1)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.1)
#>  Rcpp          1.0.11  2023-07-06 [1] CRAN (R 4.3.1)
#>  remotes       2.4.2.1 2023-07-18 [1] CRAN (R 4.3.1)
#>  reprex        2.0.2   2022-08-17 [1] CRAN (R 4.3.1)
#>  rlang         1.1.1   2023-04-28 [1] CRAN (R 4.3.1)
#>  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.1)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.1)
#>  shiny         1.7.5   2023-08-12 [1] CRAN (R 4.3.1)
#>  stringi       1.7.12  2023-01-11 [1] CRAN (R 4.3.0)
#>  stringr       1.5.0   2022-12-02 [1] CRAN (R 4.3.1)
#>  styler        1.10.2  2023-08-29 [1] CRAN (R 4.3.1)
#>  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.3.1)
#>  usethis       2.2.2   2023-07-06 [1] CRAN (R 4.3.1)
#>  vctrs         0.6.3   2023-06-14 [1] CRAN (R 4.3.1)
#>  withr         2.5.2   2023-10-30 [1] CRAN (R 4.3.2)
#>  xfun          0.40    2023-08-09 [1] CRAN (R 4.3.1)
#>  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.3.1)
#>  yaml          2.3.7   2023-01-23 [1] CRAN (R 4.3.0)
#> 
#>  [1] C:/Program Files/R/R-4.3.1/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Created on 2024-01-02 with reprex v2.0.2

@eddelbuettel
Copy link
Collaborator

"The free service giveth, the free service taketh." We do not put the zeros in, that may just be what (some ?) Google backends deliver for (some ?) combinations of terms. Hard to say more.

@ilolic
Copy link

ilolic commented Jan 18, 2024

When you try downloading multiple Trends series, Google returns zeros to stop you.
I have added a function that checks for these blocks and repeats the download if blocks occur.
I tried that for several days, and it helped, but in the end, I still needed to download some series manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants