unsampled report downloads #44

MarkEdmondson1234 · 2016-10-18T12:12:00Z

Fetch from management API for 360 properties

j450h1 · 2017-11-22T06:11:12Z

I can work on this one before the end of the year. Feel free to assign it to me.

MarkEdmondson1234 · 2017-11-22T08:28:25Z

@j450h1 that would be great! There are several management API functions that have been requested, mostly those that help setup (like editing filters) so if you are looking for more just say!

j450h1 · 2017-12-18T06:48:41Z

See PR #139. A couple things to note:

User must authenticate with the additional google drive scope. I don't know where this should be mentioned?

options(googleAuthR.scopes.selected = c("https://www.googleapis.com/auth/analytics",
"https://www.googleapis.com/auth/drive"))

I'm aware of the bug, when the progress bar overwrites the text if there are multiple reports with the same report title (I also made the decision to request the title (which may not be unique) vs the id - not sure if that's the best way, just thinking from the user's perspective, would they get the name from the UI or would they list all reports first in R - in which case the id might be better?)
This function currently only supports files < 25 mb, however I do have the pseduocode which I think should work (maybe I'll get to it in the new year):
- if file is over 25mb, save respose as html instead of csv
- parse that html for the "confirm" link
- download with that confirm link, saving as csv

MarkEdmondson1234 · 2018-01-19T11:48:11Z

@j450h1 did you look into using googledrive as a method of downloading? It may mean some code can be taken out if we depend on that.

j450h1 · 2018-01-20T01:01:15Z

Yes, I explored that option. I believe the problem was v3 of the API only allows you to export Google Doc type files (mimeType): https://developers.google.com/apis-explorer/#p/drive/v3/drive.files.export and I believe that package uses v3. I think another issue was that file was not in "my" google drive which is what the functions in that package allow you to download/upload files from. Nonetheless, I obviously tried to go with a simple approach first.

MarkEdmondson1234 · 2018-01-20T08:13:08Z

Working with it now, and to handle the extra scope issues I don't want to add the general scope to the whole package, and just have some documentation to add the scope when needed, and a check at the start of the download that checks options(googleAuthR.scopes.selected) that raises an error if the drive scope isn't present.

Also I'd like the ga_unsampled_download to do a little less: only download a file of reportTitle, so it can chain with ga_unsampled_list i.e. make the output of ga_unsampled_list easier to work with, and then pass that to ga_unsample_list(reportTitle) so it requires less arguments.

j450h1 · 2018-01-20T10:06:04Z

Sure sounds good to me. Whatever will make it easier to work with.

j450h1 · 2018-01-27T09:14:16Z

Just following up on this one, do you want me to remove the save to dataframe option or are you going to take care of it?

MarkEdmondson1234 · 2018-01-27T18:31:22Z

Didn't want to remove the save to dataframe option, but rather change ga_unsampled_list() so the parsing currently done at the start of ga_unsampled_download() is unnecessary.

That way it will only need these arguments:

ga_unsampled_download <- function(reportTitle, 
                                  file=sprintf("%s.csv", reportTitle), 
                                  downloadFile=TRUE)

...and could be chained and looped via something like:

library(googleAnalyticsR)
library(tidyverse)

## download all unsampled reports
ga_unsampled_list(accountId, webPropertyId, profileId) %>%
   select(title) %>% 
   map(ga_unsampled_download)

Some more documentation and examples won't hurt either.

j450h1 · 2018-02-03T10:42:30Z

Okay, so " a check at the start of the download that checks options(googleAuthR.scopes.selected) that raises an error if the drive scope isn't present." is done.

Regarding, the 2nd point, I have changed the ga_unsampled_list to return a dataframe. This might require updating other code that uses this function.

However, it is working quite well for this ga_unsampled_download function. I'm having a little trouble using walk2 and map2 as you can see it is split up when not needed (technically map2 isn't needed, but just in case the user enters the title when they want the dataframes - even though that argument won't be used) in a pipe, but I managed to get these two test cases working for now:

library(tidyverse)
## download all unsampled reports and create a list of dataframes
test <- ga_unsampled_list(accountId, webPropertyId, profileId) %>%
  select(driveDownloadDetails, title) %>% 
  na.omit() 

#download
walk2(unlist(test$driveDownloadDetails), 
      unlist(test$title),
      ga_unsampled_download)

#list of dataframes
dataframes_test <- map(unlist(test$driveDownloadDetails), 
      ga_unsampled_download, downloadFile=FALSE)

j450h1 · 2018-02-03T10:49:50Z

If a user wanted to download just 1 report:

reportTitle <- "googleanalyticsR_test_download"

small <- ga_unsampled_list(accountId, webPropertyId, profileId) %>%
  filter(title == reportTitle) %>%
  select(driveDownloadDetails) %>% 
  na.omit() %>%
  walk(ga_unsampled_download)

This is what I mean, by how walk2, map2, or walk like this one should all be in one pipeline (not sure if there is a better word?). I just couldn't get it working, maybe you know the correct syntax? This one only works because it is 1 item, if it wasn't filtered it wouldn't work properly. So I realize it definitely needs more testing.

Also by default right now, the filename is the documentId/driveDownloadDetails (example: 19dydgPj1A9L7QRDgNvE6qTNcXN0rqBxj.csv), unless user selects the title column which should be best practice when downloading (hence use of walk2), while when creating a list of dataframes it is not needed/used.

j450h1 · 2018-02-03T10:55:17Z

The downside of this approach is if the user doesn't want to use pipes, he/she will need to enter the driveDownloadDetails and can't just enter the report name. That was the advantage of the old approach. But I guess, if its documented to review the list first and choose the driveDownloadDetails or maybe most people are using %>% now.

MarkEdmondson1234 · 2018-02-05T22:23:40Z

Its perhaps easier to download the files as they are named in the GA UI (perhaps with date too) then show an example on how to rename them if they want, that way you don't have to juggle two argument loops.

My overall strategy over time has been to try and get each function to be as useful but do as little as possible, with as many sensible defaults for beginners but the arguments in there for advanced users if they want it, as its the most flexible and easiest to maintain.

If its just one report, I guess they could just read the reportTitle from the UI, and pass that in:

ga_unsample_download(what_I_copied_from_ui)

How come you needed na.omit()in your example, is that something that could perhaps be within ga_unsampled_list()?

j450h1 · 2018-02-05T22:58:59Z

I think that approach makes a lot of sense.

Regarding na.omit(), it is required because the list includes what appear to be standard reports not created by the user and therefore no download details. Yes, I can include it as part of the ga_unsampled_list. Should a tibble be returned instead of a dataframe? Just trying to be consistent with the rest of the repo.

j450h1 · 2018-02-05T23:04:13Z

Regarding allowing the user to enter the ReportTitle, if we go with that approach it will either require all the same arguments as ga_unsampled_list (as was originally done) or the returned dataframe/tibble from this function. So their is a tradeoff because you can enter the reportTitle which will essentially just filter the dataframe, however then you cannot simply pipe the reportTitle as mentioned. Looks like the simpler approach for the user is to allow the reportTitle be entered, so I will change it back to that approach. What do you think of requiring the same arguments of ga_unsampled_list or just requiring the returned df/tibble?

MarkEdmondson1234 · 2018-02-06T10:26:41Z

Should a tibble be returned instead of a dataframe? Just trying to be consistent with the rest of the repo.

I'm not sure it is consistent in the package, but IIRC it should fall back to data.frame from tibble if they don't have it loaded - the ga_account_summary() does that.

It doesn't necessarily have to be piped, in fact a motive is to allow other ways that we haven't thought of to cover. A base R example may be:

library(googleAnalyticsR)

## download all unsampled reports
unsample_df <- ga_unsampled_list(accountId, webPropertyId, profileId)

# you need the title to pass in to ga_unsampled_download
lapply(unsample_df$title, ga_unsampled_download)

j450h1 · 2018-02-08T05:16:45Z

Ok thanks for clarifying. I think I know what updates to make now.

j450h1 · 2018-02-08T06:46:35Z

Give this a crack and let me know. Note: accountId, webPropertyId, profileId are required again because I have to call ga_unsampled_list within ga_unsampled_download. The only other option I see is to pass the dataframe/tibble object itself (similar to what I was doing before this) and the user cannot then enter a reportTitle if they wanted to do that. Anyways here it is:

# Download multiple reports with lapply
## download all unsampled reports
unsample_df <- ga_unsampled_list(accountId, webPropertyId, profileId)
lapply(unsample_df$title, ga_unsampled_download, accountId, webPropertyId, profileId)

library(tidyverse)
# Download multiple reports with pipes
ga_unsampled_list(accountId, webPropertyId, profileId) %>%
  select(title) %>%
  unlist() %>%
  map(ga_unsampled_download, accountId, webPropertyId, profileId)

j450h1 · 2018-02-08T06:51:36Z

# Download 1 file
reportTitle <- "googleanalyticsR_test_download" #user can enter this without having to explicitly call ga_unsampled_list first and then possibly filtering if we went with 2nd option of using dataframe/tibble as argument instead of reportTitle and 3 other things
ga_unsampled_download(reportTitle,
                      accountId,
                      webPropertyId,
                      profileId)

unsampled report downloads #44

j450h1 · 2018-05-17T06:32:39Z

Hey Mark. Just wondering if this issue can be closed or is there a reason it is still open? I'm happy to clear up any loose ends if there are any!

MarkEdmondson1234 · 2018-05-17T08:57:55Z

I just forgot to close it :)

MarkEdmondson1234 added the enhancement label Dec 8, 2016

MarkEdmondson1234 added the hacktoberfest label Oct 2, 2017

MarkEdmondson1234 added a commit that referenced this issue Jan 19, 2018

Style tweaks after pull #139 for #44

b60f62e

j450h1 mentioned this issue Feb 3, 2018

unsampled report downloads #44 #146

Merged

MarkEdmondson1234 added a commit that referenced this issue Feb 8, 2018

Merge pull request #146 from j450h1/master

0f3c83e

unsampled report downloads #44

MarkEdmondson1234 added a commit that referenced this issue Feb 8, 2018

Add tests for #44 related to pull #146 - add contributors.md

781e233

MarkEdmondson1234 added a commit that referenced this issue Feb 8, 2018

add examples to function documentation for #44

ca28d64

MarkEdmondson1234 added a commit that referenced this issue Feb 8, 2018

add news entry for #44

cf08010

MarkEdmondson1234 closed this as completed May 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unsampled report downloads #44

unsampled report downloads #44

MarkEdmondson1234 commented Oct 18, 2016

j450h1 commented Nov 22, 2017

MarkEdmondson1234 commented Nov 22, 2017

j450h1 commented Dec 18, 2017 •

edited

MarkEdmondson1234 commented Jan 19, 2018

j450h1 commented Jan 20, 2018

MarkEdmondson1234 commented Jan 20, 2018

j450h1 commented Jan 20, 2018

j450h1 commented Jan 27, 2018

MarkEdmondson1234 commented Jan 27, 2018

j450h1 commented Feb 3, 2018

j450h1 commented Feb 3, 2018 •

edited

j450h1 commented Feb 3, 2018

MarkEdmondson1234 commented Feb 5, 2018

j450h1 commented Feb 5, 2018

j450h1 commented Feb 5, 2018

MarkEdmondson1234 commented Feb 6, 2018

j450h1 commented Feb 8, 2018

j450h1 commented Feb 8, 2018

j450h1 commented Feb 8, 2018

j450h1 commented May 17, 2018

MarkEdmondson1234 commented May 17, 2018

unsampled report downloads #44

unsampled report downloads #44

Comments

MarkEdmondson1234 commented Oct 18, 2016

j450h1 commented Nov 22, 2017

MarkEdmondson1234 commented Nov 22, 2017

j450h1 commented Dec 18, 2017 • edited

MarkEdmondson1234 commented Jan 19, 2018

j450h1 commented Jan 20, 2018

MarkEdmondson1234 commented Jan 20, 2018

j450h1 commented Jan 20, 2018

j450h1 commented Jan 27, 2018

MarkEdmondson1234 commented Jan 27, 2018

j450h1 commented Feb 3, 2018

j450h1 commented Feb 3, 2018 • edited

j450h1 commented Feb 3, 2018

MarkEdmondson1234 commented Feb 5, 2018

j450h1 commented Feb 5, 2018

j450h1 commented Feb 5, 2018

MarkEdmondson1234 commented Feb 6, 2018

j450h1 commented Feb 8, 2018

j450h1 commented Feb 8, 2018

j450h1 commented Feb 8, 2018

j450h1 commented May 17, 2018

MarkEdmondson1234 commented May 17, 2018

j450h1 commented Dec 18, 2017 •

edited

j450h1 commented Feb 3, 2018 •

edited