-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't join across sources #219
Comments
Need to adjust Possibly related: tidyverse/dplyr#2993 |
Better approach, which works in dev version.
|
Has this been resolved? I'm still getting the error. Is this an adequate reprex? library(magrittr)
library(ggplot2)
library(bigrquery)
library(tidyverse)
library(dplyr, warn.conflicts = FALSE)
library(knitr)
library(dbplyr)
#>
#> Attaching package: 'dbplyr'
#> The following objects are masked from 'package:dplyr':
#>
#> ident, sql
conn = DBI::dbConnect(
bigrquery::bigquery(),
project = "elite-magpie-257717",
dataset = "TEST_DATASET",
KeyFilePath = "google_service_key.json",
OAuthMechanism = 0
)
df = data.frame(
geo_id = c("US", "CA"),
socialization = c("Obnoxious", "Polite")
)
DBI::dbWriteTable(
conn=conn,
name="mytable",
value=df,
overwrite=T
)
#> Using an auto-discovered, cached token.
#> To suppress this message, modify your code or options to clearly consent to the use of a cached token.
#> See gargle's "Non-interactive auth" vignette for more details:
#> https://gargle.r-lib.org/articles/non-interactive-auth.html
#> The bigrquery package is using a cached token for ariel.balter@gmail.com.
country_data = tbl(conn, "mytable")
covid_conn = DBI::dbConnect(
bigrquery::bigquery(),
project = "bigquery-public-data",
dataset = "covid19_ecdc"
)
covid_data =
tbl(covid_conn, "covid_19_geographic_distribution_worldwide") %>%
select(geo_id, pop_data_2019)
inner_join(country_data, covid_data, by="geo_id")
#> Error: `x` and `y` must share the same src, set `copy` = TRUE (may be slow). Created on 2020-10-02 by the reprex package (v0.3.0) |
@abalter you need to use the same connection object |
But those are different data sources. One is a google-public-dataset. One is mine. |
Ok, I think I see what you mean. I can both create a connection without specifying a dataset, and also access a table outside of the project associated with the connection with a fully qualified path. This was not obvious. library(bigrquery)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
con <- DBI::dbConnect(
bigrquery::bigquery(),
project = bq_test_project()
)
country_data = tbl(con, "elite-magpie-257717.TEST_DATASET.mytable")
#> Using an auto-discovered, cached token.
#> To suppress this message, modify your code or options to clearly consent to the use of a cached token.
#> See gargle's "Non-interactive auth" vignette for more details:
#> https://gargle.r-lib.org/articles/non-interactive-auth.html
#> The bigrquery package is using a cached token for ariel.balter@gmail.com.
covid_data =
tbl(con, "bigquery-public-data.covid19_ecdc.covid_19_geographic_distribution_worldwide") %>%
select(geo_id, pop_data_2019) %>%
distinct()
inner_join(country_data, covid_data, by="geo_id") %>% head(10)
#> Warning: `...` is not empty.
#>
#> We detected these problematic arguments:
#> * `needs_dots`
#>
#> These dots only exist to allow future extensions and should be empty.
#> Did you misspecify an argument?
#> # Source: lazy query [?? x 3]
#> # Database: BigQueryConnection
#> socialization geo_id pop_data_2019
#> <chr> <chr> <int>
#> 1 Obnoxious US 329064917
#> 2 Polite CA 37411038 Created on 2020-10-04 by the reprex package (v0.3.0) |
(Part of #101)
The text was updated successfully, but these errors were encountered: