Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Sql functions when joining datasets #4214

Closed
javierluraschi opened this issue Feb 22, 2019 · 5 comments
Closed

Feature Request: Sql functions when joining datasets #4214

javierluraschi opened this issue Feb 22, 2019 · 5 comments

Comments

@javierluraschi
Copy link
Contributor

Can we use custom SQL functions when joining tables under the by parameter? Seems to me this in unsupported but is quite important when doing Geo Spatial analysis in SQL.

This is how a query looks in GeoSpark with sparklyr, which works fine in DBI:

DBI::dbGetQuery(sc, "
  SELECT area, state, count(*) cnt FROM
    (SELECT area, ST_GeomFromWKT(polygons.geom ,'4326') as y FROM polygons) polygons
  INNER JOIN
    (SELECT ST_GeomFromWKT (points.geom,'4326') as x, state, city FROM points) points
  WHERE ST_Contains(polygons.y,points.x) GROUP BY area, state")

However, the following throws an error:

library(dplyr)
polygons_wkt <- mutate(polygons_wkt, y = st_geomfromwkt(geom, "4326"))
points_wkt <- mutate(points_wkt, x = st_geomfromwkt(geom, "4326"))

inner_join(polygons_wkt, points_wkt, by = st_contains(y, x)) %>%
  group_by(area, state) %>%
  summarise(cnt = n())
Error in st_contains(y, x) : could not find function "st_contains"

Is there a known workaround?

CC: @edgararuiz

@krlmlr
Copy link
Member

krlmlr commented Feb 22, 2019

There is #2240 for dplyr, but it's going to take a while. For databases we already have a workaround in the dev version: tidyverse/dbplyr#146.

@javierluraschi
Copy link
Contributor Author

Wasn't aware of the workaround! That helps a lot! But it would be really nice to support this, long-term. Thank you!

@romainfrancois
Copy link
Member

I'll close this as a duplicate of #2240. Flexible joins are on the radar, but that might not happen in the 0.9.* series.

@krlmlr
Copy link
Member

krlmlr commented Jun 29, 2019

FWIW, flexible joins are now available in dbplyr with the new sql_on argument.

@lock
Copy link

lock bot commented Dec 26, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Dec 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants