Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeoSpatial use cases #7

Open
jeroen opened this issue May 2, 2015 · 15 comments
Open

GeoSpatial use cases #7

jeroen opened this issue May 2, 2015 · 15 comments

Comments

@jeroen
Copy link
Owner

jeroen commented May 2, 2015

MongoDB has some nice geospacial operators, for example $geoWithin queries for all points that lie within a certain area (polygon). This might make an interesting use case where we query data from a certain region or location. @sckott do you have some example data for this?

@sckott
Copy link

sckott commented May 2, 2015

Most likely. When I get to my laptop I'll get something.

@sckott
Copy link

sckott commented May 2, 2015

There's US states in geojson format https://github.com/glynnbird/usstatesgeojson

@sckott
Copy link

sckott commented May 2, 2015

congressional district boundaries https://github.com/JeffreyBLewis/congressional-district-boundaries

@jeroen
Copy link
Owner Author

jeroen commented May 2, 2015

Thanks. Do you also have an example data with lat/long coordinates in the US? They idea is to put these in the db and use a geojson to query records that appear within a certain geojson region.

@sckott
Copy link

sckott commented May 2, 2015

us cities in geojson https://gist.github.com/sckott/97c28209169d64938714

@jeroen
Copy link
Owner Author

jeroen commented Feb 27, 2017

Maybe should do a separate chapter on this in the book.

@mlampros
Copy link

First of all, thank you for the useful mongolite package.

I try to reproduce the code of a blog post. The author populates and queries a Mongo-database using GeoJson data. Assuming I have the following .geojson files (geo1.geojson, geo2.geojson, geo3.geojson, geo4.geojson) saved in a folder (GEO_DATA)

# geo1.geojson

{
    "name" : "Squaw Valley",
    "location" : {
        "type" : "Point",
        "coordinates" : [
            -120.24,
            39.21
        ]
    }
}


# geo2.geojson

{
    "name" : "Mammoth Lakes",
    "location" : {
        "type" : "Point",
        "coordinates" : [
            -118.9,
            37.61
        ]
    }
}


# geo3.geojson

{
    "name" : "Aspen",
    "location" : {
        "type" : "Point",
        "coordinates" : [
            -106.82,
            39.18
        ]
    }
}


# geo4.geojson

{
    "name" : "Whistler",
    "location" : {
        "type" : "Point",
        "coordinates" : [
            -122.95,
            50.12
        ]
    }
}

First, I read the files and I populate the Mongo-database using the mongolite package,


library(mongolite)


init_quer = mongo("GeoJson_query")

geo_json_files = list.files('/GEO_DATA', full.names = T)   

for (i in 1:length(geo_json_files)) {
  
  dat_geom = geojsonR::shiny_from_JSON(geo_json_files[i])           # read data using 'shiny_from_JSON' 
  
  # dat_geom = jsonlite::read_json(geo_json_files[i])               # OR using 'jsonlite'
  
  init_quer$insert(dat_geom)
}

I don't face any problems when I do simple queries,


# simple query

subs = init_quer$find(query = '{"name" : "Whistler"}',
                       
                       fields = '{ "location.coordinates" : true, "name" : true, "_id" : false}'
)

subs
     coordinates     name
1 -122.95, 50.12 Whistler

However, when I attempt to use the $geoIntersects or the $geoWithin operator I do get an empty data frame as an output,


# 'geoIntersects' from mongodb

subs_geointersect = init_quer$find(query =

    '{"location": {
       "$geoIntersects": {
        "$geometry": {
          "type": "Polygon",
            "coordinates": [[
              [-109, 41],
              [-102, 41],
              [-102, 37],
              [-109, 37],
              [-109, 41]
              ]]
          }
        }
      }
    }',
    
    fields = '{ "location.coordinates" : true, "name" : true, "_id" : false}'
)


subs_geointersect
data frame with 0 columns and 0 rows

Am I doing something wrong?


@mlampros
Copy link

I know it's a while since I asked, however I came to a solution, which is somehow involved (just for reference in case anyone is interested to use the MongoDB geospatial features).

This week I came across a blog post that uses MongoDB and mongolite for geospatial queries/analysis. The fact that the author used the MongoDB-Compass tool helped a lot to find out what exactly led to the previously mentioned empty data frame.

First I modified the data insertion code chunk in the following way:

library(mongolite)

init_quer = mongo(collection = "GeoJson_query", db = "GeoJson_db", 

                  url = "mongodb://localhost", verbose = T)

geo_json_files = list.files('/GEO_DATA', full.names = T)   

for (i in 1:length(geo_json_files)) {
  
  dat_geom = jsonlite::read_json(geo_json_files[i], simplifyVector = T)
  
  init_quer$insert(dat_geom)
}

Then I opened an ubuntu console and typed

sudo mongod --dbpath /var/lib/mongodb

to start the mongodb service (defining the path where the database is saved).

After installing the MongoDB-Compass I followed the instructions in the blog post to create a (geospatial) index for the GeoJson_query collection of the GeoJson_db database. However I got an error,

( Can't extract geo keys: { _id: ObjectId('5968cf7942b25619d34254c1'), name: [ "Squaw Valley" ], location: { type: [ "Point" ], coordinates: [ -120.24, 39.21 ] } } unknown GeoJSON type: { type: [ "Point" ], coordinates: [ -120.24, 39.21 ] }

That because the type field of each geospatial file was saved initially as an array ( ["Point"] ) rather than as a string ( I don't know if that has to do with the fact that the insert() method of the mongolite package accepts a data-frame, named list or a character vector as input).

Thus I had to open a new MongoDB session and modify the saved data,

mongo

use GeoJson_db                 # switch to the relevant database

and then I used the following query to modify the type field and save the updated data to a new collection GeoJson_query_updated ( probably there's a better mongodb-query to modify the data),

db.GeoJson_query.aggregate([{ 

  "$project": { 

    "name": 1, 

    "location": {

      "type": { "$arrayElemAt": [ "$location.type", 0 ] } , 

      "coordinates": 1}}}, 

  {$out : "GeoJson_query_updated"}

  ]
)

After that I reopened the MongoDB-Compass tool, I navigated to the GeoJson_query_updated collection and I created the (geospatial) index successfully.

Finally, by opening a new R-session I was able to get the correct output,

modified_db = mongo(collection = "GeoJson_query_updated", db = "GeoJson_db", 
                        url = "mongodb://localhost", verbose = T)

subs_geointersect = modified_db$find(query =

                                     '{"location": {
       "$geoIntersects": {
        "$geometry": {
          "type": "Polygon",
            "coordinates": [[
              [-109, 41],
              [-102, 41],
              [-102, 37],
              [-109, 37],
              [-109, 41]
              ]]
          }
        }
      }
    }', fields = '{ "location.coordinates" : true, "name" : true, "_id" : false}'
)
subs_geointersect
   name    coordinates
1 Aspen -106.82, 39.18

I'd like to know if there's a simpler solution to this issue from within an R-session.

@SymbolixAU
Copy link

@mlampros - I may have missed the point of your issue, but I was able to create a geospatial index using

m$index((add = '{"geometry" : "2dsphere"}'))
## where m is my 'mongolite' connection object 

@mlampros
Copy link

@SymbolixAU, thanks for making me aware of the index operator. In my initial example, although I create the geospatial index as you pointed out, I continue to receive an empty data frame,

> init_quer$index((add = '{"geometry" : "2dsphere"}'))
  v key._id key.geometry              name                       ns 2dsphereIndexVersion
1 1       1         <NA>              _id_ GeoJson_db.GeoJson_query                   NA
2 1      NA     2dsphere geometry_2dsphere GeoJson_db.GeoJson_query                    3

> subs_geointersect = init_quer$find(query =
+                                      
+                                      '{"location": {
+        "$geoIntersects": {
+         "$geometry": {
+           "type": "Polygon",
+             "coordinates": [[
+               [-109, 41],
+               [-102, 41],
+               [-102, 37],
+               [-109, 37],
+               [-109, 41]
+               ]]
+           }
+         }
+       }
+     }',
+                                    
+     fields = '{ "location.coordinates" : true, "name" : true, "_id" : false}'
+ )
 Imported 0 records. Simplifying into dataframe...

> subs_geointersect
data frame with 0 columns and 0 rows

However, you should know since my second post in this issue (July) I invested some time and I created the GeoMongo package, which performs geospatial queries using the reticulate package. I added also details of the package in a blog post.

@SymbolixAU
Copy link

SymbolixAU commented Aug 28, 2017

@mlampros I think your issue might be that you've specified the index on the geometry column, yet your first line in the query is querying against { "location" : {. I think this should be { "geometry" : {

@mlampros
Copy link

@SymbolixAU, would you mind sharing an R script solution which results in the same output as is the case for the "$geoIntersects" operator in the mentioned blog post.

@SymbolixAU
Copy link

@mlampros - I decided to write my own blog post as the example data was already available and I could read it directly into R. This shows how I used $geoIntersects. Hopefully it's reproducible.

@mlampros
Copy link

mlampros commented Sep 5, 2017

@SymbolixAU, thanks for sharing (blog post)

@SymbolixAU
Copy link

referencing my post on issue #109

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants