Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeoSpatial use cases #7

Open
jeroen opened this issue May 2, 2015 · 15 comments
Open

GeoSpatial use cases #7

jeroen opened this issue May 2, 2015 · 15 comments
Labels

Comments

@jeroen
Copy link
Owner

@jeroen jeroen commented May 2, 2015

MongoDB has some nice geospacial operators, for example $geoWithin queries for all points that lie within a certain area (polygon). This might make an interesting use case where we query data from a certain region or location. @sckott do you have some example data for this?

@sckott
Copy link

@sckott sckott commented May 2, 2015

Most likely. When I get to my laptop I'll get something.

@sckott
Copy link

@sckott sckott commented May 2, 2015

There's US states in geojson format https://github.com/glynnbird/usstatesgeojson

@sckott
Copy link

@sckott sckott commented May 2, 2015

@jeroen
Copy link
Owner Author

@jeroen jeroen commented May 2, 2015

Thanks. Do you also have an example data with lat/long coordinates in the US? They idea is to put these in the db and use a geojson to query records that appear within a certain geojson region.

@jeroen
Copy link
Owner Author

@jeroen jeroen commented Feb 27, 2017

Maybe should do a separate chapter on this in the book.

@mlampros
Copy link

@mlampros mlampros commented Apr 20, 2017

First of all, thank you for the useful mongolite package.

I try to reproduce the code of a blog post. The author populates and queries a Mongo-database using GeoJson data. Assuming I have the following .geojson files (geo1.geojson, geo2.geojson, geo3.geojson, geo4.geojson) saved in a folder (GEO_DATA)

# geo1.geojson

{
    "name" : "Squaw Valley",
    "location" : {
        "type" : "Point",
        "coordinates" : [
            -120.24,
            39.21
        ]
    }
}


# geo2.geojson

{
    "name" : "Mammoth Lakes",
    "location" : {
        "type" : "Point",
        "coordinates" : [
            -118.9,
            37.61
        ]
    }
}


# geo3.geojson

{
    "name" : "Aspen",
    "location" : {
        "type" : "Point",
        "coordinates" : [
            -106.82,
            39.18
        ]
    }
}


# geo4.geojson

{
    "name" : "Whistler",
    "location" : {
        "type" : "Point",
        "coordinates" : [
            -122.95,
            50.12
        ]
    }
}

First, I read the files and I populate the Mongo-database using the mongolite package,


library(mongolite)


init_quer = mongo("GeoJson_query")

geo_json_files = list.files('/GEO_DATA', full.names = T)   

for (i in 1:length(geo_json_files)) {
  
  dat_geom = geojsonR::shiny_from_JSON(geo_json_files[i])           # read data using 'shiny_from_JSON' 
  
  # dat_geom = jsonlite::read_json(geo_json_files[i])               # OR using 'jsonlite'
  
  init_quer$insert(dat_geom)
}

I don't face any problems when I do simple queries,


# simple query

subs = init_quer$find(query = '{"name" : "Whistler"}',
                       
                       fields = '{ "location.coordinates" : true, "name" : true, "_id" : false}'
)

subs
     coordinates     name
1 -122.95, 50.12 Whistler

However, when I attempt to use the $geoIntersects or the $geoWithin operator I do get an empty data frame as an output,


# 'geoIntersects' from mongodb

subs_geointersect = init_quer$find(query =

    '{"location": {
       "$geoIntersects": {
        "$geometry": {
          "type": "Polygon",
            "coordinates": [[
              [-109, 41],
              [-102, 41],
              [-102, 37],
              [-109, 37],
              [-109, 41]
              ]]
          }
        }
      }
    }',
    
    fields = '{ "location.coordinates" : true, "name" : true, "_id" : false}'
)


subs_geointersect
data frame with 0 columns and 0 rows

Am I doing something wrong?


@mlampros
Copy link

@mlampros mlampros commented Jul 15, 2017

I know it's a while since I asked, however I came to a solution, which is somehow involved (just for reference in case anyone is interested to use the MongoDB geospatial features).

This week I came across a blog post that uses MongoDB and mongolite for geospatial queries/analysis. The fact that the author used the MongoDB-Compass tool helped a lot to find out what exactly led to the previously mentioned empty data frame.

First I modified the data insertion code chunk in the following way:

library(mongolite)

init_quer = mongo(collection = "GeoJson_query", db = "GeoJson_db", 

                  url = "mongodb://localhost", verbose = T)

geo_json_files = list.files('/GEO_DATA', full.names = T)   

for (i in 1:length(geo_json_files)) {
  
  dat_geom = jsonlite::read_json(geo_json_files[i], simplifyVector = T)
  
  init_quer$insert(dat_geom)
}

Then I opened an ubuntu console and typed

sudo mongod --dbpath /var/lib/mongodb

to start the mongodb service (defining the path where the database is saved).

After installing the MongoDB-Compass I followed the instructions in the blog post to create a (geospatial) index for the GeoJson_query collection of the GeoJson_db database. However I got an error,

( Can't extract geo keys: { _id: ObjectId('5968cf7942b25619d34254c1'), name: [ "Squaw Valley" ], location: { type: [ "Point" ], coordinates: [ -120.24, 39.21 ] } } unknown GeoJSON type: { type: [ "Point" ], coordinates: [ -120.24, 39.21 ] }

That because the type field of each geospatial file was saved initially as an array ( ["Point"] ) rather than as a string ( I don't know if that has to do with the fact that the insert() method of the mongolite package accepts a data-frame, named list or a character vector as input).

Thus I had to open a new MongoDB session and modify the saved data,

mongo

use GeoJson_db                 # switch to the relevant database

and then I used the following query to modify the type field and save the updated data to a new collection GeoJson_query_updated ( probably there's a better mongodb-query to modify the data),

db.GeoJson_query.aggregate([{ 

  "$project": { 

    "name": 1, 

    "location": {

      "type": { "$arrayElemAt": [ "$location.type", 0 ] } , 

      "coordinates": 1}}}, 

  {$out : "GeoJson_query_updated"}

  ]
)

After that I reopened the MongoDB-Compass tool, I navigated to the GeoJson_query_updated collection and I created the (geospatial) index successfully.

Finally, by opening a new R-session I was able to get the correct output,

modified_db = mongo(collection = "GeoJson_query_updated", db = "GeoJson_db", 
                        url = "mongodb://localhost", verbose = T)

subs_geointersect = modified_db$find(query =

                                     '{"location": {
       "$geoIntersects": {
        "$geometry": {
          "type": "Polygon",
            "coordinates": [[
              [-109, 41],
              [-102, 41],
              [-102, 37],
              [-109, 37],
              [-109, 41]
              ]]
          }
        }
      }
    }', fields = '{ "location.coordinates" : true, "name" : true, "_id" : false}'
)
subs_geointersect
   name    coordinates
1 Aspen -106.82, 39.18

I'd like to know if there's a simpler solution to this issue from within an R-session.

@SymbolixAU
Copy link

@SymbolixAU SymbolixAU commented Aug 24, 2017

@mlampros - I may have missed the point of your issue, but I was able to create a geospatial index using

m$index((add = '{"geometry" : "2dsphere"}'))
## where m is my 'mongolite' connection object 
@mlampros
Copy link

@mlampros mlampros commented Aug 25, 2017

@SymbolixAU, thanks for making me aware of the index operator. In my initial example, although I create the geospatial index as you pointed out, I continue to receive an empty data frame,

> init_quer$index((add = '{"geometry" : "2dsphere"}'))
  v key._id key.geometry              name                       ns 2dsphereIndexVersion
1 1       1         <NA>              _id_ GeoJson_db.GeoJson_query                   NA
2 1      NA     2dsphere geometry_2dsphere GeoJson_db.GeoJson_query                    3

> subs_geointersect = init_quer$find(query =
+                                      
+                                      '{"location": {
+        "$geoIntersects": {
+         "$geometry": {
+           "type": "Polygon",
+             "coordinates": [[
+               [-109, 41],
+               [-102, 41],
+               [-102, 37],
+               [-109, 37],
+               [-109, 41]
+               ]]
+           }
+         }
+       }
+     }',
+                                    
+     fields = '{ "location.coordinates" : true, "name" : true, "_id" : false}'
+ )
 Imported 0 records. Simplifying into dataframe...

> subs_geointersect
data frame with 0 columns and 0 rows

However, you should know since my second post in this issue (July) I invested some time and I created the GeoMongo package, which performs geospatial queries using the reticulate package. I added also details of the package in a blog post.

@SymbolixAU
Copy link

@SymbolixAU SymbolixAU commented Aug 28, 2017

@mlampros I think your issue might be that you've specified the index on the geometry column, yet your first line in the query is querying against { "location" : {. I think this should be { "geometry" : {

@mlampros
Copy link

@mlampros mlampros commented Aug 28, 2017

@SymbolixAU, would you mind sharing an R script solution which results in the same output as is the case for the "$geoIntersects" operator in the mentioned blog post.

@SymbolixAU
Copy link

@SymbolixAU SymbolixAU commented Sep 4, 2017

@mlampros - I decided to write my own blog post as the example data was already available and I could read it directly into R. This shows how I used $geoIntersects. Hopefully it's reproducible.

@mlampros
Copy link

@mlampros mlampros commented Sep 5, 2017

@SymbolixAU, thanks for sharing (blog post)

@SymbolixAU
Copy link

@SymbolixAU SymbolixAU commented Aug 13, 2018

referencing my post on issue #109

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.