-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GeoSpatial use cases #7
Comments
Most likely. When I get to my laptop I'll get something. |
There's US states in geojson format https://github.com/glynnbird/usstatesgeojson |
congressional district boundaries https://github.com/JeffreyBLewis/congressional-district-boundaries |
Thanks. Do you also have an example data with lat/long coordinates in the US? They idea is to put these in the db and use a geojson to query records that appear within a certain geojson region. |
us cities in geojson https://gist.github.com/sckott/97c28209169d64938714 |
Maybe should do a separate chapter on this in the book. |
First of all, thank you for the useful mongolite package. I try to reproduce the code of a blog post. The author populates and queries a Mongo-database using GeoJson data. Assuming I have the following .geojson files (geo1.geojson, geo2.geojson, geo3.geojson, geo4.geojson) saved in a folder (GEO_DATA) # geo1.geojson
{
"name" : "Squaw Valley",
"location" : {
"type" : "Point",
"coordinates" : [
-120.24,
39.21
]
}
}
# geo2.geojson
{
"name" : "Mammoth Lakes",
"location" : {
"type" : "Point",
"coordinates" : [
-118.9,
37.61
]
}
}
# geo3.geojson
{
"name" : "Aspen",
"location" : {
"type" : "Point",
"coordinates" : [
-106.82,
39.18
]
}
}
# geo4.geojson
{
"name" : "Whistler",
"location" : {
"type" : "Point",
"coordinates" : [
-122.95,
50.12
]
}
}
First, I read the files and I populate the Mongo-database using the mongolite package, library(mongolite)
init_quer = mongo("GeoJson_query")
geo_json_files = list.files('/GEO_DATA', full.names = T)
for (i in 1:length(geo_json_files)) {
dat_geom = geojsonR::shiny_from_JSON(geo_json_files[i]) # read data using 'shiny_from_JSON'
# dat_geom = jsonlite::read_json(geo_json_files[i]) # OR using 'jsonlite'
init_quer$insert(dat_geom)
}
I don't face any problems when I do simple queries, # simple query
subs = init_quer$find(query = '{"name" : "Whistler"}',
fields = '{ "location.coordinates" : true, "name" : true, "_id" : false}'
)
subs
coordinates name
1 -122.95, 50.12 Whistler However, when I attempt to use the $geoIntersects or the $geoWithin operator I do get an empty data frame as an output, # 'geoIntersects' from mongodb
subs_geointersect = init_quer$find(query =
'{"location": {
"$geoIntersects": {
"$geometry": {
"type": "Polygon",
"coordinates": [[
[-109, 41],
[-102, 41],
[-102, 37],
[-109, 37],
[-109, 41]
]]
}
}
}
}',
fields = '{ "location.coordinates" : true, "name" : true, "_id" : false}'
)
subs_geointersect data frame with 0 columns and 0 rows
Am I doing something wrong? |
I know it's a while since I asked, however I came to a solution, which is somehow involved (just for reference in case anyone is interested to use the MongoDB geospatial features). This week I came across a blog post that uses MongoDB and mongolite for geospatial queries/analysis. The fact that the author used the MongoDB-Compass tool helped a lot to find out what exactly led to the previously mentioned empty data frame. First I modified the data insertion code chunk in the following way: library(mongolite)
init_quer = mongo(collection = "GeoJson_query", db = "GeoJson_db",
url = "mongodb://localhost", verbose = T)
geo_json_files = list.files('/GEO_DATA', full.names = T)
for (i in 1:length(geo_json_files)) {
dat_geom = jsonlite::read_json(geo_json_files[i], simplifyVector = T)
init_quer$insert(dat_geom)
} Then I opened an ubuntu console and typed sudo mongod --dbpath /var/lib/mongodb
to start the mongodb service (defining the path where the database is saved). After installing the MongoDB-Compass I followed the instructions in the blog post to create a (geospatial) index for the GeoJson_query collection of the GeoJson_db database. However I got an error, ( Can't extract geo keys: { _id: ObjectId('5968cf7942b25619d34254c1'), name: [ "Squaw Valley" ], location: { type: [ "Point" ], coordinates: [ -120.24, 39.21 ] } } unknown GeoJSON type: { type: [ "Point" ], coordinates: [ -120.24, 39.21 ] } That because the type field of each geospatial file was saved initially as an array ( ["Point"] ) rather than as a string ( I don't know if that has to do with the fact that the insert() method of the mongolite package accepts a data-frame, named list or a character vector as input). Thus I had to open a new MongoDB session and modify the saved data, mongo
use GeoJson_db # switch to the relevant database
and then I used the following query to modify the type field and save the updated data to a new collection GeoJson_query_updated ( probably there's a better mongodb-query to modify the data), db.GeoJson_query.aggregate([{
"$project": {
"name": 1,
"location": {
"type": { "$arrayElemAt": [ "$location.type", 0 ] } ,
"coordinates": 1}}},
{$out : "GeoJson_query_updated"}
]
)
After that I reopened the MongoDB-Compass tool, I navigated to the GeoJson_query_updated collection and I created the (geospatial) index successfully. Finally, by opening a new R-session I was able to get the correct output, modified_db = mongo(collection = "GeoJson_query_updated", db = "GeoJson_db",
url = "mongodb://localhost", verbose = T)
subs_geointersect = modified_db$find(query =
'{"location": {
"$geoIntersects": {
"$geometry": {
"type": "Polygon",
"coordinates": [[
[-109, 41],
[-102, 41],
[-102, 37],
[-109, 37],
[-109, 41]
]]
}
}
}
}', fields = '{ "location.coordinates" : true, "name" : true, "_id" : false}'
)
subs_geointersect
name coordinates
1 Aspen -106.82, 39.18
I'd like to know if there's a simpler solution to this issue from within an R-session. |
@mlampros - I may have missed the point of your issue, but I was able to create a geospatial index using
|
@SymbolixAU, thanks for making me aware of the index operator. In my initial example, although I create the geospatial index as you pointed out, I continue to receive an empty data frame, > init_quer$index((add = '{"geometry" : "2dsphere"}'))
v key._id key.geometry name ns 2dsphereIndexVersion
1 1 1 <NA> _id_ GeoJson_db.GeoJson_query NA
2 1 NA 2dsphere geometry_2dsphere GeoJson_db.GeoJson_query 3
> subs_geointersect = init_quer$find(query =
+
+ '{"location": {
+ "$geoIntersects": {
+ "$geometry": {
+ "type": "Polygon",
+ "coordinates": [[
+ [-109, 41],
+ [-102, 41],
+ [-102, 37],
+ [-109, 37],
+ [-109, 41]
+ ]]
+ }
+ }
+ }
+ }',
+
+ fields = '{ "location.coordinates" : true, "name" : true, "_id" : false}'
+ )
Imported 0 records. Simplifying into dataframe...
> subs_geointersect
data frame with 0 columns and 0 rows
However, you should know since my second post in this issue (July) I invested some time and I created the GeoMongo package, which performs geospatial queries using the reticulate package. I added also details of the package in a blog post. |
@mlampros I think your issue might be that you've specified the index on the |
@SymbolixAU, would you mind sharing an R script solution which results in the same output as is the case for the "$geoIntersects" operator in the mentioned blog post. |
@mlampros - I decided to write my own blog post as the example data was already available and I could read it directly into R. This shows how I used |
@SymbolixAU, thanks for sharing (blog post) |
MongoDB has some nice geospacial operators, for example
$geoWithin
queries for all points that lie within a certain area (polygon). This might make an interesting use case where we query data from a certain region or location. @sckott do you have some example data for this?The text was updated successfully, but these errors were encountered: