Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Capture Zoom Level (from map) for Reports #600
This task is to capture as much information as possible about the level of geographic granularity that was used to geocode a report. Ideally, reports that are geocoded using the map interface would capture the zoom level at the time the report was made. If reports were made from smartphones, the report would contain information about this; whether or not GPS was enabled; etc.
Why is this important?
For example: If User X wants to make a report about the great BBQ restaurant near David’s office, they would zoom into the exact building, and place their marker on that building. On the other hand, User Y might want to make the same report but can’t quite remember where the exact building was, but still wants to report that there is a good BBQ restaurant in the general vicinity of David’s office. User Y may zoom into the general vicinity and place their marker at random.
The problem is that the act of placing a marker defines an exact location in geographic space (lattitude/longitude) whether or not that marker is meant to identify that exact location or is simply marking a general area. As an end-consumer of Ushahidi data I have no way of differentiating these two reports but the difference in crucial:
User X is providing immediately actionable information (e.g. I can go to that exact BBQ restaurant) while User Y is providing general information that requires additional action (“there’s a good BBQ restaurant around this area but you’ll need to ask people in the neighborhood or do further research to find the exact address”).
Anyone who is trying to respond to a report in a humanitarian context will want to know the difference between these two reports. Providing the zoom level of each report would be an important piece of information. I could reasonably assume that reports entered using a lower zoom level (e.g. zoom level 0, or the entire Earth) are very general in nature: often signifying something happening within the country. On the other hand, reports entered at a higher zoom level (e.g. zoom level 16) would be much more precise and refer to very specific events or places.
Ultimately, the challenge is that a specific point (latitude/longitude) should only be used to represent other specific points in space (e.g. a well, a building, an intersection) and is not an appropriate way to signify general areas. Area is typically represented using a polygon in most geographic data. However, the simplicity of point data means that they are pressed into this service very often.
By capturing the zoom level employed at the time the report was made would, at least, help GIS analysts sort reports based on their granularity. Other datasets have dealt with ambiguous point data by adopting a “precision code” that quantitatively or qualitatively attempts to describe the scale at which each point was created. See this example of precision codes as part of point data to describe the location of humanitarian and development project locations.
Created by Shadrock on 2014-05-14 08:32:44.
Imported from https://phabricator.ushahidi.com/T248
Comment by Shadrock on 2014-11-26 17:53:27:
Folks: I'd really like to see this get higher priority. To summarize, very bluntly, what I explained in the description above: without this feature, the geographic information provided by an Ushahidi instance will remain largely unusable.
Comment by @rjmackay on 2014-11-27 09:02:33:
ping @anarghya @shadowhand if you want to include/prioritize this somewhere.
Comment by Shadrock on 2014-11-27 16:34:09:
Thanks Robbie. I understand juggling priorities, but feel like I really feel like this is more than a minor detail. Ushahidi prioritizes maps, but they remain largely for show. I'll be curious to see what the others think. I'm happy to contribute to this task however I can to help move it along. Thanks again for the quick response.
Comment by shadowhand on 2014-11-27 17:20:03:
i'm not at all convinced that zoom level is a proper indicator of granularity. fwiw, we currently have multiple kinds of "location" input: lat/lon points, as well as geometries. wouldn't a more accurate approach include a user friendly way to apply either a shape (area) or a point as the location identifier?
Comment by Shadrock on 2014-12-05 16:43:40:
I think I’m not explaining something very well here. Let me try again.
The problem is that zoom (scale) affects point and vector (geometry) data equally: they are both directly tied to the scale at which they are created. Allowing a user to differentiate between when it’s appropriate to use a point to identify something and when it’s appropriate to use geometry certainly is important, but it doesn’t resolve questions about precision. I’ll respond to each point separately:
Allowing user to choose point v. geometry
The discussion about including geometry in Ushahidi began back in 2010 when I started making noise about it during Haiti. At that time, I was working with Ros Sewell and Patrick Meier on the issue ([[ https://www.youtube.com/watch?v=eRdNUAqEiIU | I briefly mention it here ]]) and Ushahidi only allowed points, [[ http://resources.arcgis.com/en/help/getting-started/articles/026n0000000n000000.htm | which is only 1 of 3 primary ways to represent geography data visually ]]. The data I was working with really was most appropriately represented with polygons.
Eventually, geometry was added to the platform, which is great. But that still doesn’t address the issue of precision. I wasn’t the only person to notice this. The need to improve the accuracy of geographic precision and generally improve metadata came up in most internal and external reviews and case studies of that time… [[ http://www.ushahidi.com/2012/03/20/predicting-locations-of-emergency-damage-during-disaster-using-vgi-data/ | including on our own blog ]]. It’s an issue that continues to come up.
Zoom as indicator of precision
If zoom isn’t a proper indicator of granularity what would be? Zoom is, in fact, an accepted standard indicator of granularity in GIS workflows because you can’t map what you can’t see.
Geographers use it all the time as a standard part of quality control for both point and vector (geometry) data. Best practices in GIS metadata creation specifically ask that it is included and [[ http://wiki.openstreetmap.org/wiki/Zoom_levels | OpenStreetMap explicitly recognizes that different level of zoom correspond to different levels of geographic precision ]].
Simple test: Choose your favorite mapping platform and try to drop a pin on your house from zoom level 5 (no cheating!). Then from zoom level 10. Finally from zoom level 15. Which one is closest? Which one of those pins is more granular in its precision? Now, if you do that same thing in Ushahidi (make a report about your house). The report at zoom level 15 is clearly more precise (e.g. it’s closest to your actual house) but anyone using the data can’t possibly know that since each one of the reports will result in a lat/long: so there’s no way to distinguish which one was made at the greater level of zoom/granularity. I'm getting around this on the [[ http://rni.ushahidi.com/ | RNI deployment ]] right now buy using a [[ http://iatistandard.org/codelists/GeographicalPrecision/ | standard (if somewhat subjective) indicator of precision ]] as part of the report form precisely //because the platform doesn't already capture this information//. This is the voice of dogfood talking!
Note that I’m talking about granularity of precision (ie. spatial precision) not the accuracy of identifying what something is (e.g. whether or not a building is a hospital) – you can find [[ http://www.colorado.edu/geography/gcraft/notes/error/error_f.html | more on this distinction here ]]. So if your concern is whether or not the report is accurate… you’re correct, zoom level will not help with that. However, zoom level will distinguish those reports that were placed with greater //intention to be precise// (e.g. the user actually zoomed in to try and find the right building versus staying zoomed out at a “city” level).
As both a GIS project manager and a GIS grunt, I’ve had to develop, or follow, rules about the level of zoom that production staff could use to ensure a) the appropriate level of precision and b) uniformity across the data set. This is well documented in GIS literature (academic and trade) under the “minimum mapping unit” and “scale.” So, just to be clear, GIS professionals already use zoom as an indicator of precision all the time. It is an accepted industry best practice (if not standard).
Moreover, as a deployer and volunteer on several Ushahidi instances over the years, I’ve had the opportunity to observe how users interact with the platform and craft instructions specifically addressing these issues. In the vast majority of cases, what I’m saying here holds true: users who are trying to be more precise zoom in farther than those who are trying to be general. Yes, there will be some who zoom in, get frustrated, and just drop a pin anyways, but in my experience this is far less common. This, actually, gets to another issue with Ushahidi: our reliance on other people's basemaps but that's another post...
We’ve tried different things on different deployments. On one (I think it was the Sudan referendum) we gave users a checkbox for granularity: they could choose if their report was at the “city” level; the “neighborhood level”; or “an exact location.” The primary problem is that all of those terms are subjective: there is no one definition of what a city is, much less what that would mean in terms of true, Euclidian, space. Same with “neighborhood.” For somebody in one city it might be a few blocks, while in NYC it might be a whole borough. The other problem was that nobody checked the boxes... just one extra thing to do.
Is this making sense? Sometimes it’s a lot easier to show in person. Perhaps a good argument for me to devote some time to a quick GIS talk at the team meeting… if it wouldn’t bore too many people to death!
Comment by @rjmackay on 2014-12-05 19:39:32:
Thanks for the super write up.
I get that. More so now you've explained.
I guess the follow up to this is.. given a deployment will contain a mix of data collected at various different zoom levels. How do we combine that in a sane way?
I'll happily spend an entire day talking GIS :) There's only so much I can teach myself..
Comment by Shadrock on 2014-12-06 01:17:26:
I totally understand. And I appreciate using Phab as a place to have these discussions: it gives folks an opportunity to plead their case as the devs dig in and make hard choices. So that's what I'm doing here: making my case to ya'll.
Well, the short answer is, "we don't: that's the deployers responsibility." If we can, indeed, include the zoom level capture for each report then end-users of a deployment's data can at least triage and aggregate reports by some indicator (or proxy) of precision. That goes a long way towards helping. After that, task T264 (associated with this one) would probably be the best answer: give deployers a functionality that allows them to set zoom limits that enforce certain levels of precision. There are already other companies out there who are acknowledging this as an important function and allow setting the zoom for viewing or editing ([[ https://www.mapbox.com/tilemill/docs/guides/advanced-map-design/ | Mapbox does some of the more comprehensive work I've seen ]]). @shadowhand has marked T264 to a low priority, which I think is appropriate (even though it's associated) given the release schedule and, frankly, the fact that //eliminating// a user's functionality (zooming in our out) is not a very elegant solution given the constraints of the UI (as I've seen it). On a GIS production line, you don't limit the zoom of the analyst when they're creating point or vector data: they still need to zoom in and out to gain context for what they're doing (esp. important if tracing satellite imagery) but they have the discipline to actually create the data at a given scale. I honestly don't ever expect this to be the case for large Ushahidi deployments, but it's possible when only a few people are geocoding.
Comment by @rjmackay on 2014-12-08 11:38:33:
That's what worries me. Most deployers don't have much or any GIS knowledge. As much as possible, we need to try and provide sane suggestions and defaults. Obviously collecting zoom level is the first step to be able to do sensible things with it.
I can imagine a few things we could do :
Obviously those still don't work for everyone and could still be hugely confusing :/
@benstoltz this needs some work on both the API and frontend.. if you're keen to have a go at it let me know.
There's a lot of comments but the short version is:
Basic series of tasks:
There is so much more that we can do, to enrich and make the geographical data that we manage more useful. I'm just going to suggest this now:
In the implementation, instead of just adding a column for the zoom level in the database, could we add a column for collection metadata (i.e. in JSON)?
We should try to save the method by which the coordinates were provided. I don't think it's always a click on the map. Sometimes it's GPS, or Geocoding. If it's a click on the map, then, yes, let's capture the zoom level.
and this would work for this issue.
For GPS, and Geocoding we could leave
Also, sometimes, we won't know the source, as it may be coming from an API client that doesn't collect this info (i.e. importing a data set from somewhere else)
Reading through this...
Would asking users with a radio button choice on the location capture field something to the affect of:
Go some way to helping at least, differentiate the location data between 'general' and 'exact' and when the general is selected, the reports get a 'lower priority' in an export or at least an automatic tag that reads something like 'Location Vague'
Exact reports could have a zoom level that is closer and 'vague' or 'general' could have a further out zoom level.
It kind of draws upon this:
But doesn't make an assumption that the user needs to know the definition of city/town/neighbourhood etc. What the new c=radio button/check box would indicate is the users degree of comfort in self-declaring how accurate they believe they are. 'Vague' ticked would be a good indicator for any deployment owners to know what they need to follow up on because a user has self-described that they are not sure.
Actually doesn't exist right? We can 'draw' a boundary layer for location that captures a 'bounded area'
This should be in a separate ticket somewhere and should be something of a priority.
I don't quite parse what @tuxpiper is saying in as far as the metadata/JSON stuff but in general, capturing and displaying what the deployment is doing and allowing for deployment owners to learn how to work with that is A++ in my UX books.
Also 100% agree this should be hidden for anon settings
Also geez, like why so much questioning re. zoom level as a measurement of accuracy. It's better than the zilch-o we have now.
In retrospect, my comment probably amounted to a lot of blabber for saying a very simple thing: let's make database storage of geographic data as unconstrained as possible (store JSON)... so that any minute detail about geo data that may come up in the future can be stored there.