Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test code: how to download data for a bounding box #1

Closed
cbeddow opened this issue Jun 17, 2021 · 7 comments
Closed

Test code: how to download data for a bounding box #1

cbeddow opened this issue Jun 17, 2021 · 7 comments

Comments

@cbeddow
Copy link
Member

cbeddow commented Jun 17, 2021

Please take a look at the below Python code and ask any questions or make suggestions to improve the code. This is not our real code, but we will want to make a function that appears like:

get_data(bbox,type,filter)

In this case bbox is the geographic bounding area, type is the API endpoint (points, traffic signs, images, sequences), and filter would likely want to use a format with an operator (>=, ==, <, contains, etc) for the properties of the returned data, for example a map feature in the example code matching the value for a street light, but we could filter also by date > than a given input.

We also want to consider a function that would save the data to file, maybe like saveGeoJSON(path).

import mercantile, mapbox_vector_tile, requests, json
from vt2geojson.tools import vt_bytes_to_geojson

# define an empty geojson as output
output= { "type": "FeatureCollection", "features": [] }

# vector tile endpoints -- change this in the API request to reference the correct endpoint
tile_points = 'mly_map_feature_point'
tile_traffic_signs = 'mly_map_feature_traffic_sign'

# Mapillary access token -- user should provide their own
access_token = 'MLY|XXX'

# a bounding box in [east_lng,_south_lat,west_lng,north_lat] format
east, south, west, north = [-80.13423442840576,25.77376933762778,-80.1264238357544,25.788608487732198]

# list of values to filter for and keep -- update this if changing to traffic signs
filter_values = ['object--support--utility-pole','object--street-light']

# get the tiles with x and y coordinates which intersect our bounding box, MUST be at zoom level 14 where the data is available
tiles = list(mercantile.tiles(east, south, west, north, 14))

# loop through all tiles to get IDs of Mapillary data
for tile in tiles:
    tile_url = 'https://tiles.mapillary.com/maps/vtp/{}/2/{}/{}/{}?access_token={}'.format(tile_points,tile.z,tile.x,tile.y,access_token)
    response = requests.get(tile_url)
    data = vt_bytes_to_geojson(response.content, tile.x, tile.y, tile.z)

    # filter for only features matching the values in filter list above
    filtered_data = [feature for feature in data['features'] if feature['properties']['value'] in filter_values]

    # check if features are inside bbox, and push to output geojson object if yes
    for feature in filtered_data:
        if (feature['geometry']['coordinates'][0] > east and feature['geometry']['coordinates'][0] < west)\
        and (feature['geometry']['coordinates'][1] > south and feature['geometry']['coordinates'][1] < north):
            output['features'].append(feature)

# save a local geojson with the filtered data
with open('mydata.geojson', 'w') as f:
    json.dump(output, f)
@OmarMuhammedAli
Copy link
Contributor

OmarMuhammedAli commented Jun 20, 2021

Hey @cbeddow! I have a couple of questions:

  1. Could you please elaborate on the filter parameter of the get_data function? And can you please explain the filtering criteria in the following snippet?
# list of values to filter for and keep -- update this if changing to traffic signs
filter_values = ['object--support--utility-pole','object--street-light']
  1. What does zoom level mean? and why does it have to be 14?
  2. If we already get the bounded tiles here
# get the tiles with x and y coordinates which intersect our bounding box, MUST be at zoom level 14 where the data is available
tiles = list(mercantile.tiles(east, south, west, north, 14))

Why do we need to check if the features exist within the bbox again here?

 # check if features are inside bbox, and push to output geojson object if yes
    for feature in filtered_data:
        if (feature['geometry']['coordinates'][0] > east and feature['geometry']['coordinates'][0] < west)\
        and (feature['geometry']['coordinates'][1] > south and feature['geometry']['coordinates'][1] < north):
            output['features'].append(feature)

@cbeddow
Copy link
Member Author

cbeddow commented Jun 22, 2021

@OmarMuhammedAli

  1. in a get_data function, a request will be made to the Mapillary API seeking all data within a specified boundary. If this is sent to the map feature tile endpoint, then the data will include the following: street lamps, utility poles, crosswalks, trash bins, shop signs, fire hydrants. Others will be included also. These each have a technical tag name such as object--street-light. I will get a list of these for us. So the use case here is a user with filter_values = ['object--support--utility-pole','object--street-light'] in a list, could pass this to the function, something like get_data(filter=filter_values) and return only utility poles and street lights, rather than returning all classes of data if left unspecified.

  2. Zoom level refers to the zoom level of a tile service. I recommend watching this video to take a look: https://www.youtube.com/watch?v=savQWL0kq_g

So there are 22 zoom levels on a web map. Zoom level=1 (or maybe 0) is a world map scale, which zoom level 14 is roughly a city-level. Check out this link: https://mapsam.com/map/#14/41.0016/29.0348

In this URL, the format is map/#ZOOM/LONGITUDE/LATITUDE. When you load it, the zoom is at level 14. See the red squares on the map, each of these is a level 14 tile. If you zoom in or out it changes, so test how it looks with a level 5 tile, and a level 18 tile.

When we access the raw data of the tile for this Python package, it will always be at level 14 (or the server may reject the call and say no data exists). So If we want to download all data for Istanbul city, we need to probably download data from multiple tiles, each at level 14 size.

For example, we can download the geojson here, which has a polygon representing Istanbul: https://spelunker.whosonfirst.org/id/85679237/

Or we can use the bounding box of Istanbul: [27.972623,40.801277,29.915492,41.585109]

Either of this should be given to the function, so we can say:

`get_data(location=[27.972623,40.801277,29.915492,41.585109],filter=['object--street-light'])` 

Then we would have a returned function that is a geojson point data format.

  1. In the first function, mercantile.tiles() will get all the zoom 14 tiles that intersect the input bounding box. So if our bounding box covers Istanbul, some of the zoom 14 tiles returned may only partially intersect our bounding box. So when we do the check to see if the features exist, we want to eliminate the features that are part of a tile which intersects the box, but where the actual feature lies outside of the box.

image

For example, in this image if the red is out bounding box, then the tiles with color are the ones which intersect it. We only want to keep point features that fall inside the yellow area, and not the orange area despite those tiles intersecting the bbox, so this second check would filter the partially intersecting tile features.

@Rubix982
Copy link
Contributor

Rubix982 commented Jun 24, 2021

@cbeddow

Acknowledged and read. I watched the lecture you mentioned above and got through that as well. For now, all this makes sense to me, but I'll definitely ask if I have more questions. Thank you!

@OmarMuhammedAli
Copy link
Contributor

@cbeddow All clear now. Thank you for the thorough answers!

@Rubix982
Copy link
Contributor

@OmarMuhammedAli should I close this?

@OmarMuhammedAli
Copy link
Contributor

@OmarMuhammedAli should I close this?

Leave it open as a reference for #18 and #19

@Rubix982
Copy link
Contributor

Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

3 participants