Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to load raster tile maps using contextily #2125

Merged
merged 35 commits into from
Mar 2, 2023

Conversation

weiji14
Copy link
Member

@weiji14 weiji14 commented Sep 19, 2022

Description of proposed changes

New dataset function to load XYZ tiles! Uses contextily to retrieve the tiles based on a bounding box.

Preview at https://pygmt-dev--2125.org.readthedocs.build/en/2125/api/generated/pygmt.datasets.load_tile_map.html

This is the first part of #2115 that does the tile loading and georeferencing to an xarray.DataArray. A follow up PR will be needed to take the output xarray.DataArray from pygmt.datasets.load_tile_map and plot it using fig.grdimage (to be discussed in #2115).

Usage:

import contextily
from pygmt.datasets import load_tile_map

raster = load_tile_map(
    region=[103.60, 104.06, 1.22, 1.49],  # West, East, South, North
    source=contextily.providers.Stamen.TerrainBackground,
    lonlat=True,  # bounding box coordinates are longitude/latitude
)

print(raster)
# <xarray.DataArray (band: 3, y: 1024, x: 1536)>
# array([[[208, 208, 205, ..., 210, 213, 213],
#         [208, 208, 205, ..., 213, 213, 213],
#         [205, 205, 204, ..., 213, 213, 213],
#         ...,
#         [153, 153, 153, ..., 153, 153, 153],
#         [153, 153, 153, ..., 153, 153, 153],
#         [153, 153, 153, ..., 153, 153, 153]],
# 
#        [[214, 214, 213, ..., 217, 218, 218],
#         [214, 214, 213, ..., 219, 220, 219],
#         [213, 213, 212, ..., 219, 220, 219],
#         ...,
#         [179, 179, 179, ..., 179, 179, 179],
#         [179, 179, 179, ..., 179, 179, 179],
#         [179, 179, 179, ..., 179, 179, 179]],
# 
#        [[175, 175, 174, ..., 186, 185, 185],
#         [175, 175, 174, ..., 188, 188, 188],
#         [174, 174, 171, ..., 188, 188, 188],
#         ...,
#         [204, 204, 204, ..., 204, 204, 204],
#         [204, 204, 204, ..., 204, 204, 204],
#         [204, 204, 204, ..., 204, 204, 204]]], dtype=uint8)
# Coordinates:
#   * band     (band) int64 0 1 2
#   * y        (y) float64 1.49 1.49 1.489 1.489 1.489 ... 1.221 1.221 1.22 1.22
#   * x        (x) float64 103.6 103.6 103.6 103.6 ... 104.1 104.1 104.1 104.1

Alternatively, they can use a custom Web Map Tile url like so:

raster = load_tile_map(
    region=[103.60, 104.06, 1.22, 1.49],  # West, East, South, North
    source="https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png",
    lonlat=True,  # bounding box coordinates are longitude/latitude
)
print(raster)
# <xarray.DataArray (band: 3, y: 1024, x: 1536)>
# array([[[174, 174, 174, ..., 200, 200, 200],
#         [181, 188, 181, ..., 200, 200, 200],
#         [201, 196, 197, ..., 200, 200, 200],
#         ...,
#         [170, 170, 170, ..., 170, 170, 170],
#         [170, 170, 170, ..., 170, 170, 170],
#         [170, 170, 170, ..., 170, 170, 170]],
# 
#        [[223, 223, 223, ..., 215, 215, 215],
#         [225, 227, 225, ..., 215, 215, 215],
#         [215, 196, 203, ..., 215, 215, 215],
#         ...,
#         [211, 211, 211, ..., 211, 211, 211],
#         [211, 211, 211, ..., 211, 211, 211],
#         [211, 211, 211, ..., 211, 211, 211]],
# 
#        [[163, 163, 163, ..., 171, 171, 171],
#         [170, 178, 170, ..., 171, 171, 171],
#         [199, 196, 195, ..., 171, 171, 171],
#         ...,
#         [223, 223, 223, ..., 223, 223, 223],
#         [223, 223, 223, ..., 223, 223, 223],
#         [223, 223, 223, ..., 223, 223, 223]]], dtype=uint8)
# Coordinates:
#   * band     (band) int64 0 1 2
#   * y        (y) float64 1.663e+05 1.663e+05 1.663e+05 ... 1.272e+05 1.272e+05
#   * x        (x) float64 1.153e+07 1.153e+07 1.153e+07 ... 1.158e+07 1.158e+07

References:

Addresses first part of #2115

Reminders

  • Run make format and make check to make sure the code follows the style guide.
  • Add tests for new features or tests that would have caught the bug that you're fixing.
  • Add new public functions/methods/classes to doc/api/index.rst.
  • Write detailed docstrings for all functions/methods.
  • If wrapping a new module, open a 'Wrap new GMT module' issue and submit reasonably-sized PRs.
  • If adding new functionality, add an example to docstrings or tutorials.

Slash Commands

You can write slash commands (/command) in the first line of a comment to perform
specific operations. Supported slash commands are:

  • /format: automatically format and lint the code
  • /test-gmt-dev: run full tests on the latest GMT development version

New dataset function to load XYZ tiles! Uses contextily to retrieve the tiles based on a bounding box. Included an example doctest that shows how the map tiles can be loaded into an xarray.DataArray. Added a new section in the API docs and intersphinx mappings for contextily and xyzservices.
@weiji14 weiji14 added the feature Brand new feature label Sep 19, 2022
@weiji14 weiji14 added this to the 0.8.0 milestone Sep 19, 2022
@weiji14 weiji14 self-assigned this Sep 19, 2022
Can't assume that the input bounding box (which can be in longitude/latitude) is the same as the returned extent (which is always in EPSG:3857).
To fix pylint `C0103: Argument name "ll" doesn't conform to snake_case naming style (invalid-name)`.
Remove the extent and _image variables to prevent `R0914: Too many local variables (17/15) (too-many-locals)`.
Let the Continuous Integration tests run with `contextily`, include it in pyproject.toml and environment.yml, and document it in `doc/install.rst` as an optional dependency.
import numpy as np
import xarray as xr

__doctest_requires__ = {("load_map_tiles"): ["contextily"]}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a nice way to skip the doctest when contextily is not installed (e.g. in the Python 3.8 / NumPy 1.21 tests) using https://github.com/astropy/pytest-doctestplus/tree/v0.12.1#doctest-dependencies. Thanks to #1790 for adding pytest-doctestplus!

doc/api/index.rst Outdated Show resolved Hide resolved
pygmt/datasets/map_tiles.py Outdated Show resolved Hide resolved
pygmt/datasets/map_tiles.py Outdated Show resolved Hide resolved
@seisman seisman modified the milestones: 0.8.0, 0.9.0 Dec 11, 2022
@weiji14 weiji14 mentioned this pull request Dec 20, 2022
65 tasks
@yvonnefroehlich
Copy link
Member

yvonnefroehlich commented Dec 25, 2022

This PR was bumped back to the 0.9.0 milestone by @seisman two weeks ago. But @weiji14 you added this PR to the list of priority issues/PRs for release v.0.8.0 some days ago. Currently, this PR is still in draft status and there are some conflicts with the main branch. Do you feel you can finish this PR and there is enough time for reviewers to look at it 🙂?

@weiji14 weiji14 marked this pull request as ready for review December 28, 2022 01:44
@weiji14
Copy link
Member Author

weiji14 commented Dec 28, 2022

This PR was bumped back to the 0.9.0 milestone by @seisman two weeks ago. But @weiji14 you added this PR to the list of priority issues/PRs for release v.0.8.0 some days ago. Currently, this PR is still in draft status and there are some conflicts with the main branch. Do you feel you can finish this PR and there is enough time for reviewers to look at it slightly_smiling_face?

Thanks for the ping. I've resolved the conflicts and removed the draft status. No strong opinion on whether to have it in v0.8.0 or v0.9.0, if it takes too long to review, we can merge it next year (2023).

@yvonnefroehlich
Copy link
Member

This PR was bumped back to the 0.9.0 milestone by @seisman two weeks ago. But @weiji14 you added this PR to the list of priority issues/PRs for release v.0.8.0 some days ago. Currently, this PR is still in draft status and there are some conflicts with the main branch. Do you feel you can finish this PR and there is enough time for reviewers to look at it slightly_smiling_face?

Thanks for the ping. I've resolved the conflicts and removed the draft status. No strong opinion on whether to have it in v0.8.0 or v0.9.0, if it takes too long to review, we can merge it next year (2023).

Hm. As this PR was in draft status, I thought you plan more changes here. But if it is already ready for review let's see, if people
can have a look at it.

@seisman
Copy link
Member

seisman commented Feb 27, 2023

Perhaps we should convert the returned image to longitude/latitude because it's more commonly used than the Spherical Mercator coordinate system, following https://contextily.readthedocs.io/en/latest/warping_guide.html.

Most tiles are originally served as Web Mercator (EPSG:3857), and reprojecting to latlon (EPSG:4326) would result in distortion. I'd prefer the reprojection to be a user controlled step (e.g. using rioxarray's .rio.to_crs), because if latlon isn't what the user wants, there would be double reprojection (EPSG:3857 ->EPSG:4326 ->EPSG:????) which is less accurate than single reprojection (EPSG:3857 ->EPSG:????).

What about having an option so that users can decide on the desired projection and don't have to learn the syntax of the rioxarray package?

Maybe in a separate PR?

This PR looks good to me, but I'm still unsure about the default coordinate system of the returned grid. Currently it's EPSG:3857. Are we going to change the default coordinate system to EPSG:4326 if the option mentioned above is implemented?

Comment on lines 35 to 44
source : xyzservices.TileProvider or str
Optional. The tile source: web tile provider or path to a local file.
The web tile provider can be in the form of a
:class:`xyzservices.TileProvider` object or a URL. The placeholders for
the XYZ in the URL need to be {x}, {y}, {z}, respectively. For local
file paths, the file is read with :doc:`rasterio <rasterio:index>` and
all bands are loaded into the basemap. IMPORTANT: tiles are assumed to
be in the Spherical Mercator projection (EPSG:3857). [Default is
``xyzservices.providers.Stamen.Terrain``, i.e. Stamen Terrain web
tiles].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, xyzservices.TileProvider is linked to its API documentation (https://xyzservices.readthedocs.io/en/stable/api.html#xyzservices.TileProvider) which is very technical.

Perhaps we should add a link to https://contextily.readthedocs.io/en/latest/intro_guide.html#Providers so that users can quickly see the list of all available providers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've linked to https://contextily.readthedocs.io/en/stable/providers_deepdive.html which seems a bit better (actually mentioned before in #2125 (comment)) and split out the three possible source options as bullet points. Done in commit aa2d5fb. Take a look to see if it's ok.

@weiji14
Copy link
Member Author

weiji14 commented Mar 2, 2023

Perhaps we should convert the returned image to longitude/latitude because it's more commonly used than the Spherical Mercator coordinate system, following https://contextily.readthedocs.io/en/latest/warping_guide.html.

Most tiles are originally served as Web Mercator (EPSG:3857), and reprojecting to latlon (EPSG:4326) would result in distortion. I'd prefer the reprojection to be a user controlled step (e.g. using rioxarray's .rio.to_crs), because if latlon isn't what the user wants, there would be double reprojection (EPSG:3857 ->EPSG:4326 ->EPSG:????) which is less accurate than single reprojection (EPSG:3857 ->EPSG:????).

What about having an option so that users can decide on the desired projection and don't have to learn the syntax of the rioxarray package?

Maybe in a separate PR?

This PR looks good to me, but I'm still unsure about the default coordinate system of the returned grid. Currently it's EPSG:3857. Are we going to change the default coordinate system to EPSG:4326 if the option mentioned above is implemented?

Even with the option to reproject to EPSG:4326, I still think the default should be EPSG:3857 since most web tiles are served using EPSG:3857 as mentioned in #2125 (comment). We shouldn't be reprojecting (and causing a distortion in the pixels) by default to EPSG:4326.

@seisman seisman added final review call This PR requires final review and approval from a second reviewer and removed needs review This PR has higher priority and needs review. labels Mar 2, 2023
pygmt/datasets/tile_map.py Outdated Show resolved Hide resolved
pygmt/datasets/tile_map.py Outdated Show resolved Hide resolved
pygmt/datasets/tile_map.py Outdated Show resolved Hide resolved
pygmt/datasets/tile_map.py Outdated Show resolved Hide resolved
@michaelgrund
Copy link
Member

Will be a great improvement!

Co-authored-by: Michael Grund <23025878+michaelgrund@users.noreply.github.com>
@weiji14
Copy link
Member Author

weiji14 commented Mar 2, 2023

Thanks everyone! I'll leave this open for another day in case there's anything else.

Comment on lines 32 to 33
zoom : int
Optional. Level of detail. [Default is 'auto'].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, from the default it seems like that there is also string input possible.
Currently, I am wondering what "level of detail" means. Do users have to pass a percentage value to zoom? Or is zoom something similar we have for rivers of pygmt.Figure.coast? Maybe we should also state what "auto" actually means?

Suggested change
zoom : int
Optional. Level of detail. [Default is 'auto'].
zoom : int or str
Optional. Level of detail [Default is ``"auto"``].

Copy link
Member Author

@weiji14 weiji14 Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Yes, I did wonder if people actually knew what to use for zoom! The zoom here refers to an integer from 0 to 22, where 0 is the most zoomed out (i.e. the whole globe), and 22 is the most zoomed in (i.e. you can see buildings. See https://docs.mapbox.com/help/glossary/zoom-level or https://leafletjs.com/examples/zoom-levels for a good definition.

There is a small complication in that not all tile sources supports zoom up to level 22, so it's a bit hard to just document something like "Use an integer between 0 and 22 (inclusive)". But let me try adding a better description of this zoom parameter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, more detail added in 9f14d31. Let me know if that reads ok!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that the description of zoom should be expanded a little bit since in my opinion this is a great (GMT-) improvement for people doing work on small scale levels such as cities or communities and want to have geographic content (streets, buidldings, parks etc.) in the background as reference. I missed such things in the past and used other libraries or created customized stuff. Definitely a feature we should highlight in the next release! Great work @weiji14 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @weiji14 for expanding the docstring for zoom! Now it is clearer how to use this parameter 🙂.

Co-Authored-By: Yvonne Fröhlich <94163266+yvonnefroehlich@users.noreply.github.com>
Copy link
Member

@michaelgrund michaelgrund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of zoom should be sufficient to understand what the parameter does.

pygmt/datasets/tile_map.py Outdated Show resolved Hide resolved
Co-authored-by: Michael Grund <23025878+michaelgrund@users.noreply.github.com>
pygmt/datasets/tile_map.py Outdated Show resolved Hide resolved
pygmt/datasets/tile_map.py Outdated Show resolved Hide resolved
pygmt/datasets/tile_map.py Outdated Show resolved Hide resolved
pygmt/datasets/tile_map.py Outdated Show resolved Hide resolved
pygmt/datasets/tile_map.py Outdated Show resolved Hide resolved
pygmt/datasets/tile_map.py Outdated Show resolved Hide resolved
Co-authored-by: Yvonne Fröhlich <94163266+yvonnefroehlich@users.noreply.github.com>
@weiji14 weiji14 merged commit f379164 into main Mar 2, 2023
@weiji14 weiji14 deleted the contextily/load_map_tiles branch March 2, 2023 22:31
@weiji14 weiji14 removed the final review call This PR requires final review and approval from a second reviewer label Mar 2, 2023
@seisman
Copy link
Member

seisman commented Mar 3, 2023

The doctest fails for the CI run with Windows + Python 3.11 + NumPy 1.24 (see https://github.com/GenericMappingTools/pygmt/actions/runs/4318832687/jobs/7537596112), but passes on all other CI runs.

_______________ [doctest] pygmt.datasets.tile_map.load_tile_map _______________
094     >>> import contextily
095     >>> from pygmt.datasets import load_tile_map
096     >>> raster = load_tile_map(
097     ...     region=[103.60, 104.06, 1.22, 1.49],  # West, East, South, North
098     ...     source=contextily.providers.Stamen.TerrainBackground,
099     ...     lonlat=True,  # bounding box coordinates are longitude/latitude
100     ... )
101     >>> raster.sizes
102     Frozen({'band': 3, 'y': 1024, 'x': 1536})
103     >>> raster.coords
Differences (unified diff with -expected +actual):
    @@ -1,4 +1,4 @@
     Coordinates:
    -  * band     (band) int64 0 1 2
    -  * y        (y) float64 1.663e+05 1.663e+05 1.663e+05 ... 1.272e+05 ...
    -  * x        (x) float64 1.153e+07 1.153e+07 1.153e+07 ... 1.158e+07 ...
    +  * band     (band) int32 0 1 2
    +  * y        (y) float64 1.663e+05 1.663e+05 1.663e+05 ... 1.272e+05 1.272e+05
    +  * x        (x) float64 1.153e+07 1.153e+07 1.153e+07 ... 1.158e+07 1.158e+07

@weiji14
Copy link
Member Author

weiji14 commented Mar 3, 2023

The doctest fails for the CI run with Windows + Python 3.11 + NumPy 1.24 (see https://github.com/GenericMappingTools/pygmt/actions/runs/4318832687/jobs/7537596112), but passes on all other CI runs.

_______________ [doctest] pygmt.datasets.tile_map.load_tile_map _______________
094     >>> import contextily
095     >>> from pygmt.datasets import load_tile_map
096     >>> raster = load_tile_map(
097     ...     region=[103.60, 104.06, 1.22, 1.49],  # West, East, South, North
098     ...     source=contextily.providers.Stamen.TerrainBackground,
099     ...     lonlat=True,  # bounding box coordinates are longitude/latitude
100     ... )
101     >>> raster.sizes
102     Frozen({'band': 3, 'y': 1024, 'x': 1536})
103     >>> raster.coords
Differences (unified diff with -expected +actual):
    @@ -1,4 +1,4 @@
     Coordinates:
    -  * band     (band) int64 0 1 2
    -  * y        (y) float64 1.663e+05 1.663e+05 1.663e+05 ... 1.272e+05 ...
    -  * x        (x) float64 1.153e+07 1.153e+07 1.153e+07 ... 1.158e+07 ...
    +  * band     (band) int32 0 1 2
    +  * y        (y) float64 1.663e+05 1.663e+05 1.663e+05 ... 1.272e+05 1.272e+05
    +  * x        (x) float64 1.153e+07 1.153e+07 1.153e+07 ... 1.158e+07 1.158e+07

Hmm, I don't see why the band coordinate would be int32 on Windows but int64 on Linux. Also the fact that it's only on Windows+NumPy 1.24? To be honest though, that dtype should really just be uint8.

The y and x coordinates are of different lengths, but the numbers look the same. Not sure what we can do about that, but let me open up a patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Brand new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants