Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the most Pythonic way for long GMT arguments? #1082

Open
seisman opened this issue Mar 20, 2021 · 9 comments
Open

What's the most Pythonic way for long GMT arguments? #1082

seisman opened this issue Mar 20, 2021 · 9 comments
Labels
discussions Need more discussion before taking further actions feature request New feature wanted help wanted Helping hands are appreciated longterm Long standing issues that need to be resolved
Milestone

Comments

@seisman
Copy link
Member

seisman commented Mar 20, 2021

For each GMT option, PyGMT usually provides a long-form parameter. For example, GMT CLI option -B is equivalent to frame in PyGMT. This is done by the alias system (the use_alias decorator), which automatically converts long-form PyGMT parameters to short-form GMT options. The alias system works well but has known limitations and issues (e.g., #256, #262).

Sometimes, GMT arguments are still specified using long unreadable strings. For example,

As these common options are used a lot, we may take some time to find and implement a more Pythonic way to specify projection, frame et al.

However, there are still a lot of other long GMT arguments. Here is a simple example showing how to plot a GMT logo on maps.

import pygmt

fig = pygmt.Figure()
fig.basemap(region=[0, 10, 0, 2], projection="X6c", frame=["WSen", "xaf", "yaf"])
fig.logo(position="jTR+o0.3c/0.6c+w3c", box="+p1p+glightblue")
fig.show()

The above Python script is equivalent to the following GMT CLI script:

gmt begin
gmt basemap -R0/10/0/2 -JX6c -B
gmt logo -DjTR+o0.3c/0.6c+w3c -F+p1p+glightblue
gmt end show

The long argument for position and box are really difficult to remember and write. There are definitely some better and more Pythonic ways to specify these arguments. As GMT has too many options, we must provide a universal way to specify these long arguments.

Perhaps the simplest way is to pass these long arguments as dict, i.e.,

fig.logo(
    position={
        location: "jTR",
        offset: [0.3, 0.6],
        width: "3c",
    },
    box={
        pen: "1p",
        fill: "lightblue"
    }
)

Are there any better ways?

@seisman seisman added the question Further information is requested label Mar 20, 2021
@weiji14 weiji14 added this to To do in Pythonic GMT arguments Apr 6, 2021
@weiji14
Copy link
Member

weiji14 commented Apr 6, 2021

This has come up again and again in different forms, so I've started a project at https://github.com/GenericMappingTools/pygmt/projects/6 to track the various issues and PRs tackling this. Let's make this issue the central point to gather all the ideas. Here's my attempt at summarize the various candidates (feel free to edit this comment anyone, or suggest alternatives)

Dictionary

Pass the arguments into the parameter as a Python dictionary. E.g. at #262 (comment), #1082 (comment)

fig.logo(position={location: "jTR", offset: [0.3, 0.6], width: "3c"})

Pros: Appears easy to implement, sort of intuitive for new Python users
Cons: Parameters like location/offset/width cannot be tab-completed, every PyGMT module will need to be updated to handle these dict inputs

Functions

Expand the PyGMT module's keyword argument list to specify every possible parameter. E.g. at #1173 (comment)

fig.logo(position_location="jTR", position_offset=[0.3, 0.6], position_width="3c")

Pros: Can tab-complete inside PyGMT module, more discoverable by new users
Cons: May result in long list of parameters to scroll through, will need to be added to docstring of every module
Note: if it's a one-off thing though, this functional style might be suitable, e.g. for the text 'paragraph' option at #1078 (comment)

Classes

Set up a dedicated PyGMT Python class to handle the parameter/arguments. E.g. at #356 (comment), #262 (comment).

position = pygmt.param.Position(location="jTR", offset=[0.3, 0.6], width="3c")
fig.logo(position=position)

Pros: Can be tab-completed, only need to implement once and be used throughout all PyGMT modules, can plug into existing workflow
Cons: May not be as discoverable by new users, will be a bit harder to program up front


Yes, my preference is for the 'Classes' option, so I'll elaborate on it a bit more. The main idea is to have a params.py file with lots of pygmt.param.Abcde classes (or some other name). E.g.:

Implementation (of pygmt.param.ClassName)

It's been mentioned at #249 (comment) and attempted for e.g. at #379, but I'll summarize the idea again. A class like pygmt.param.Pen for example, will need to have a __str__ method to produce the correct -W argument to be passed into pen:

pen = pygmt.param.Pen(width="1p", color="blue", style="-")
print(pen)
# 1p,blue,-

Since PyGMT has adopted NEP29 #1074 and is Python 3.7+ now, we can use Python dataclasses to implement this (see https://realpython.com/python-data-classes), and this means less boilerplate when coding things up (compared to traditional Python classes).

TLDR: Let's start small-ish, e.g. @willschlitzer's issue on pen (-W) at #1173 will be a great start, and we can work out the implementation details along the way.

@weiji14 weiji14 added this to the 0.5.0 milestone Apr 6, 2021
@weiji14 weiji14 added feature request New feature wanted help wanted Helping hands are appreciated longterm Long standing issues that need to be resolved labels Apr 6, 2021
@seisman
Copy link
Member Author

seisman commented Apr 6, 2021

The "Classes" option sounds the best way, but we need to think about a general way to implement classes for all options.

@maxrjones
Copy link
Member

maxrjones commented Apr 27, 2021

I agree that the classes option is good. I have a couple further opinions about the implementation of long GMT arguments:

  1. Based on this discussion, we will implement full-word modifiers for GMT arguments in the PyGMT library independently of the planned effort to support long-option modifiers in core-GMT (see https://docs.generic-mapping-tools.org/dev/devdocs/long_options.html and the newly added project https://github.com/GenericMappingTools/gmt/projects/1). I agree with this decision but would like to see a system for ensuring consistent attribute names throughout the GMT ecosystem. So far, only GMT.jl has implemented long-format for both GMT options and modifiers (https://www.generic-mapping-tools.org/GMT.jl/dev/quick_learn/#Case-2.-You-are-a-new-GMT-user-or-one-that-wants-to-use-long-verbose-options). My opinion is that it will help GMT sustainability in the long run if all the GMT-related projects are consistent with at least the names of the modifiers even if the implementation is different. Do others think this is something worth tracking?
  2. I would prefer to use a different method than __str__ to format into GMT style, such that print gives user-readable information rather than the sometimes convoluted GMT syntax.

@weiji14
Copy link
Member

weiji14 commented Apr 27, 2021

I agree that the classes option is good. I have a couple further opinions about the implementation of long GMT arguments:

  1. Based on this discussion, we will implement full-word modifiers for GMT arguments in the PyGMT library independently of the planned effort to support long-option modifiers in core-GMT (see https://docs.generic-mapping-tools.org/dev/devdocs/long_options.html and the newly added project https://github.com/GenericMappingTools/gmt/projects/1). I agree with this decision but would like to see a system for ensuring consistent attribute names throughout the GMT ecosystem. So far, only GMT.jl has implemented long-format for both GMT options and modifiers (https://www.generic-mapping-tools.org/GMT.jl/dev/quick_learn/#Case-2.-You-are-a-new-GMT-user-or-one-that-wants-to-use-long-verbose-options). My opinion is that it will help GMT sustainability in the long run if all the GMT-related projects are consistent with at least the names of the modifiers even if the implementation is different. Do others think this is something worth tracking?

Yep, we should definitely have standardized aliases across all GMT wrappers. There was a lot of discussion around this in 2019/2020 on GitHub and the GMT forum (e.g. https://forum.generic-mapping-tools.org/t/standardized-human-readable-gmt-parameters-for-pygmt-matlab-julia-pyshtools-etc/77) but the conversation stalled for some reason, so thanks for picking this up again and starting the project to keep track!

  1. I would prefer to use a different method than __str__ to format into GMT style, such that print gives user-readable information rather than the sometimes convoluted GMT syntax.

This can be arranged. We can have both a __str__ and __repr__ method in the pygmt.param.ClassName class (see https://stackoverflow.com/questions/1436703/what-is-the-difference-between-str-and-repr), one which is human-readable (for the user), and one which is machine-readable (for GMT 😛). I'll work on it in the RFC PR at #1239.

Edit: Actually, Python 3.7 data classes already implements a __repr__ method (see https://realpython.com/python-data-classes/#basic-data-classes). So the current implementation at #1239 does the following:

pen = pygmt.param.Pen(style=".-.-", color="120-1-1", width="0.5c")

str(pen)  #'0.5c,120-1-1,.-.-'
print(pen)  # 0.5c,120-1-1,.-.-
repr(pen)  # "Pen(width='0.5c', color='120-1-1', style='.-.-')"
pen  # Pen(width='0.5c', color='120-1-1', style='.-.-')

For the last one, simply putting pen at the end in a Jupyter cell would give the information-rich __repr__ representation of the Pen object, same as calling repr(pen). Calling str(pen) or print(pen) would give the GMT style string. It sounds though that you prefer it to be the other way around @meghanrjones?

@maxrjones
Copy link
Member

Yep, we should definitely have standardized aliases across all GMT wrappers. There was a lot of discussion around this in 2019/2020 on GitHub and the GMT forum (e.g. https://forum.generic-mapping-tools.org/t/standardized-human-readable-gmt-parameters-for-pygmt-matlab-julia-pyshtools-etc/77) but the conversation stalled for some reason, so thanks for picking this up again and starting the project to keep track!

Cool, this is on the agenda to discuss at the next GMT community meeting. Of course, we could also discuss at the next PyGMT meeting in case anyone is unavailable May 6.

For the last one, simply putting pen at the end in a Jupyter cell would give the information-rich __repr__ representation of the Pen object, same as calling repr(pen). Calling str(pen) or print(pen) would give the GMT style string. It sounds though that you prefer it to be the other way around @meghanrjones?

Yes, but I could be convinced elsewise if others feel strongly that __str__ should be used for the GMT readable formatting.

@leouieda
Copy link
Member

leouieda commented May 6, 2021

Sorry, jumping in here a bit late here. A quick impression is that something like the dictionary approach would be the easiest way to implement but not necessarily the best way to use the library. Mainly because the dictionary arguments won't be tab-completed. Same thing goes for the classes. It would be nice as a developer but I really wouldn't want to have to create separate instances just to pass them as arguments to a function.

This is something I feel is very easy to forget when making software: good APIs optimize usability. So if we forget about the implementation details: How would you want to use PyGMT? Not as a GMT guru but as a Python user who wants to make some nice maps.

As a user, I feel like the easiest thing to use would be just flat function arguments. Then I can type fig.grdimage in a notebook and press tab to see what I have to give it very easily from the signature. The current implementation of our argument conversion makes this very difficult since we have to look up alias names etc. It was relatively easy to implement but it's painful to use in practice. Basically, anything that hinders tab-completion is going to hinder usability.

So if we take something like this:

import pygmt

# Load sample grid and point datasets
grid = pygmt.datasets.load_earth_relief()
points = pygmt.datasets.load_ocean_ridge_points()
# Sample the bathymetry along the world's ocean ridges at specified track points
track = pygmt.grdtrack(points=points, grid=grid, newcolname="bathymetry")

fig = pygmt.Figure()
# Plot the earth relief grid on Cylindrical Stereographic projection, masking land areas
fig.basemap(region="g", projection="Cyl_stere/150/-20/15c", frame=True)
pygmt.makecpt(cmap="gray", series=[0, 800])
fig.grdimage(grid=grid, cmap=True)
fig.coast(land="#666666")
# Plot the sampled bathymetry points using circles (c) of 0.15 cm
# Points are colored using elevation values (normalized for visual purposes)
fig.plot(
    x=track.longitude,
    y=track.latitude,
    style="c0.15c",
    cmap="terra",
    color=(track.bathymetry - track.bathymetry.mean()) / track.bathymetry.std(),
)
fig.show()

As a user, I would want to write something like this instead:

import pygmt

# These lines are fine
grid = pygmt.datasets.load_earth_relief()
points = pygmt.datasets.load_ocean_ridge_points()
track = pygmt.grdtrack(points=points, grid=grid, newcolname="bathymetry")

fig = pygmt.Figure()
fig.basemap(
    region="g", 
    projection=pygmt.CylindricalStereographic(central_longitude=150, central_latitude=-20),  # long names are tab-completed
    width="15c",  # Makes no sense to have this as part of the project. Could also have a default value
    frame=True,  # Should probably be True by default
)
# makecpt is called internally. It's not a name anyone new to GMT would think to use.
fig.grdimage(
    grid=grid, 
    cmap=pygmt.cmap.gray(vmin=0, vmax=800),  # Can tab complete the cmap name and keyword arguments
    # alternatively
    # cmap=pygmt.cmap("gray", vmin=0, vmax=800),  # not as good since can't tab-complete the name but still nice
)
fig.coast(land="#666666")
fig.plot(
    x=track.longitude,
    y=track.latitude,
    style="c",  # It's really confusing to have c0.15c (what are all the cs?)
    size="0.15c",
    cmap="terra",   # This is also OK if you don't need to edit CPT properties
    color=(track.bathymetry - track.bathymetry.mean()) / track.bathymetry.std(),
)
fig.show()

And going back to the logo example:

fig.logo(
    location="TR", 
    offset=[0.3, 0.6], 
    width="3c", 
    justify="default",  # or "mirror" for J
)
# if you want to plot the timestamp
fig.timestamp(location=..., offset=..., ...)

To me, every time multiple options have the same modifier that is an indication that this function is trying to do too many things. The mapping of "GMT module = PyGMT function/method" is not great for usability. We can always internally combine the options into something GMT wants and call the appropriate module. Decoupling the GMT I/O from the API is something that would make this a lot easier on our side but that is a huge amount of work.

You all have been putting in the time here so of course you get to decide and I'm perfectly happy with whatever the group settles on. You have been an awesome group and I wish I had more time to be interact more with you 🙂

@leouieda
Copy link
Member

leouieda commented May 6, 2021

The frame argument is another one that I would break into a dozen separate ones. Passing in a list with crazy strings is not something I've enjoyed recently.

@leouieda
Copy link
Member

leouieda commented May 7, 2021

Been thinking about this some more. The classes option is not a bad idea but there are some things to watch out for:

  1. Going overboard with classes. The Pen example is a good one. "1p,black,-" is quite reasonable in terms of complexity and should be somewhat familiar to matplotlib users (like "-k" in plt.plot). Other things like frames and positioning of boxes are the worse in terms of complexity and difficulty to remember. There are many that fall in between and I'm not sure where to draw the line.
  2. Generating confusion with regards to using classes or strings. Having almost every argument in a function come from a separate class would be quite tedious to use. But having some be strings and others be classes would also make it hard to remember which is which.

As a coder, I would definitely prefer implementing the classes since they would make our job easier. But as a user, I would probably end up passing the crazy GMT strings instead since creating all these different instances would be a bit tedious.

@seisman seisman unpinned this issue Mar 17, 2023
@seisman seisman pinned this issue Jul 4, 2023
@weiji14 weiji14 unpinned this issue Aug 24, 2023
@seisman seisman added discussions Need more discussion before taking further actions and removed question Further information is requested labels Nov 19, 2023
@seisman
Copy link
Member Author

seisman commented Nov 22, 2023

Dictionary

Pass the arguments into the parameter as a Python dictionary. E.g. at #262 (comment), #1082 (comment)

fig.logo(position={location: "jTR", offset: [0.3, 0.6], width: "3c"})

Pros: Appears easy to implement, sort of intuitive for new Python users Cons: Parameters like location/offset/width cannot be tab-completed, every PyGMT module will need to be updated to handle these dict inputs

Now it's possible to tab-complete dict keys, see #2793. It works in VSCode and other IDEs, but I can't make it work in Jupyter notebooks.

Functions

Expand the PyGMT module's keyword argument list to specify every possible parameter. E.g. at #1173 (comment)

fig.logo(position_location="jTR", position_offset=[0.3, 0.6], position_width="3c")

Pros: Can tab-complete inside PyGMT module, more discoverable by new users Cons: May result in long list of parameters to scroll through, will need to be added to docstring of every module Note: if it's a one-off thing though, this functional style might be suitable, e.g. for the text 'paragraph' option at #1078 (comment)

The parameter names are also too long and it means a lot of work for maintainers, as we have to convert position_location="jTR", position_offset=[0.3, 0.6], position_width="3c" to -DjTR+o0.3/0.6+w3c.

Classes

Set up a dedicated PyGMT Python class to handle the parameter/arguments. E.g. at #356 (comment), #262 (comment).

position = pygmt.param.Position(location="jTR", offset=[0.3, 0.6], width="3c")
fig.logo(position=position)

Pros: Can be tab-completed, only need to implement once and be used throughout all PyGMT modules, can plug into existing workflow Cons: May not be as discoverable by new users, will be a bit harder to program up front

I feel we can have both classes and dictionary. We can use classes for common options (e.g., -B, -J) and dictionary for non-common options (e.g., text's -M option).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussions Need more discussion before taking further actions feature request New feature wanted help wanted Helping hands are appreciated longterm Long standing issues that need to be resolved
Projects
Development

No branches or pull requests

4 participants