Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap grdcut #492

Merged
merged 12 commits into from
Jun 28, 2020
Merged

Wrap grdcut #492

merged 12 commits into from
Jun 28, 2020

Conversation

seisman
Copy link
Member

@seisman seisman commented Jun 24, 2020

Description of proposed changes

Wrap the grdcut function, which is the first step is to redesign
the load_earth_relief() function (see #489).

Ideally it should take a grid file or a xarray.DataArray as input, and
store output as a grid file or return a xarray.DataArray.

Supported features:

  • input: grid file; output: grid file
  • input: grid file; output: xarray.DataArray
  • input: xarray.DataArray; output: grid file
  • input: xarray.DataArray; output: xarray.DataArray

Passing xarray.DataArray to grdcut is not implenmented in the
GMT 6.0.0. See GenericMappingTools/gmt#3532 for
a feature request to the upstream GMT.

The long-form names of the grdcut function come from GMT.jl, but they're open for better naming.

The implementation of the grdcut() function is very similar to the surface() function.

Reminders

  • Run make format and make check to make sure the code follows the style guide.
  • Add tests for new features or tests that would have caught the bug that you're fixing.
  • Add new public functions/methods/classes to doc/api/index.rst.
  • Write detailed docstrings for all functions/methods.
  • If adding new functionality, add an example to docstrings or tutorials.

Wrap the `grdcut` function.

Ideally it should take a grid file or a `xarray.DataArray` as input, and
store output as a grid file or return a `xarray.DataArray`.

Supported features:

- [X] input: grid file; output: grid file
- [X] input: grid file; output: `xarray.DataArray`
- [ ] input: `xarray.DataArray`; output: grid file
- [ ] input: `xarray.DataArray`; output: `xarray.DataArray`

Passing `xarray.DataArray` to `grdcut` is not implenmented in the
GMT 6.0.0. See GenericMappingTools/gmt#3532 for
a feature request to the upstream GMT.
@weiji14 weiji14 added feature request New feature wanted feature Brand new feature and removed feature request New feature wanted labels Jun 24, 2020
@seisman seisman requested review from weiji14 and leouieda June 24, 2020 22:50
grdinfo in GMT 6.1.0 reports the grid registration in the last column,
while GMT 6.0.0 doesn't.
@weiji14
Copy link
Member

weiji14 commented Jun 24, 2020

This is good work @seisman! It was on my TODO list a while back, but I realize that cropping a Cartesian bounding box could be done easily enough in xarray using xr.sel. I do know that grdcut handles more than just rectangular Cartesian cuts though (e.g. circular cuts -S, and Geographic ones), and it would be good to support those too sorry, didn't realize you aliased -S already, all good!

pygmt/modules.py Outdated Show resolved Hide resolved
pygmt/modules.py Outdated Show resolved Hide resolved
@vercel vercel bot temporarily deployed to Preview June 24, 2020 23:09 Inactive
Co-authored-by: Wei Ji <23487320+weiji14@users.noreply.github.com>
@vercel vercel bot temporarily deployed to Preview June 24, 2020 23:10 Inactive
@seisman
Copy link
Member Author

seisman commented Jun 24, 2020

I will open a feature request or a bug report to remind us that grdcut doesn't support xarray.DataArray due to a missing feature from upstream GMT.

@seisman
Copy link
Member Author

seisman commented Jun 24, 2020

Do you have any idea why the test test_grdcut_dataarray_in_fail fails?

@seisman
Copy link
Member Author

seisman commented Jun 24, 2020

There is one more failure on Windows:

    def test_grdcut_file_in_dataarray_out():
        "grdcut an input grid file, and output as DataArray"
>       outgrid = grdcut("@earth_relief_01d", region="0/180/0/90")

C:\Miniconda\envs\testing\Lib\site-packages\pygmt\tests\test_grdcut.py:27: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
C:\Miniconda\envs\testing\Lib\site-packages\pygmt\helpers\decorators.py:194: in new_module
    return module_func(*args, **kwargs)
C:\Miniconda\envs\testing\Lib\site-packages\pygmt\helpers\decorators.py:304: in new_module
    return module_func(*args, **kwargs)
C:\Miniconda\envs\testing\Lib\site-packages\pygmt\modules.py:251: in grdcut
    return result
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <pygmt.helpers.tempfile.GMTTempFile object at 0x000001FD000AA250>
args = (None, None, None)

    def __exit__(self, *args):
        if os.path.exists(self.name):
>           os.remove(self.name)
E           PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\VSSADM~1\\AppData\\Local\\Temp\\pygmt-llrbi2c3.nc'

You use

pygmt/pygmt/gridding.py

Lines 90 to 94 in 516e799

if outfile == tmpfile.name: # if user did not set outfile, return DataArray
with xr.open_dataarray(outfile) as dataarray:
result = dataarray.load()
elif outfile != tmpfile.name: # if user sets an outfile, return None
result = None

while I use

        if outgrid == tmpfile.name:  # if user did not set outgrid, return DataArray
            result = xr.open_dataarray(outgrid)
        else:
            result = None  # if user sets an outgrid, return None

Is it the reason?

@weiji14
Copy link
Member

weiji14 commented Jun 24, 2020

You use

pygmt/pygmt/gridding.py

Lines 90 to 94 in 516e799

if outfile == tmpfile.name: # if user did not set outfile, return DataArray
with xr.open_dataarray(outfile) as dataarray:
result = dataarray.load()
elif outfile != tmpfile.name: # if user sets an outfile, return None
result = None

while I use

        if outgrid == tmpfile.name:  # if user did not set outgrid, return DataArray
            result = xr.open_dataarray(outgrid)
        else:
            result = None  # if user sets an outgrid, return None

Is it the reason?

Yes, see #245 (comment) and commit 3324235, the comment which reads:

The xarray.Dataset may sometimes be lazily loaded, so we use .load() to force loading of the data. See e.g. https://github.com/pydata/xarray/blame/9f4474d657193f1c7c9aac25bb2edf94755a8593/doc/io.rst#L169-L170. This is important because we want to persist the data even after the tempfile is deleted.

@seisman
Copy link
Member Author

seisman commented Jun 24, 2020

Do you have any idea why the test test_grdcut_dataarray_in_fail fails?

Never mind. I just forgot to push my changes.

pygmt/modules.py Outdated Show resolved Hide resolved
pygmt/__init__.py Outdated Show resolved Hide resolved
doc/api/index.rst Outdated Show resolved Hide resolved
Co-authored-by: Wei Ji <23487320+weiji14@users.noreply.github.com>
@vercel vercel bot temporarily deployed to Preview June 25, 2020 00:41 Inactive
Co-authored-by: Wei Ji <23487320+weiji14@users.noreply.github.com>
@vercel vercel bot temporarily deployed to Preview June 26, 2020 05:26 Inactive
@seisman seisman requested a review from weiji14 June 26, 2020 05:27
Copy link
Member

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks for working on this! I think it's fine to merge this now, and we can refactor later. Just one comment, which is more about integrating all the other grid related PRs we have open.

lib.call_module("grdcut", arg_str)

if outgrid == tmpfile.name: # if user did not set outgrid, return DataArray
with xr.open_dataarray(outgrid) as dataarray:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After starting #494, I'm starting to wonder if we should have a pygmt.open function to load NetCDF files with the node_offset attribute inside the xarray.DataArray, as needed if we're going to start chaining operations as you mentioned in #476 (comment).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the GMT C API correctly, GMT has its own data types (internally) for grids, CPT and data tables. It also provides API functions GMT_Read_Data and GMT_Write_Data, and two special modules gmtread and gmtwrite which I presume can read and write any GMT data types. Perhaps in the future we need to wrap these GMT data types. In this case, pygmt.read() and pygmt.write() may make more sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Brand new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants