Adds a validate cli command #151

gadomski · 2021-06-23T22:10:51Z

This is a very simple cli command that runs validate (for Items) or validate_all (for Collections and Catalogs). There is also an --only flag to only run validate for container objects.

The output is a simple success message, or the exception message and source. This should hopefully help folks track down validation errors in existing STAC objects that can be hard to parse from exception output.

This is a very simple cli command that runs validate (for Items) or validate_all (for Collections and Catalogs). There is also an `--only` flag to only run validate for container objects. The output is a simple success message, or the exception message and source. This should hopefully help folks track down validation errors in existing STAC objects that can be hard to parse from exception output.

cholmes

Looks great to me! (but just looking at the cli commands, I don't code python).

gadomski · 2021-06-24T15:29:32Z

Sweet, thanks! I'm going to leave this open for now in case we wanted to add more superpowers (e.g. to help debug #124 and friends). Right now it's pretty naive (e.g. it probably won't give you pretty output if it can't read all linked children).

cholmes · 2021-06-24T15:33:16Z

Yeah, it'll help with my issues for sure. I'm sure I'll have more feedback as I use it, but seems like a good iterative step.

cholmes · 2021-06-28T16:15:49Z

Ok, just tried out the validate command and got an error that didn't show what line it encountered it with. It seems like it'd be ideal if the validate command always showed the exact line of the exact file where the link that generated the error came from.

(venv) cholmes@C02Y3151JHD3 stactools-pete % ./scripts/stac validate /Users/cholmes/Repos/planet-orders/new-stacs/planet-stac/collection.json 
Traceback (most recent call last):
  File "/opt/salt/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/salt/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/stactools/cli/__main__.py", line 4, in <module>
    run_cli()
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/stactools/cli/cli.py", line 40, in run_cli
    cli(prog_name='stac')
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/stactools/cli/commands/validate.py", line 24, in validate_command
    object.validate_all()
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/catalog.py", line 777, in validate_all
    self.validate()
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/stac_object.py", line 58, in validate
    return pystac.validation.validate(self)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/validation/__init__.py", line 33, in validate
    stac_dict=stac_object.to_dict(),
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/collection.py", line 537, in to_dict
    d = super().to_dict(include_self_link)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/catalog.py", line 459, in to_dict
    "links": [link.to_dict() for link in links],
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/catalog.py", line 459, in <listcomp>
    "links": [link.to_dict() for link in links],
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/link.py", line 249, in to_dict
    d["href"] = self.get_href()
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/link.py", line 121, in get_href
    if href and is_absolute_href(href) and self.owner and self.owner.get_root():
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/stac_object.py", line 202, in get_root
    root_link.resolve_stac_object()
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/link.py", line 213, in resolve_stac_object
    obj = stac_io.read_stac_object(target_href, root=root)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/stac_io.py", line 225, in read_stac_object
    d = self.read_json(source, *args, **kwargs)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/stac_io.py", line 196, in read_json
    txt = self.read_text(source, *args, **kwargs)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/stac_io.py", line 278, in read_text
    return self.read_text_from_href(href)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/stactools/core/io/__init__.py", line 24, in read_text_from_href
    with fsspec.open(href, "r") as f:
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/fsspec/core.py", line 102, in __enter__
    f = self.fs.open(self.path, mode=mode)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/fsspec/spec.py", line 968, in open
    **kwargs,
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/fsspec/implementations/local.py", line 144, in _open
    return LocalFileOpener(path, mode, fs=self, **kwargs)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/fsspec/implementations/local.py", line 235, in __init__
    self._open()
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/fsspec/implementations/local.py", line 240, in _open
    self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/cholmes/Repos/planet-orders/new-stacs/planet-stac/planet-stac/collection.json'

(I think I can figure out the issue, as it's a pretty constrained catalog at this point, but in a more complex catalog I wouldn't be sure

gadomski · 2021-06-28T17:37:01Z

Right on, thanks for the feedback. I've added some additional logic to walk children and look for missing links. Output looks something like this:

There isn't the exact file line (getting the exact line would be some additional lifting), but it gives you the file containing the link and the exact text of the link, which hopefully is easily findable. Is this helpful for your debugging scenario?

cholmes · 2021-06-28T18:56:24Z

Cool, that helps for some situations. Actual line numbers is likely not necessary in most situations. Would have helped with the last step, but that one was easy to figure out. Doesn't seem to grab my next particular one though (which feels pretty weird):

% ./scripts/stac validate /Users/cholmes/Repos/planet-orders/new-stacs/planet-stac/collection.json
OK! STAC object at /Users/cholmes/Repos/planet-orders/new-stacs/planet-stac/collection.json is valid!
% stac move-assets collection.json
Traceback (most recent call last):
  File "/Users/cholmes/Repos/planet-orders/venv/bin/stac", line 8, in <module>
    sys.exit(run_cli())
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/stactools/cli/cli.py", line 40, in run_cli
    cli(prog_name='stac')
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/stactools/cli/commands/copy.py", line 37, in move_assets_command
    copy=copy)
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/stactools/core/copy.py", line 187, in move_all_assets
    for item in catalog.get_all_items():
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/pystac/catalog.py", line 437, in get_all_items
    yield from self.get_items()
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/pystac/stac_object.py", line 292, in get_stac_objects
    link.resolve_stac_object(root=self.get_root())
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/pystac/stac_object.py", line 202, in get_root
    root_link.resolve_stac_object()
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/pystac/link.py", line 213, in resolve_stac_object
    obj = stac_io.read_stac_object(target_href, root=root)
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/pystac/stac_io.py", line 172, in read_stac_object
    d = self.read_json(source)
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/pystac/stac_io.py", line 151, in read_json
    txt = self.read_text(source, *args, **kwargs)
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/pystac/stac_io.py", line 215, in read_text
    return self.read_text_from_href(href, *args, **kwargs)
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/stactools/core/io/__init__.py", line 24, in read_text_from_href
    with fsspec.open(href, "r") as f:
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/fsspec/core.py", line 102, in __enter__
    f = self.fs.open(self.path, mode=mode)
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/fsspec/spec.py", line 968, in open
    **kwargs,
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/fsspec/implementations/local.py", line 132, in _open
    return LocalFileOpener(path, mode, fs=self, **kwargs)
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/fsspec/implementations/local.py", line 220, in __init__
    self._open()
  File "/Users/cholmes/Repos/planet-orders/venv/lib/python3.7/site-packages/fsspec/implementations/local.py", line 225, in _open
    self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: '/collection.json'

(I changed the path to just be a completely relative one, but it still seems to have problems with it, though maybe I wrote it wrong? But it gets past the 'validate' command.

cholmes · 2021-06-28T18:57:14Z

(I think there's also a problem with merge, as that seemed to add on relative path prefixes, but I've got more work to do to make a clean bug report on it).

gadomski · 2021-06-28T22:55:52Z

(I think there's also a problem with merge, as that seemed to add on relative path prefixes, but I've got more work to do to make a clean bug report on it).

Roger. I've added items to the validate command (and made the output a bit prettier) so you (hopefully) can get something like this instead of a traceback:

LMK if that helps.

(as an aside, the colorification is with an eye towards #70, so we could e.g. do yellow for best practices, etc)

This returns a lot of "self" link spam when testing on the data-files catalogs, but according to the spec these are bad links so I guess that's ok? Maybe we'll want to add a flag to quiet self flags later.

cholmes · 2021-07-19T19:39:25Z

Ok, I've been using this a lot. Going to post a number of potential improvements and test cases where more information would help (or it may just be a bug). But it'd be great to get this into the next release, even in its current form, as it's definitely a helpful tool.

The first suggestion is to add a 'check-links' option or something like that, which will actually follow all the asset hrefs (probably the link ones too) and tell you if they are actually valid locations. I'm hoping to catch typos and just when people think they're properly linking to the asset. I think this could just be warnings, that it's valid stac, but that the links aren't working. I put basically the same suggestion at https://github.com/sparkgeo/stac-validator as well.

cholmes · 2021-07-19T19:48:21Z

Ok, I've got a failure I'm stuck on, that doesn't have enough validation information for me to figure it out:

test-catalog.zip

unzip the catalog, then:

(venv) cholmes@c02y3151jhd3 test-catalog % stac validate collection.json 
FileNotFound error: [Errno 2] No such file or directory: '/collection.json'
Walking children to find location of missing link(s)...

And then if I try a 'copy' I get:

(venv) cholmes@c02y3151jhd3 test-catalog % stac copy collection.json test
Traceback (most recent call last):
  File "/Users/cholmes/Repos/stactools-pete/venv/bin/stac", line 8, in <module>
    sys.exit(run_cli())
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/stactools/cli/cli.py", line 40, in run_cli
    cli(prog_name='stac')
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/stactools/cli/commands/copy.py", line 67, in copy_command
    copy_catalog(source_catalog, dst, catalog_type, copy_assets)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/stactools/core/copy.py", line 198, in copy_catalog
    catalog = source_catalog.full_copy()
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/collection.py", line 764, in full_copy
    return cast(Collection, super().full_copy(root, parent))
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/catalog.py", line 972, in full_copy
    return cast(Catalog, super().full_copy(root, parent))
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/stac_object.py", line 373, in full_copy
    link.resolve_stac_object()
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/link.py", line 219, in resolve_stac_object
    obj = stac_io.read_stac_object(target_href, root=root)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/stac_io.py", line 224, in read_stac_object
    d = self.read_json(source, *args, **kwargs)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/stac_io.py", line 195, in read_json
    txt = self.read_text(source, *args, **kwargs)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/pystac/stac_io.py", line 277, in read_text
    return self.read_text_from_href(href)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/stactools/core/io/__init__.py", line 24, in read_text_from_href
    with fsspec.open(href, "r") as f:
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/fsspec/core.py", line 102, in __enter__
    f = self.fs.open(self.path, mode=mode)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/fsspec/spec.py", line 968, in open
    **kwargs,
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/fsspec/implementations/local.py", line 144, in _open
    return LocalFileOpener(path, mode, fs=self, **kwargs)
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/fsspec/implementations/local.py", line 235, in __init__
    self._open()
  File "/Users/cholmes/Repos/stactools-pete/venv/lib/python3.7/site-packages/fsspec/implementations/local.py", line 240, in _open
    self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: '/202012_223832_ssc4_u0001/20201211_223832_ssc4_u0001.json'

I'm 95% sure that I got this collection by just using stac planet convert order and then a merge.

This returns a lot of "self" link spam when testing on the data-files catalogs, but according to the spec these are bad links so I guess that's ok? Maybe we'll want to add a flag to quiet self flags later.

gadomski · 2021-07-22T20:56:40Z

The first suggestion is to add a 'check-links' option or something like that, which will actually follow all the asset hrefs (probably the link ones too) and tell you if they are actually valid locations.

I have the link version of this, but not the asset -- I'll add that.

There's some self link resolution that's funny in PySTAC v1.0.0, so once PySTAC v1.0.1 is released (containing stac-utils/pystac#574) I think this PR will be ready to merge, and then we can add features in subsequent PRs.

However, this isn't doing recursive asset validation, so we'll need to rework the flow.

cuttlefish

Looks good! It worked on a couple files I tried.
I like that the link and asset following is optional.
Can you update the CHANGELOG before you merge?

gadomski · 2021-07-27T13:23:21Z

@cuttlefish can you take a look at the codecov error when you get a chance? I don't really know what's going on there.

cuttlefish · 2021-07-27T16:43:22Z

@cuttlefish can you take a look at the codecov error when you get a chance? I don't really know what's going on there.

@gadomski It looks like it was just a spurious runtime error. Merge away!

gadomski linked an issue Jun 23, 2021 that may be closed by this pull request

Display the name of the file where a validation error occurred. #51

Closed

gadomski added this to the v0.2.1 milestone Jun 23, 2021

gadomski mentioned this pull request Jun 23, 2021

Test catalogs do not pass validation #152

Closed

gadomski requested a review from cholmes June 24, 2021 15:24

gadomski mentioned this pull request Jun 24, 2021

Issues with moving assets #146

Closed

cholmes approved these changes Jun 24, 2021

View reviewed changes

gadomski requested a review from matthewhanson June 24, 2021 15:35

Merge branch 'main' into feature/stac-validate

9288c78

Add intelligent missing link checking to validate

07f092f

Check items, and prettify output

59d06df

Check all links, not just item links

7f9f1ec

This returns a lot of "self" link spam when testing on the data-files catalogs, but according to the spec these are bad links so I guess that's ok? Maybe we'll want to add a flag to quiet self flags later.

gadomski mentioned this pull request Jun 29, 2021

Resolving a self link can erase href information stac-utils/pystac#499

Closed

cholmes mentioned this pull request Jul 19, 2021

STAC Copy not working with relative link. #168

Closed

gadomski and others added 4 commits July 20, 2021 06:24

Check all links, not just item links

139ad38

This returns a lot of "self" link spam when testing on the data-files catalogs, but according to the spec these are bad links so I guess that's ok? Maybe we'll want to add a flag to quiet self flags later.

Merge branch 'main' into feature/stac-validate

230c23a

Merge branch 'main' into feature/stac-validate

ec1283b

Merge branch 'main' into feature/stac-validate

74e056b

gadomski added 2 commits July 22, 2021 14:36

Add asset validation

4f75dfe

However, this isn't doing recursive asset validation, so we'll need to rework the flow.

Rework validate to check recused assets

2949161

gadomski added 3 commits July 22, 2021 14:41

Collections can have assets

b6334cc

Update link checks to not resolve to STAC objects

81f6ff7

Lints

df9373a

cholmes mentioned this pull request Jul 22, 2021

stac merge error: 'raise ValueError('asset_href msut be absolute.')' #169

Closed

gadomski added the enhancement New feature or request label Jul 26, 2021

Merge branch 'main' into feature/stac-validate

c8ead94

gadomski requested review from cuttlefish and removed request for matthewhanson July 26, 2021 19:53

cuttlefish approved these changes Jul 27, 2021

View reviewed changes

Update CHANGELOG for #151

f997d09

Merge branch 'main' into feature/stac-validate

eeec7b4

cuttlefish merged commit c278e73 into stac-utils:main Jul 27, 2021

gadomski deleted the feature/stac-validate branch July 27, 2021 17:29

gadomski mentioned this pull request Feb 14, 2022

Incorporate stac-check #230

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds a validate cli command #151

Adds a validate cli command #151

gadomski commented Jun 23, 2021

cholmes left a comment

gadomski commented Jun 24, 2021

cholmes commented Jun 24, 2021

cholmes commented Jun 28, 2021

gadomski commented Jun 28, 2021

cholmes commented Jun 28, 2021

cholmes commented Jun 28, 2021

gadomski commented Jun 28, 2021

cholmes commented Jul 19, 2021

cholmes commented Jul 19, 2021 •

edited

Loading

gadomski commented Jul 22, 2021

cuttlefish left a comment

gadomski commented Jul 27, 2021

cuttlefish commented Jul 27, 2021

Adds a validate cli command #151

Adds a validate cli command #151

Conversation

gadomski commented Jun 23, 2021

cholmes left a comment

Choose a reason for hiding this comment

gadomski commented Jun 24, 2021

cholmes commented Jun 24, 2021

cholmes commented Jun 28, 2021

gadomski commented Jun 28, 2021

cholmes commented Jun 28, 2021

cholmes commented Jun 28, 2021

gadomski commented Jun 28, 2021

cholmes commented Jul 19, 2021

cholmes commented Jul 19, 2021 • edited Loading

gadomski commented Jul 22, 2021

cuttlefish left a comment

Choose a reason for hiding this comment

gadomski commented Jul 27, 2021

cuttlefish commented Jul 27, 2021

cholmes commented Jul 19, 2021 •

edited

Loading