Skip to content

Conversation

@spinto
Copy link

@spinto spinto commented Dec 9, 2025

Related Issue(s):

Description:
when a STAC Item is loaded from a remote URL, pystac was not resolving correctly relative URL references (e.g. /d/myfile) to absolute paths.

This was due to the utils.is_absolute_href function checking always via os.path.isabs, even if the STAC Item is from a non-filesystem URL.

The patch proposed here applies the same principle of "utils.make_absolute_href", so to use the STAC item self value to determine if this is filesystem or network url and act accordingly (for network urls, the presence of the schema in front is considered absolute URL, instead a relative URL).

PR Checklist:

  • Pre-commit hooks pass (run pre-commit run --all-files)
  • Tests pass (run pytest)
  • Documentation has been updated to reflect changes, if applicable
  • This PR maintains or improves overall codebase code coverage.
  • Changes are added to the CHANGELOG. See the docs for information about adding to the changelog.

@gadomski gadomski self-requested a review December 9, 2025 13:05
@codecov
Copy link

codecov bot commented Dec 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.30%. Comparing base (608972a) to head (5e08cc8).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1599      +/-   ##
==========================================
+ Coverage   92.29%   92.30%   +0.01%     
==========================================
  Files          55       55              
  Lines        8395     8397       +2     
  Branches      971      971              
==========================================
+ Hits         7748     7751       +3     
  Misses        458      458              
+ Partials      189      188       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@spinto spinto marked this pull request as draft December 9, 2025 14:14
@spinto spinto marked this pull request as ready for review December 10, 2025 05:53
Copy link
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks like it only fixes href resolution for assets. Any reason to not also consider href resolution for traversing links, e.g. resolving a "child" or "item" link?

Also, can you add a CHANGELOG entry? Thanks.

@jsignell jsignell linked an issue Dec 10, 2025 that may be closed by this pull request
@spinto
Copy link
Author

spinto commented Dec 10, 2025

This PR looks like it only fixes href resolution for assets. Any reason to not also consider href resolution for traversing links, e.g. resolving a "child" or "item" link?

I was unsure it would be easy to make it work for the 'links', because it uses the get_self_href function to determine if we are loading from the network or from the filesystem, and you may get to parse the 'self' links later then the other links... also STAC API Catalogues I know will generate always absolute values for the links (although they may be relative according to the STAC Item specifications)... if that is ok, I would propose to start with just the asset and see later for the links...

@spinto spinto changed the title Fix for proper handling of absolute http URIs Fix for proper handling of relative http URIs for assets Dec 10, 2025
@spinto spinto changed the title Fix for proper handling of relative http URIs for assets Fix for proper handling of relative links for assets of network-loaded STAC items Dec 10, 2025
@spinto
Copy link
Author

spinto commented Dec 10, 2025

I have added the CHANGELOG

@gadomski gadomski self-requested a review December 10, 2025 19:02
Copy link
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that implementing this for links will be much tricker, so I'm ok with just handling assets for now.

Can you add a test for get_absolute_href, since that's the API change that you're mention in the CHANGELOG? Thanks!

spinto and others added 2 commits December 11, 2025 07:54
Co-authored-by: Pete Gadomski <pete.gadomski@gmail.com>
Co-authored-by: Pete Gadomski <pete.gadomski@gmail.com>
@spinto
Copy link
Author

spinto commented Dec 12, 2025

Can you add a test for get_absolute_href, since that's the API change that you're mention in the CHANGELOG? Thanks!

Well, the issue in get_absolute_href was caused by a problem in is_absolute_href... anyway I totally agree that more testing is better, so I have added also explicit tests for get_absolute_href too in the test_asset (for both windows and linux)

@spinto
Copy link
Author

spinto commented Dec 12, 2025

o the previous get_absolute_href test had some problems in windows. It should work now.

@gadomski gadomski self-requested a review December 12, 2025 11:54
Copy link
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks for the additional test, I think it helps illustrate the desired functionality well.

Just one question about logic to make sure I'm understanding things correctly.

Comment on lines +395 to +400
parsed_start_scheme = "" if start_href is None else safe_urlparse(start_href).scheme

if parsed_start_scheme in ["", "file"]:
return parsed.scheme not in ["", "file"] or os.path.isabs(parsed.path)
else:
return parsed.scheme not in ["", "file"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggled to understand this at first. Can it be more clearly expressed as this?

Suggested change
parsed_start_scheme = "" if start_href is None else safe_urlparse(start_href).scheme
if parsed_start_scheme in ["", "file"]:
return parsed.scheme not in ["", "file"] or os.path.isabs(parsed.path)
else:
return parsed.scheme not in ["", "file"]
if parsed.scheme not in ["", "file"]:
return True
else:
parsed_start_scheme = "" if start_href is None else safe_urlparse(start_href).scheme
return parsed_start_scheme in ["", "file"] and os.path.isabs(parsed.path)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proper handling of root relative hrefs

2 participants