Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Sentinel-5P #167

Merged
merged 18 commits into from
May 1, 2023
Merged

Update Sentinel-5P #167

merged 18 commits into from
May 1, 2023

Conversation

pjhartzell
Copy link
Contributor

@pjhartzell pjhartzell commented Apr 5, 2023

Description

Adds the sentinel-5p-l2-netcdf collection to the sentinel-5p dataset.

The Sentinel-5P payload is a single imaging spectrometer, the TROPOMI sensor. The collected data are processed to Level-2 derived products, 13 of which are included in this PR's sentinel-5p-l2-netcdf collection. The products are listed below, each linked to a sample STAC Item in PC Test:

You can check out the dataset landing page in PC Test.

Notes

  • The STAC Items are derived from the existing STAC Items on Azure. The NetCDF files are not touched.
  • The products are still in swath form (not re-gridded to tiles)
  • The products are in NetCDF format
  • There are 14 products in Azure, but we are only indexing 13 of them with STAC. The product that does not have STAC Items in Azure is L2__O3__PR, which is an ozone profile product. The data in Azure for this product starts in November 2021, so perhaps this product became available after the STAC Item creation pipeline was put together.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

The Collection and several hundred Items were ingested into PC Test.

Checklist:

Please delete options that are not relevant.

  • I have performed a self-review
  • Changelog has been updated
  • Documentation has been updated
  • Unit tests pass locally (./scripts/test)
  • Code is linted and styled (./scripts/format)

@pjhartzell pjhartzell marked this pull request as ready for review April 6, 2023 01:29
datasets/sentinel-5p/collection/template.json Outdated Show resolved Hide resolved
datasets/sentinel-5p/collection/template.json Outdated Show resolved Hide resolved
@pjhartzell pjhartzell requested a review from gadomski April 6, 2023 13:38
@TomAugspurger
Copy link
Contributor

@pjhartzell sorry, I caused a merge conflict with #166. That just added the msft:region property to the file collection JSON you moved / updated.

Can you comment on what the summaries JSON files are for? e.g. datasets/sentinel-5p/summaries/2022-09-08-full.json. They probably shouldn't be committed, right?

@pjhartzell
Copy link
Contributor Author

@TomAugspurger

sorry, I caused a merge conflict with #166. That just added the msft:region property to the file collection JSON you moved / updated.

No worries.

Can you comment on what the summaries JSON files are for? e.g. datasets/sentinel-5p/summaries/2022-09-08-full.json. They probably shouldn't be committed, right?

I suppose not. These files existed in the sentinel-5p folder before I started this PR. I'll move them out.

@TomAugspurger
Copy link
Contributor

Ah, thanks, I missed that they were a move. I suspect that Rob made those when analyzing the existing STAC items to find good values for summaries.

@pjhartzell
Copy link
Contributor Author

Yep, and they were valuable in both content and as an example of generating summary info. I'll hold onto them locally.

@pjhartzell pjhartzell marked this pull request as draft April 6, 2023 16:18
@pjhartzell
Copy link
Contributor Author

Converting to draft while I add anti-meridian support and fix incorrect datetime strings.

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just confirm: we need to add a bit of code here to call the new enclose_poles?

FWIW, I tried it out on the asset at blob://sentinel5euwest/sentinel-5p-stac/TROPOMI/L2__AER_AI/2023/01/01/S5P_L2_AER_AI_20230101T000319_20230101T014449_27036.json, and got a ZeroDivisionError when calling it after fix_item.

# import sys
# sys.path.append("datasets/sentinel-5p/")

# from sentinel_5p import *
from pctasks.core.storage import StorageFactory
from pctasks.core.tokens import *
from pctasks.core.models.tokens import *
import planetary_computer
import pystac
import shapely.geometry
from stactools.core.utils.antimeridian import fix_item, enclose_poles, Strategy

token = planetary_computer.sas.get_token("sentinel5euwest", "sentinel-5p-stac").token

storage_factory = StorageFactory(tokens=Tokens({"sentinel5euwest": StorageAccountTokens(containers={"sentinel-5p-stac": ContainerTokens(token=token)})}))

asset_uri = "blob://sentinel5euwest/sentinel-5p-stac/TROPOMI/L2__AER_AI/2023/01/01/S5P_L2_AER_AI_20230101T000319_20230101T014449_27036.json"

storage, json_path = storage_factory.get_storage_for_file(asset_uri)
item_dict = storage.read_json(json_path)

item = pystac.Item.from_dict(item_dict)

fix_item(item, Strategy.NORMALIZE)
poly = shapely.geometry.shape(item.geometry)

antimeridian.enclose_poles(poly)  # ZeroDivisionError

@gadomski
Copy link
Contributor

gadomski commented Apr 11, 2023

Just confirm: we need to add a bit of code here to call the new enclose_poles?

Correct, I'm working this now.

ZeroDivisionError when calling it after fix_item.

Yeah, enclose poles will have to come first, and I'm doing some testing now to see if we need fix_item after it -- I think we will, and that will require some extra twiddling to enclose_poles (fix_item detects crossings by >180° deltas in longitude, and enclose_poles as it is currently written goes straight from meridian to meridian ... we'll need to add some intermediate points to prevent the pole horizontal from getting picked up by fix_item, or maybe look for this exact line and skip it).

tl;dr: Antimeridians and poles are hard

@gadomski
Copy link
Contributor

@TomAugspurger got the antimeridian stuff fixed up. Some examples below. There's 100 items from each product in pc-test. One product type (L2__O3_TCL) is special, it doesn't cross the antimeridian and is actually just a giant box from -20 to 20 latitude, all the way around the world...we handle it specially in this PR.

Right now this code is in a personal repo ... if you'd like, I can port it over to stactools before we merge, but I thought I'd get this up now before I do that work. LMK.

FYSA the code appears to work well for sentinel3 as well, though there's an issue with wraparounds (for some files we've seen, there's data from the beginning and the end of a swath in the same spot) that requires another special hack to fix -- we'll put that in the sentinel3 pull request, I haven't seen any s5 data that has that issue.

Before for L2__AER_AI_

before

After for L2__AER_AI_

after

Before for L2__HCHO__

before

After for L2__HCHO__

after

@TomAugspurger
Copy link
Contributor

Right now this code is in a personal repo ... if you'd like, I can port it over to stactools before we merge, but I thought I'd get this up now before I do that work. LMK.

I'm OK with running off your personal repo, but agreed that longer-term this belongs in stactools (or shapley).


Just for fun, I made a little thing to browse the footprints (gist). I think these look good.

footprints

@gadomski
Copy link
Contributor

longer-term this belongs in stactools (or shapley).

I'm 👎 on either of these options, unfortunately. stactools is too heavy for what the antimeridian package does (antimeridian only needs shapely), and shapely intentionally doesn't care about geographic coordinate systems. Maybe there's another good home for this algorithm, but I don't think its either of those spots.

Uhhhh that browser is awesome and I'm going to be using it for sentinel3 for sure! Thanks for sharing!

@gadomski
Copy link
Contributor

gadomski commented Apr 14, 2023

@TomAugspurger released to PyPI and updated PR to point to the released version, so we aren't dependent on my personal repo url anymore.

@TomAugspurger TomAugspurger marked this pull request as ready for review April 18, 2023 13:56
Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Just two small questions.

Can you also copy-paste a Dockerfile to the sentinel-5p directory (I really need to finish off
#104 so we can stop all this duplication)?

datasets/sentinel-5p/collection/template.json Show resolved Hide resolved
datasets/sentinel-5p/collection/template.json Outdated Show resolved Hide resolved
pjhartzell and others added 6 commits April 28, 2023 08:38
- change s3:spatial_resolution from string to list of integers
- add s3:product_name
- make asset keys kebab-case
- remove redundant asset title prefixes
- round geometry and bbox coordinates to original precision after
  antimeridian fix
* added split
* removed limit
@TomAugspurger TomAugspurger merged commit 1b0eadb into main May 1, 2023
@TomAugspurger
Copy link
Contributor

Thanks all!

@TomAugspurger TomAugspurger deleted the update/pjh/sentinel-5p branch May 1, 2023 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants