Skip to content

Integrate changes for compatibility with the modular OGC compliant application packaging #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 24 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,29 @@ The `process.ipynb` notebook file is designed to work either as an independent n

Unity OGC applications rely upon using [Papermill parameritzation](https://papermill.readthedocs.io/en/latest/usage-parameterize.html) of arguments. One of the cells is tagged with the `parameters` tag, indicating to Papermill which cell to inspect for insertion of values from the command line. See the [app-pack-generator](https://github.com/unity-sds/app-pack-generator) for more information on the formatting of parameters and the use of type hints.

### OGC Run
to run this without the stage-in or out parameters, simply call the process.cwl contained in this repo:

```
# run stage in

cwltool --outdir stage_in --copy-output stage_in.cwl test/ogc_app_package/stage_in.yml

#For my run, it ended up in a directory called z7ai3uj8

# For now, the current stage_in creates an invalid 'root' in catalog.json. you'll need to set the root from /<someting>/catalog.json to catalog.json
vim z7ai3uj8/catalog.json

# Pass that as the directory into the process.py runs
cwltool process.cwl --example_argument_empty '' --input z7ai3uj8/ --output_collection <mycollection>

#The output files and catalog.json from this run were stored in 02kajdto

#run stageout
cwltool stage_out.cwl --output_dir 02kajdto/ --staging_bucket <mybucket>

```

### stage-in

This notebook is connected to a Unity stage-in process through the `input_stac_collection_file` variable. This variable contains the location of a STAC feature collection file. That feature collection points to the input files used by the notebook. In our example notebook we use Unity-py to parse the file and obtain the full paths to the input files.
Expand All @@ -67,4 +90,4 @@ See our [releases page](https://github.com/unity-sds/unity-example-application/r

## License

See our: [LICENSE](LICENSE.txt)
See our: [LICENSE](LICENSE.txt)
62 changes: 62 additions & 0 deletions process.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#!/usr/bin/env cwl-runner
arguments:
- -p
- input
- $(inputs.input.path)
- -p
- output
- $(runtime.outdir)
baseCommand:
- papermill
- /home/jovyan/process.ipynb
- --cwd
- /home/jovyan
- "process_out.ipynb"
- -f
- /tmp/inputs.json
- -k
- python3
- --log-output
class: CommandLineTool
cwlVersion: v1.2
inputs:
input: Directory
example_argument_bool:
default: true
type: boolean
example_argument_empty:
default: null
type: string
example_argument_float:
default: 1.0
type: float
example_argument_int:
default: 1
type: int
example_argument_string:
default: string
type: string
output_collection:
default: example-app-collection___1
type: string
summary_table_filename:
default: summary_table.txt
type: string
outputs:
output:
outputBinding:
glob: $(runtime.outdir)
type: Directory
requirements:
DockerRequirement:
dockerPull: gangl/unity-ogc-example-application:174ee35b
InitialWorkDirRequirement:
listing:
- entry: $(inputs)
entryname: /tmp/inputs.json
InlineJavascriptRequirement: {}
InplaceUpdateRequirement:
inplaceUpdate: true
NetworkAccess:
networkAccess: true
ShellCommandRequirement: {}
84 changes: 63 additions & 21 deletions process.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -23,19 +23,36 @@
"from unity_sds_client.resources.data_file import DataFile"
]
},
{
"cell_type": "markdown",
"id": "5cb70c6e-a08b-49c4-bbef-13200a18bfda",
"metadata": {},
"source": [
"## Parameters Cell\n",
"\n",
"The below cell is tagged as a 'paramter' cell. This enables us to overwrite the below values at runtime. There are some special values in the below cell.\n",
"\n",
"* `input` is a special name, and it also has the `# type: stage-in` annotation. This should be a directory, and at run time it will be populated with a STAC catalog that contains files that have been staged for your algorithm to reference.\n",
"* `output` is a special name, and it also has the `# type: stage-out` annotation. This should be treated as a directory to which you write ALL of your output files along with a STAC catalog that references files you would like to persist outside of the algorithm run."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "04ac7f2d",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": [
"parameters"
]
},
"outputs": [],
"source": [
"input_stac_collection_file = 'test/stage_in/stage_in_results.json' # type: stage-in\n",
"output_stac_catalog_dir = 'test/process_results/' # type: stage-out\n",
"input = 'test/stage_in/' # type: stage-in\n",
"output = 'test/process_results/' # type: stage-out\n",
"\n",
"# Filename written to the working directory\n",
"summary_table_filename = \"summary_table.txt\"\n",
Expand All @@ -52,6 +69,25 @@
"example_argument_empty = None # type: string Allow a null value or a string\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "62471f5d-d898-46c1-89c1-b572851db551",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"reading test/stage_in/catalog.json\n"
]
}
],
"source": [
"input_catalog = os.path.join(input, \"catalog.json\")\n",
"print(\"reading {}\".format(input_catalog))"
]
},
{
"cell_type": "markdown",
"id": "7926d21b",
Expand All @@ -64,7 +100,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 10,
"id": "2eeaa5d4",
"metadata": {},
"outputs": [
Expand All @@ -88,7 +124,7 @@
"'<table>\\n<thead>\\n<tr><th>argument_name </th><th>type </th><th>value </th></tr>\\n</thead>\\n<tbody>\\n<tr><td>example_argument_int </td><td>&lt;class &#x27;int&#x27;&gt; </td><td>1 </td></tr>\\n<tr><td>example_argument_float </td><td>&lt;class &#x27;float&#x27;&gt; </td><td>1.0 </td></tr>\\n<tr><td>example_argument_string</td><td>&lt;class &#x27;str&#x27;&gt; </td><td>string </td></tr>\\n<tr><td>example_argument_bool </td><td>&lt;class &#x27;bool&#x27;&gt; </td><td>True </td></tr>\\n<tr><td>example_argument_empty </td><td>&lt;class &#x27;NoneType&#x27;&gt;</td><td> </td></tr>\\n</tbody>\\n</table>'"
]
},
"execution_count": 3,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
Expand Down Expand Up @@ -123,24 +159,24 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 11,
"id": "3a09d57c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['/home/jovyan/unity-example-application/test/stage_in/./SNDR.SS1330.CHIRP.20160822T0005.m06.g001.L1_AQ.std.v02_48.G.200425095850.nc',\n",
" '/home/jovyan/unity-example-application/test/stage_in/./SNDR.SS1330.CHIRP.20160822T0011.m06.g002.L1_AQ.std.v02_48.G.200425095901.nc']"
"['/Users/gangl/dev/unity/unity-OGC-example-application/test/stage_in/./SNDR.SS1330.CHIRP.20160822T0005.m06.g001.L1_AQ.std.v02_48.G.200425095850.nc',\n",
" '/Users/gangl/dev/unity/unity-OGC-example-application/test/stage_in/./SNDR.SS1330.CHIRP.20160822T0011.m06.g002.L1_AQ.std.v02_48.G.200425095901.nc']"
]
},
"execution_count": 4,
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"inp_collection = Collection.from_stac(input_stac_collection_file)\n",
"inp_collection = Collection.from_stac(input_catalog)\n",
"data_filenames = inp_collection.data_locations()\n",
"\n",
"data_filenames"
Expand All @@ -158,7 +194,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 12,
"id": "9fbac209",
"metadata": {},
"outputs": [],
Expand All @@ -178,7 +214,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 13,
"id": "d22c8670",
"metadata": {},
"outputs": [],
Expand All @@ -195,7 +231,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 14,
"id": "3344bd15",
"metadata": {},
"outputs": [
Expand All @@ -216,7 +252,7 @@
"'<table>\\n<thead>\\n<tr><th>product_name </th><th>product_name_type_id </th><th>shortname </th><th>product_version </th><th>date_created </th><th>time_coverage_start </th><th>time_coverage_end </th><th style=\"text-align: right;\"> geospatial_lat_mid</th><th style=\"text-align: right;\"> geospatial_lon_mid</th></tr>\\n</thead>\\n<tbody>\\n<tr><td>SNDR.SS1330.CHIRP.20160822T0005.m06.g001.L1_AQ.std.v02_48.G.200425095850.nc</td><td>L1_AQ </td><td>SNDR13CHRP1</td><td>v02.48.00 </td><td>2021-04-25T05:59:08Z</td><td>2016-08-22T00:05:22Z </td><td>2016-08-22T00:11:22Z</td><td style=\"text-align: right;\"> -48.6062</td><td style=\"text-align: right;\"> 12.4563 </td></tr>\\n<tr><td>SNDR.SS1330.CHIRP.20160822T0011.m06.g002.L1_AQ.std.v02_48.G.200425095901.nc</td><td>L1_AQ </td><td>SNDR13CHRP1</td><td>v02.48.00 </td><td>2021-04-25T05:59:19Z</td><td>2016-08-22T00:11:22Z </td><td>2016-08-22T00:17:22Z</td><td style=\"text-align: right;\"> -69.3979</td><td style=\"text-align: right;\"> -1.98753</td></tr>\\n</tbody>\\n</table>'"
]
},
"execution_count": 7,
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -228,14 +264,14 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 16,
"id": "014257f3",
"metadata": {},
"outputs": [],
"source": [
"# Write the table in text format\n",
"pathlib.Path(output_stac_catalog_dir).mkdir(parents=True, exist_ok=True)\n",
"output_filename = os.path.join(output_stac_catalog_dir, summary_table_filename)\n",
"pathlib.Path(output).mkdir(parents=True, exist_ok=True)\n",
"output_filename = os.path.join(output, summary_table_filename)\n",
"with open(output_filename, \"w\") as summary_file:\n",
" summary_file.write(tabulate(table_data, headers=column_names))"
]
Expand All @@ -250,9 +286,15 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 18,
"id": "b4aa5d3b",
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": []
},
"outputs": [],
"source": [
"# Create a collection\n",
Expand All @@ -271,14 +313,14 @@
"dataset.add_data_file(DataFile(\"csv\", summary_table_filename, [\"data\"]))\n",
"\n",
"#when we run \"to_stac\" below, this file will be generated. this needs to be added to the stac file itself for future reference.\n",
"dataset.add_data_file(DataFile(\"json\", output_stac_catalog_dir + \"/\" + summary_table_filename +'.json', [\"metadata\"] ))\n",
"dataset.add_data_file(DataFile(\"json\", output + \"/\" + summary_table_filename +'.json', [\"metadata\"] ))\n",
"\n",
"\n",
"# Add the dataset to the collection\n",
"#out_collection.add_dataset(dataset)\n",
"out_collection._datasets.append(dataset)\n",
"\n",
"Collection.to_stac(out_collection, output_stac_catalog_dir)"
"Collection.to_stac(out_collection, output)"
]
},
{
Expand Down Expand Up @@ -307,7 +349,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.18"
"version": "3.10.8"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
papermill
unity-sds-client==0.3.0
unity-sds-client==0.6.1
netCDF4
tabulate
Loading