Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation #1455

Merged
merged 12 commits into from Jun 20, 2023
Merged

Update documentation #1455

merged 12 commits into from Jun 20, 2023

Conversation

Ariana-B
Copy link
Contributor

@Ariana-B Ariana-B commented Jun 7, 2023

Reason for this pull request

Datacube-core documentation is woefully out of date and in bad need of a review and rewrite. A common pain point is around the installation and other configuration instructions

Proposed changes

  • Update ubuntu installation instructions

  • Include some more info on env variables

  • Update metadata type example

  • Closes #xxxx

  • Tests added / passed

  • Fully documented, including docs/about/whats_new.rst for all changes


📚 Documentation preview 📚: https://datacube-core--1455.org.readthedocs.build/en/1455/

@codecov
Copy link

codecov bot commented Jun 7, 2023

Codecov Report

Patch and project coverage have no change.

Comparison is base (ec00a7f) 91.82% compared to head (acac9c0) 91.82%.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #1455   +/-   ##
========================================
  Coverage    91.82%   91.82%           
========================================
  Files          132      132           
  Lines        14516    14516           
========================================
  Hits         13330    13330           
  Misses        1186     1186           

see 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Ariana-B Ariana-B marked this pull request as ready for review June 13, 2023 06:33
Copy link
Contributor

@SpacemanPaul SpacemanPaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a bunch of inline comments. Some are corrections/requests for changes, others are suggestions, and others are really just seeds for further discussion.

This PR is a great start though, and even as-is (without responding to any of my comments) would be an improvement - thanks so much for slogging through this, Ariana.

README.rst Show resolved Hide resolved
docs/about-core-concepts/metadata-types.rst Outdated Show resolved Hide resolved
docs/installation/data-preparation-scripts.rst Outdated Show resolved Hide resolved
@@ -42,11 +42,11 @@ Download the USGS Collection 1 landsat scenes from any of the links below:

The prepare script for collection 1 - level 1 data is available in
`ls_usgs_prepare.py
<https://github.com/opendatacube/datacube-dataset-config/blob/master/old-prep-scripts/ls_usgs_prepare.py>`_.
<https://github.com/opendatacube/datacube-dataset-config/blob/main/old-prep-scripts/ls_usgs_prepare.py>`_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we should update all this for USGS Landsat Collection 2.

If Collection 1 is still available, it's probably safe to assume it won't be for much longer - and we should at least state this here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to migrate some of the product-specific docs from over here to replace this: https://github.com/opendatacube/datacube-dataset-config

I don't know that any of the prep scripts have been used for like ... 5+ years!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth keeping this page at all, beyond linking to the datacube-dataset-config repo?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so! The scripts probably don't work...

docs/installation/database/setup.rst Outdated Show resolved Hide resolved
To add or alter metadata types, you can use commands like: ``datacube metadata add <path-to-file>``
and to update: ``datacube metadata update <path-to-file>``. Using ``--allow-unsafe`` will allow
you to update metadata types where the changes may have unexpected consequences.

Note that from version 1.9 onward, only eo3-compatible metadata types will be accepted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From version 2.0 onwards.

The postgres driver (currently aka "default", from 1.9 aka "legacy") will continue to support non-eo3-compatible metadata types until it is retired in 2.0.

The postgis driver (currently aka "experimental") already only supports non-eo3-compatible metadata types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

docs/installation/metadata-types.rst Outdated Show resolved Hide resolved
docs/installation/setup/common_install.rst Show resolved Hide resolved
docs/installation/setup/ubuntu.rst Outdated Show resolved Hide resolved

Now we can create an ``agdcintegration`` database for testing::
Now we can create the ``agdcintegration`` and ``odcintegration`` databases for testing::

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these names, specifically? Could they be changed to something more descriptive to help the user understand what they're for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are the databases used for integration testing. (agdcintegration for tests run against the old default/legacy/postgres index driver, and odcintegration for tests run against the new experimental/postgis index driver).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These databases are used for "integration tests", hence the name. But I agree not most obvious name. Ideally this should be captured as a cli tool, something like datacube bootsrtap --test-db.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel people won't necessarily recognise "agdc" as the legacy option and "odc" as the new option. Would something like: postgresintegration and postgisintegration be more descriptive? Or legacyintegration and currentintegration?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way that's only a convention, one could use whatever database name, as these things are configured via ~/.datacube_integration.conf (see docs below). And we should be testing with non-default names as well to catch any hard-coded assumptions in test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be fantastic to purge this repo of any remaining references to "AGDC" - that name should no longer appear in any ODC-branded repo IMO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe all references to 'agdc' stem from the postgres driver which is going to be deprecated in v2 anyway. I'm not sure doing a potentially breaking rename at this stage would be worth it.

Copy link
Contributor

@alexgleith alexgleith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Some nice changes and it's great to have someone go through the docs and incrementally improve things.

README.rst Outdated Show resolved Hide resolved
@@ -2,10 +2,16 @@
name: barebone
description: A minimalist metadata type file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not saying we should fix this here, but this is so minimal that it's almost irrelevant.

Practically nobody configures metadata types, I don't think. Possibly something to consider for the future, @SpacemanPaul as the times I change a metadata type is to make a field searchable for a product... allowing arbitrary searches would be great and using metadata to configure indexed searchable fields would be 🚀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality provided by the metadata subsystem is really handy, it's a shame it can only be configured at "product creation time". Non-index based searching can be configured purely at runtime as it just about constructing a query, no need to have an index for the query to be useful. It's just that datacube only allows configuration from the metadata document stored in DB and linked to a given product. While "stored metadata" is handy it does not need to be the only way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The postgis driver requires metadata type documents to be "eo3 compatible" although it's not 100% clear at the moment what that means.

The only use for metadata type documents going forwards appears to be to expose user-friendly metadata aliases for various dataset metadata entries for use in searches (and ensuring those searches are fully indexed).

@SpacemanPaul SpacemanPaul self-requested a review June 19, 2023 05:36
Copy link
Contributor

@SpacemanPaul SpacemanPaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor wording quibble, otherwise I think we're ready to merge.


For more information see :ref:`dataset-metadata-doc`.
For third party datasets, see the examples detailed `here <https://github.com/opendatacube/datacube-dataset-config#documented-examples>`__.
For common distribution formations, data can be indexed using one of the tools from `odc-apps-dc-tools <https://github.com/opendatacube/odc-tools/tree/develop/apps/dc_tools>`__.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Distribution formations" sounds wrong - I think you mean "distribution formats"?

@Ariana-B Ariana-B merged commit d8ccecd into develop Jun 20, 2023
22 checks passed
@Ariana-B Ariana-B deleted the doco_improvements branch June 20, 2023 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants