Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
b886e09
docs: :memo: Clarify the installation steps
joelostblom Nov 4, 2025
e090964
docs: :memo: Explain what a data package is and motivate Sprout more
joelostblom Nov 5, 2025
cf74fc3
docs: :memo: Move sidenote to callout
joelostblom Nov 5, 2025
b6d0f75
docs: :memo: Reword intro
joelostblom Nov 5, 2025
287374f
docs: :memo: Note differences when using the template repo
joelostblom Nov 5, 2025
ad1f51d
Syntax highlight the script code in the docs
joelostblom Nov 5, 2025
a53d18c
docs: :memo: Mention what the terminal command does
joelostblom Nov 5, 2025
26d67a2
docs: :memo: Divide guide into clearer sections
joelostblom Nov 5, 2025
264e0de
docs: :memo: Fix capitalization and clarify a bit
joelostblom Nov 6, 2025
702a73f
docs: :memo: Clarify the role of classes
joelostblom Nov 6, 2025
5cf676a
Delete more complex metadata management section
joelostblom Nov 6, 2025
0ca6001
Simplify example and use the actual metadata in the guide for the wri…
joelostblom Nov 6, 2025
611be79
Show example of what datapackage.json looks like
joelostblom Nov 6, 2025
7293957
Add chapter for manaing data package metadata
joelostblom Nov 6, 2025
cf27d6e
Update order to account for new page
joelostblom Nov 6, 2025
b7910a9
Change title and setup main.py file for second section
joelostblom Nov 7, 2025
98e016e
Show script content with syntax highlighting
joelostblom Nov 7, 2025
410fb40
Add old text as is
joelostblom Nov 7, 2025
76ab979
Make final statement more accurate
joelostblom Nov 7, 2025
1ef3a15
Elaborate on the example description
joelostblom Nov 7, 2025
3815970
Evaluate cell so that we can use `package_properties` later
joelostblom Nov 7, 2025
b41dbcf
Add more examples of how to edit the properties file
joelostblom Nov 7, 2025
55c4766
Reformat code blocks
joelostblom Nov 7, 2025
91bb115
Make easier to parse via additional section
joelostblom Nov 7, 2025
e6467c1
Fix typo
joelostblom Nov 7, 2025
7373e75
Remove import since `package_properties` is already defined
joelostblom Nov 7, 2025
a013422
Fix typo
joelostblom Nov 7, 2025
f49f405
Automate create of json output
joelostblom Nov 12, 2025
e1068fe
Merge branch 'main' into docs/needed-vs-recommended
joelostblom Nov 12, 2025
d869270
Improve wording
joelostblom Nov 13, 2025
14d6450
Apply suggestions from code review to improve wording
joelostblom Nov 17, 2025
cceb4ce
Turn note about file deletion into callout
joelostblom Nov 17, 2025
f668b7d
Clarify title
joelostblom Nov 17, 2025
380e2a4
Rephrase for clarity
joelostblom Nov 17, 2025
09179dd
Shorten paragraph
joelostblom Nov 17, 2025
dac7e02
Move heading to have intro paragraph
joelostblom Nov 17, 2025
af50649
Add explicit reference to install section
joelostblom Nov 17, 2025
25f3661
Avoid making it sound like the datapackage file is created here
joelostblom Nov 17, 2025
80d2703
Make heading more appropriate for content
joelostblom Nov 17, 2025
a2e042d
Apply suggestions from code review to improve wording
joelostblom Nov 18, 2025
89efa62
chore(pre-commit): :pencil2: automatic fixes
pre-commit-ci[bot] Nov 18, 2025
fb70906
Apply suggestions from code review to improve wording
joelostblom Nov 19, 2025
818843a
Name files consistently
joelostblom Nov 19, 2025
5fcf102
Merge branch 'main' into docs/needed-vs-recommended
lwjohnst86 Nov 21, 2025
831f7cb
Reflow text
joelostblom Nov 21, 2025
0e6b17c
Update links to match new file names
joelostblom Nov 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions _quarto-pdf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,11 +51,7 @@ book:
- index.qmd
- part: Guide
chapters:
# - docs/guide/index.qmd
- docs/guide/installation.qmd
- docs/guide/packages.qmd
- docs/guide/resources.qmd
- docs/guide/checks.qmd
- auto: "docs/guide/*.qmd"
- part: "Design: Architecture"
chapters:
- docs/design/architecture/index.qmd
Expand Down
263 changes: 263 additions & 0 deletions docs/guide/package-metadata.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,263 @@
---
title: "Adding to the package metadata"
order: 2
jupyter: python3
---

In the [previous guide](/docs/guide/package.qmd), we created a Data
Package and saw how to write a minimal set of metadata properties to
`datapackage.json`. Here, we'll take a closer look at the full
`package_properties.py` that's created by `create_properties_script()`.

::: callout-important
Before we get started with this section, it's necessary to delete any
previously created `scripts/package_properties.py` since
`create_properties_script()` does not overwrite this file if it already
exists.
:::

First, change your `main.py` to look like this:

```{python}
#| filename: "main.py"
import seedcase_sprout as sp

def main():
# Create the properties script in the default location.
sp.create_properties_script()

if __name__ == "__main__":
main()
```

Now run the script from terminal:

``` {.bash filename="Terminal"}
uv run main.py
```

If you now view the created `package_properties.py` script, you'll see
that it includes a template containing many of the most commonly used
metadata names together with comments indicating which are required and
which are optional:

<!-- Create the script where quarto can find it for building the docs -->

{{< include _python-minimal-package-setup.qmd >}}

```{python}
#| include: false
sp.create_properties_script(package_path.root())
```

```{python}
#| echo: false
#| output: asis
#| filename: "scripts/package_properties.py"
print(
'```python',
package_path.properties_script().read_text(),
'```',
sep='\n'
)
```

As you can see, there are a lot of available metadata properties.
However, as we saw in the previous section and as you can see from the
comments in this template script, there aren't that many *required*
metadata. So you can quickly get started creating a Data Package and add
more metadata later as needed.

Sometimes it might feel tedious to fill out metadata properties at all
and you might be tempted to skip creating a Data Package for your data.
But it's important to remember just how vital these metadata actually
are. Without them, your data are simply a collection of files without
any context or meaning. The metadata (properties) are **crucially
important** for understanding and actually using the data in your data
package!

## Creating a more complex `datapackage.json` file

Since metadata is so important, Sprout encourages users to include it by
making it easier to manage it through the use of Python classes as we
saw in the previous section. In the example above, you can see a couple
of additional classes `ContributorProperties` and `SourceProperties`.
Let's create a slightly more complex example using one of these other
classes:

```{python}
#| filename: "scripts/package_properties.py"

import seedcase_sprout as sp

package_properties = sp.PackageProperties(
name="diabetes-study",
title="A Study on Diabetes",
# You can write Markdown below, with the helper `sp.dedent()`.
description=sp.dedent("""
# Data from a 2021 study on diabetes prevalence

This data package contains data from a study conducted in 2021 on the
*prevalence* of diabetes in various populations. The data includes:

- demographic information
- health metrics
- survey responses about lifestyle
"""),
contributors=[
sp.ContributorProperties(
title="Jamie Jones",
email="jamie_jones@example.com",
path="example.com/jamie_jones",
roles=["creator"],
),
sp.ContributorProperties(
title="Zdena Ziri",
email="zdena_ziri@example.com",
path="example.com/zdena_ziri",
roles=["creator"],
)
],
licenses=[
sp.LicenseProperties(
name="ODC-BY-1.0",
path="https://opendatacommons.org/licenses/by",
title="Open Data Commons Attribution License 1.0",
)
],
## Autogenerated:
id="8f301286-2327-45bf-bbc8-09696d059499",
version="0.1.0",
created="2025-11-07T11:12:56+01:00",
)
```

You can see that we included a more involved description of the package
using the helper function `dedent()` and that we used the
`ContributorProperties` class twice as we set the `contributors`
parameter to a list of two contributors who co-created this example Data
Package.

Now you can edit your `main.py` file to again include the
`write_properties()` function:

```{python}
#| eval: false
#| filename: "main.py"
import seedcase_sprout as sp
from scripts.package_properties import package_properties

def main():
# Create the metadata properties script in default location.
sp.create_properties_script()
# Write metadata properties from properties script to `datapackage.json`.
sp.write_properties(properties=package_properties)

if __name__ == "__main__":
main()
```

```{python}
#| include: false
# Only to check that it runs.
sp.write_properties(
properties=package_properties,
path=package_path.properties()
)
```

Then, use uv to run the script from the Terminal.

``` {.bash filename="Terminal"}
uv run main.py
```

When you inspect the created `datapackage.json` file, you'll see that it
contains all the metadata from the `scripts/package_properties.py`:

```{python}
#| echo: false
#| output: asis
#| filename: datapackage.json
print(
'```json',
(package_path.path / 'datapackage.json').read_text(),
'```',
sep='\n'
)
```

If you made a mistake and want to update the properties in the current
`datapackage.json`, remember that you never need to edit the JSON file
directly. Instead, you edit the `scripts/package_properties.py` and then
run the `main.py` script to regenerate `datapackage.json`.

## Creating a README of the metadata properties

Having a *human-readable* version of what is contained in the
`datapackage.json` file is useful for others who may be working with or
wanting to learn more about your data package. You can use
`as_readme_text()` to convert the properties into text that can be added
to a README file. Let's create a README file with the properties of the
data package you just created by writing it in the `main.py` file.

```{python}
#| eval: false
#| filename: "main.py"
import seedcase_sprout as sp
from scripts.package_properties import package_properties

def main():
# Create the properties script in default location.
sp.create_properties_script()
# Save the properties to `datapackage.json`.
sp.write_properties(properties=package_properties)
# Create text for a README of the data package.
readme_text = sp.as_readme_text(package_properties)
# Write the README text to a `README.md` file.
sp.write_file(readme_text, sp.PackagePath().readme())

if __name__ == "__main__":
main()
```

Sprout splits the README creation functionality into two steps: One to
make the text and one to write to the file. That way, if you want to add
or manipulate the text, you can do so before writing it to the file.
This is useful if you want to add information to the README that you
don't want included in the `datapackage.json` file. For this guide we
won't cover how or why to do this.

Next, run this command in the Terminal to make the README file. The
`write_file()` will always overwrite the existing README file.

``` {.bash filename="Terminal"}
uv run main.py
```

```{python}
#| include: false
# Only to check that it runs.
readme_text = sp.as_readme_text(package_properties)
sp.write_file(
string=readme_text,
path=package_path.readme()
)
```

Now you can see that the `README.md` file has been created in your data
package:

```{python}
#| echo: false
print(file_tree(package_path.root()))
```

Now that you know how to create and manage metadata at the
project-level, it is time to learn how to add data to the project and
manage its metadata.

```{python}
#| include: false
temp_path.cleanup()
```
Loading
Loading