From 6d6820646b9cdafeda30658a00b8ad5d9c7f4fb1 Mon Sep 17 00:00:00 2001 From: Tobias Raabe Date: Fri, 1 May 2026 10:31:56 +0200 Subject: [PATCH 1/7] docs: rewrite article about moving projects --- docs/source/how_to_guides/index.md | 2 +- .../move_project_to_another_machine.md | 77 +++++++++++++++ docs/source/how_to_guides/portability.md | 95 ------------------- mkdocs.yml | 2 +- 4 files changed, 79 insertions(+), 97 deletions(-) create mode 100644 docs/source/how_to_guides/move_project_to_another_machine.md delete mode 100644 docs/source/how_to_guides/portability.md diff --git a/docs/source/how_to_guides/index.md b/docs/source/how_to_guides/index.md index ce4084f1..2e10705a 100644 --- a/docs/source/how_to_guides/index.md +++ b/docs/source/how_to_guides/index.md @@ -9,7 +9,7 @@ specific tasks with pytask. - [Migrating From Scripts To Pytask](migrating_from_scripts_to_pytask.md) - [Interfaces For Dependencies Products](interfaces_for_dependencies_products.md) -- [Portability](portability.md) +- [Move a Project to Another Machine](move_project_to_another_machine.md) - [Remote Files](remote_files.md) - [Functional Interface](functional_interface.md) - [Capture Warnings](capture_warnings.md) diff --git a/docs/source/how_to_guides/move_project_to_another_machine.md b/docs/source/how_to_guides/move_project_to_another_machine.md new file mode 100644 index 00000000..a43fc15d --- /dev/null +++ b/docs/source/how_to_guides/move_project_to_another_machine.md @@ -0,0 +1,77 @@ +# Move a Project to Another Machine + +This guide teaches you how to move a pytask project to another machine or environment +and reuse existing outputs where possible. + +## Update the lockfile on the source machine + +Run a normal build with [`pytask build`](../reference_guides/commands.md#pytask-build) +before moving the project with its `pytask.lock` and files and outputs are up-to-date: + +```console +$ pytask build +``` + +## Move the project files and reusable outputs + +If you have not done it yet, commit `pytask.lock` to your repository and move it with +the project. In practice, move: + +- the project files tracked in version control, including source files, configuration, + data inputs, and `pytask.lock` +- the build artifacts you want to reuse, often in `bld/` if you follow the tutorial + layout +- the `.pytask` folder if you use the data catalog and it manages some of your files + +## Keep external files in the same relative layout + +If tasks use files outside the project root, keep the same relative layout on the target +machine. The project root is the folder with the `pyproject.toml` file. + +For example, if a task reads `../shared/input.csv` from the source machine, the moved +project also needs a readable `../shared/input.csv` next to the project root on the +target machine. + +## Run pytask on the target machine + +After you moved the project to the target machine, run pytask to build the project: + +```console +$ pytask build +``` + +Assuming that the project was fully built before the move, pytask will not rebuild the +project and skip all tasks. + +## Clean stale lockfile entries + +If you removed, renamed, or moved tasks before transferring the project, clean up stale +lockfile entries on the source machine before you move the project: + +```console +$ pytask build --clean-lockfile +``` + +This rewrites the lockfile after a successful build with only the currently collected +tasks and their current state values. + +## If your project uses custom nodes + +Make sure custom node IDs and state values stay stable across machines: + +- Use project-relative IDs instead of absolute paths. +- Prefer file content hashes over timestamps. +- Avoid machine-specific paths or timestamps in custom + [`state()`](../api/nodes_and_tasks.md#pytask.PNode.state) implementations. +- Provide a custom hash function for + [`PythonNode`](../api/nodes_and_tasks.md#pytask.PythonNode) values that are not + natively stable. + +Most projects that only use built-in nodes do not need extra work here. + +!!! seealso + + The lockfile format and behavior are documented in the + [reference guide](../reference_guides/lockfile.md). For custom nodes, see + [Writing custom nodes](writing_custom_nodes.md). For hashing guidance, see + [Hashing inputs of tasks](hashing_inputs_of_tasks.md). diff --git a/docs/source/how_to_guides/portability.md b/docs/source/how_to_guides/portability.md deleted file mode 100644 index 6ffb23dd..00000000 --- a/docs/source/how_to_guides/portability.md +++ /dev/null @@ -1,95 +0,0 @@ -# Portability - -This guide explains what you need to do to move a pytask project between machines and -why the lockfile is central to that process. - -!!! seealso - - The lockfile format and behavior are documented in the - [reference guide](../reference_guides/lockfile.md). - -## How to port a project - -Use this checklist when you move a project to another machine or environment. - -1. **Update state once on the source machine.** - -Run a normal build with [`pytask build`](../reference_guides/commands.md#pytask-build) -so `pytask.lock` is up to date: - -```` -```console -$ pytask build -``` - -If you already have a recent lockfile and up-to-date outputs, you can skip this step. -```` - -1. **Ship the right files.** - - Commit `pytask.lock` to your repository and move it with the project. In practice, - you should move: - - - the project files tracked in version control (source, configuration, data inputs - and `pytask.lock`) - - the build artifacts you want to reuse (often in `bld/` if you follow the tutorial - layout) - - the `.pytask` folder in case you are using the data catalog and it manages some of - the files - -1. **Files outside the project** - - If you have files outside the project root (the folder with the `pyproject.toml` - file), you need to make sure that the same relative layout exists on the target - machine. - -1. **Run pytask on the target machine.** - - When states match, tasks are skipped. When they differ, tasks run and the lockfile is - updated. - -## What makes a project portable - -There are two things that must stay stable across machines: - -First, task and node IDs must be stable. An ID is the unique identifier that ties a task -or node to an entry in `pytask.lock`. pytask builds these IDs from project-relative -paths anchored at the project root, so most users do not need to do anything. If you -implement custom nodes, make sure their IDs remain project-relative and stable across -machines. - -Second, state values must be portable. The lockfile stores opaque state strings from -[`PNode.state()`](../api/nodes_and_tasks.md#pytask.PNode.state) and -[`PTask.state()`](../api/nodes_and_tasks.md#pytask.PTask.state), and pytask uses them to -decide whether a task is up to date. Content hashes are portable; timestamps or absolute -paths are not. This mostly matters when you define custom nodes or custom hash -functions. - -## Tips for stable state values - -- Prefer file content hashes over timestamps for custom nodes. -- For [`PythonNode`](../api/nodes_and_tasks.md#pytask.PythonNode) values that are not - natively stable, provide a custom hash function. -- Avoid machine-specific paths or timestamps in custom - [`state()`](../api/nodes_and_tasks.md#pytask.PNode.state) implementations. - -!!! seealso - - For custom nodes, see [Writing custom nodes](writing_custom_nodes.md). For hashing - guidance, see [Hashing inputs of tasks](hashing_inputs_of_tasks.md). - -## Cleaning up the lockfile - -`pytask.lock` is updated incrementally. Entries are only replaced when the corresponding -tasks run. If tasks are removed or renamed, their old entries remain as stale data and -are ignored. - -To clean up stale entries without deleting the file, run -[`pytask build --clean-lockfile`](../reference_guides/commands.md#pytask-build--clean-lockfile): - -```console -$ pytask build --clean-lockfile -``` - -This rewrites the lockfile after a successful build with only the currently collected -tasks and their current state values. diff --git a/mkdocs.yml b/mkdocs.yml index 4711b71a..94291903 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -37,7 +37,7 @@ nav: - Overview: how_to_guides/index.md - Migrating From Scripts To pytask: how_to_guides/migrating_from_scripts_to_pytask.md - Interfaces For Dependencies And Products: how_to_guides/interfaces_for_dependencies_products.md - - Portability: how_to_guides/portability.md + - Move a Project to Another Machine: how_to_guides/move_project_to_another_machine.md - Remote Files: how_to_guides/remote_files.md - Functional Interface: how_to_guides/functional_interface.md - Capture Warnings: how_to_guides/capture_warnings.md From 819dbfdf2a63ba9ccecab1b36ed50e7922027d99 Mon Sep 17 00:00:00 2001 From: Tobias Raabe Date: Fri, 1 May 2026 10:44:59 +0200 Subject: [PATCH 2/7] Document lockfile in tutorials --- .../tutorials/defining_dependencies_products.md | 6 ++++++ docs/source/tutorials/set_up_a_project.md | 14 ++++++++++++-- docs/source/tutorials/using_a_data_catalog.md | 4 ++++ docs/source/tutorials/write_a_task.md | 5 +++++ 4 files changed, 27 insertions(+), 2 deletions(-) diff --git a/docs/source/tutorials/defining_dependencies_products.md b/docs/source/tutorials/defining_dependencies_products.md index 86cf5da0..f43b5b6f 100644 --- a/docs/source/tutorials/defining_dependencies_products.md +++ b/docs/source/tutorials/defining_dependencies_products.md @@ -36,6 +36,8 @@ my_project │ ├────task_data_preparation.py │ └────task_plot_data.py │ +├───pytask.lock +│ └───pyproject.toml ``` @@ -107,6 +109,10 @@ Now, let us execute the two paths. --8<-- "docs/source/_static/md/defining-dependencies-products.md" +The build updates `pytask.lock` with the state of both tasks. When you run the same +tasks again without changing their dependencies, products, or source files, pytask uses +the lockfile to skip them. + ## Relative paths Dependencies and products do not have to be absolute paths. If paths are relative, they diff --git a/docs/source/tutorials/set_up_a_project.md b/docs/source/tutorials/set_up_a_project.md index a8317aa3..bca31778 100644 --- a/docs/source/tutorials/set_up_a_project.md +++ b/docs/source/tutorials/set_up_a_project.md @@ -14,7 +14,8 @@ move to the next section of the tutorials. ## The directory structure -The following directory tree gives an overview of the project's different parts. +The following directory tree gives an overview of the project's different parts after +the first build. ```text my_project @@ -30,13 +31,16 @@ my_project │ ├────config.py │ └────... │ +├───pytask.lock +│ └───pyproject.toml ``` -Replicate this directory structure for your project or start from pytask's +Create the project files and folders for your project or start from pytask's [cookiecutter-pytask-project](https://github.com/pytask-dev/cookiecutter-pytask-project) template or any other [linked template or example project](../how_to_guides/bp_templates_and_projects.md). +pytask creates the `.pytask` folder and `pytask.lock` file during builds. ## The `src` directory @@ -134,6 +138,12 @@ The `[tool.pytask.ini_options]` section tells pytask to look for tasks in The `.pytask` directory is where pytask stores its information. You do not need to interact with it. +## The `pytask.lock` file + +The `pytask.lock` file records which tasks and products are up to date. pytask updates +it during builds so later runs can skip unchanged tasks. This file should be kept in +version control. + ## Installation === "uv" diff --git a/docs/source/tutorials/using_a_data_catalog.md b/docs/source/tutorials/using_a_data_catalog.md index 9e50935d..5d7d86d2 100644 --- a/docs/source/tutorials/using_a_data_catalog.md +++ b/docs/source/tutorials/using_a_data_catalog.md @@ -36,6 +36,8 @@ my_project │ ├────task_data_preparation.py │ └────task_plot_data.py │ +├───pytask.lock +│ └───pyproject.toml ``` @@ -148,6 +150,8 @@ my_project │ ├───pyproject.toml │ +├───pytask.lock +│ ├───src │ └───my_project │ ├────config.py diff --git a/docs/source/tutorials/write_a_task.md b/docs/source/tutorials/write_a_task.md index 3b0d7a87..8d41f3c8 100644 --- a/docs/source/tutorials/write_a_task.md +++ b/docs/source/tutorials/write_a_task.md @@ -24,6 +24,8 @@ my_project │ ├────config.py │ └────task_data_preparation.py │ +├───pytask.lock +│ └───pyproject.toml ``` @@ -78,6 +80,9 @@ Now, execute pytask to collect tasks in the current and subsequent directories. --8<-- "docs/source/_static/md/write-a-task.md" +After the task succeeds, pytask writes `pytask.lock` next to `pyproject.toml`. Keep this +file under version control so later builds can detect unchanged tasks. + ## Customize task names From 0aff0415cc53913c91c2f441fb669c6358418054 Mon Sep 17 00:00:00 2001 From: Tobias Raabe Date: Fri, 1 May 2026 10:45:46 +0200 Subject: [PATCH 3/7] Update docs authoring instructions --- docs/AGENTS.md | 5 +++++ docs/CLAUDE.md | 1 - 2 files changed, 5 insertions(+), 1 deletion(-) delete mode 120000 docs/CLAUDE.md diff --git a/docs/AGENTS.md b/docs/AGENTS.md index db21c5e2..a51a353a 100644 --- a/docs/AGENTS.md +++ b/docs/AGENTS.md @@ -2,6 +2,11 @@ ## General +- The structure of the documentation follows https://diataxis.fr/. When writing or + editing an article, read the relevant guidance from the Diataxis Framework before: + https://diataxis.fr/tutorials, https://diataxis.fr/how-to-guides/, + https://diataxis.fr/explanation/, https://diataxis.fr/reference/. + https://diataxis.fr/compass/ tells you where belongs what and how do they relate. - Document only public APIs and user-facing behavior - exclude internals, framework abstractions, and implementation plumbing - Users need actionable documentation on what they can use, not confusing details about internal mechanics they can't control diff --git a/docs/CLAUDE.md b/docs/CLAUDE.md deleted file mode 120000 index 47dc3e3d..00000000 --- a/docs/CLAUDE.md +++ /dev/null @@ -1 +0,0 @@ -AGENTS.md \ No newline at end of file From f0b01ecb56d6ea26af1b44faec70e98622ffc94a Mon Sep 17 00:00:00 2001 From: Tobias Raabe Date: Fri, 1 May 2026 10:47:09 +0200 Subject: [PATCH 4/7] Link lockfile follow-up docs from tutorial --- docs/source/tutorials/set_up_a_project.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/source/tutorials/set_up_a_project.md b/docs/source/tutorials/set_up_a_project.md index bca31778..a4385cd3 100644 --- a/docs/source/tutorials/set_up_a_project.md +++ b/docs/source/tutorials/set_up_a_project.md @@ -144,6 +144,12 @@ The `pytask.lock` file records which tasks and products are up to date. pytask u it during builds so later runs can skip unchanged tasks. This file should be kept in version control. +When you move a project to another machine, see +[Move a Project to Another Machine](../how_to_guides/move_project_to_another_machine.md). +To update recorded task state manually, use +[`pytask lock`](../reference_guides/commands.md#pytask-lock). For details on what pytask +stores in the file, see [Lockfile](../reference_guides/lockfile.md). + ## Installation === "uv" From 3aa4cef51ca9518a7be74af302df328cf515fe54 Mon Sep 17 00:00:00 2001 From: Tobias Raabe Date: Fri, 1 May 2026 10:53:50 +0200 Subject: [PATCH 5/7] clarify some sections --- docs/source/how_to_guides/index.md | 2 +- ...te_the_lockfile_to_match_project_state.md} | 0 docs/source/tutorials/set_up_a_project.md | 22 ++++++++++--------- mkdocs.yml | 2 +- 4 files changed, 14 insertions(+), 12 deletions(-) rename docs/source/how_to_guides/{reconciling_lockfile_state.md => update_the_lockfile_to_match_project_state.md} (100%) diff --git a/docs/source/how_to_guides/index.md b/docs/source/how_to_guides/index.md index 9c49e87a..c27d8584 100644 --- a/docs/source/how_to_guides/index.md +++ b/docs/source/how_to_guides/index.md @@ -10,7 +10,7 @@ specific tasks with pytask. - [Migrating From Scripts To Pytask](migrating_from_scripts_to_pytask.md) - [Interfaces For Dependencies Products](interfaces_for_dependencies_products.md) - [Move a Project to Another Machine](move_project_to_another_machine.md) -- [Update the Lockfile to Match Project State](reconciling_lockfile_state.md) +- [Update the Lockfile to Match Project State](update_the_lockfile_to_match_project_state.md) - [Remote Files](remote_files.md) - [Functional Interface](functional_interface.md) - [Capture Warnings](capture_warnings.md) diff --git a/docs/source/how_to_guides/reconciling_lockfile_state.md b/docs/source/how_to_guides/update_the_lockfile_to_match_project_state.md similarity index 100% rename from docs/source/how_to_guides/reconciling_lockfile_state.md rename to docs/source/how_to_guides/update_the_lockfile_to_match_project_state.md diff --git a/docs/source/tutorials/set_up_a_project.md b/docs/source/tutorials/set_up_a_project.md index a4385cd3..d6168e9e 100644 --- a/docs/source/tutorials/set_up_a_project.md +++ b/docs/source/tutorials/set_up_a_project.md @@ -133,22 +133,24 @@ The `[tool.pytask.ini_options]` section tells pytask to look for tasks in `src/my_project`. You will learn more about configuration in the [configuration tutorial](configuration.md). -## The `.pytask` directory - -The `.pytask` directory is where pytask stores its information. You do not need to -interact with it. - ## The `pytask.lock` file The `pytask.lock` file records which tasks and products are up to date. pytask updates it during builds so later runs can skip unchanged tasks. This file should be kept in version control. -When you move a project to another machine, see -[Move a Project to Another Machine](../how_to_guides/move_project_to_another_machine.md). -To update recorded task state manually, use -[`pytask lock`](../reference_guides/commands.md#pytask-lock). For details on what pytask -stores in the file, see [Lockfile](../reference_guides/lockfile.md). +!!! seealso + + You will later learn how to sync the state of the lockfile with the project state with + the [`pytask lock`](../reference_guides/commands.md#pytask-lock) command or how the + lockfile enables you to + [move a project to another machine](../how_to_guides/move_project_to_another_machine.md), + but don't worry about it for now. + +## The `.pytask` directory + +The `.pytask` directory is where pytask stores some of its ephemeral information. You do +not need to interact with it, nor do you need to keep it in version control. ## Installation diff --git a/mkdocs.yml b/mkdocs.yml index 3cfda22a..57f09cdb 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -38,7 +38,7 @@ nav: - Migrating From Scripts To pytask: how_to_guides/migrating_from_scripts_to_pytask.md - Interfaces For Dependencies And Products: how_to_guides/interfaces_for_dependencies_products.md - Move a Project to Another Machine: how_to_guides/move_project_to_another_machine.md - - Reconciling Lockfile State: how_to_guides/reconciling_lockfile_state.md + - Update the Lockfile to Match Project State: how_to_guides/update_the_lockfile_to_match_project_state.md - Remote Files: how_to_guides/remote_files.md - Functional Interface: how_to_guides/functional_interface.md - Capture Warnings: how_to_guides/capture_warnings.md From 4be7d50ff508e65808d14d42d3b72b9de1978ebf Mon Sep 17 00:00:00 2001 From: Tobias Raabe Date: Fri, 1 May 2026 10:55:47 +0200 Subject: [PATCH 6/7] Document docs update in changelog --- CHANGELOG.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 8f08448e..2f28950f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,9 @@ releases are available on [PyPI](https://pypi.org/project/pytask) and ## Unreleased +- [#874](https://github.com/pytask-dev/pytask/pull/874) improves the lockfile + documentation by restructuring related guides around user workflows and introducing + `pytask.lock` in the tutorials. - [#868](https://github.com/pytask-dev/pytask/pull/868) resets the global marker configuration during unconfigure so `--strict-markers` no longer leaks into later marker access in the same process. From e57e75faef17d4f62f274c82952a4b1596a9026d Mon Sep 17 00:00:00 2001 From: Tobias Raabe Date: Fri, 1 May 2026 12:29:20 +0200 Subject: [PATCH 7/7] Clarify generated setup files --- docs/source/tutorials/set_up_a_project.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/source/tutorials/set_up_a_project.md b/docs/source/tutorials/set_up_a_project.md index d6168e9e..9d1a62c4 100644 --- a/docs/source/tutorials/set_up_a_project.md +++ b/docs/source/tutorials/set_up_a_project.md @@ -14,13 +14,12 @@ move to the next section of the tutorials. ## The directory structure -The following directory tree gives an overview of the project's different parts after -the first build. +The following directory tree gives an overview of the project's different parts. ```text my_project │ -├───.pytask +├───.pytask # Generated by pytask. │ ├───bld │ └────... @@ -31,7 +30,7 @@ my_project │ ├────config.py │ └────... │ -├───pytask.lock +├───pytask.lock # Generated by pytask. │ └───pyproject.toml ``` @@ -40,7 +39,7 @@ Create the project files and folders for your project or start from pytask's [cookiecutter-pytask-project](https://github.com/pytask-dev/cookiecutter-pytask-project) template or any other [linked template or example project](../how_to_guides/bp_templates_and_projects.md). -pytask creates the `.pytask` folder and `pytask.lock` file during builds. +pytask creates the `.pytask` folder and `pytask.lock` file later when you run tasks. ## The `src` directory