Skip to content

Commit

Permalink
Fix setting the name of PythonNode. (#443)
Browse files Browse the repository at this point in the history
  • Loading branch information
tobiasraabe committed Oct 8, 2023
1 parent b191559 commit 326e589
Show file tree
Hide file tree
Showing 13 changed files with 146 additions and 118 deletions.
13 changes: 9 additions & 4 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,17 @@
version: 2

build:
os: "ubuntu-20.04"
os: ubuntu-22.04
tools:
python: "mambaforge-4.10"
python: "3.10"

sphinx:
configuration: docs/source/conf.py
fail_on_warning: true

conda:
environment: docs/rtd_environment.yml
python:
install:
- method: pip
path: .
extra_requirements:
- docs
36 changes: 0 additions & 36 deletions docs/rtd_environment.yml

This file was deleted.

5 changes: 5 additions & 0 deletions docs/source/changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@ chronological order. Releases follow [semantic versioning](https://semver.org/)
releases are available on [PyPI](https://pypi.org/project/pytask) and
[Anaconda.org](https://anaconda.org/conda-forge/pytask).

## 0.4.1 - 2023-10-08

- {pull}`443` ensures that `PythonNode.name` is always unique by only handling it
internally.

## 0.4.0 - 2023-10-07

- {pull}`323` remove Python 3.7 support and use a new Github action to provide mamba.
Expand Down
1 change: 1 addition & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
"sphinx.ext.viewcode",
"sphinx_copybutton",
"sphinx_click",
"sphinx_toolbox.more_autodoc.autoprotocol",
"nbsphinx",
"myst_parser",
"sphinx_design",
Expand Down
23 changes: 12 additions & 11 deletions docs/source/how_to_guides/writing_custom_nodes.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Writing custom nodes

In the previous tutorials and how-to guides, you learned that dependencies and products
can be represented as plain Python objects with {class}`pytask.PythonNode` or as paths
where every {class}`pathlib.Path` is converted to a {class}`pytask.PathNode`.
can be represented as plain Python objects with {class}`~pytask.PythonNode` or as paths
where every {class}`pathlib.Path` is converted to a {class}`~pytask.PathNode`.

In this how-to guide, you will learn about the general concept of nodes and how to write
your own to improve your workflows.
Expand Down Expand Up @@ -54,13 +54,13 @@ A custom node needs to follow an interface so that pytask can perform several ac
- Load and save values when tasks are executed.

This interface is defined by protocols [^structural-subtyping]. A custom node must
follow at least the protocol {class}`pytask.PNode` or, even better,
{class}`pytask.PPathNode` if it is based on a path. The common node for paths,
{class}`pytask.PathNode`, follows the protocol {class}`pytask.PPathNode`.
follow at least the protocol {class}`~pytask.PNode` or, even better,
{class}`~pytask.PPathNode` if it is based on a path. The common node for paths,
{class}`~pytask.PathNode`, follows the protocol {class}`~pytask.PPathNode`.

## `PickleNode`

Since our {class}`PickleNode` will only vary slightly from {class}`pytask.PathNode`, we
Since our {class}`PickleNode` will only vary slightly from {class}`~pytask.PathNode`, we
use it as a template, and with some minor modifications, we arrive at the following
class.

Expand All @@ -85,8 +85,8 @@ class.

Here are some explanations.

- The node does not need to inherit from the protocol {class}`pytask.PPathNode`, but you
can do it to be more explicit.
- The node does not need to inherit from the protocol {class}`~pytask.PPathNode`, but
you can do it to be more explicit.
- The node has two attributes
- `name` identifies the node in the DAG, so the name must be unique.
- `path` holds the path to the file and identifies the node as a path node that is
Expand All @@ -107,9 +107,10 @@ Nodes are an important in concept pytask. They allow to pytask to build a DAG an
generate a workflow, and they also allow users to extract IO operations from the task
function into the nodes.

pytask only implements two node types, {class}`PathNode` and {class}`PythonNode`, but
many more are possible. In the future, there should probably be a plugin that implements
nodes for many other data sources like AWS S3 or databases. [^kedro]
pytask only implements two node types, {class}`~pytask.PathNode` and
{class}`~pytask.PythonNode`, but many more are possible. In the future, there should
probably be a plugin that implements nodes for many other data sources like AWS S3 or
databases. [^kedro]

## References

Expand Down
27 changes: 15 additions & 12 deletions docs/source/reference_guides/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -239,25 +239,30 @@ The remaining exceptions convey specific errors.
```

## Nodes
## Protocols

Nodes are the interface for different kinds of dependencies or products. They inherit
from {class}`pytask.MetaNode`.
Protocols define how tasks and nodes for dependencies and products have to be set up.

```{eval-rst}
.. autoclass:: pytask.MetaNode
.. autoprotocol:: pytask.MetaNode
:show-inheritance:
.. autoprotocol:: pytask.PNode
:show-inheritance:
.. autoprotocol:: pytask.PPathNode
:show-inheritance:
.. autoprotocol:: pytask.PTask
:show-inheritance:
.. autoprotocol:: pytask.PTaskWithPath
:show-inheritance:
```

Then, different kinds of nodes can be implemented.
## Nodes

```{eval-rst}
.. autoclass:: pytask.PathNode
:members:
```
Nodes are the interface for different kinds of dependencies or products.

```{eval-rst}
.. autoclass:: pytask.PathNode
.. autoclass:: pytask.PythonNode
:members:
```

To parse dependencies and products from nodes, use the following functions.
Expand Down Expand Up @@ -357,8 +362,6 @@ There are some classes to handle different kinds of reports.
An indicator to mark arguments of tasks as products.
Examples
--------
>>> def task_example(path: Annotated[Path, Product]) -> None:
... path.write_text("Hello, World!")
Expand Down
4 changes: 2 additions & 2 deletions docs/source/tutorials/defining_dependencies_products.md
Original file line number Diff line number Diff line change
Expand Up @@ -320,7 +320,7 @@ Secondly, dictionaries use keys instead of positions that are more verbose and
descriptive and do not assume a fixed ordering. Both attributes are especially desirable
in complex projects.

## Multiple decorators
**Multiple decorators**

pytask merges multiple decorators of one kind into a single dictionary. This might help
you to group dependencies and apply them to multiple tasks.
Expand All @@ -344,7 +344,7 @@ Inside the task, `depends_on` will be
{"first_text": ... / "text_1.txt", "second_text": "text_2.txt", 0: "text_3.txt"}
```

## Nested dependencies and products
**Nested dependencies and products**

Dependencies and products can be nested containers consisting of tuples, lists, and
dictionaries. It is beneficial if you want more structure and nesting.
Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,4 +47,5 @@ dependencies:
- sphinxext-opengraph

- pip:
- sphinx-toolbox
- -e .
13 changes: 13 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,19 @@ where = src
console_scripts =
pytask=_pytask.cli:cli

[options.extras_require]
docs =
furo
ipython
myst-parser
nbsphinx
sphinx
sphinx-click
sphinx-copybutton
sphinx-design>=0.3.0
sphinx-toolbox
sphinxext-opengraph

[check-manifest]
ignore =
src/_pytask/_version.py
40 changes: 17 additions & 23 deletions src/_pytask/collect.py
Original file line number Diff line number Diff line change
Expand Up @@ -325,20 +325,8 @@ def pytask_collect_node(session: Session, path: Path, node_info: NodeInfo) -> PN
node = node_info.value

if isinstance(node, PythonNode):
prefix = (
node_info.task_path.as_posix() + "::" + node_info.task_name
if node_info.task_path
else node_info.task_name
)
if node.name:
node.name = prefix + "::" + node.name
else:
node.name = prefix + "::" + node_info.arg_name

suffix = "-".join(map(str, node_info.path)) if node_info.path else ""
if suffix:
node.name += "::" + suffix

node_name = _create_name_of_python_node(node_info)
node.name = node_name
return node

if isinstance(node, PPathNode) and not node.path.is_absolute():
Expand Down Expand Up @@ -366,15 +354,7 @@ def pytask_collect_node(session: Session, path: Path, node_info: NodeInfo) -> PN
)
return PathNode.from_path(node)

prefix = (
node_info.task_path.as_posix() + "::" + node_info.task_name
if node_info.task_path
else node_info.task_name
)
node_name = prefix + "::" + node_info.arg_name
suffix = "-".join(map(str, node_info.path)) if node_info.path else ""
if suffix:
node_name += "::" + suffix
node_name = _create_name_of_python_node(node_info)
return PythonNode(value=node, name=node_name)


Expand Down Expand Up @@ -514,3 +494,17 @@ def pytask_collect_log(
)

raise CollectionError


def _create_name_of_python_node(node_info: NodeInfo) -> str:
"""Create name of PythonNode."""
prefix = (
node_info.task_path.as_posix() + "::" + node_info.task_name
if node_info.task_path
else node_info.task_name
)
node_name = prefix + "::" + node_info.arg_name
if node_info.path:
suffix = "-".join(map(str, node_info.path))
node_name += "::" + suffix
return node_name
2 changes: 1 addition & 1 deletion src/_pytask/node_protocols.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ class MetaNode(Protocol):
"""Protocol for an intersection between nodes and tasks."""

name: str
"""The name of node that must be unique."""
"""Name of the node that must be unique."""

@abstractmethod
def state(self) -> str | None:
Expand Down
Loading

0 comments on commit 326e589

Please sign in to comment.