# Restarting nodes

Since we store the graph checkpoints we can recover from various errors without having to re-do the computations from scratch.
In the case we want to re-run a node that has failed for an ephemeral reason then it is enough to invalidate that single node.
However if a node has written outputs but those outputs are incorrect for some reason then the workflow will continue and many dependent nodes may be run with incorrect inputs.
In these cases the method `storage.restart_task` may be useful.
It will find all of the dependent nodes that have run and invalidate these also.
This means tha|t the chosen node will get restarted and all the subsequent nodes that depend upon it will be run again with new inputs.


In [1]:

from uuid import UUID
from tests.controller.typed_graphdata import typed_eval
from tierkreis.consts import WORKERS_DIR
from tierkreis import run_graph
from tierkreis.storage import FileStorage
from tierkreis.executor import UvExecutor

storage = FileStorage(UUID(int=205), "restart_example", do_cleanup=True)
executor = UvExecutor(WORKERS_DIR, storage.logs_path)
run_graph(storage, executor, typed_eval(), {})

If we open the workflow in the visualiser then we see that all the nodes have run and the workflow has completed.

In [2]:
from tierkreis.controller import resume_graph
from tierkreis.controller.data.location import Loc

storage.restart_task(Loc().N(3).N(3))

The visualiser will now show that the nodes dependent on `Loc().N(3).N(3)` have been restarted.

In [3]:
resume_graph(storage, executor)

The graph now shows as completed again.