-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: task graph html repr #99
Conversation
src/sciline/task_graph.py
Outdated
'<p>Task graph computes</p>\n', | ||
_list_items(leafs), | ||
'<p>based on</p>\n', | ||
_list_max_n_then_hide(roots), | ||
f'<p>using the scheduler {scheduler}.</p>', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must say not a fan of the "narrative" formatting. How about:
Output keys: A, B, C
Scheduler: DaskScheduler
Providers:
....
For the dask scheduler, it would be very helpful if we can include which underlying dask scheduler (get
function) is used, see here: https://scipp.github.io/sciline/generated/classes/sciline.scheduler.DaskScheduler.html.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a bit of an experiment ;) I'll change it.
However I don't think the last column should be "Providers" but rather something like "Input parameters"?
Or do you think it would be better to list all providers in the task graph rather than only the inputs?
Once the user has selected what output they are interested in I think the intermediate providers are kind of uninteresting, while the inputs that contribute to a particular output might be interesting for debugging purposes / understanding what is needed to compute a particular quantitiy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the user has selected what output they are interested in I think the intermediate providers are kind of uninteresting
Maybe? We can only list the inputs for now.
) | ||
|
||
|
||
def _list_max_n_then_hide(items: Sequence[str], n: int = 5, header: str = '') -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good function name 👍
src/sciline/task_graph.py
Outdated
{ | ||
escape(keyname(key)) | ||
for key, (_, requires) in self._graph.items() | ||
if len(requires) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a sufficient check for an input param. One can define providers without inputs. We should use the same check as in other places (Pipeline repr, graph visualization).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm I'm not sure I understand. Is there a conceptual difference between a parameter and a provider without inputs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a parameter is typically something a user sets, i.e., it would typically be different every time the pipeline is used (and therefore seeing its value is valuable). A provider is more static and would typically not change (and we cannot get its value without calling it, which may be expensive).
src/sciline/task_graph.py
Outdated
if len(requires) == 0 | ||
} | ||
) | ||
scheduler = escape(self._scheduler.__class__.__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you include the underlying dask scheduler, this is important information. Add a method to the schedulers? Not sure if __str__
or __repr__
would make sense for them, maybe?
src/sciline/utils.py
Outdated
def keyname(key: Key) -> str: | ||
if isinstance(key, Item): | ||
return f'{keyname(key.tp)}({keyname(key.label[0].tp)})' | ||
args = get_args(key) | ||
if len(args): | ||
parameters = ', '.join(map(keyname, args)) | ||
return f'{qualname(key)}[{parameters}]' | ||
return qualname(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this compare to what is used in visualize
? Can we use the same code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think we can.
c699e1c
to
805eeb3
Compare
Fixes #70