New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Categorise high level layers for display #7919
Comments
This is amazing! I will get started on finding more about this and share my findings here! 🚀 |
I suggest starting with a concrete list of the current known implementations of Layer. |
List of LayersGeneral# High level graph layer
class Layer(collections.abc.Mapping)
# Fully materialized layer of `Layer`
class MaterializedLayer(Layer)
# Tensor Operation
class Blockwise(Layer) Array Layers# Specialized Blockwise Layer for array creation routines
class BlockwiseCreateArray(Blockwise)
# Simple HighLevelGraph array overlap layer
class ArrayOverlapLayer(Layer) Dataframe Layers# DataFrame-based HighLevelGraph Layer
class DataFrameLayer(Layer)
# High-level graph layer for a simple shuffle operation in which each output partition depends on all input partitions
class SimpleShuffleLayer(DataFrameLayer)
# High-level graph layer corresponding to a single stage of a multi-stage inter-partition shuffle operation.
class ShuffleLayer(SimpleShuffleLayer)
# High-level graph layer for a join operation requiring the smaller collection to be broadcasted to every partition of the larger collection.
class BroadcastJoinLayer(DataFrameLayer)
# DataFrame-based Blockwise Layer with IO
class DataFrameIOLayer(Blockwise, DataFrameLayer) |
Copying over my comment from the Slack thread earlier today:
|
So I think I'm more advocating that the Layer classes have some default annotations - and that's all. That way, we only need to look at these annotations, we don't need to do any Totally agree that "CPU" should not have any associated decoration, as it will be the default, most common box. |
@martindurant Meanwhile, I have come up with the visualization for the HLGs! DesignGroup the listed layer_types into 4 categories IO (DataFrame)
Shuffle (DataFrame)
Blockwise (Array mostly + DataFrame)
Materialized (Array + DataFrame)
The remaining layers:
VisualizationGraphviz (
|
Empty? If we can't think of anything useful, it's better not to complicate the visuals. |
/cc @mrocklin What do you think about the use of colors? |
I hadn't thought of changing the colours in the dots next to the layer titles in the HTML representation, that's a nice touch. Personally, I'm not sure we're gaining much by having multiple shades of green/blue/etc. for different types of layers in the same larger category. I think that adds more confusion. We'll need some effort spent on:
|
I'm sorry, what does this mean? |
I think it means that the outline box is not just a rectangle, but has a little fold in the corner. Pretty subtle.
…On July 26, 2021 10:28:04 PM EDT, Genevieve Buckley ***@***.***> wrote:
> For example, I can add a little node form **note** to all of the
Shuffle and DataFrameIO layers.
I'm sorry, what does this mean?
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#7919 (comment)
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
|
@GenevieveBuckley Yes, Martin is correct. So, in essence, the current node shape is "box." Another node shape is "note," which resembles a paper fold from the corner. Normally, people fold a page to bookmark it so that it can be easily found later. In this case, the fold would indicate that this area warrants further investigation. |
Thanks @freyam but all the colour design was done by an external designer. See dask/community#135. I've added another comment to that issue to see if we can get the designer involved in choosing the more colours. |
That's amazing! 🚀 |
I think that the difference is too subtle to see in your examples, and the meaning of the different shapes will not be obvious to users. |
Just wanna confirm: |
I have opened a Draft PR where I will be working along with this discussion. |
There are currently a concrete number of subclasses of the base highlevelgraph.Layer. Some of these have specific contexts or collection linkage (array Vs dataframe), others do not. For the sake of the work being done by @freyam , it would be nice to create some categories for the purpose of being shown in
.visualize()
.Layers allow for attaching attributes at instantiation. I suggests there might also be class attributes giving information about the layer type, which will be true for all instances.
Example:
DataFrameIOLayer
is IO by operation type and dataframe by collection. It would be reasonable for these to be among the default annotations of all instances.The text was updated successfully, but these errors were encountered: