Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SoC2020] Automated Layout #1598

Closed
neoddish opened this issue May 21, 2020 · 15 comments
Closed

[SoC2020] Automated Layout #1598

neoddish opened this issue May 21, 2020 · 15 comments
Assignees

Comments

@neoddish
Copy link
Member

neoddish commented May 21, 2020

智能布局
Automated Layout

What is SoC2020: #1599

描述 Description

AntV G6 是一个开源图可视化引擎。布局是图可视化和分析中最基础的功能。

作为图可视化与分析引擎,G6 目前已经拥有丰富的布局算法,每一种布局都有着丰富的配置。但对于用户而言,如何选择一个合适的布局算法、合适的参数和配置是一个难题。参考资料:

AntV G6 is an open source graph visualization engine. Layout is the most basic function in graph visualization and analysis.

As an engine for graph visualization and analysis, G6 currently has a wealth of layout algorithms, and each layout has a rich configuration. But for users, how to choose a suitable layout algorithm, suitable parameters and configuration is a difficult problem. References:

目标 Goal

在 G6 中设计和实现智能推荐布局的机制或算法,根据展示目的、分析目的、数据特点等多方面因素推荐合理的内置布局算法及配置。

  • 实现智能推荐布局算法;
  • 给出介绍推荐策略的文档;
  • 设计实验验证算法可行性与合理性。

Design and implement a mechanism or algorithm to recommend a reasonable built-in layout algorithm and configuration based on various factors such as display purpose, analysis purpose, and data characteristics.

  1. Design and implement recommending mechanism or algorithm;
  2. Provide the documents about your strategy;
  3. Provide the evaluation to show the feasibility and rationality of the solution.

需要技能 Prerequisite Skills

  • JavaScript、Canvas 等前端基础技能;
  • 图可视化基础理论知识。

  • JavaScript, Canvas and front-end engineering skills;
  • Basic theory knowledge of graph visualization.
@LovelyBuggies
Copy link

Hey, 大家好,我是Nino,一个来自中山大学的学生,我对G6这个智能布局的项目很感兴趣,并期待参加今年的ASoC。考虑到离申请截止日期越来越近了(很抱歉之前一直没有关注到这个Issue :/),请问您能够提供给我一些建议关于我应当在proposal中包含对哪些问题的理解和反馈?

@Yanyan-Wang
Copy link
Contributor

Hey, 大家好,我是Nino,一个来自中山大学的学生,我对G6这个智能布局的项目很感兴趣,并期待参加今年的ASoC。考虑到离申请截止日期越来越近了(很抱歉之前一直没有关注到这个Issue :/),请问您能够提供给我一些建议关于我应当在proposal中包含对哪些问题的理解和反馈?

期待你的参与。可考虑但不限于下面几点:

  1. 在设计策略或算法时考虑的因素,比如展示目的、分析目的、数据特点、……;
  2. 有何参考算法、论文、依据;
  3. 方案可行性与合理性,可以用量化、案例等,形式不限。

@LovelyBuggies
Copy link

LovelyBuggies commented May 27, 2020

According to Wikipedia: https://en.wikipedia.org/wiki/Automatic_layout

Automatic layout generation seems to be another concept, which is to adjust the edges and nodes to make it elegant, instead of recommend layouts according to the demands and purpose of users. I'm not sure whether our goal is to make a better layout or to make a better recommendation.

If I understand correctly, I think it would be better to use "Recommend Layout for Users" or "Decide Which Layout and Style (==configration) to Use" or "Choose the Right Layout to Use" as the title for this issue (and ASoC project).

A layout algorithm addresses one or more quality criteria, depending on the type of graph and the features of the algorithm, when laying out a graph. The most common criteria are:

  • Minimizing the number of link crossings;
  • Minimizing the total area of the drawing;
  • Minimizing the number of bends (in orthogonal drawings);
  • Maximizing the smallest angle formed by consecutive incident links;
  • Maximizing the display of symmetries.

ref. by https://docs.roguewave.com/en/jviews/current/index.html#page/userman%2Fdiagrammer%2FChapter06ProgrammersDocumentation.22.076.html%23 and yworks.

@LovelyBuggies
Copy link

LovelyBuggies commented May 27, 2020

If this is the case (our goal is to make a good recommendation strategy), for layout, I think the only deciding factor is whether to express the relationship between nodes (if yes, use normal layouts; else, use tree layouts). For configuration or style, users should have the right to express, and they can choose the style they like, rather than obsess about the purposes.

For example, what is the difference between using CompactBox and Dendrogram when using the tree layout? And what factors (purposes, data characteristics ...) lead to this difference?

@Yanyan-Wang
Copy link
Contributor

If this is the case (our goal is to make a good recommendation strategy), for layout, I think the only deciding factor is whether to express the relationship between nodes (if yes, use normal layouts; else, use tree layouts). For configuration or style, users should have the right to express, and they can choose the style they like, rather than obsess about the purposes.

For example, what is the difference between using CompactBox and Dendrogram when using the tree layout? And what factors (purposes, data characteristics ...) lead to this difference?

Yes, the goal is to make a good recommendation strategy, refer to

Design and implement a mechanism or algorithm to recommend a reasonable built-in layout algorithm and configuration based on various factors such as display purpose, analysis purpose, and data characteristics.

First of all, the data structures for general graph and tree graph are different. It is not decided by whether to express the relationship between nodes.

Second, both general graph and tree graph have lots of their own different layout algorithms. The analysis purpose might be a factor to find a proper choice. E.g., If the user wants to explore the graph with a focus node and its related nodes with different shortest path lengths, the radial layout will be more suitable than force layout, etc.

Of course users have the right to choose the configurations they prefer, but a reasonable recommendation will be a good start.

@LovelyBuggies
Copy link

LovelyBuggies commented May 27, 2020

@Yanyan-Wang Thanks. In fact, I think the user preference is the only determinant when selecting some particular layout algorithms (i.e. layouts in the tree graph). I see your example, i.e.,

If the user wants to explore the graph with a focus node and its related nodes with different shortest path lengths, the radial layout will be more suitable than force layout, etc.

I totally agree, especially for the general graphs (though G6 classify them all into general, some practices put these graphs into different categories, e.g. grid and circular graph are separated graphs in yworks). But when it comes to the tree graph, things seem to be different. Just like my illustration above,

For example, what is the difference between using CompactBox and Dendrogram when using the tree graph?

I don't think selecting an appropriate layout algorithms matters a lot for tree graphs, as the diff of those layouts is tiny, just the direction, the level position, and so on. Furthermore, if the users have confirmed to use a tree graph, I think it's intuitive for them to select a suitable tree graph layout.

For example, when I am sure to use the tree, I would directly select the MindMap layout if I want my edges in the different directions. And from the functional point of view, there doesn't seem to be much difference between CompactBox and Dendrogram. Could you please show me an example of how purpose or data characteristics can affect G6's decision making between CompactBox and Dendrogram?

@neoddish
Copy link
Member Author

neoddish commented May 27, 2020

The discussions above are great. Some of my ideas - for reference only:

  1. Of course, it's easy to let the user switch between different layouts. But if we could have a function like autoGraph(container, data) that automatically draws a graph on the container for the given data, there must be a 'default' layout algorithm on the top of the list. Shall we just choose a constant one? pick one randomly?... or do we have a better idea?

  2. It's interesting to compare CompactBox and Dendrogram. In the sense of data, CompactBox is for top-down logic while Dendrogram is for bottom-up. In other word, the leaves of a dendrogram should be more similar or have comparability. In fact, dendrograms are offen used in clustering problems. Is it possible to raise its priority when the leaves of the dataset exhibit some quantifiable clustering properties? (I can't go further... it's just an idea)

  3. I believe this issue is about 'how to choose a suitable layout algorithm, suitable parameters and configuration'. And I do think the recommandation for configs of a specific layout has wider application scenarios than for layouts. Imagine that when I have decided to use some layout, I have to calculate the (x, y) of each node myself. It is even necessary to consider the different lengths of text in the nodes, and maybe the intersection of links. This is too much trouble.

If we can solve any problem similar to the above, this will be a great work.

@LovelyBuggies
Copy link

LovelyBuggies commented May 29, 2020

@jiazhewang CC: @Yanyan-Wang @baizn

Hey, all. I have drafted a proposal, any suggestions? The basic ideas of my proposal are:

  1. Select Layout: Calculated the weighted scores of the layouts according to data characteristics, display purposes, and analysis purposes.
  2. Select Style: We first select a style, e.g. to use an ordered style rather than an unordered one; then figure out the configurations, e.g. where to place the nodes.
  3. Evaluation: The evaluation is based on the users feedbacks and "automatic layout" criteria, we will store the error rate and error message for each layout.

If I am in the correct direction, I think I can go deeper into the codebase.

Plz ignore the reference order, I will correct them in the end :)

@LovelyBuggies
Copy link

LovelyBuggies commented May 29, 2020

Reply: @jiazhewang

The discussions above are great. Some of my ideas - for reference only:

  1. Of course, it's easy to let the user switch between different layouts. But if we could have a function like autoGraph(container, data) that automatically draws a graph on the container for the given data, there must be a 'default' layout algorithm on the top of the list. Shall we just choose a constant one? pick one randomly?... or do we have a better idea?
  2. It's interesting to compare CompactBox and Dendrogram. In the sense of data, CompactBox is for top-down logic while Dendrogram is for bottom-up. In other word, the leaves of a dendrogram should be more similar or have comparability. In fact, dendrograms are offen used in clustering problems. Is it possible to raise its priority when the leaves of the dataset exhibit some quantifiable clustering properties? (I can't go further... it's just an idea)
  3. I believe this issue is about 'how to choose a suitable layout algorithm, suitable parameters and configuration'. And I do think the recommendation for configs of a specific layout has wider application scenarios than for layouts. Imagine that when I have decided to use some layout, I have to calculate the (x, y) of each node myself. It is even necessary to consider the different lengths of text in the nodes, and maybe the intersection of links. This is too much trouble.

If we can solve any problem similar to the above, this will be a great work.

  1. You can use a constant or a random layout, but in my idea, G6 will give you a recommendation as default once the user has entered the data path and his purposes.
  2. CompactBox and Dendrogram could both be top to bottom, see https://g6.antv.vision/en/examples/tree/compactBox#tbCompactBox and https://g6.antv.vision/en/examples/tree/dendrogram#tbDendrogram, and thus not much difference in my design.
  3. Those are exactly what will be used in my Style Selection here: randomly initialized some possibilities, use "what is a good layout" criteria to score them, and finally select the best one.

@Yanyan-Wang
Copy link
Contributor

@jiazhewang CC: @Yanyan-Wang @baizn

Hey, all. I have drafted a proposal, any suggestions? The basic ideas of my proposal are:

  1. Select Layout: Calculated the weighted scores of the layouts according to data characteristics, display purposes, and analysis purposes.
  2. Select Style: We first select a style, e.g. to use an ordered style rather than an unordered one; then figure out the configurations, e.g. where to place the nodes.
  3. Evaluation: The evaluation is based on the users feedbacks and "automatic layout" criteria, we will store the error rate and error message for each layout.

If I am in the correct direction, I think I can go deeper into the codebase.

Plz ignore the reference order, I will correct them in the end :)

Such a good job! Please go ahead as your will.

Some small tips:

2.7.2 Sub-styled Layouts
G6 allows users to use subgraph layout by force-directed layout currently like this.

Actually, G6 supports any built-in layouts for subgraphs.

2.7.3 More Layouts

It is a good extension for this topic. You are not limited to the layouts in G6, any potential plans are welcome to be discussed. And we also welcome PR for more useful layout algorithms from communities.

@neoddish
Copy link
Member Author

neoddish commented Jun 1, 2020

@jiazhewang CC: @Yanyan-Wang @baizn

Hey, all. I have drafted a proposal, any suggestions? The basic ideas of my proposal are:

  1. Select Layout: Calculated the weighted scores of the layouts according to data characteristics, display purposes, and analysis purposes.
  2. Select Style: We first select a style, e.g. to use an ordered style rather than an unordered one; then figure out the configurations, e.g. where to place the nodes.
  3. Evaluation: The evaluation is based on the users feedbacks and "automatic layout" criteria, we will store the error rate and error message for each layout.

If I am in the correct direction, I think I can go deeper into the codebase.

Plz ignore the reference order, I will correct them in the end :)

Impressive! This is a meticulous proposal. In fact, I think it sets a good criterion for other proposers.

@LovelyBuggies
Copy link

Impressive! This is a meticulous proposal. In fact, I think it sets a good criterion for other proposers.

To be continued. But if you can have a look at your convenience and offer me some suggestions, I would appreciate it! (I am not pretty confident about my plan in Style Selection).

@neoddish
Copy link
Member Author

neoddish commented Jun 2, 2020

Impressive! This is a meticulous proposal. In fact, I think it sets a good criterion for other proposers.

To be continued. But if you can have a look at your convenience and offer me some suggestions, I would appreciate it! (I am not pretty confident about my plan in Style Selection).

The section of Style Selection is good enough for a propsal. The name of 'zen' is interesting. In fact, you are designing some empirical constraint rules. I think there are some points that should be considered: Should these constraints be all satisfied? Does this problem generalize to a Constraint Satisfication Problem or a Constraint Optimization Problem? (but the solver could be heavy! so maybe we can find some simpler model) Or, how to choose when these rules conflict with each other. What is the relationship between these rules and our Objectives(aka "what is a good layout" criteria)? Maybe these questions can be thought of more carefully, with a clearer description, it will make your proposal even better.

@LovelyBuggies
Copy link

LovelyBuggies commented Jun 3, 2020

@jiazhewang Reply:

Should these constraints be all satisfied? Does this problem generalize to a Constraint Satisfication Problem or a Constraint Optimization Problem?

As I mentioned in my CV, I'm Python developer, so I used the word "zen" :) Pretty intuitive, Uh-huh? Any pythonic code is expected to follow this zen, but don't have to. Same for us, for style selection, this is a CSP.

(but the solver could be heavy! so maybe we can find some simpler model) Or, how to choose when these rules conflict with each other.

Yeah, It is heavy to some extent, but we can make the implementation simple, (i.e., just have codes cover the ideas of zen rather than lengthy conditional judgments, it'd not be extremely complex, I think). And they will never conflict with each other, because I thought about this when I design it. If they really conflict, we can just eliminate the relatively unimportant one :)

What is the relationship between these rules and our Objectives(aka "what is a good layout" criteria)?

I think I need to clarify this point here. Style Selection should be conducted after the Layout Selection, which includes two parts, a) selecting a suitable style and b) properly arrange the items.

  • a) should follow the "zen", for example, for the radial layout, the style is better to be an ordered style rather than chaotic style, say "Restricted is better than unrestricted".
  • b) should follow the rules like minimal link crossings, minimal area etc. Because it makes sense to follow these rules to choose the best configuration, only after you have determined the style to use, e.g. restricted radial layout.

Thank you so much for your careful reading. I think these are something I need to further clarify. Afterall, talk is cheap, I'll deep into the codebase in the next few weeks to see the practicability of my plan before the application due time. So probably won't change much of the ideas in the proposal. If you have other thoughts, feel free to comment ⬇️ or tell me.

@neoddish
Copy link
Member Author

neoddish commented Jul 9, 2020

This project chose the proposal of @LovelyBuggies . Congratulations!

BTW, any friends can still participate in the co-construction of this project or make suggestions. In addition to being an ASoC project, this task itself is still a help wanted issue. THX.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants