Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Py] Improve embeddings #12

Merged
merged 18 commits into from
Mar 30, 2022
Merged

[Py] Improve embeddings #12

merged 18 commits into from
Mar 30, 2022

Conversation

Splines
Copy link
Member

@Splines Splines commented Feb 22, 2022

No description provided.

We want to allow for more chain scenarios, e.g. add a random chain where
both source and target node are in chains themselves. This led to
overcomplicated code having to deal with chains that are placed on edges
instead of nodes. Thus, this commit is a working example for better
chain placement (although not perfect), but definitely needs to be
redone in the next commits (e.g. using the already-used mapping from H
to G instead of essentially redundant chains on edges).
This only affects the embedding and solver module.
Beforehand, we encoded supernodes using edge costs, e.g. edge with
cost 0 was no chain, an edge with cost 1 indicated that the two nodes
mapped to the same node_H, thus they belong to the same supernode.

However, this was essentially redundant as we also stored the mapping
between input graph H and hardware graph G in the mapping class. We
leverage this class now and encode supernodes directly, thus getting rid
of edge chains (remnants are still there, will clean up later). It might
be worth to keep the term "chain" to indicate supernodes that consist
of more than one node.

This change makes the embedding itself worse, e.g. K4 does not work
anymore. This is since we now also allow to merge random nodes to
supernodes where both source and target are in chains. We need to
implement the evolutionary strategy where we select the mutation
that yields the best result, in our case where the most edges can be
embedded (to achieve a "real" local maximum).

Also changed:
- Changed lists to sets in various places
- Adapted drawing class to the new changes (to properly draw supernodes)
- Made logging output more precise
@Splines Splines added the enhancement New feature or request label Feb 22, 2022
@Splines Splines self-assigned this Feb 22, 2022
Split the method into multiple pieces for better readability and fixed
several bugs that occurred due to wrong order of execution. It is still
quirky that we can't embed edges freely since all nodes on that edge
must have supernode (in the current implementation). That's why there
are some special constructs, e.g. we need to discard nodes from sets
to make sure not the change results after having embedded them
in supernodes. There is room for improvement here.

Fixed a really annoying bug: Sets in Python are returned by reference,
so we got weird results since we used discard on them assuming we just
altered the local variable. Now we copy the set.

Fixed another, potential bug where we accidentally added selfloops to
the embedding view graph.

Also improved error messages.
Note that at this point, we still sometimes use the word "chain",
however we don't use chains encoded on edges anymore, but now on
nodes themselves.

TODO: Unify usage of words "chain" and "supernode" throughout the code.
Also slightly shifted labels of Chimera graph for better readability.
After each generation we also try to remove unnecessary edges.
This way, we don't remove the wrong nodes in our graph that would
disconnect the respective subgraph of supernodes (if removed).

See https://www.geeksforgeeks.org/articulation-points-or-cut-vertices-in-a-graph/
and https://youtu.be/jFZsDDB0-vo as examples
Previously, we removed all nodes that were not articulation points
for every supernode in one go. However, as we remove, the articulation
points might change, so we need to recalculate them every time a node
is removed.
We don't shift the target node in this strategy.
Also switched to 5x5 Chimera cell
@Splines Splines marked this pull request as ready for review March 30, 2022 22:47
@Splines Splines merged commit 274c063 into develop Mar 30, 2022
keksklauer4 added a commit that referenced this pull request Apr 9, 2022
* Fix spelling mistake

* [Python] Init docs (#9)

TL;DR:
- Refactor graph python module
- Init Sphinx documentation: https://majorminer.readthedocs.io/

* [Py] Implement BFS and DFS for initialization

BFS: Breadth-first search
DFS: Depth-first search

* Refactor undirected Graph and add doc strings.

* Init Sphinx ReadTheDocs documentation

* Add documentation badge

* Add docstrings to graph

* Init Chimera graph docs

* Add ReadTheDocs configuration to fix doc errors

Use Python 3.9 instead of version 3.7 for building.

* Add doc strings to embedding graph

This partially breaks with some chain logic, so we will need to adjust
some functions one abstraction layer above.

* Fix get_embedded_nodes

* Add overview section to docs including diagrams

* Add more docs

* Add warning for outdated Readme

* [Py] Chimera lattice (#10)

* Extend from Chimera cell to whole Chimera lattice

- Also outsource test graphs
- Do evolution with multiple mutation steps

* Simplify connected test graph generation

* Try to embed endlessly until embedding found

Also save intermediate results as SVG

* Add more chain colors and SVG export option

* Init loggin instead of prints

* Save intermediate SVG graphs

* Do not return multiple chains (instead use set)

* Add typing to util function

* Only use from_nodes which are not in a chain

* Tighten error handling

* Allow mutation for to_nodes that are in a chain

* Include max_total as bound for main_loop passes

* Export step-by-step SVG

For this, we outsourced the local maximum technique to not be included
inside a mutation. It now has to be called from outside (evolution.py).

* Change draw direction to horizontal

* [Py] Improve embeddings (#12)

* ⚠ Overcomplicate logic for random chains

We want to allow for more chain scenarios, e.g. add a random chain where
both source and target node are in chains themselves. This led to
overcomplicated code having to deal with chains that are placed on edges
instead of nodes. Thus, this commit is a working example for better
chain placement (although not perfect), but definitely needs to be
redone in the next commits (e.g. using the already-used mapping from H
to G instead of essentially redundant chains on edges).

* Rename from_node->source, to_node->target

This only affects the embedding and solver module.

* Shift from edge to node supernode encoding

Beforehand, we encoded supernodes using edge costs, e.g. edge with
cost 0 was no chain, an edge with cost 1 indicated that the two nodes
mapped to the same node_H, thus they belong to the same supernode.

However, this was essentially redundant as we also stored the mapping
between input graph H and hardware graph G in the mapping class. We
leverage this class now and encode supernodes directly, thus getting rid
of edge chains (remnants are still there, will clean up later). It might
be worth to keep the term "chain" to indicate supernodes that consist
of more than one node.

This change makes the embedding itself worse, e.g. K4 does not work
anymore. This is since we now also allow to merge random nodes to
supernodes where both source and target are in chains. We need to
implement the evolutionary strategy where we select the mutation
that yields the best result, in our case where the most edges can be
embedded (to achieve a "real" local maximum).

Also changed:
- Changed lists to sets in various places
- Adapted drawing class to the new changes (to properly draw supernodes)
- Made logging output more precise

* Rewrite extend_random_supernode and fix bugs

Split the method into multiple pieces for better readability and fixed
several bugs that occurred due to wrong order of execution. It is still
quirky that we can't embed edges freely since all nodes on that edge
must have supernode (in the current implementation). That's why there
are some special constructs, e.g. we need to discard nodes from sets
to make sure not the change results after having embedded them
in supernodes. There is room for improvement here.

Fixed a really annoying bug: Sets in Python are returned by reference,
so we got weird results since we used discard on them assuming we just
altered the local variable. Now we copy the set.

Fixed another, potential bug where we accidentally added selfloops to
the embedding view graph.

Also improved error messages.

* Get rid of chain remnants

Note that at this point, we still sometimes use the word "chain",
however we don't use chains encoded on edges anymore, but now on
nodes themselves.

TODO: Unify usage of words "chain" and "supernode" throughout the code.

* Explicitly use sets instead of lists

* Draw supernode colors on nodes (not only on edges)

Also slightly shifted labels of Chimera graph for better readability.

* Change color brightness instead of transparency

* Fix bug not operating on playground embedding

* Fix bug forgot removing target from prev supernode

* Init basic hill climbing (evolution)

After each generation we also try to remove unnecessary edges.

* Move supernode connectiveness check to embedding

* Remove redundant nodes probabilistically

* Implement articulation point algorithm (cut node)

This way, we don't remove the wrong nodes in our graph that would
disconnect the respective subgraph of supernodes (if removed).

See https://www.geeksforgeeks.org/articulation-points-or-cut-vertices-in-a-graph/
and https://youtu.be/jFZsDDB0-vo as examples

* Fix removal of redundant nodes

Previously, we removed all nodes that were not articulation points
for every supernode in one go. However, as we remove, the articulation
points might change, so we need to recalculate them every time a node
is removed.

* Add strategy to extend supernode without shifting

We don't shift the target node in this strategy.

* Fix articulation point (don't include removed nodes)

* Remove redundant nodes as last resort in the end

Also switched to 5x5 Chimera grid

At this point K6 embedding on a 3x3 cell is working. Tried K12 on a
5x5 Chimera grid, which is not yet yielding good valid embeddings.
TODO: Code cleanup, especially the embedding solver that got
kind of cluttered.

* [Cpp] Refactoring and mutation class. (#11)

* [Cpp] Implemented visualizer for generic graphs.

* [Cpp] Started reworking parallelization concept for iterative local improvement.

* [Cpp] Adjusted extend operator.

* [Cpp] Continued with mutation manager.

* [Cpp] Fundamental rework of the structure

* [Cpp] Continued refactoring.

* [Cpp] Still refactoring.

* [Cpp] Refactoring...

* [Cpp] Refactored imports.

* [Cpp] Code running again - at least a bit.

* [Cpp] Continued fixing problems

* [Cpp] Refactoring.

* [Cpp] Refactored embedding state and added generic iteration methods.

* [Cpp] Some more loop replacements

* [Cpp] Fixing shifting operator.

* [Cpp] Implemented random gen and started working on shifting.

* [Cpp] Working on shifting.

* [Cpp] Fixed mutations.

* [Cpp] Started implementing annealing-based super vertex reducer.

* [Cpp] Continuing super vertex reducer.

* [Cpp] Fixed embedding invalidating bug.

* [Cpp] Working on extend.

* [Cpp] Implemented reducer as mutation.

* [Cpp] Started implementing evolutionary csc reducer.

* [Cpp] Continuing csc reducer.

* [Cpp] Added K15 for testing.

* [Cpp] Implemented evolutionary CSC reducer.

* [Cpp] Fixed some csc evo bugs.

* [Cpp] Added super vertex replacer.

* [Cpp] Fixed bugs.

* Update README.md

Co-authored-by: Splines <37160523+Splines@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant