-
Notifications
You must be signed in to change notification settings - Fork 0
/
graph-coloring.Rmd
214 lines (170 loc) · 6.95 KB
/
graph-coloring.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
---
title: "Graph Coloring"
author: "Forest Fang"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Graph Coloring}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
`graphcoloring` is a collection of graph coloring algorithms for coloring vertices of a graph such that no two adjacent vertices share the same color. The algorithms
are included via the embedded 'GraphColoring' C++ library, <https://github.com/brrcrites/GraphColoring>. The package provide two sets of functions, `color_*` and `graph_coloring_*`, which operate on a `tidygraph` and adjavency lists respectively. Both sets of functions covers all algorithms found in the C++ GraphColoring library.
## Algorithms
Here is the list of algorithms found in `graphcoloring` package:
```{r algorithms, echo=FALSE}
library(graphcoloring)
do.call(
htmltools::tags$ul,
unname(Map(htmltools::tags$li, ls("package:graphcoloring", pattern = "^graph_coloring_")))
)
```
## Coloring a `tidygraph`
[`tidygraph`](https://github.com/thomasp85/tidygraph) is a powerful abstraction for graph datasets. It envisions a graph as two tidy tables, nodes and edges, and provides ways to activate either set and apply `dplyr` verbs for manipulation.
`color_*` functions operate under `tidygraph` family and can be used to color nodes within `mutate` context similar to `group_*` functions in `tidygraph`.
They automatically use the graph that is being computed on, and otherwise passes on its arguments to the relevant coloring function. The return value is always
a integer vector of assigned color index so that neighboring nodes never share the same color.
```{r example, message=FALSE, fig.width=7}
library(graphcoloring)
library(tidygraph)
library(ggraph)
set.seed(42)
play_islands(5, 10, 0.8, 3) %>%
mutate(color = as.factor(color_dsatur())) %>%
ggraph(., layout = 'kk') +
geom_edge_link(aes(alpha = ..index..), show.legend = FALSE) +
geom_node_point(aes(color = color), size = 7) +
theme_graph()
```
## Working with Adjacency List
`graph_coloring_*` functions directly take adjacency lists and returns an integer vector of assigned labels.
Here is a 3-coloring of the famous [Petersen Graph](https://en.wikipedia.org/wiki/Petersen_graph):
```{r adj-example, message=FALSE, out.extra = 'style = "margin:0 auto"'}
library(graphcoloring)
library(igraph)
library(dplyr)
# create graph
petersen_graph <- graph.famous("Petersen")
# get adjacency list
petersen_edges <- as_adj_list(petersen_graph)
# color the graph with 3 colors
set.seed(10737312)
petersen_colors <- graph_coloring_tabucol(petersen_edges, 3)
# arrange vertices for layout
petersen_positions <- data_frame(
theta = (0:4) * 2 * pi / 5 + pi / 2,
r = 2
) %>%
bind_rows(
mutate(., r = r / 2)
) %>%
transmute(
x = r * cos(theta),
y = r * sin(theta)
)
petersen_graph %>%
as_tbl_graph() %>%
mutate(color = as.factor(petersen_colors)) %>%
ggraph(., layout = "manual", x = petersen_positions$x, y = petersen_positions$y) +
geom_edge_link() +
geom_node_point(aes(color = color), size = 7, show.legend = FALSE) +
theme_graph()
```
One common use case for graph coloring is to visualize geographical dataset to color contiguous groupings.
For example, this can be used with `sf::st_intersects()` to color a feature collection for visualization.
Here we look at [Bureau of Economic Analysis regions](https://en.wikipedia.org/wiki/List_of_regions_of_the_United_States#/media/File:BEA_regions.png) which group
US states into 8 regions:
```{r sf-example, message=FALSE}
library(graphcoloring)
library(USAboundaries)
library(sf)
library(ggplot2)
library(rvest)
# retrieve Bureau of Economic Analysis regions
bea_regions <- read_html("https://apps.bea.gov/regional/docs/regions.cfm") %>%
html_node(".table") %>%
html_table()
# 48 states
states_sf <- us_states() %>%
filter(!(name %in% c("Alaska", "District of Columbia", "Hawaii", "Puerto Rico"))) %>%
left_join(bea_regions, c("state_name" = "State or Region name"))
# color regions
set.seed(48)
region_colors <- states_sf %>%
group_by(`Region code`) %>%
summarise() %>% {
colors <- st_intersects(.) %>%
graph_coloring_dsatur() %>%
as.factor()
data_frame(
`Region code` = .$`Region code`,
color = colors
)
}
states_sf %>%
left_join(region_colors, "Region code") %>%
ggplot() +
geom_sf(aes(fill = color), show.legend = FALSE) +
theme_bw()
```
It might be better to choose an 8-color palette in this case but graph coloring can be particularly useful when the number of colors get exceedingly big.
## Other Applications
Graph coloring is commonly used in [Scheduling](https://en.wikipedia.org/wiki/Graph_coloring#Scheduling) and [Register Allocation](https://en.wikipedia.org/wiki/Graph_coloring#Register_allocation). It can also be used to solve [Sudoku](http://www.cs.kent.edu/~dragan/ST-Spring2016/SudokuGC.pdf) puzzles!
A Sudoku puzzle plays on a 9x9 grid where some entries are pre-filled with numbers from 1 to 9. The goal is the fill the entire grid with 1 to 9 such that:
1. Numbers in each row is not repeated
1. Numbers in each columns is not repeated
1. Numbers in each of 3x3 box/block/subgrid is not repeated
![](https://upload.wikimedia.org/wikipedia/commons/thumb/e/e0/Sudoku_Puzzle_by_L2G-20050714_standardized_layout.svg/361px-Sudoku_Puzzle_by_L2G-20050714_standardized_layout.svg.png){width=45%}
![](https://upload.wikimedia.org/wikipedia/commons/1/12/Sudoku_Puzzle_by_L2G-20050714_solution_standardized_layout.svg){width=45%}
A Sudoku puzzle can be converted into a graph by modeling the 9x9 cells into 81 vertices where a pair of vertices are connected if and only if they are on the same row, column, or 3x3 block. Each valid Sudoku solution is therefore a 9-coloring of the Sudoku graph.
```{r sudoku, echo=FALSE}
sudoku_graph <- function(n) {
sudoku_nodes <-
tidyr::crossing(
row = 1:(n * n),
col = 1:(n * n)
) %>%
mutate(
id = row_number(),
block = ceiling(row / n) * n + ceiling(col / n) - n
)
complete_subgraph <- function(nodes, var) {
var <- enquo(var)
nodes %>%
group_by(!!var) %>%
do(tidyr::crossing(from = .$id, to = .$id)) %>%
ungroup() %>%
select(from, to) %>%
filter(from > to)
}
sudoku_edges <-
bind_rows(
complete_subgraph(sudoku_nodes, row),
complete_subgraph(sudoku_nodes, col),
complete_subgraph(sudoku_nodes, block),
)
create_empty(0) %>%
bind_nodes(sudoku_nodes) %>%
bind_edges(sudoku_edges)
}
```
```{r, fig.show='hold'}
generate_sudoku <- function(n, seed) {
set.seed(seed)
sudoku_graph(n) %>%
mutate(color = as.factor(color_tabucol(4))) %>%
ggraph(., layout = "grid") +
geom_edge_link() +
geom_node_point(aes(color = color), size = 7, show.legend = TRUE) +
theme_graph() +
theme(legend.position = "bottom")
}
generate_sudoku(2, 432)
generate_sudoku(2, 45)
```