Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fortify.igraph messes with vertex/edge attribute types #54

Closed
briatte opened this issue Jul 30, 2019 · 0 comments
Closed

fortify.igraph messes with vertex/edge attribute types #54

briatte opened this issue Jul 30, 2019 · 0 comments
Assignees
Labels

Comments

@briatte
Copy link
Owner

briatte commented Jul 30, 2019

See reprex at end of issue.

This is because fortify.network uses dumb for loops to import vertex/edge attributes (show vertex attributes only below):

# import vertex attributes
for (y in network::list.vertex.attributes(model)) {
nodes <- cbind(nodes, network::get.vertex.attribute(model, y), stringsAsFactors = stringsAsFactors)
names(nodes)[ncol(nodes)] <- y
}

This preserves attributes (column) types. By contrast, the (smarter) code in fortify.igraph uses sapply, which erroneously converts all vertex/edge attributes to character if a single attribute is of that class, resulting in numeric attributes being converted to characters (and then to factors, unless stringsAsFactors has been set to FALSE: see #53):

# import vertex attributes
if (length(igraph::list.vertex.attributes(model))) {
nodes <- cbind(
nodes,
sapply(
igraph::list.vertex.attributes(model),
FUN = igraph::get.vertex.attribute,
graph = model,
USE.NAMES = TRUE
),
stringsAsFactors = stringsAsFactors
)
}

Solution (1) would be to use purrr::map_dfc to get the sapply method to preserve column types, but that would end up being more complex than a for at the cost of additional dependencies (including dplyr, since purrr::map_dfc relies on it).

Solution (2), which is dumber but better in this context, is to use for loops in fortify.igraph, as in fortify.network.

Solution (3), the best one in my view, is to use igraph::as_data_frame(x, what = "vertices") and igraph::as_data_frame(x, what = "edges") to import vertex and edge attributes, which is what I'll do, unless @jcfisher has a better fix.

library(ggnetwork)
#> Loading required package: ggplot2
library(network)
#> network: Classes for Relational Data
#> Version 1.15 created on 2019-04-01.
#> copyright (c) 2005, Carter T. Butts, University of California-Irvine
#>                     Mark S. Handcock, University of California -- Los Angeles
#>                     David R. Hunter, Penn State University
#>                     Martina Morris, University of Washington
#>                     Skye Bender-deMoll, University of Washington
#>  For citation information, type citation("network").
#>  Type help("network-package") to get started.
library(igraph)
#> 
#> Attaching package: 'igraph'
#> The following objects are masked from 'package:network':
#> 
#>     %c%, %s%, add.edges, add.vertices, delete.edges,
#>     delete.vertices, get.edge.attribute, get.edges,
#>     get.vertex.attribute, is.bipartite, is.directed,
#>     list.edge.attributes, list.vertex.attributes,
#>     set.edge.attribute, set.vertex.attribute
#> The following objects are masked from 'package:stats':
#> 
#>     decompose, spectrum
#> The following object is masked from 'package:base':
#> 
#>     union
library(intergraph)

# network with numeric and character edges
data(emon, package = "network")
emon[[1]]
#>  Network attributes:
#>   vertices = 14 
#>   directed = TRUE 
#>   hyper = FALSE 
#>   loops = FALSE 
#>   multiple = FALSE 
#>   total edges= 83 
#>     missing edges= 0 
#>     non-missing edges= 83 
#> 
#>  Vertex attribute names: 
#>     Command.Rank.Score Decision.Rank.Score Formalization Location Paid.Staff Sponsorship vertex.names Volunteer.Staff 
#> 
#>  Edge attribute names: 
#>     Frequency
intergraph::asIgraph(emon[[1]])
#> IGRAPH 031e6f4 D--- 14 83 -- 
#> + attr: Command.Rank.Score (v/n), Decision.Rank.Score (v/n),
#> | Formalization (v/n), Location (v/c), na (v/l), Paid.Staff (v/n),
#> | Sponsorship (v/c), vertex.names (v/c), Volunteer.Staff (v/n),
#> | Frequency (e/n), na (e/l)
#> + edges from 031e6f4:
#>  [1]  2->1  3->1  8->1  9->1 14->1  1->2  3->2  4->2  8->2  1->3  2->3
#> [12]  4->3  7->3 12->3 13->3  1->4  3->4  8->4  1->5  3->5  8->5 14->5
#> [23]  3->6  8->6  9->6  1->7  2->7  3->7  4->7  5->7  8->7  9->7 10->7
#> [34] 11->7 12->7 13->7  1->8  2->8  3->8  5->8  7->8  9->8 12->8 13->8
#> [45] 14->8  1->9  2->9  3->9  4->9  8->9 10->9 11->9 12->9 13->9
#> + ... omitted several edges

# all goes well with fortify.network, characters as factors (as per #53)
str(ggnetwork(emon[[1]]))
#> 'data.frame':    97 obs. of  15 variables:
#>  $ x                  : num  0.134 0 0.628 0.111 0.561 ...
#>  $ y                  : num  0.68 0.485 0.447 0.246 1 ...
#>  $ Command.Rank.Score : num  0 10 3 5 0 0 20 40 10 30 ...
#>  $ Decision.Rank.Score: num  20 7 0 5 0 0 20 50 10 20 ...
#>  $ Formalization      : num  2 1 1 1 1 1 1 2 1 3 ...
#>  $ Location           : Factor w/ 1 level "L": 1 1 1 1 1 1 1 1 1 1 ...
#>  $ na.x               : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
#>  $ Paid.Staff         : num  10 400 200 60 1 7 60 7 70 100 ...
#>  $ Sponsorship        : Factor w/ 6 levels "City","County",..: 6 6 6 4 5 5 2 3 1 1 ...
#>  $ vertex.names       : Factor w/ 14 levels "A.1.Ambulance.Service",..: 12 14 13 4 9 10 8 5 2 3 ...
#>  $ Volunteer.Staff    : num  50 2000 0 0 20 80 20 100 0 0 ...
#>  $ xend               : num  0.134 0 0.628 0.111 0.561 ...
#>  $ yend               : num  0.68 0.485 0.447 0.246 1 ...
#>  $ Frequency          : num  NA NA NA NA NA NA NA NA NA NA ...
#>  $ na.y               : logi  NA NA NA NA NA NA ...

# however, fortify.igraph messes with the (vertex) attribute types
str(ggnetwork(intergraph::asIgraph(emon[[1]])))
#> 'data.frame':    97 obs. of  15 variables:
#>  $ x                  : num  0.662 0.724 0.392 0.537 0.424 ...
#>  $ y                  : num  0.414 0.69 0.664 1 0 ...
#>  $ Command.Rank.Score : Factor w/ 8 levels "0","10","2","20",..: 1 2 5 8 1 1 4 7 2 6 ...
#>  $ Decision.Rank.Score: Factor w/ 8 levels "0","10","2","20",..: 4 8 1 6 1 1 4 7 2 4 ...
#>  $ Formalization      : Factor w/ 3 levels "1","2","3": 2 1 1 1 1 1 1 2 1 3 ...
#>  $ Location           : Factor w/ 1 level "L": 1 1 1 1 1 1 1 1 1 1 ...
#>  $ na.x               : Factor w/ 1 level "FALSE": 1 1 1 1 1 1 1 1 1 1 ...
#>  $ Paid.Staff         : Factor w/ 9 levels "0","1","10","100",..: 3 6 5 7 2 8 7 8 9 4 ...
#>  $ Sponsorship        : Factor w/ 6 levels "City","County",..: 6 6 6 4 5 5 2 3 1 1 ...
#>  $ vertex.names       : Factor w/ 14 levels "A.1.Ambulance.Service",..: 12 14 13 4 9 10 8 5 2 3 ...
#>  $ Volunteer.Staff    : Factor w/ 7 levels "0","100","20",..: 5 4 1 1 3 7 3 2 1 1 ...
#>  $ xend               : num  0.662 0.724 0.392 0.537 0.424 ...
#>  $ yend               : num  0.414 0.69 0.664 1 0 ...
#>  $ Frequency          : num  NA NA NA NA NA NA NA NA NA NA ...
#>  $ na.y               : num  NA NA NA NA NA NA NA NA NA NA ...

Created on 2019-07-30 by the reprex package (v0.3.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant