New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v0.7.0 write_graphml changes integer node attribute type and value #796
Comments
We are dealing with two separate issues here. One is the fact that igraph converts integer attributes to doubles when the graph is saved to GraphML. Unfortunately this is not easy to deal with because the GraphML writer is implemented deep down in igraph's C core, and on the C level igraph distinguishes between three types of attributes only: numbers (which are stored as doubles), strings and Booleans. This means that by the time the GraphML writer function is called, the Python interface has already converted the attribute values to "regular" C doubles because this is the only way it can pass the values down to the C layer. However, if you are not tied to the GraphML format, you can simply pickle your graphs instead (using the The other problem (the fact that it seems that the attributes are not loaded back properly) is more interesting, but unfortunately I can investigate this only if you could upload a full, self-contained script (and most likely a corresponding GraphML file) somewhere that reproduces the error on your machine. Please post the URL here if you managed to produce such a script so I can check what's going on here. |
Thank you for your prompt reply. Actually, the only thing you should need in addition to the code I posted in the comment above is the list of integer IDs. I could give this to you, but it turns out you don't actually need it. Interestingly, the attributes only change when the integers are above a certain value. I found the break point in my original graph case and then isolated the value above which errors start to occur. It seems any integer values over 1,000,000 gets rounded to the nearest tenth place using something similar to the typical decimal rounding procedure. You can replicate this with the following:
This is the output:
Oddly, the rounding at multiples of 5 alternates between rounding down and rounding up. The first rounds down, the next up, and so on. This may have something to do with the floating point rounding protocol. |
Okay, this is due to how the standard C library prints floats into the GraphML file. Up to 999999 we are fine because we write the number exactly into GraphML. From 1000000 the standard C library switches to scientific notation with rounding, so we get (Fun fact: your code crashes my machine with |
Note to self: we are probably looking for a solution that strives to
|
…ing precision for numeric attributes, fixes #796
Fixed in fdcaa14. |
I have a graph with 2,146,334 nodes, each with a 'name' attribute which contains unique integer IDs. When I write this graph to a GraphML file using the
write_graphml
orwrite_graphmlz
methods of the Graph instance, the 'name' attributes are given the type double and many of them change. The code below illustrates:The same issue occurs if I create a new attribute for the IDs rather than relying on the default 'name' attribute.
For now I am circumventing this problem by converting the integer IDs to strings, which are handled properly. However, it would be nice to have this resolved.
The text was updated successfully, but these errors were encountered: