fixed indexing issue when vertices are <:Integer #133

sbromberger · 2014-12-19T18:02:51Z

Fixes #131. This was a simple change, but may have memory implications, as it's enumerating the vertices. Tests on graphs with order=1e7 don't show any appreciable slowdown, though.

mlubin · 2014-12-19T18:05:53Z

How precisely is it enumerating the vertices?

sbromberger · 2014-12-19T18:07:48Z

Well, perhaps it's not. I'm passing vertices(g) as an additional argument, and my concern was that doing so might force an evaluation of the vertices which could possibly get expensive.

mlubin · 2014-12-19T18:10:16Z

vertex_index should be a fast O(1) operation when the vertices are integers. As long as we're careful to preserve that, this seems fine.

sbromberger · 2014-12-19T18:13:42Z

I don't think it is, and I don't think it can be. Consider:

h = graph([10,11,12,13],[])

The old (bad) code would simply return 10 for vertex_index(10,h). This is clearly incorrect and has the side effect of corrupting any type <:Integer. It is only correct when the vertices are 1-based sequences.

The new code, which fixes all cases, needs at a minimum one search through an unordered list of vertices, which I believe is O(n). It could be faster depending on how Julia does array (vector) searching.

Also, depending on dispatch method, this is no worse than others (see common.jl for some examples with other types).

mlubin · 2014-12-19T18:17:01Z

Agreed, but we definitely need a "fast case" available when the vertices are 1-based. Maybe @lindahua should comment on how this was intended to be handled in the initial design.

sbromberger · 2014-12-19T18:18:39Z

The only way to do that, I think, is to figure out a way to check to see whether vertices(g) is a range. Even then, if you add a node that is nonsequential afterwards, you're basically hosed, and this will break any attempt to remove a node.

mlubin · 2014-12-19T18:21:12Z

Or the convention could be that by using integers for vertices, you're guaranteeing that that the indices are 1-based and contiguous.

sbromberger · 2014-12-19T18:24:44Z

I think that would be a mistake. This is the reason char breaks in 3.3, and it would continue to break all subtypes of Integer, which include all signed, unsigned, bigint, and bool types.

Consider my use case, which is mapping UNIX UIDs to activity. I'd have to construct a custom (memory-laden) type as a vertex identifier, or use a non-indexing property which would destroy insert performance.

Anyone who does time-based graphing using epoch will also be burned by this.

It would also mean that graphs couldn't have a vertex with integer value <= 0.

mlubin · 2014-12-19T18:31:59Z

Well, there's no memory overhead for creating a custom immutable type, though it certainly is ugly. I have to object to making vertex lookups O(n) in the linear base case since it will silently increase the algorithmic complexity of existing code. Hopefully we can gather a few more opinions here.

sbromberger · 2014-12-19T18:36:28Z

So, I'm not sure it's O(n). I don't know enough about how Julia does vector searching, but IIRC unordered lists are O(n) search.

Another option is to create a new graph type that would have the guarantees you desire. This would have the advantage of requiring the user to explicitly acknowledge the indexing limitations (and would allow us to throw appropriate error messages if constraints are violated).

Regardless, I urge adoption of this PR because as it stands right now, 0.3.3 is broken for char and nonconsecutive/nonpositive ints, and 0.4.x is broken for nonconsecutive/nonpositive ints. This could really burn someone who's expecting proper behavior. IMO, it's better to have a correct result whose derivation MAY BE slower, than a guaranteed incorrect result that's delivered lightning fast. You can optimize speed, but correctness is boolean.

ETA: EVEN FOR contiguous, 1-based sequences of vertices, this code is broken:

julia> a = simple_graph(10)
Directed Graph (10 vertices, 0 edges)

julia> vertex_index(1000,a)
1000

lindahua · 2014-12-20T04:09:53Z

I am not satisfied with the current way of handling vertex/edge indexing either.

In the original design, what we intend is:

When using Int as vertex type, you implicitly assume that vertex_index(i, g) == i always holds.
If you want more sophisticated way of indexing vertices, you have to explicitly introduce a new vertex type, and handle the indexing by overriding vertex_index(v, g) for the customized vertex type.

lindahua · 2014-12-20T04:12:31Z

The following problem can be easily resolved by adding a bound checking in vertex_index:

julia> a = simple_graph(10)
Directed Graph (10 vertices, 0 edges)

julia> vertex_index(1000,a)
1000

However, using an algorithm of complexity O(n) or O(log n) for vertex(i::Int, g::AbstractGraph) will cause huge performance issues in many applications.

sbromberger · 2014-12-20T05:34:50Z

@lindahua

When using Int as vertex type, you implicitly assume that vertex_index(i, g) == i always holds.

I wish I had known of this implicit assumption - I wouldn't have spent the last two days trying to optimize dijkstra and pulling my hair out due to the catastrophic errors introduced by vertex_index().

If you decide to keep this constraint, I would urge copious documentation, and a rewrite of add_vertex! must ensure that the new vertex is sequential.

I really think this is a mistake, though - it's certainly one that will preclude me from using this package for my application - and without documentation and constraint checks, will cause others to rely on expected functionality that will (silently) break their code. Right now, there's nothing magical about Integer vertices that requires them to be positive, 1-based, and sequential. I would again recommend a special graph type that enforces these constraints with the advantage of speed.

It's not obvious at all in the documentation or the implementation that using an Integer for a vertex should result in incorrect indexing. If you decide to enforce the limitation, then you should explicitly disallow all Integer vertex types other than unsigned (including char for 0.3.x), you should ensure that add_vertex! does not require a vertex parameter, and, as a side effect, you will give up any possibility of deleting vertices in any efficient manner.

That last part means that this package will be completely unsuitable for modeling dynamic graphs such as AMI, power systems, and user activity networks.

In any case, if that's the decision, so be it, and I'll close the PR.

lindahua · 2014-12-20T06:24:22Z

@sbromberger Your concerns are valid.

I agree that this implicit assumption is a recipe for errors (and even disasters in some applications).

A stopgap to remedy this:

restrict vertex_index(i, g) = i to only SimpleGraph.
for others, using a safer way.

sbromberger · 2014-12-20T06:30:20Z

I think your suggestion is a good long-term solution as well - this is what I was thinking about when I proposed creating a special constrained graph for fast indexing.

In order to make SimpleGraph safe, it will be necessary to modify add_vertex! so that v is ignored (you can't / don't want to use it), and will need to enforce bounds checking on vertex_index.

For performance, perhaps using a heap or other search-optimized structure for non-SimpleGraph vertices would be a possibility. If memory serves, you'd never get O(1), but you could approach O(log n).

coveralls · 2014-12-20T07:11:37Z

Coverage remained the same when pulling 5c7460e on sbromberger:fix_vertex_index into 6525b82 on JuliaLang:master.

coveralls · 2014-12-20T07:11:38Z

Coverage remained the same when pulling 5c7460e on sbromberger:fix_vertex_index into 6525b82 on JuliaLang:master.

sbromberger · 2014-12-20T07:14:06Z

I've made the following changes:

created a new add_vertex!(g::SimpleGraph) that does not take a vertex, since these graphs must ensure consistent state of the vertex range, and deprecated add_vertex!(g::SimpleGraph, v)
made vertex_index(v::Integer, g::SimpleGraph) do proper bounds checking and still produce an O(1) result (= v) - fixes incorrect behavior of vertex_index (in all versions) #131 (comment) issue 2.
kept the original change for all other graphs to ensure that vertex_index() does the right thing (even if it's O(n)). - fixes incorrect behavior of vertex_index (in all versions) #131 (comment) issue 1.

…or add_vertex\!() deprecated add_vertex\!(g::SimpleGraph, v)

coveralls · 2014-12-20T07:35:53Z

Coverage remained the same when pulling fe2b228 on sbromberger:fix_vertex_index into 6525b82 on JuliaLang:master.

mlubin · 2014-12-20T19:57:58Z

So this change isn't correct because if the vertex type is KeyVertex, which stores its own index, vertex_index will still end up doing a linear search on the list of vertices.

sbromberger · 2014-12-20T21:10:40Z

Are there any other graph types that need to be excluded?

On Dec 20, 2014, at 11:58, Miles Lubin notifications@github.com wrote:

So this change isn't correct because if the vertex type is KeyVertex, which
stores its own index, vertex_index will still end up doing a linear search
on the list of vertices.

—
Reply to this email directly or view it on GitHub
#133 (comment).

mlubin · 2014-12-21T05:57:16Z

Probably the right way is to do something like:

function vertex_index{V<:ProvidedVertexType}(v::V, g::GenericGraph{V})
    if applicable(vertex_index, v)
        return vertex_index(v)
    else
        return vertex_index(v,vertices(g))
    end
end

Then undefine vertex_index(v::Integer) = v if this is consensus.

sbromberger · 2014-12-21T16:38:13Z

@mlubin

I'm not sure I follow:

julia> a = graph([2,4,6,8],[])
Directed Graph (4 vertices, 0 edges)

julia> applicable(vertex_index, 2)
true

...which would put us right back into #131 (comment) issue 1. What am I missing?

mlubin · 2014-12-21T16:39:23Z

Removing the definition vertex_index(v::Integer) = v

sbromberger · 2014-12-21T16:46:09Z

That "breaks" simple_graph (that is, it would force simple_graph to be O(n) which is what @lindahua didn't want), ...unless we keep the second commit (items 1 and 2 from #133 (comment)). Is that what you're suggesting?

mlubin · 2014-12-21T16:47:32Z

Yeah, remove vertex_index(v::Integer) and define vertex_index(v::Integer, g::SimpleGraph).

…ers O(1)

sbromberger · 2014-12-21T16:55:49Z

Yep, I got it now. See latest?

mlubin · 2014-12-21T16:57:37Z

LGTM, but tests are failing and the docs should be updated also.

sbromberger · 2014-12-21T17:05:10Z

Thank goodness for CI. I'm not sure this is going to work, since vertex_index(v) is being called by other areas of the code. (How one can have a vertex_index call without an associated graph is puzzling to me.)

mlubin · 2014-12-21T17:08:02Z

I think the code will need to be updated to use vertex_index(v,g) instead. They're probably a holdover from before the graph argument was introduced.

@sbromberger

Thanks @sbromberger Fixes #131 Fixes #127 Closes #133

@sbromberger

Thanks @sbromberger Fixes JuliaAttic#131 Fixes JuliaAttic#127 Closes JuliaAttic#133

fixed indexing issue when vertices are <:Integer

fdf8807

SimpleGraph now is O(1) vertex indexing, and does not take vertices f…

fe2b228

…or add_vertex\!() deprecated add_vertex\!(g::SimpleGraph, v)

added @mlubin's suggestions for indexing to preserve KeyIndex and oth…

ceb7b4c

…ers O(1)

mlubin added a commit that referenced this pull request Dec 21, 2014

Remove vertex_index(::Integer), make special case for SimpleGraph

6c0b52f

Thanks @sbromberger Fixes #131 Fixes #127 Closes #133

mlubin mentioned this pull request Dec 21, 2014

Remove vertex_index(::Integer), make special case for SimpleGraph #135

Merged

mlubin closed this in #135 Dec 21, 2014

ggggggggg pushed a commit to ggggggggg/Graphs.jl that referenced this pull request Mar 17, 2015

Remove vertex_index(::Integer), make special case for SimpleGraph

b731e8c

Thanks @sbromberger Fixes JuliaAttic#131 Fixes JuliaAttic#127 Closes JuliaAttic#133

StefanKarpinski added OG closed labels Oct 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixed indexing issue when vertices are <:Integer #133

fixed indexing issue when vertices are <:Integer #133

sbromberger commented Dec 19, 2014

mlubin commented Dec 19, 2014

sbromberger commented Dec 19, 2014

mlubin commented Dec 19, 2014

sbromberger commented Dec 19, 2014

mlubin commented Dec 19, 2014

sbromberger commented Dec 19, 2014

mlubin commented Dec 19, 2014

sbromberger commented Dec 19, 2014

mlubin commented Dec 19, 2014

sbromberger commented Dec 19, 2014

lindahua commented Dec 20, 2014

lindahua commented Dec 20, 2014

sbromberger commented Dec 20, 2014

lindahua commented Dec 20, 2014

sbromberger commented Dec 20, 2014

coveralls commented Dec 20, 2014

coveralls commented Dec 20, 2014

sbromberger commented Dec 20, 2014

coveralls commented Dec 20, 2014

mlubin commented Dec 20, 2014

sbromberger commented Dec 20, 2014

mlubin commented Dec 21, 2014

sbromberger commented Dec 21, 2014

mlubin commented Dec 21, 2014

sbromberger commented Dec 21, 2014

mlubin commented Dec 21, 2014

sbromberger commented Dec 21, 2014

mlubin commented Dec 21, 2014

sbromberger commented Dec 21, 2014

mlubin commented Dec 21, 2014

fixed indexing issue when vertices are <:Integer #133

fixed indexing issue when vertices are <:Integer #133

Conversation

sbromberger commented Dec 19, 2014

mlubin commented Dec 19, 2014

sbromberger commented Dec 19, 2014

mlubin commented Dec 19, 2014

sbromberger commented Dec 19, 2014

mlubin commented Dec 19, 2014

sbromberger commented Dec 19, 2014

mlubin commented Dec 19, 2014

sbromberger commented Dec 19, 2014

mlubin commented Dec 19, 2014

sbromberger commented Dec 19, 2014

lindahua commented Dec 20, 2014

lindahua commented Dec 20, 2014

sbromberger commented Dec 20, 2014

lindahua commented Dec 20, 2014

sbromberger commented Dec 20, 2014

coveralls commented Dec 20, 2014

coveralls commented Dec 20, 2014

sbromberger commented Dec 20, 2014

coveralls commented Dec 20, 2014

mlubin commented Dec 20, 2014

sbromberger commented Dec 20, 2014

mlubin commented Dec 21, 2014

sbromberger commented Dec 21, 2014

mlubin commented Dec 21, 2014

sbromberger commented Dec 21, 2014

mlubin commented Dec 21, 2014

sbromberger commented Dec 21, 2014

mlubin commented Dec 21, 2014

sbromberger commented Dec 21, 2014

mlubin commented Dec 21, 2014