Interfaces for network models #47

simonbowly · 2023-04-21T05:38:04Z

Meta issue to make sure we have consistent APIs between network-y mods #39 #36

Networkx is very clean for this, as we can have node and edge attributes and all the data for a mod is bundled into one python object. We probably want to avoid networkx/igraph/others as dependencies, but only if we can find just as clean an API for network models using the existing numpy/scipy/pandas dependencies.

For weighted or unweighted graphs we use scipy sparse arrays (e.g. mwis and matching):

# Graph adjacency matrix (upper triangular) as a sparse matrix.
g = nx.cubical_graph()
adjacency_matrix = sp.triu(nx.to_scipy_sparse_array(g))

# Vertex weights
weights = np.array([2**i for i in range(8)])

# Compute maximum weighted independent set.
mwis = maximum_weighted_independent_set(adjacency_matrix, weights)

however this doesn't quite suit networks which need multiple attributes per edge. We could pass multiple arrays, but this is unweildy as the user has to build two scipy objects with duplicated data for the edge pattern. The alignment between sparse indexes and positions in the demand array is not too nice either, as it's restricted to 0 ... n-1 node labels.

cost = ... # sparse array
capacity = ... # sparse array
demand = ... # dense array
min_cost_flow(cost, capacity, demand)

I also wrote a network flow function for use in bipartite matching, which uses dense arrays to pass all the edge source/target indices, costs & capacities. It's meant as a low-level utility for other mods to use though, it would be fast but not a nice API:

gurobi-optimods/src/gurobi_optimods/network_util.py

Lines 11 to 18 in ae7e1d8

    
           def solve_min_cost_flow( 
        
               env: gp.Env, 
        
               edge_source: np.ndarray, 
        
               edge_target: np.ndarray, 
        
               capacity: np.ndarray, 
        
               cost: np.ndarray, 
        
               demand: np.ndarray, 
        
           ):

So I think pandas is the better alternative here, with input data as below. It's easy to do a direct formulation of a flow problem using gurobipy-pandas as well, see networkflow.md

edge_data = pd.DataFrame([
    {"source": 0, "target": 1, "capacity": 16, "cost": 0},
    {"source": 0, "target": 2, "capacity": 13, "cost": 0},
    {"source": 1, "target": 2, "capacity": 10, "cost": 0},
    {"source": 2, "target": 1, "capacity": 4, "cost": 0},
    {"source": 1, "target": 3, "capacity": 12, "cost": 0},
    {"source": 3, "target": 2, "capacity": 9, "cost": 0},
    {"source": 2, "target": 4, "capacity": 14, "cost": 0},
    {"source": 4, "target": 3, "capacity": 7, "cost": 0},
    {"source": 3, "target": 5, "capacity": 20, "cost": 0},
    {"source": 4, "target": 5, "capacity": 4, "cost": 0},
])

node_data = pd.DataFrame([
    {"node": 0, "demand": -23},
    {"node": 5, "demand": 23}
])

@torressa @stevedwards what do you think (for the final version, ok to continue hacking with networkx for the time being)? It's also fine to use networkx in the docs examples, either to generate networks or produce figures. @maliheha did this for MWIS; her examples generate a networkx graph and convert to scipy using networksx/numpy functions. It's similarly straightforward for pandas, see to_pandas_edgelist which gives the same structure as edge_data above from a graph with edge attributes.

The text was updated successfully, but these errors were encountered:

torressa · 2023-04-21T17:14:52Z

I am happy with using pandas as our main external interface for problems with many edge attributes. It's cleaner than adding a new dep or using a function with many inputs.

To make it super easy for people to use I think we should aim to transform stuff for people with different preferences to use (we can add these as optional deps: i.e. only define the function when the import is available).

For graph-based mods, we could have, for example (thanks to Python's lack of overloading),

graph_mod(data: pd.DataFrame, ...)
graph_mod_sp(G: sp.sparray, ...) (This can be the default one when pandas one is not needed)
graph_mod_nx(G: nx.DiGraph, ...) (optional dep)

This will really help peeps in the graph community try these out. Of course, as @ruthmair pointed out, some of these libraries will already have algorithms for these problems in place, but sometimes they will not!

simonbowly · 2023-04-24T02:38:05Z

Sounds good. You can overload the same function name if you want; just check the argument type. I've done that in bipartite matching. It looks ok but might start to look odd when e.g. networkx requires only one object as an argument but pandas/scipy require two.

simonbowly · 2023-05-31T04:32:44Z

We have a more consistent approach now: use scipy and pandas as much as possible, but also provide some documented way to pass a networkx graph. Closing.

torressa mentioned this issue Apr 25, 2023

Min-Cost Flow Problem #39

Merged

7 tasks

simonbowly added this to the First public release milestone May 8, 2023

simonbowly closed this as completed May 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interfaces for network models #47

Interfaces for network models #47

simonbowly commented Apr 21, 2023 •

edited

Loading

torressa commented Apr 21, 2023 •

edited

Loading

simonbowly commented Apr 24, 2023

simonbowly commented May 31, 2023

Interfaces for network models #47

Interfaces for network models #47

Comments

simonbowly commented Apr 21, 2023 • edited Loading

torressa commented Apr 21, 2023 • edited Loading

simonbowly commented Apr 24, 2023

simonbowly commented May 31, 2023

simonbowly commented Apr 21, 2023 •

edited

Loading

torressa commented Apr 21, 2023 •

edited

Loading