In [1]:
# the copywright author is added, which is notebook specific
# for more information on codeowners and details, check CODEOWNERS
__copyright__ = "Copywright © 2025 Debmalya Pramanik (ZenithClown)"

<div align = "center">

# Basic Concepts & Usage of `NetworkX`

</div>

A *jupyter notebook* file exploring the optimization of share of business across plant-vendor combination with different constraints (MOQ, pack size, min. order, capacity) to meet the demand at a plant while minimizing the overall cost and time. Typically, this can be done using various optimization libaries as explained below.

In [2]:
import os   # miscellaneous os interfaces
import sys  # configuring python runtime environment
import json # json, i.e., dict based object manipulation
import math # basic mathematical operations, etc.
import yaml # file for manipulation of yaml file, as dict

In [3]:
import random # generate random numbers for poc development

In [4]:
from typing import Iterable
from uuid import uuid4 as UUID4
from tqdm.auto import tqdm as TQ

### Module for Complex Network

**`NetworkX`** is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. In addition, we are using **`gravis`** which is great for visualization of graphs and has an inbuilt feature of conversion of a primitive `nx.Graph` to a [D3.js](https://d3js.org/) visualization for export/view inline in jupyter notebook.

In [5]:
import gravis as gv
import networkx as nx

### Module for Optimization

In [6]:
import pulp as p

## User Defined Function(s)

It is recommended that any UDFs are defined outside the scope of the *jupyter notebook* such that development/editing of function can be done more practically. As per *programming guidelines* as [`src`](https://fileinfo.com/extension/src) file/directory is beneficial in code development and/or production release. However, *jupyter notebook* requires *kernel restart* if any imported code file is changed in disc, for this frequently changing functions can be defined in this section.

**Getting Started** with **`PYTHONPATH`**

One must know what are [Environment Variable](https://medium.com/chingu/an-introduction-to-environment-variables-and-how-to-use-them-f602f66d15fa) and how to call/use them in your choice of programming language. Note that an environment variable is *case sensitive* in all operating systems (except windows, since DOS is not case sensitive). Generally, we can access environment variables from terminal/shell/command prompt as:

```shell
# macOS/*nix
echo $VARNAME

# windows
echo %VARNAME%
```

Once you've setup your system with [`PYTHONPATH`](https://bic-berkeley.github.io/psych-214-fall-2016/using_pythonpath.html) as per [*python documentation*](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH) is an important directory where any `import` statements looks for based on their order of importance. If a source code/module is not available check necessary environment variables and/or ask the administrator for the source files.

Most of the utility functions available in `PYTHONPATH` is tracked and maintained in [GIST.GitHub/ZenithClown](https://gist.github.com/ZenithClown) which provides detailed documentation and code snippets/example use cases etc. For more information, and category wise module check [github](https://github.com/ZenithClown/ZenithClown) repository.

In [7]:
sys.path.append(os.path.join("..")) # base of the repository

# methods/class objects to define graph components
from routicle.components import edges, nodes

# methods/class objects for core functionalities of scm & logistics applications
import routicle.core.networkx as rnx
import routicle.core.optimizer as ro

## Global Argument(s)

The global arguments are *notebook* specific, however they may also be extended to external libraries and functions on import. The *boilerplate* provides a basic ML directory structure which contains a directory for `data` and a separate directory for `output`. In addition, a separate directory (`data/processed`) is created to save processed dataset such that preprocessing can be avoided.

In [8]:
def readConfig(filepath : str, verbose : bool = True, dropkeys : list = []) -> dict:
    config = yaml.load(open(filepath, "r"), Loader = yaml.FullLoader)

    _ = config.pop("about", None)
    version = config.pop("version", None)

    if verbose:
        print(f"Template File Version: {version}")
        print("  Types of Node Components:", len(config["components"]))
        print("   >> Component Key(s):", config["components"].keys())

    # create a dictionary of dropped key-value pairs
    dropped = dict(version = version)

    for key in dropkeys:
        dropped[key] = config.pop(key, None)

    return config, dropped

### Config File Information(s)

The config file is defined to design the network components (edges, nodes, etc.) with attributes for quick reference and get the value via an external API call/integration with a source system. The file is defined in a YAML format which is parsed as a dictionary in Python. The main structure of the config file with details are as follows.

#### Component(s) Detail

The nodes are defined under the `components` key which lists the number of nodes for a particular type followed by a list of names of the node. The attributes defined here are only the `count` which is defined for assertion purpose, while `name` is passed as the compulsory attribute for defining a node name.

```yaml
...
components:
    node-type:
        count: <int>  # int, default to null
        names: <list> # array, if null then [], else values
    ...
...
```

#### Attribute(s) Detail

Most of the node accepts any number of external attributes by default. The `attrbutes` key is defined to map any attribute using a `zip(...)` functionality for a `routicle` nodes component library. The attributes are seperated by the type of node component type defined earler, defaults to a blank dictionary `{}` if count is `null` for any node type.

```yaml
...
attributes:
    node-type:
        1:
            attr: <any>
        ...
            ...
        n:
            ...
    ...
...
```

The `attribute/node-type` is positional, for example for an attribute with `count = 3` then the `n = (1, 2, 3)` under the attributes key, and is unpacked internally.

In [9]:
config, _ = readConfig(os.path.join("..", "assets", "templates", "network.yaml"))

Template File Version: v1.0.0
  Types of Node Components: 4
   >> Component Key(s): dict_keys(['ports', 'plants', 'vendors', 'warehouses'])


In [10]:
# for simple explanation, the template is defined with some plants and vendors
# get the global values, and then proceed for component definition, attributes, etc.
N_PLANTS, N_VENDORS = config["components"]["plants"]["count"], config["components"]["vendors"]["count"]

print(f"No. of Plants : {N_PLANTS}\t | Plant Names:", config["components"]["plants"]["names"])
print(f"No. of Vendors : {N_VENDORS}\t | Vendor Names:", config["components"]["vendors"]["names"])

No. of Plants : 5	 | Plant Names: ['PA Unit-I', 'PB Unit-II', 'PC Unit-I', 'PD Unit-III', 'PE Unit-I']
No. of Vendors : 3	 | Vendor Names: ['VA Ltd.', 'VB Co. Ltd.', 'VC Industry']


## Share of Business Optimization

In [11]:
plants = [nodes.ManufacturingUnits(name = config["components"]["plants"]["names"][idx], **config["attributes"]["plants"][idx]) for idx in range(N_PLANTS)]
vendors = [nodes.SupplyPoints(name = config["components"]["vendors"]["names"][idx], **config["attributes"]["vendors"][idx]) for idx in range(N_VENDORS)]

# check dnodes definition under routicle networx/nxGraph for more information
dnodes = {node.name : node for node in plants + vendors}

In [12]:
# get the cost matrix from the underlying key of the template file, or
# we can also get the cost matrix from an underlying excel sheet/like wise
# the data shape must be of dimension of p x v where p = plants, v = vendors
# the iteration will be on for p in plants for v in vendors ... always for lookup
costmatrix = config["costmatrix"]["v2p"]

# as of now, time matrix is not defined; since a proper weight function
# is required or a multi-objective optimization function is required for this
timematrix = [[1.0] * N_VENDORS] * N_PLANTS

In [13]:
# dedges is a connection between the nodes to edges, and dynamically allocate cost and rate of production (if any)
dedges = {
    (v.name, p.name) : edges.TimeCostEdge(
        name = f"{v.name} --> {p.name}", unode = v, vnode = p,
        time = timematrix[pidx][vidx], cost = costmatrix[pidx][vidx],
        idgen = lambda un, vn : f"V2P_{un}_{vn}".upper(),
        useselfname = False, idgenargs = [v.cidx, p.cidx],
        indexposition = (pidx, vidx)
    )
    for pidx, p in enumerate(plants) for vidx, v in enumerate(vendors)
    if math.isfinite(costmatrix[pidx][vidx])
}

In [14]:
network = rnx.nxGraph(G = nx.DiGraph(), dnodes = dnodes, dedges = dedges) # created a directed graph, between vendor to plant to make time cost edge

In [15]:
fig = gv.d3(network.G, show_menu = True, show_node_image = True, node_label_size_factor = 0.50)
# fig.display(inline = True) # or view in browser

In [16]:
model = ro.PuLPModel(name = "SOBOptimizer", network = network)

In [17]:
demand, supply, supplyiter, demanditer = model.create_constraints(plants)

In [18]:
for item in supplyiter:
    model += p.lpSum(item["variables"]) >= item["object"].minorder, "S_{name}_MO".format(name = item["object"].cidx) # add min. order quantity

    maxcapacity = item["object"].maxcapacity
    maxcapacity = maxcapacity if maxcapacity != float("inf") else 1e3
    model += p.lpSum(
        [ var * dict(item["object"]).get("packsize", 1.0) for var in item["variables"] ]
    ) <= maxcapacity, "S_{name}_CAPACITY".format(name = item["object"].cidx)

In [19]:
for item in demanditer:
    # add the demand constraint at the plant, subjected to pack size of the vendor (optional, defaults to 1.0)
    model += p.lpSum(
        [ var * dict(model.network.getbycidx(str(var), component = "edge").unode).get("packsize", 1.0) for var in item["variables"] ]
    ) == item["object"].demand, "P_{name}_DEMAND".format(name = item["object"].cidx)
    
    # add the capacity constraint at the plant, if the max. capacity is not defined consider a very large number
    maxcapacity = item["object"].maxcapacity
    maxcapacity = maxcapacity if maxcapacity != float("inf") else 1e3
    model += p.lpSum(item["variables"]) <= maxcapacity, "P_{name}_CAPACITY".format(name = item["object"].cidx)

In [20]:
for idx, _ in enumerate(model.variables()):
    model.variables()[idx].cat = p.LpInteger

In [21]:
# the decision binary variable is of type `p.LpBinary` and is created outside the function scope
# this ensures that the base model and its functionalities remains unaltered
for item in demanditer:
    # all the binary variables are named with a suffix `_DBIN` and the name,
    # is derived from the variable names so defined in the variables iterable
    for variable in item["variables"]:
        edge = model.network.getbycidx(str(variable), component = "edge")
        model += item["object"].demand >= dict(edge.unode).get("moq", 0.0) * p.LpVariable(f"{edge.cidx}_DBIN", cat = p.LpBinary), "LOW_MOQ_{name}".format(name = str(variable))

In [22]:
print(f"Objective Function::\n\t{model.objective}", end = "\n\n")
print(f"No. of Variables = {len(model.nvariables):,} | Variable Names::\n\t{model.nvariables}", end = "\n\n")
print(f"No. of Constraints = {len(model.nconstraints):,} | Constraint Definition::\n{json.dumps(model.nconstraints, default = str, indent = 2)}")

Objective Function::
	87.24*V2P_VALTD_PAUNITI + 87.24*V2P_VALTD_PBUNITII + 87.24*V2P_VALTD_PCUNITI + 87.24*V2P_VALTD_PDUNITIII + 91.49*V2P_VALTD_PEUNITI + 257.25*V2P_VBCOLTD_PAUNITI + 257.25*V2P_VBCOLTD_PBUNITII + 257.25*V2P_VBCOLTD_PCUNITI + 255.0*V2P_VBCOLTD_PDUNITIII + 267.0*V2P_VBCOLTD_PEUNITI + 171.0*V2P_VCINDUSTRY_PAUNITI + 171.0*V2P_VCINDUSTRY_PBUNITII + 171.0*V2P_VCINDUSTRY_PCUNITI + 173.0*V2P_VCINDUSTRY_PDUNITIII

No. of Variables = 19 | Variable Names::
	[V2P_VALTD_PAUNITI, V2P_VALTD_PAUNITI_DBIN, V2P_VALTD_PBUNITII, V2P_VALTD_PBUNITII_DBIN, V2P_VALTD_PCUNITI, V2P_VALTD_PCUNITI_DBIN, V2P_VALTD_PDUNITIII, V2P_VALTD_PDUNITIII_DBIN, V2P_VALTD_PEUNITI, V2P_VALTD_PEUNITI_DBIN, V2P_VBCOLTD_PAUNITI, V2P_VBCOLTD_PBUNITII, V2P_VBCOLTD_PCUNITI, V2P_VBCOLTD_PDUNITIII, V2P_VBCOLTD_PEUNITI, V2P_VCINDUSTRY_PAUNITI, V2P_VCINDUSTRY_PBUNITII, V2P_VCINDUSTRY_PCUNITI, V2P_VCINDUSTRY_PDUNITIII]

No. of Constraints = 30 | Constraint Definition::
{
  "S_VALTD_MO": "V2P_VALTD_PAUNITI + V2P_VALTD_PB

In [23]:
status = model.optimize()

# we've defined a positional index in an attribute, this
# can be used to return the matrix in the original order for representation
output = [[None] * N_VENDORS for _ in range(N_PLANTS)] # https://stackoverflow.com/a/2739564

for var in [variable for variable in model.nvariables if "_DBIN" not in str(variable)]:
    edge = network.getbycidx(str(var), component = "edge")
    output[edge.indexposition[0]][edge.indexposition[1]] = p.value(var) * edge.unode.packsize

print(output)

Solver Status: Optimal
Target Objective: 2,601.89
[[9.0, 3.0, 0.0], [0.0, 9.0, 0.0], [0.0, 0.0, 4.0], [1.0, 0.0, 0.0], [1.0, 3.0, None]]


In [24]:
# model.__reset_constraints__()