Skip to content

The igraph object format

Kirill Müller edited this page Jun 12, 2023 · 2 revisions

Description

List of length 10 with a "class" attribute

  1. igraph_t_idx_n: numeric(1), number of vertices

  2. igraph_t_idx_directed: logical(1), is the graph directed?

  3. igraph_t_idx_from: numeric(), "from" vertex ID of each edge in order

  4. igraph_t_idx_to: numeric(), "to" vertex ID of each edge in order

  5. igraph_t_idx_oi: auxiliary indexes, numeric() in <= 1.4.3, NULL (ignored if present) in 1.5.0

  6. igraph_t_idx_ii: auxiliary indexes, numeric() in <= 1.4.3, NULL (ignored if present) in 1.5.0

  7. igraph_t_idx_os: auxiliary indexes, numeric() in <= 1.4.3, NULL (ignored if present) in 1.5.0

  8. igraph_t_idx_is: auxiliary indexes, numeric() in <= 1.4.3, NULL (ignored if present) in 1.5.0

  9. igraph_t_idx_attr: attribute structure

  10. igraph_t_idx_env: environment, gains a new entry "igraph" in 1.5.0

Requirements

  • Old versions should not misbehave with the new format. Examples of bad behaviour: silent wrong result, crash, memory leak. Examples of acceptable behaviour: immediate error (even if unintuitive), correct operation. This applies to each function in old versions, different functions may behave differently.

  • Future versions should support all older format versions, either by doing an automatic upgrade, or by instructing the user to upgrade manually.

    • Now fully tested. User is being asked to call upgrade_graph() if possible, supported for all inputs from igraph >= 0.2. The in-place upgrade is possible because igraph_t_idx_env is an environment that can be updated without changing the object that contains it.
  • Proper format versioning and checks: Starting with the next version, igraph should check the format of an object, and should be able to decide whether that format is compatible and act accordingly (accept or reject the object, and advise the user with a clear message).

  • Support future changes: Future format modifications should be feasible and easy. It should be very clear to a future maintainer how they can execute a format change without breaking these requirements.

    • Search for R_IGRAPH_TYPE_VERSION
    • Search for the string it is defined to (currently 1.5.0)
    • Adapt every occurrence as needed
  • Long-term suitability: let’s try to get it right so we can go as long as possible without another format change, even if the C-side format is modified.

    • The C side is never serialized to disk. Exotic risk: unload the igraph R package and load an older/newer version of that package with igraph objects in memory. To mitigate, ABI changes could come with a change of the element name our external pointer is stored in (currently unclass(g)[[10]]$igraph). If this changes, e.g., to igraph2 after an ABI update, users won't crash even in that rare case.
  • Platform independent: An igraph object saved on one computer should work on another, as much as feasible. This includes 32/64-bit interop.

    • Handled by R as long as we rely on natively supported data types, no action needed.
  • Minimize storage redundancy: Try not to double storage requirements if we can get away with less.

    • The storage requirements are reduced compared to 1.4.3 because the auxiliary index vectors are no longer stored and actively discarded when upgrading to 1.5.0.
  • Minimize conversion overhead: Since the plan is to have an associated C-side igraph_t, minimize the conversion overhead (both in time and in code complexity) between the R-side and C-side objects.

    • The complexity is captured in the static restore_pointer() function, it requires a single pass, is cache efficient and small and simple enough.
  • Clearly documented format versioning: Any proposal should explain how formats are versioned and how versions are meant to be interpreted (is 0.10 greater than 0.8?)

    • Unclear: do we switch to integer versions?