Skip to content

jkpr/OdkGraph

Repository files navigation

ODK graph

Description

Analyze your XlsForms as directed graphs. Survey elements, such as select_one ..., calculate, or note become nodes in such a graph. In other words, the nodes of the graph are the individual XlsForm questions (rows in an XlsForm). The edges are dependencies on other questions. If question B depends on question A being answered a specific way (e.g. through ${...} in the relevant column), then an edge points from A to B. A dependency could also be when a label in question D displays the value of survey element C. Here an edge points from C to D.

Installation

All package dependencies, networkx and xlrd, are on PyPI. To install, a single pip call on the command line suffices:

python3 -m pip install https://github.com/jkpr/OdkGraph/zipball/master

Usage

First, make sure the ODK Xlsform converts cleanly to XML.

Import the OdkGraph class with

from odkgraph import OdkGraph

Next, create an OdkGraph object. The __init__ method accepts a path to the file:

odk_graph = OdkGraph('/path/to/odk/xlsform.xlsx')

Access Xlsform nodes

Access nodes through a variety of ways

odk_graph['age']        # Get the ODK survey element (node) named 'age'
odk_graph[0]            # Zero-indexed node access. This example returns the first node
odk_graph.excel_row(2)  # Return the ODK survey element from row 2 in the Excel file

Slicing is also supported.

Learn stuff

Some useful things this code does now that we have an OdkGraph object:

odk_graph.number_edges()            # The number of edges (dependencies)
odk_graph.number_nodes()            # The number of nodes (survey elements)
odk_graph.forward_dependencies()    # The ODK elements that depend on things that are defined after them in the Xlsform
odk_graph.terminal_nodes()          # The ODK elements that depend on other elements, but nothing depends on them
odk_graph.isolates()                # The ODK elements that depend on nothing else, and nothing depends on them
odk_graph.simple_cycles()           # A list of cyclical dependencies

With node(s) in hand, we can do

age = odk_graph['age']
odk_graph.predecessors(age)             # All nodes that 'age' directly depends on
odk_graph.successors(age)               # All nodes that directly depend on 'age'
odk_graph.all_dependencies_of([age])    # All nodes that 'age directly or indirectly depends on
odk_graph.all_nodes_dependent_on([age]) # All nodes that directly or indirectly depend on 'age'

The underlying networkx network (documentation here) can be accessed with

odk_graph.network

See all methods and attributes on OdkGraph and their docstrings with

help(OdkGraph)

or by reading the source code.

Bugs

Submit bug reports to James K. Pringle at jpringleBEAR@jhu.edu minus the bear.

About

Conceptualize an XlsForm as a graph

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published