diff --git a/README.md b/README.md index 15bc5e38..816575e7 100644 --- a/README.md +++ b/README.md @@ -5,15 +5,17 @@ # `netrd`: A library for network {reconstruction, distances, dynamics} -NOTE: This library is pre-alpha. **Use at your own risk.** - This library provides a consistent, NetworkX-based interface to various -utilities for graph distances, graph reconstruction from time series data, -and simulated dynamics on networks. For the API reference visit -[this link](https://netrd.readthedocs.io/en/latest/). +utilities for graph distances, graph reconstruction from time series data, and +simulated dynamics on networks. + +Some resources that maybe of interest: -To see the library in action, visit the [netrd -explorer](https://netrdexplorer.herokuapp.com/). +* An interactive demonstration: [netrd + explorer](https://netrdexplorer.herokuapp.com) +* A [tutorial](https://netrd.readthedocs.io/en/latest/tutorial.html) on how to use the library +* The API [reference](https://netrd.readthedocs.io/en/latest/) +* A [notebook](https://nbviewer.jupyter.org/github/netsiphd/netrd/blob/master/notebooks/00%20-%20netrd_introduction.ipynb) showing advanced usage # Installation @@ -30,108 +32,52 @@ has dependencies on Cython and [POT](https://github.com/rflamary/POT). ## Reconstructing a graph -All reconstruction algorithms provide a simple interface. First, initialize the -reconstructor object by calling its constructor with no arguments. Then, use the -`fit()` method to obtain the reconstructed network. - -```python -TS = np.loadtxt('data/synth_4clique_N64_simple.csv', - delimiter=',', - encoding='utf8') -# TS is a NumPy array of shape N (number of nodes) x L (observations). +The basic usage of a graph reconstruction algorithm is as follows: -recon = netrd.reconstruction.RandomReconstructor() -G = recon.fit(TS) ``` - -Many reconstruction algorithms store additional metadata in a `results` -dictionary. - -```python -# Another way to obtain the reconstructed graph -G = recon.results['graph'] - -# A dense matrix of weights -W = recon.results['weights_matrix'] - -# The binarized matrix from which the graph is created -A = recon.results['thresholded_matrix'] +>>> reconstructor = ReconstructionAlgorithm() +>>> G = reconstructor.fit(TS, ) +>>> # or alternately, G = reconstructor.results['graph'] ``` -Many, though not all, reconstruction algorithms work by assigning each potential -edge a weight and then thresholding the matrix to obtain a sparse -representation. This thresholding can be controlled by setting the -`threshold_type` argument to one of four values: - -* `range`: Consider only weights whose values fall within a range. -* `degree`: Consider only the largest weights, targeting a specific average - degree. -* `quantile`: Consider only weights in, e.g., the 0.90 quantile and above. -* `custom`: Pass a custom function for thresholding the matrix yourself. - -Each of these has a specific argument to pass to tune the thresholding: - -* `cutoffs`: A list of 2-tuples specifying the values to keep. For example, to - keep only values whose absolute values are above 0.5, use `cutoffs=[(-np.inf, - -0.5), (0.5, np.inf)]` -* `avg_k`: The desired average degree of the network. -* `quantile`: The appropriate quantile (not percentile). -* `custom_thresholder`: A user-defined function that returns an N x N NumPy - array. - -```python -H = recon.fit(TS, threshold_type='degree', avg_k = 15.125) +Here, `TS` is an N x L numpy array consisting of L +observations for each of N sensors. This constrains the graphs +to have integer-valued nodes. - -print(nx.info(G)) -# This network is a complete graph. - -print(nx.info(H)) -# This network is not. -``` +The `results` dict object, in addition to containing the graph +object, may also contain objects created as a side effect of +reconstructing the network, which may be useful for debugging or +considering goodness of fit. What is returned will vary between +reconstruction algorithms. ## Distances between graphs -Distances behave similarly to reconstructors. All distance objects have a -`dist()` method that takes two NetworkX graphs. +The basic usage of a distance algorithm is as follows: -```python -G1 = nx.fast_gnp_random_graph(1000, 0.1) -G2 = nx.fast_gnp_random_graph(1000, 0.1) - -dist = netrd.distance.NetSimile() -D = dist.dist(G1, G2) ``` - -Some distances also store metadata in `results` dictionaries. - -```python -# Another way to get the distance -D = dist.results['dist'] - -# The underlying features used in NetSimile -vecs = dist.results['signature_vectors'] +>>> dist_obj = DistanceAlgorithm() +>>> distance = dist_obj.dist(G1, G2, ) +>>> # or alternatively: distance = dist_obj.results['dist'] ``` +Here, `G1` and `G2` are `nx.Graph` objects (or subclasses such as +`nx.DiGraph`). The results dictionary holds the distance value, as +well as any other values that were computed as a side effect. + ## Dynamics on graphs -As a utility, we also implement various ways to simulate dynamics on a network. -These have a similar interface to reconstructors and distances. Their -`simulate()` method takes an input graph and the desired length of the dynamics, -returning the same N x L array used in the graph reconstruction methods. +The basic usage of a dynamics algorithm is as follows: -```python -model = netrd.dynamics.VoterModel() -TS = model.simulate(G, 1000, noise=.001) +``` +>>> ground_truth = nx.read_edgelist("ground_truth.txt") +>>> dynamics_model = Dynamics() +>>> synthetic_TS = dynamics_model.simulate(ground_truth, ) +>>> # G = Reconstructor().fit(synthetic_TS) +``` -# Another way to get the dynamics -TS = model.results['TS'] +This produces a numpy array of time series data. -# The original graph is stored in results -H = model.results['ground_truth'] -``` # Contributing -Contributing guidelines can be found in -[CONTRIBUTING.md](CONTRIBUTING.md). +Contributing guidelines can be found in [CONTRIBUTING.md](CONTRIBUTING.md). diff --git a/doc/source/index.rst b/doc/source/index.rst index ab650184..02feb9da 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -1,24 +1,44 @@ -.. netrd documentation master file, created by - sphinx-quickstart on Mon May 20 17:29:14 2019. - You can adapt this file completely to your liking, but it should at least - contain the root `toctree` directive. +``netrd``: A library for network {reconstruction, distances, dynamics} +====================================================================== -netrd -===== +This library provides a consistent, NetworkX-based interface to various +utilities for graph distances, graph reconstruction from time series +data, and simulated dynamics on networks. -`netrd` stands for Network Reconstruction and Distances. It is a repository -of different algorithms for constructing a network from time series data, -as well as for comparing two networks. It is the product of the Network -Science Insitute 2019 Collabathon. +To see the library in action, visit the `netrd +explorer `__. -To see the library in action, visit the `netrd explorer`_. +Installation +============ -.. _netrd explorer: https://netrdexplorer.herokuapp.com/ +:: + + git clone https://github.com/netsiphd/netrd + cd netrd + pip install . + +Aside from NetworkX and the Python scientific computing stack, this +library also has dependencies on Cython and +`POT `__. + +Tutorial +======== + +A tutorial on using the library can be found `here `__. To see +more advanced usage of the library, refer to `this +notebook `__. + +Contributing +============ + +Contributing guidelines can be found in +`CONTRIBUTING.md `__. .. toctree:: :maxdepth: 1 - :caption: API Reference + :caption: Contents + tutorial dynamics distance reconstruction diff --git a/doc/source/tutorial.rst b/doc/source/tutorial.rst new file mode 100644 index 00000000..35ff4da5 --- /dev/null +++ b/doc/source/tutorial.rst @@ -0,0 +1,112 @@ +Tutorial +======== + +Reconstructing a graph +---------------------- + +All reconstruction algorithms provide a simple interface. First, +initialize the reconstructor object by calling its constructor with no +arguments. Then, use the ``fit()`` method to obtain the reconstructed +network. + +.. code:: python + + TS = np.loadtxt('data/synth_4clique_N64_simple.csv', + delimiter=',', + encoding='utf8') + # TS is a NumPy array of shape N (number of nodes) x L (observations). + + recon = netrd.reconstruction.RandomReconstructor() + G = recon.fit(TS) + +Many reconstruction algorithms store additional metadata in a +``results`` dictionary. + +.. code:: python + + # Another way to obtain the reconstructed graph + G = recon.results['graph'] + + # A dense matrix of weights + W = recon.results['weights_matrix'] + + # The binarized matrix from which the graph is created + A = recon.results['thresholded_matrix'] + +Many, though not all, reconstruction algorithms work by assigning each +potential edge a weight and then thresholding the matrix to obtain a +sparse representation. This thresholding can be controlled by setting +the ``threshold_type`` argument to one of four values: + +- ``range``: Consider only weights whose values fall within a range. +- ``degree``: Consider only the largest weights, targeting a specific + average degree. +- ``quantile``: Consider only weights in, e.g., the 0.90 quantile and + above. +- ``custom``: Pass a custom function for thresholding the matrix + yourself. + +Each of these has a specific argument to pass to tune the thresholding: + +- ``cutoffs``: A list of 2-tuples specifying the values to keep. For + example, to keep only values whose absolute values are above 0.5, use + ``cutoffs=[(-np.inf, -0.5), (0.5, np.inf)]`` +- ``avg_k``: The desired average degree of the network. +- ``quantile``: The appropriate quantile (not percentile). +- ``custom_thresholder``: A user-defined function that returns an N x N + NumPy array. + +.. code:: python + + H = recon.fit(TS, threshold_type='degree', avg_k = 15.125) + + print(nx.info(G)) + # This network is a complete graph. + + print(nx.info(H)) + # This network is not. + +Distances between graphs +------------------------ + +Distances behave similarly to reconstructors. All distance objects have +a ``dist()`` method that takes two NetworkX graphs. + +.. code:: python + + G1 = nx.fast_gnp_random_graph(1000, 0.1) + G2 = nx.fast_gnp_random_graph(1000, 0.1) + + dist = netrd.distance.NetSimile() + D = dist.dist(G1, G2) + +Some distances also store metadata in ``results`` dictionaries. + +.. code:: python + + # Another way to get the distance + D = dist.results['dist'] + + # The underlying features used in NetSimile + vecs = dist.results['signature_vectors'] + +Dynamics on graphs +------------------ + +As a utility, we also implement various ways to simulate dynamics on a +network. These have a similar interface to reconstructors and distances. +Their ``simulate()`` method takes an input graph and the desired length +of the dynamics, returning the same N x L array used in the graph +reconstruction methods. + +.. code:: python + + model = netrd.dynamics.VoterModel() + TS = model.simulate(G, 1000, noise=.001) + + # Another way to get the dynamics + TS = model.results['TS'] + + # The original graph is stored in results + H = model.results['ground_truth'] +