Skip to content
This repository has been archived by the owner on Jun 5, 2024. It is now read-only.

Commit

Permalink
add examples
Browse files Browse the repository at this point in the history
  • Loading branch information
sappelhoff committed Apr 18, 2018
1 parent f77592c commit a2f463b
Show file tree
Hide file tree
Showing 3 changed files with 82 additions and 0 deletions.
Binary file added docs/source/example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
68 changes: 68 additions & 0 deletions docs/source/examples.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
Examples
========
A straight forward example:

.. code-block:: python
import matplotlib.pyplot as plt
import numpy as np
from remedian.remedian import Remedian
# We can have data of any shape ... e.g., 3D:
data_shape = (2,3,4)
# Now we have to decide how many data observations we want to load into
# memory at a time before computing a first intermediate median from it
n_obs = 100
# Pick some example number ... assume we have `t` arrays of shape `data_shape`
# that we want to summarize with Remedian
t = 500
# Initialize the object
r = Remedian(data_shape, n_obs, t)
# Feed it the data ... for now, we just generate the data randomly on the go
# ... also save the actual data for comparison with true median
res = []
for obs_i in range(t):
obs = np.random.random(data_shape)
r.add_obs(obs)
res.append(obs)
# Now we have the Remedian in `r.remedian`
# Let's summarize the results
x = np.median(np.asarray(res).squeeze(), axis=0)
y = r.remedian
xydiff = x-y
# For colorbar scaling
vmin = np.min([x.min(), y.min(), xydiff.min()])
vmax = np.max([x.max(), y.max(), xydiff.max()])
vmin = -1*np.max(np.abs([vmin, vmax]))
vmax = np.max(np.abs([vmin, vmax]))
# Plot it
plt.close('all')
plt.subplot(131)
plt.imshow(x.reshape(1,-1), aspect='auto', cmap='bwr', vmin=vmin, vmax=vmax)
plt.axis('off')
plt.title('True median')
plt.subplot(132)
plt.imshow(y.reshape(1,-1), aspect='auto', cmap='bwr', vmin=vmin, vmax=vmax)
plt.axis('off')
plt.title('Remedian')
plt.subplot(133)
plt.imshow(xydiff.reshape(1,-1), aspect='auto', cmap='bwr', vmin=vmin, vmax=vmax)
plt.axis('off')
plt.colorbar()
plt.title('Difference')
.. image:: example.png
:scale: 100 %
:alt: Plot comparing true median and Remedian
:align: center
14 changes: 14 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,24 @@
Welcome to remedian's documentation!
====================================

The Remedian: A Robust Averaging Method for Large Data Sets

This algorithm is used to approximate the median of several data chunks if these data chunks cannot (or should not) be loaded into memory at once.


References
==========

1. P.J. Rousseeuw, G.W. Bassett Jr., "The remedian: A robust averaging method for large data sets", Journal of the American Statistical Association, vol. 85 (1990), pp. 97-104
2. M. Chao, G. Lin, "The asymptotic distributions of the remedians", Journal of Statistical Planning and Inference, vol. 37 (1993), pp. 1-11
3. Domenico Cantone, Micha Hofri, "Further analysis of the remedian algorithm", Theoretical Computer Science, vol. 495 (2013), pp. 1-16


.. toctree::
:maxdepth: 2
:caption: Contents:

examples


Indices and tables
Expand Down

0 comments on commit a2f463b

Please sign in to comment.