Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
R
 
 
 
 
man
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

stream - Infrastructure for Data Stream Mining - R package

CRAN version CRAN RStudio mirror downloads R-CMD-check AppVeyor Build Status

The package provides support for modeling and simulating data streams as well as an extensible framework for implementing, interfacing and experimenting with algorithms for various data stream mining tasks. The main advantage of stream is that it seamlessly integrates with the large existing infrastructure provided by R. The package currently focuses on data stream clustering and provides implementations of BICO, BIRCH, D-Stream, DBSTREAM, and evoStream.

Additional packages in the stream family are:

  • streamMOA: Interface to clustering algorithms implemented in the MOA framework. Includes implementations of DenStream, ClusTree and CluStream.
  • subspaceMOA: Interface to Subspace MOA and its implementations of HDDStream and PreDeConStream.

The development of the stream package was supported in part by NSF IIS-0948893 and NIH R21HG005912.

Installation

Stable CRAN version: install from within R with

install.packages("stream")

Current development version: Download package from AppVeyor or install from GitHub (needs devtools).

install_git("mhahsler/stream")

Usage

Load the package and create micro-clusters via sampling.

library("stream")
stream <- DSD_Gaussians(k=3, noise=0)

sample <- DSC_Sample(k=20)
update(sample, stream, 500)
sample
Reservoir sampling
Class: DSC_Sample, DSC_Micro, DSC_R, DSC 
Number of micro-clusters: 20 

Recluster micro-clusters using k-means and plot results

kmeans <- DSC_Kmeans(k=3)
recluster(kmeans, sample)
plot(kmeans, stream, type="both")

A list of all available clustering methods can be obtained with

DSC_registry$get_entries()

References