Skip to content

wcy1984123/DCDMCS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

###Collective Dynamic Modeling & Clustering System

Collective Dynamical Modeling-Clustering (CDMC) is an algorithmic framework for time series dynamical modeling and clustering using probabilistic state-transition models. In our work, an efficient initialization technique based on Itakura slope-constrained Dynamic Time Warping is applied to CDMC. Semi-Markov chains are used as the dynamical models. Experimental evaluation demonstrates the effectiveness of the proposed approach in providing more realistic dynamical modeling of stage dynamics than Markov models, with improved clustering quality and convergence speed as compared with pseudorandom initialization.

GUIIMAGE

###Distributed Collective Dynamic Modeling & Clustering System One way of processing large-scale datasets of discrete time series is to use compressed representation of instances based on the quartiles of the duration distributions. This data representation is used as a basis for the discovery of duration-related patterns in symbolic time series. It reduces the input size of the dataset before going to the learning procedure. An alternative way is to distribute existing algorithms over distributed system. The distributed environment such as Storm, that is distributed and fault-tolerant real-time computation, is widely used for the past five years. A systematic way of distributing the proposed method is important to extend the capability of processing large-scale dataset of time series. The following is the deployment of CDMC in Storm platform.

DCDMCS In Fig.1, there are three components (grey, orange, and green parts) in total. The grey one is the original version of the CDMC framework in Algorithm 1; the orange one indicates the work we focus on to improve the CDMC algorithm; the green one is an option for dealing with large-scalable time series in a distributed system. The orange component consists of two parts, namely, “Initial hierarchicalclustering.Cluster Assignment” and “Dynamical Model”. “Initial hierarchicalclustering.Cluster Assignment” aims at providing better data partitions before going to the iterative clustering and modeling steps in CDMC than random initialization, for example, DTW-based hierarchical clustering; “Dynamical Model” focuses on building state-based models such as semi-Markov model to capture dynamics of few occurrences of events. The green component with “Distributed Deployment Architecture” module tackles the scalability problem of big dataset built on CDMC. Note that the rectangle area encompassed by a dashed line in Fig. 1 is central components of efficient CDMC framework under distributed architecture.

DCDMCS Figure 2: Distributed CDMC framework. CDMC determines if two consecutive clustering results are similar enough to each other after each clustering step is finished. Thus, it is the occasion when the given data are grouped into data cluster bolts in parallel. Each data cluster bolt works as input to the corresponding local modeling bolt. The results in local modeling bolts go to the clustering bolt that emits the current cluster labels back to data source bolt if converge not. The cycle never stops until convergence.

See Link: http://users.wpi.edu/~wangchiying/page2.html

About

Distributed Collective Dynamic Modeling & Clustering System

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published