Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'find_chains' not returning the longest chain in some datasets #33

Closed
PrakritiAilavadi opened this issue Nov 3, 2018 · 5 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@PrakritiAilavadi
Copy link

PrakritiAilavadi commented Nov 3, 2018

** 'find_chains' not returning the longest chain sometimes **
I've tested the function on datasets given on the official page of Time Series Chains (https://sites.google.com/site/timeserieschain/).
Performs beautifully on some, but fails on some others. In the chain calibration dataset given in the official page, on the dataset 'chain_test_1' the result is correct, but on the 3 other datasets, the output in best chain of the function is NULL, when I could clearly see through the total chains found that longest chain do exist.

To Reproduce
Data:

  1. Go to https://sites.google.com/site/timeserieschain/
  2. Under 'Chain Calibration Datasets', click on 'datasets'.

Code:

#each of subsequence length 127 
#each of about 6000-7000 data points

#chain test 1 - giving accurate results for best chain
#chain test 2,3,4 - giving best chain output as NULL.

chain_test_dataset <- read.delim("chain_test_2.txt", header=FALSE)
mp_chain_test <- tsmp(chain_test_dataset$V1, window_size = 127)
window_size_chaintest=127
chains_of_chain_test <- find_chains(mp_chain_test)
chains_of_chain_test
length_of_best= length(chains_of_chain_test$chain$best)
length_of_best

#Similarly for chain test 3 and 4

Screenshots
screenshot 2018-11-03 at 11 35 18 pm

screenshot 2018-11-03 at 11 44 33 pm

System used:

  • OS: macOS Mojave
  • R version: 1.0.143 – © 2009-2016
  • 'tsmp' package version: 0.3.1
@franzbischoff
Copy link
Member

franzbischoff commented Nov 4, 2018 via email

@PrakritiAilavadi
Copy link
Author

The matrix profile object for chain_test_1 dataset also does not show the number of observations but is appropriately giving the best chain.

All datasets are already fed as a vertical matrix/column.

Sample datasets chain_test1,2,3,4 are attached.

Screenshot of chain_test_1 code and results:
screenshot 2018-11-05 at 12 12 11 am

Datasets:
chain_test_1.txt
chain_test_2.txt
chain_test_3.txt
chain_test_4.txt

All of the datasets have 6k-7k observations.

@franzbischoff
Copy link
Member

franzbischoff commented Nov 4, 2018 via email

@franzbischoff franzbischoff self-assigned this Nov 5, 2018
@franzbischoff franzbischoff added the bug Something isn't working label Nov 5, 2018
@franzbischoff
Copy link
Member

Bug confirmed, looking into it

@franzbischoff
Copy link
Member

Bug is in line 55:
n <- mean(.mp$rmp[chain_set[[i]]])

one of the values is Inf, messing with the mean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

2 participants