New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'find_chains' not returning the longest chain in some datasets #33
Comments
Hello,
It seems the Matrix Profile object is not correct either, as it doesn't say
how many observations it has. Try to force data into a vertical matrix.
I'll look into this if you can provide a sample of your data frame that
reproduces the error.
…--
Francisco Bischoff
*"O mate está para o gaúcho como o chá para os ingleses, a coca para os
bolivianos, o uísque para os escoceses e o café... para os brasileiros"--
Eduardo Bueno*
On Sat, Nov 3, 2018 at 6:22 PM PrakritiAilavadi ***@***.***> wrote:
** 'find_chains' not returning the longest chain sometimes **
I've tested the function on datasets given on the official page of Time
Series Chains (https://sites.google.com/site/timeserieschain/).
Performs beautifully on some, but fails on some others. In the chain
calibration dataset given in the official page, on the dataset
'chain_test_1' the result is correct, but on the 3 other datasets, the
output in best chain of the function is NULL, when I could clearly see
through the total chains found that longest chain do exist.
*To Reproduce*
Data:
1. Go to https://sites.google.com/site/timeserieschain/
2. Under 'Chain Calibration Datasets', click on 'datasets'.
Code:
`# each of subsequence length 127
each of about 6000-7000 data points
#chain test 1 - giving accurate results for best chain
chain test 2,3,4 - giving best chain output as NULL.
chain_test_dataset <- read.delim("chain_test_2.txt", header=FALSE)
mp_chain_test <- tsmp(chain_test_dataset$V1, window_size = 127)
window_size_chaintest=127
chains_of_chain_test <- find_chains(mp_chain_test)
chains_of_chain_test
length_of_best= length(chains_of_chain_test$chain$best)
length_of_best
Similarly for chain test 3 and 4`
*Screenshots*
[image: screenshot 2018-11-03 at 11 35 18 pm]
<https://user-images.githubusercontent.com/20098293/47955710-e6489a00-dfc1-11e8-946f-ff040ad0e8d2.png>
[image: screenshot 2018-11-03 at 11 44 33 pm]
<https://user-images.githubusercontent.com/20098293/47955749-9fa76f80-dfc2-11e8-806f-1a6063cae85c.png>
*System used:*
- OS: macOS Mojave
- R version: 1.0.143 – © 2009-2016
- 'tsmp' package version: 0.3.1
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<https://github.com/franzbischoff/tsmp/issues/33>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA8GEJx8IIzK0JVtreDFmoS0NGKHayTzks5urd7LgaJpZM4YM7pI>
.
|
The matrix profile object for chain_test_1 dataset also does not show the number of observations but is appropriately giving the best chain. All datasets are already fed as a vertical matrix/column. Sample datasets chain_test1,2,3,4 are attached. Screenshot of chain_test_1 code and results: Datasets: All of the datasets have 6k-7k observations. |
Thanks.
Have you tried the last version from github? I see you are not using the
new progress bar.
…--
Francisco Bischoff
*"O mate está para o gaúcho como o chá para os ingleses, a coca para os
bolivianos, o uísque para os escoceses e o café... para os brasileiros"--
Eduardo Bueno*
On Sun, Nov 4, 2018 at 6:44 PM PrakritiAilavadi ***@***.***> wrote:
The matrix profile object for chain_test_1 dataset also does not show the
number of observations but is appropriately giving the best chain.
All datasets are already fed as a vertical matrix/column.
Sample datasets chain_test1,2,3,4 are attached.
Screenshot of chain_test_1 code and results:
[image: screenshot 2018-11-05 at 12 12 11 am]
<https://user-images.githubusercontent.com/20098293/47968403-7c96c180-e08f-11e8-9336-2c71843c900a.png>
Datasets:
chain_test_1.txt
<https://github.com/franzbischoff/tsmp/files/2546147/chain_test_1.txt>
chain_test_2.txt
<https://github.com/franzbischoff/tsmp/files/2546148/chain_test_2.txt>
chain_test_3.txt
<https://github.com/franzbischoff/tsmp/files/2546149/chain_test_3.txt>
chain_test_4.txt
<https://github.com/franzbischoff/tsmp/files/2546150/chain_test_4.txt>
All of the datasets have 6k-7k observations.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<https://github.com/franzbischoff/tsmp/issues/33#issuecomment-435694936>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA8GEJiM6TvCEP48dLWzbzTi3KNBjW3Vks5urzWQgaJpZM4YM7pI>
.
|
Bug confirmed, looking into it |
Bug is in line 55: one of the values is |
** 'find_chains' not returning the longest chain sometimes **
I've tested the function on datasets given on the official page of Time Series Chains (https://sites.google.com/site/timeserieschain/).
Performs beautifully on some, but fails on some others. In the chain calibration dataset given in the official page, on the dataset 'chain_test_1' the result is correct, but on the 3 other datasets, the output in best chain of the function is NULL, when I could clearly see through the total chains found that longest chain do exist.
To Reproduce
Data:
Code:
Screenshots
System used:
The text was updated successfully, but these errors were encountered: