Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CrossRecurrencePlot matrix gives seemingly random values #128

Closed
emmanuelasso opened this issue Jul 25, 2019 · 13 comments · Fixed by #192
Closed

CrossRecurrencePlot matrix gives seemingly random values #128

emmanuelasso opened this issue Jul 25, 2019 · 13 comments · Fixed by #192
Assignees
Labels
Milestone

Comments

@emmanuelasso
Copy link

I am generating a crossRecurrence plot and every time I run it I get different recurrence matrices. I have not changed the parameters.

crp = CrossRecurrencePlot(x, y, metric='euclidean', tau = 0, dim = 1, threshold=0.05)

example:

first run:
recurrence_matrix: array([[0, 0, 0, ..., 0, 0, 0],
[0, 1, 1, ..., 1, 1, 1],
[0, 1, 1, ..., 1, 1, 1],
...,
[0, 1, 1, ..., 1, 1, 1],
[0, 1, 1, ..., 1, 1, 1],
[0, 1, 1, ..., 1, 1, 1]], dtype=int8)

second run:

recurrence_matrix: array([[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 1, ..., 1, 1, 1],
...,
[0, 0, 1, ..., 1, 1, 1],
[0, 0, 1, ..., 1, 1, 1],
[0, 0, 1, ..., 1, 1, 1]], dtype=int8)

@jkroenke jkroenke added the bug label Oct 7, 2019
@jkroenke jkroenke self-assigned this Oct 7, 2019
ntfrgl added a commit that referenced this issue Apr 8, 2022
- Single-source Cython/C type definitions: `core._ext.types`
- Introduce semantic type aliases: ADJ, NODE, DEGREE, WEIGHT, FIELD, etc.
- Remove deprecated type aliases: `np.{int|float}`
- Replace C type aliases by Numpy type aliases
- Add missing Numpy type casts before Python/Cython boundary
- Disambiguate `timeseries._ext.numerics._twin_surrogates_{s|r}`
  (blame: 098489b & 2f6a5b9)
- Fix some broken output pointers
- Reduce redundant copying
- Vectorize some simple Python loops
- Eliminate some dead code
- Potentially related issues: #128, #141, #142, #145
@CharlotteGeier
Copy link

Hi, did this issue ever get solved?

I'm facing the same problem when generating an InterSystemRecurrenceNetwork: The blocks of the adjacency matrix related to the CrossRecurrencePlot come out differently every time. Accordingly, running only the CrossRecurrencePlot yields different results with every run.

So far, I've tried working with the stable release pyunicorn v0.6.1, the current development version from pik-copan/pyunicorn that includes the 3dab5bf commit referenced above and the current jkroenke/pyunicorn version with the e45da87 commit that refers to this issue. Unfortunately, none of these seem to solve the problem for me. The testCrossRecurrencePlot() returns different distance matrices on each run. Maybe I'm overlooking something?

Any help would be greatly appreciated! Thanks!

@fkuehlein
Copy link
Collaborator

Hi @CharlotteGeier,

thanks for pointing out! I will look into it and get back to you as soon as possible.

@CharlotteGeier
Copy link

Hi @fkuehlein that sounds good, thanks :)

@fkuehlein
Copy link
Collaborator

Hi @CharlotteGeier,

the above commits are both included in the current development version. However, I can confirm that CrossRecurrencePlot is still not doing what it should. Something goes wrong in the calculation of the distance matrix, which results in incorrect output, changing with every re-calculation. As you experienced, this also affects the InterSystemRecurrenceNetwork class.

We are working on this, but cannot guarantee to resolve it soon. Please excuse this inconvenience and thanks again for pointing out the issue!

@CharlotteGeier
Copy link

Hi @fkuehlein,

thank you for looking into the matter!

For now, I've found a workaround (just in case anyone else comes across this issue): I compute the cross-recurrence plot using PyRQA, assemble the inter-system recurrence network manually and use a slightly adapted version of the InterSystemRecurrenceNetwork class to perform the analysis.

This seems to work quite well - but I'd still be interested in a "cleaner" solution.

@fkuehlein
Copy link
Collaborator

Thanks for sharing your workaround! We'll keep you updated about any advances :)

@fkuehlein
Copy link
Collaborator

fkuehlein commented Jul 10, 2023

I just looked into this again, wasn't able to fix it but narrowed it down a bit:

CrossRecurrencePlot.distance_matrix() gives faulty results. The problem must be in the underlying cython functions _manhattan_distance_matrix_crp(), _euclidean_distance_matrix_crp() and _supremum_distance_matrix_crp(), in the corresponding C functions or maybe somewhere at their interface. The problem actually appears for all three possible metrics.

This eventually leads to a faulty CrossRecurrencePlot.recurrence_matrix(), which is also needed in InterSystemRecurrenceNetwork.

A simple example to reproduce the bug:

import numpy as np
import matplotlib.pyplot as plt

from pyunicorn.timeseries import CrossRecurrencePlot

# create example timeseries
x = np.linspace(0,2*np.pi, 20)
sine = np.sin(x)

# calculate CrossRecurrencePlot
crp = CrossRecurrencePlot(x=sine, y=sine, threshold=0.2)

# show faulty distance matrix
crp_dist = crp.distance_matrix(crp.x_embedded, crp.y_embedded, crp.metric)
plt.matshow(np.log10(crp_dist)) #plotting log10 here, in order to make visible how messed up the result really is
plt.colorbar(fraction=0.0455, pad=0.04)

# show faulty recurrence matrix
crp_rec = crp.recurrence_matrix()
plt.matshow(crp_rec)
plt.colorbar(fraction=0.0455, pad=0.04)

Try RecurrencePlot instead to see what the result should look like:

from pyunicorn.timeseries import RecurrencePlot

rp = RecurrencePlot(sine, threshold=0.2)

# show correct distance matrix
rp_dist = rp._distance_matrix
plt.matshow(rp._distance_matrix)
plt.colorbar(fraction=0.0455, pad=0.04)

# show correct recurrence matrix
rp_rec = rp.recurrence_matrix()
plt.matshow(rp_rec)
plt.colorbar(fraction=0.0455, pad=0.04)

@ntfrgl, could you help me out on this?

Some further notes:

  • The distance matrix is currently being printed after calculation in CrossRecurrencePlot.set_fixed_threshold(). This should be removed.
  • This distance matrix is currently not being cached, as it is neither in the parent class RecurrencePlot. Should this be implemented (as @jdonges once mentioned), or is caching the recurrence matrix enough?
  • Tests should be added to cover CrossRecurrencePlot.

ntfrgl added a commit that referenced this issue Jul 10, 2023
- Incorrect indexing dimensions
- Potentially related issues: #128, #141, #145
ntfrgl added a commit that referenced this issue Jul 10, 2023
- Follow-up to 0ce39ae
- Potentially related issues: #128, #141, #145
@ntfrgl
Copy link
Member

ntfrgl commented Jul 10, 2023

@fkuehlein, thank you for isolating the issue. The array indexing used incorrect dimensions in all of those C functions. In the first commit above I fixed the bug in C, and in the second commit I ported it to Cython, where the boundscheck would have found it. I also removed the print statement.

This bug was simple enough to find by staring, after you had isolated it, but more generally you might want to use a proper debugger for Cython.

Next steps:

  • Please proof-read my commits, verify that the results are as expected, and add tests.
  • This bug, which apparently was never tested against, was introduced in the process of porting C code from the former weave.inline infrastructure to the more recent Cython infrastructure. This suggests that a thorough examination of all Cython/C extensions might be necessary, comparing them against their original implementations.

@fkuehlein
Copy link
Collaborator

fkuehlein commented Jul 11, 2023

Thanks a lot @ntfrgl, I can confirm CrossRecurrencePlot is working as it should now!

I will add tests later this week.

fkuehlein added a commit to fkuehlein/pyunicorn that referenced this issue Jul 18, 2023
- testing against issue pik-copan#128
- for all three possible metrics
  - intitialize CrossRecurrencePlot
  - assert two calculations of distance matrix are np.allclose
- remove redundant tests
@ntfrgl
Copy link
Member

ntfrgl commented Jul 18, 2023

Two comments on the updated tests:

  • Did you verify that those tests actually reliably trigger the original bug when the above solution is reverted? Since the illegal memory access in the faulty version was deterministic, for the current tests to fail the content of a nearby heap location would have to change in between the two identical function calls. Test failure would be much more likely if, e.g., you compared the outcomes of two different CrossRecurrencePlot instances built from two different instances of SmallTestData, and even more likely if the x and y arguments were different timeseries.
  • The three tests are identical up to the metric: str argument. This would be an ideal starting point for using parametric test fixtures.

@fkuehlein
Copy link
Collaborator

fkuehlein commented Jul 19, 2023

I did verify that, and it does trigger the original bug because even though the faulty recurrence matrix stays the same, it gets transposed on every calculation. My idea was to keep it minimal, but I agree that your suggestions would be the better option. I will rework the tests and also add a parametric test fixture.

@ntfrgl
Copy link
Member

ntfrgl commented Jul 19, 2023

I see, thank you.

The added benefit of not making the validity of this test depend on implementation details of its testee, apart from the possibility of the latter changing in the future, is that we might be able to apply the same testing strategy to many more functions, with the hope of finding indexing issues more systematically, by extending tests/test_generic.py::simple_instances().

fkuehlein added a commit to fkuehlein/pyunicorn that referenced this issue Jul 19, 2023
- subsitute old tests by parametrized test for CrossRecurrencePlot
- add parametrized test for InterSystemRecurrenceNetwork
- testing agains issues pik-copan#128 and pik-copan#125
- add fixture params for fixed recurrence rate mode when pik-copan#188 is solved
fkuehlein added a commit to fkuehlein/pyunicorn that referenced this issue Jul 20, 2023
- subsitute old tests by parametrized test for CrossRecurrencePlot
- add parametrized test for InterSystemRecurrenceNetwork
- testing agains issues pik-copan#128 and pik-copan#145
- add fixture params for fixed recurrence rate mode when pik-copan#188 is solved
fkuehlein added a commit to fkuehlein/pyunicorn that referenced this issue Jul 27, 2023
- subsitute old tests by parametrized test for CrossRecurrencePlot
- add parametrized test for InterSystemRecurrenceNetwork
- testing agains issues pik-copan#128 and pik-copan#145
- add fixture params for fixed recurrence rate mode when pik-copan#188 is solved
@CharlotteGeier
Copy link

@fkuehlein and @ntfrgl thanks a lot for fixing this! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants