Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check if DFA needs to be fixed #17

Closed
Tracked by #26
CSchoel opened this issue Sep 4, 2020 · 14 comments · Fixed by #41
Closed
Tracked by #26

Check if DFA needs to be fixed #17

CSchoel opened this issue Sep 4, 2020 · 14 comments · Fixed by #41
Assignees

Comments

@CSchoel
Copy link
Owner

CSchoel commented Sep 4, 2020

Following the discussion in NeuroKit2 (neuropsychology/NeuroKit#206) it seems that there are multiple ways to implement DFA. => Check that we really use the correct implementation according to Peng 1995 and update the reference list accordingly.

@CSchoel CSchoel self-assigned this Sep 4, 2020
@LRydin
Copy link

LRydin commented Jan 21, 2021

Hey @CSchoel I am also working on NeuroKit2 issue #206 and as far as I can judge from your code there shouldn't be any issues. I have compared it to my MFDFA and algorithm-wise, I can't detect any hiccups.

I will use you code and mine to serve as tests for NeuroKit2 fractal_dfa. I hope this also helps you out. I'll report on the tests later.

@CSchoel
Copy link
Owner Author

CSchoel commented Jan 21, 2021

Hey, thank you very much for looking into this! 😄 And also thanks for the hint at Lévy motion in your NeuroKit2 comment. I am always happy to have data sets/processes with prescribed theoretical results against which I can test the algorithms in nolds.

@LRydin
Copy link

LRydin commented Jan 21, 2021

:) Cheers! Happy to help, I'll keep you in the loop with the work with NeuroKit2. I'll add some tests based on your nolds and my MFDFA, to be sure things we have to validating codes for NK2.

@CSchoel
Copy link
Owner Author

CSchoel commented Jan 5, 2024

I finally came back to this: Retracing my steps from neuropsychology/NeuroKit#206, there is indeed a change required:

We need to apply the square root only after we summed all of the individual squared differences and took the mean. This is consistent with the PhysioNet implementation and the (now discontinued) R-package fractal.

This does not change the unit tests for fractal brownian motion, random noise, and random walks. But it does change the output for the Lorenz system quite a bit. This makes me a bit suspicious about the reference paper for that experiment. I'll try to verify this by using the PhysioNet implementation on the Lorenz data to see if the output is closer to 1 (with the fix) or 1.5 (without the fix, prescribed by González-Salas et al. (2016)).

@CSchoel
Copy link
Owner Author

CSchoel commented Jan 5, 2024

Other To-dos:

  • Use Lévy motion as another unit test.
  • Take a closer look at the actual values of the unit tests that still work. Did they get closer to the expected values or further away?

@CSchoel
Copy link
Owner Author

CSchoel commented Jan 6, 2024

I looked for another paper reporting the Hurst parameter obtained by DFA from the Lorenz system and found Wallot et al. (2023). The authors actually report values that are closer to 1 than to 1.5.

I trust them more than González-Salas et al. (2016) because Wallot et al. link to the original paper and describe the algorithm very detailed and with the same square-root-last approach as in the fix I now want to apply to nolds. I think it is quite likely that González-Salas et al. just had the same error in their implementation of DFA that I had in nolds.

@CSchoel
Copy link
Owner Author

CSchoel commented Jan 6, 2024

I compared the PhysioNet implementation against nolds. Fortunately, the PhysioNet binary outputs all data points on the log-log-plot, which allows a very fine-grained comparison of the two implementations. After correcting for the fact that nolds uses the natural logarithm and PhysioNet uses base 10, the plots are identical:

physionet_vs_nolds

Looking at the code, I also found this comment:

dfa: Detrended Fluctuation Analysis (translated from C-K Peng's Fortran code)
Copyright (C) 2001-2005 Joe Mietus, C-K Peng, and George B. Moody

I think a 100% agreement with a ported version of C-K Peng's original Fortran implementation is as good as it gets for validating that the implementation in nolds is correct.

I think I want to add this experiment as a regression test.

@CSchoel
Copy link
Owner Author

CSchoel commented Jan 22, 2024

I added the regression test.

Open TODOs:

  • Fix expected DFA value for Lorenz in unit tests and examples and change reference from González-Salas et al. (2016) to Wallot et al. (2023).
  • Use Lévy motion as another unit test.
  • Adjust explanation in the docstring of DFA to reflect change in formula.

@LRydin
Copy link

LRydin commented Jan 23, 2024

Hey everyone,

I think there might be some material you can reuse for your tests in the documentation of MFDFA. There you can find some simple code to generate a Lévy motion (see here):

# Imports
# from MFDFA import MFDFA   #not needed, just copied from my docs.
from scipy.stats import levy_stable

# Generate 100000 points
alpha = 1.5
X = levy_stable.rvs(alpha=alpha, beta=0, size=10000)

Since these are purely Markovian processes, you should get a Hurst coefficient of H=0.5 in your DFA implementation :).

Hope it helps!

@CSchoel
Copy link
Owner Author

CSchoel commented Jan 25, 2024

Thanks a lot for the hint. Much appreciated. ❤️

@CSchoel
Copy link
Owner Author

CSchoel commented Feb 4, 2024

Update for myself: Unfortunately, I noticed that the error I had in my formula was due to the same error existing in Hardstone et al. (2012), which I based my entire explanation of DFA in the docstring on. This means I might have to re-write the whole docstring as the explanation involving standard deviations of individual detrended windows doesn't make any sense anymore. 🙈

I didn't find any other good explanation, so I might have to resort to explaining the how rather than the why. 🙁

@CSchoel
Copy link
Owner Author

CSchoel commented Feb 19, 2024

Another update for myself: After a bit of searching I, found Kantelhardt et al. (2001) which seems to include a decent explanation for at least some parts of the algorithm. It is mostly consistent with the original definition by Peng et al. (1995), so I think I can use that and maybe combine it with the parts of Hardstone et al. (2012) that don't involve the aforementioned error. Let's hope I can get a somewhat understandable introduction to DFA out of this.

@CSchoel
Copy link
Owner Author

CSchoel commented Jun 24, 2024

I revised the explanation. It needs another proofread, and I still need to find a good motivation why we take the integral / random walk as the first step.

@CSchoel
Copy link
Owner Author

CSchoel commented Jul 18, 2024

Explanation is finished, and Lévy motion is implemented. We're done here! 🎉

@CSchoel CSchoel linked a pull request Jul 18, 2024 that will close this issue
@CSchoel CSchoel closed this as completed Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants