Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Distinguish "Hard Chaos" from "Partially Predictable Chaos" #83

Merged
merged 21 commits into from
May 4, 2019

Conversation

efosong
Copy link
Member

@efosong efosong commented Mar 27, 2019

Closes #68

Todo:

  • Fix algorithm issues
  • Complete function interface
  • Write Docstring
  • Docs
  • Tests

@efosong
Copy link
Member Author

efosong commented Mar 27, 2019

A few of the issues I can think of for now. Of course there's docs/tests/etc to do, but I think these things will need resolving first. I may add to this in the future.

Naming

The function name (distinguish_chaos) and file name (chaos_distinction.jl) could perhaps be improved. In particular, chaos_distinction sounds a bit too close to the other file, chaos_detection. Any idea for alternative names?

Determination of perturbation range, (δmin, δmax)

In the paper, it seems the authors determine a valid range of perturbation sizes δ∈ (δmin, δmax) by simply increasing δmax until they see behaviour change on a graph. This is not particularly principled, and does not lend well to automated approaches.
My current approach is to allow the user to input bounds based on intuition. This is not principled, but may work in practice.
The alternative approaches I can think of require calculating values for a range of δ, which defeats the purpose somewhat of determining an upper bound (why not just calculate for a small range of δ instead --- the slope is all we're interested in).
Any suggestions for routes forward here?

Maximal Lyapunov Exponent

λmax is calculated using the lyapunov function. I have arbitrarily set T passed to lyapunov to 2500. Is there a principled way to choose this, or should it just be left to the user?

Lyapunov prediction time

In the paper, the authors suggest the Lyapunov prediction time should be set to Tλ = 10/λmax. However, they don't actually use this in their paper, and in practice I have found that for laminar flow, λmax is often very small, leading to very long trajectories which take a little bit too long to compute. I've defined Tλmax to allow the user to hard-cap the trajectory length. Is there a better way.

Sampling from the trajectory

I'm reasonably happy with the sampling method I have implemented here, but think I ought to explain it in case I have made a mistake somewhere.
The paper calls for sampling over the 'natural distribution' of the attractor. I could not find a method for doing so, so instead came up with my own sampling method.
The user inputs a desired length of time to evolve the trajectory (the longer the better) and the number of samples they wish to take from the trajectory.
Then, after an initial transient, the function samples approximately the requested number of samples by taking 'inter-sample times' from an exponential distribution such that the sample times come from a Poisson process.
This approach has the large advantage over simply computing a trajectory then sampling uniformly over said trajectory that it does not require a potentially rather long trajectory be stored. Instead only the samples themselves need be stored.

Function Parameters

There are quite a lot of function parameters required for this function. Most of them will be keyword parameters, but I feel like having so many is rather messy. Perhaps some sort of configuration struct is appropriate?

Return type

I've returned a symbol :laminar, :ppc or :chaotic depending on the result of the algorithm (with :indeterminate for cases not covered by the algorithm), as well as the cross-distance scaling parameter ν and the mean cross-correlation C. Perhaps this should be changed?

@hwernecke
Copy link

A few answers to your questions:

Determination of perturbation range, (δmin, δmax)
You are absolutely right with your statement.
The formal requirements for the perturbations are

  • 0 < δmin, numerically resonable δmin ~ 1e-10 and
  • δmax << s, where s is the standard deviation of the attractor.

Two possible options:

  1. expect the user to give an educated guess
  2. compute long-term distance for a wide range of perturbations, e. g. δ ∈ [ 1e-15, 1e2 ], and see where the scaling breaks down when starting from small perturbations
    I would go with option (1). Else you can implement (2) and still leave it up to the user to provide a guess.

Maximal Lyapunov exponent // Lyapunov prediction time
Some general words: the largest Lyapunov exponent λmax provides an estimate for the evolution of an arbitrary but vanishingly small perturbation δ → 0 in the limit of long times t → ∞,
i.e. d(t) ~ δ exp(λmax t).

Let's assume we are in the vicinity of a chaotic attractor, viz λmax > 0. In order to estimate the time it takes for a finite small perturbation δ < s to reach a certain distance d* < s smaller than the extent of the attractor can then be estimated reversely by Tλ ~ log(d*/δ) / λmax.
On average this means that up to time one can make a prediction for the chaotic system within the tolerance d*. Thus, is typically called Lyapunov prediction time.
For chaotic attractors Tλ>0, while for laminar flow the Lyapunov prediction time diverges Tλ → ∞.

Note that it depends on the distances δ, d*! (Apparently you fixed that with a constant factor scale_ly.)

Secondly, note that in [Wernecke et al, 2017] we do not propose Tλmax = 10 / λmax but rather we propose to use Tmax = 10 Tλ as an estimate integration time for known Lyapunov prediction time .

So here comes the tricky question: what is a sufficiently large Tmax for computing the largest Lyapunov exponent and for the integration of trajectories?

Possible approaches:

  1. As the time to be neglected for transient motion Ttr is already specified by the user, choose something like Tmax ~ 10 Ttr to be on the save side. But: if the user fails, the estimate might fail as well.
  2. Look for the long-term plateau in the time evolution of the distance on log-scale. On every attractor the distance settles close to such a plateau for t>Tλ. Apply a linear fit to the distance on log-scale on a time interval [ti, tf], where ti<tf are extremely long times. Decreasing ti to smaller values you should see that the error of the fit will increase once the onset of the plateau is reached.

Hope that brings you further. Cheers, hendrik


# Perform regression to check cross-distance scaling
ν = slope(log.(δs), log.(ds))
C = mean(Cs)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you average over the whole time evolution of the cross-correlation.
One is usually more interested in the long-term cross-correlation, i.e. neglecting the initial correlation, which is close to unity for any attractor.
My suggestion: neglect the initial time interval [0, 2 Tλ] when averaging the cross-correlation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the cross-correlation should be calculated here at a single time point T=Tmax (called in this version, but I'll change the names now I understand a bit better). This averaging operation C = mean(Cs) should actually average over C(Tmax) for the different values of δ.

There is no 'good' reason to perform this averaging, as I think C(Tmax) should be independent of δ, but I did it anyway because:

  • Calculating C is essentially free for each δ
  • It should help filter out any weird statistical variations in the values of C for each δ

Please let me know if I have misunderstood this point.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, the cross-correlation C should be independent of the initial distance δ in the long-term limit, its computation comes at low cost and can be used to cross-check.

About the averaging:
Although a single measurement at time t=Tmax should suffice to determine the residual cross-correlation, the correlation curve is typically fluctuating more or less strongly over time - especially when being computed from only a single pair of (chaotic) trajectories.
Thus, either

  • one can perform an ensemble average (as done in the paper, note the definition of C) OR
  • one averages over time, e.g. the last 10% of the correlation curve.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks. In this case the ensemble average should be computed (or more precisely the ensemble average D₁₂, then transformed to C₁₂ using equation 5 [using notation from the paper]).

C = mean(Cs)

# Determine chaotic nature of the system
if ν > ν_thresh_upp

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you could additionally check whether C > C_thresh_upp.
If so, the additional criterion is consistent with the scaling exponent.
If not, you have a sanity check and find that something did not work as intended.

@efosong
Copy link
Member Author

efosong commented Apr 18, 2019

Thanks for your help @hwernecke.
Sorry about the progress on this I have exams coming up soon and (in theory) have been/will be focusing on studying for them. I'll get round to finishing this PR within the next few weeks. I just wanted to provide an update to make it clear I haven't abandoned this.

A few thoughts on the direction:

  • I think the user should specify (δmin, δmax).
  • Looking for a plateau in separation seems like a promising approach to determining (and then Tmax). It might be worth trying to find the plateau on-the-fly so we don't always have to simulate incredibly long trajectories.

@Datseris
Copy link
Member

Datseris commented Apr 18, 2019

I've also have not abandoned this, I was just very busy doing my Postdoc applications. I will review and reply in detail in the coming week as well. Just to say a quick note: finding plateaus on the fly is something we already do in ChaosTools for the methods we have that estimate embedding dimension (unrelated with this PR).

The relevant code comes from this file: https://github.com/JuliaDynamics/DelayEmbeddings.jl/blob/master/test/R_dimension.jl which defines the function:

"""
    saturation_point(x, y; threshold = 0.01, dxi::Int = 1, tol = 0.2)
Decompose the curve `y(x)` into linear regions using `linear_regions(x, y; dxi, tol)`
and then attempt to find a saturation point where the the first slope
of the linear regions become `< threshold`.
Return the `x` value of the saturation point.
"""
function saturation_point(Ds, E1s; threshold = 0.01, kwargs...)
    lrs, slops = ChaosTools.linear_regions(Ds, E1s; kwargs...)
    i = findfirst(x -> x < threshold, slops)
    return i == 0 ? Ds[end] : Ds[lrs[i]]
end

we could do something similar here. I will support this and give more details once I do my "proper" review.


@efosong you don't have to apologize if this takes long. We all do this on our free time :)

(edit: but it is of course always a great move to give a quick update that this isn't forgotten!)

@Datseris
Copy link
Member

Datseris commented Apr 20, 2019

Alright, I've studied both the paper and this PR in detail; time for me to contribute a bit as well. But first things first: @hwernecke thanks very much for taking some time to help us with this implementation and @efosong once again thanks for working on this! Looking forward to grant you the bounty!

I'll now make comments as appropriate.

Technical things

  • Change file name to partially_predictable.jl and for now let's just name the function predictability. When this is ready to merge we should re-think the name, but for now something concise that doesn't confuse with the GALI method is better.
  • Make sure that DiffEq keyword arguments are propagated towards the solver. Please look at how I do it for eg lyapunovs
  • The return values are fine as is, but I would suggest changing the symbols to :PPC, :SC, :LAM for the three phases. Don't forget that the partially predictable phase is also chaotic.
  • Configuration structs are overcomplication without gain in my eyes. Just keyword arguments by themselves are fine, no need to add one more layer of objects.
  • Please use 4 spaces as the indentation distance, instead of 8. (if you use Atom, I highly recommend to install "indent detective"). If you don't use Atom then I'll do it.
  • Distributions is not part of the dependencies of ChaosTools, please add it.

Algorithm arguments

  • The perturbation range should be a single keyword argument δrange = 10 .^ (-9:-6). The user should specify this, and making it a single argument instead of 3 is nicer from an interface perspective I feel. Plus the default values are very good for normalized systems (i.e. with all variables having values of order 1 and the characteristic timescale also being of order 1).
  • Let's test this with only 3 different δ as the default. By reading the paper this should be enough since ν is 0 for chaotic always and (at least approximately) 1 for regular always.
  • I think we should have only a single threshold value for ν, C. The values are returned anyway, the user should decide whether the prediction is plausible. (i.e. we remove the indeterminate possibility.
  • the single threshold values for ν, C should not be keyword arguments. They are just internal constants with default equal to 0.5.
  • We make the Lyapunov exponent an optional positional argument of the algorithm. E.g. λmax = lyapunov(ds, 5000). (arbitrary choice, doesn't matter that much.)
  • The Lyapunov prediction time should be calculated from the rigorous relationship given in the paper, Tλ = ln(d/δ)/λ, with d = 1e-3, a keyword argument, just as @hwernecke suggested in his reply.
  • We define a keyword argument, multiplier = 10. The long term time is multiplier * Tλ, as Hendrik suggested. In the paper this is the time the authors call t >> Tλ. (which is e.g. the time you check the cross correlation or the cross distance)

(edit: So, I do not recommend checking for the plateaua that was discussed above. This is possible though and if @efosong wants to do it then go ahead, but I am not so sure it is worth it)

Testing

The code looks good, and will become better but one needs to test it rigorously. Can you please write tests that use Systems.lorenz and emulate the results of the original paper? In addition, we need more systems to test this on. I suggest that we use the second system that is tested on the paper, i.e. the Shilnikov attractor, section "PPC in a system without separation of scales". I will add the system to the predefined Systems in the meantime.

@hwernecke Can you maybe suggest other system, preferably discrete, that you have tested this on while developing the method? The neural model discussed in the paper is probably too much for our scope here.

Questions

@hwernecke I can see from the paper that your definition of a "correlation function" is not the one traditionally found in the literature, where one trajectory is delayed with respect to the other, e.g. : https://en.wikipedia.org/wiki/Cross-correlation#Cross-correlation_of_deterministic_signals

Is it really a different function ?

@Datseris Datseris self-requested a review April 20, 2019 17:30
@efosong
Copy link
Member Author

efosong commented Apr 21, 2019

Thanks @Datseris.

I've made a couple of the simpler changes you suggested:

  • changed filename
  • changed symbol names. However, I've left a symbol :INDETERMINATE because the tests as are don't cover all cases. However, I think in theory it might not be possible to reach this condition --- perhaps it should throw an error instead?
  • changed indentation from tabs to spaces
  • added Distributions 0.19.1 to REQUIRE
  • default δ_range = 10.0 .^ (-9:-6) (therefore this tests 4 values by default). This seems like a nice way to pass this
  • down to single threshold (ν_thresh=0.5, C_thresh=0.5)
  • Lyapunov exponent 'passed as argument' (when I implement them) --- another nice solution
  • Tλ = ln(d_tol/δ)/λ_max computed in loop
  • T_multiplier multiplies to give evolution time for separation T.
  • min(T_multiplier*Tλ, T_max) to prevent incredibly long running time when Lyapunov exponent is small
  • changed a few variable names. I'm not sure what you think of the 'style' --- I've gone for reasonably descriptive names, with underscores, but it should be easy to change them to something better

Still need to:

  • implement the function interface
  • DiffEq parameters pass through arguments
  • Write tests

@Datseris
Copy link
Member

Very good, I'll review again when there are some basics tests!

@hwernecke
Copy link

Testing

The code looks good, and will become better but one needs to test it rigorously. Can you please write tests that use Systems.lorenz and emulate the results of the original paper? In addition, we need more systems to test this on. I suggest that we use the second system that is tested on the paper, i.e. the Shilnikov attractor, section "PPC in a system without separation of scales". I will add the system to the predefined Systems in the meantime.

@hwernecke Can you maybe suggest other system, preferably discrete, that you have tested this on while developing the method? The neural model discussed in the paper is probably too much for our scope here.

Possible candidates for testing the implementation are

The latter is a nice system to test the robustness of the implementation, as there are two distinct time-scales involved.
The equations of motion: https://iopscience.iop.org/article/10.1088/2399-6528/aac33c/pdf#%FE%FF%00A%003
The example of chaotic motion can be found in Figure 10 on page 12 (including parameters in the caption).

@hwernecke
Copy link

Questions

@hwernecke I can see from the paper that your definition of a "correlation function" is not the one traditionally found in the literature, where one trajectory is delayed with respect to the other, e.g. : https://en.wikipedia.org/wiki/Cross-correlation#Cross-correlation_of_deterministic_signals

Is it really a different function ?

The definition we used is more commonly called Pearson correlation, see https://en.wikipedia.org/wiki/Correlation_and_dependence#Definition
We can further assume that the means and the variances are the same for both trajectories, the ensemble average is used to compute the expectation value.

The definition of cross-correlation you referenced above is not normalized around zero.
Further we are interested in comparing two trajectories at the same instance of time which would correspond to zero lag τ=0.

@Datseris
Copy link
Member

Thanks @hwernecke , these are some extremely useful comments!

issues with some of the variables names was originally missed due to
scoping in the REPL
@efosong
Copy link
Member Author

efosong commented Apr 24, 2019

The function now has an interface and works when imported via the package.
I think a few of the argument names probably need changing to match the 'standard' of the package (e.g. T_transient -> Ttr) but we can fix this later.

I'll write some tests. I'm not sure if there's a "good" way to test this other than simply trying some 'known' conditions and testing it gets them right? Not very rigorous but at least a sanity check.

I think discrete systems will need a bit more work on the function itself but shouldn't be too bad, so I'll give that a go at some point.

@Datseris
Copy link
Member

the symbol is not defined on my computer. How did it end up in the source?

I'll write some tests. I'm not sure if there's a "good" way to test this other than simply trying some 'known' conditions and testing it gets them right? Not very rigorous but at least a sanity check.

I think this is totally fine.

I think discrete systems will need a bit more work on the function itself but shouldn't be too bad, so I'll give that a go at some point.

Shouldn't be that much tweaking necessary. I can see in the source that everything that you do on ds is defined for both continuous and discrete systems.

@efosong
Copy link
Member Author

efosong commented Apr 24, 2019

symbol ⋅ is not defined on my computer. How did it end up in the source?

The symbol (unicode dot operator) or \cdot is defined by LinearAlgebra. Does it work in the package for you?

Shouldn't be that much tweaking necessary. I can see in the source that everything that you do on ds is defined for both continuous and discrete systems.

I'll at least need to change how sampling works to sample from a geometric distribution rather than exponential (so the step is always an integer). I've had a look and other than that I think it might work.

handles both ContinuousDynamicalSystems and DiscreteDynamicalSystems
@efosong
Copy link
Member Author

efosong commented Apr 24, 2019

Latest commit adds 'support' for discrete-time dynamical systems. I've manually tested the Lorenz system for continuous-time, and the Hénon map for discrete-time.
I also tried the logisitic map, but parallel_integrator has issues with this being 1-D output, e.g.

lmap = Systems.logistic(r=3.6)
p_integ = parallel_integrator(lmap, [0.4, 0.5])
step!(p_integ) # --> Error

Worth an issue on DynamicalSystemsBase.jl @Datseris?

If you'd like to try this out on the Lorenz out for yourself, you might need to pass a relatively large maxiters (I'm using 1e9) to predictability. I find the program basically crashes in a pretty nasty way otherwise.

@efosong
Copy link
Member Author

efosong commented Apr 24, 2019

The stack trace for the error mentioned in the previous comment:

ERROR: MethodError: no method matching *(::StaticArrays.SArray{Tuple{1},Float64,1,1}, ::StaticArrays.SArray{Tuple{1},Float64,1,1})
Closest candidates are:
  *(::Any, ::Any, ::Any, ::Any...) at operators.jl:502
  *(::LinearAlgebra.Adjoint{#s623,#s622} where #s622<:LinearAlgebra.AbstractTriangular where #s623, ::AbstractArray{T,1} where T) at /build/julia/src/julia-1.1.0/usr/share/julia/stdlib/v1.1/LinearAlgebra/src/triangular.jl:1805
  *(::LinearAlgebra.Adjoint{#s623,#s622} where #s622<:(Union{Hermitian{T,S}, Hermitian{Complex{T},S}, Symmetric{T,S}} where S where T<:Real) where #s623, ::AbstractArray{T,1} where T) at /build/julia/src/julia-1.1.0/usr/share/julia/stdlib/v1.1/LinearAlgebra/src/symmetric.jl:426
  ...
Stacktrace:
 [1] *(::Float64, ::StaticArrays.SArray{Tuple{1},Float64,1,1}, ::StaticArrays.SArray{Tuple{1},Float64,1,1}) at ./operators.jl:502
 [2] logistic_eom(::StaticArrays.SArray{Tuple{1},Float64,1,1}, ::Array{Float64,1}, ::Int64) at ~/.julia/packages/DynamicalSystemsBase/o0zd8/src/discrete_famous_systems.jl:268
 [3] #31 at ~/.julia/packages/DynamicalSystemsBase/o0zd8/src/dynamicalsystem.jl:420 [inlined]
 [4] step!(::DynamicalSystemsBase.MinimalDiscreteIntegrator{true,Array{StaticArrays.SArray{Tuple{1},Float64,1,1},1},1,getfield(DynamicalSystemsBase, Symbol("##31#32")){DiscreteDynamicalSystem{false,Float64,1,typeof(DynamicalSystemsBase.Systems.logistic_eom),Array{Float64,1},typeof(DynamicalSystemsBase.Systems.logistic_jacob),Float64,false},Int64},Array{Float64,1}}) at ~/.julia/packages/DynamicalSystemsBase/o0zd8/src/discrete.jl:65
 [5] top-level scope at none:0

@SebastianM-C
Copy link
Member

I have also tried to implement this about a year ago, but I got distracted by other things and I couldn't find the old code. I remember that when I tried to reproduce the plots from Figure 3, I needed parallelization since the number of initial conditions was quite high.
The parallelization aspect is orthogonal with this PR, but it may be helpful to take this into account.
There are multiple ways in which one can parallelize here, and imposing one in the implementation is not useful. What I think would be more useful is to have the main function split into more functions. For example, if the contents of a for loop are in a function, one can wrap that part with pmap and the code can be reused.
But since I got distracted by the parallelization part, it may be better to first finish the serial version. 😄

@Datseris
Copy link
Member

@efosong the logistic map is special because its state is not a vector but just a number. I'll have a look once the PR is ready for review. I need a proper test file in the test folder that I can just run.

@Datseris
Copy link
Member

The parallelization aspect is orthogonal with this PR, but it may be helpful to take this into account.
There are multiple ways in which one can parallelize here, and imposing one in the implementation is not useful. What I think would be more useful is to have the main function split into more functions. For example, if the contents of a for loop are in a function, one can wrap that part with pmap and the code can be reused.

Yes making the code more modular is a valid suggestion, but this can wait after this is implemented, merged and tagged.

@Datseris Datseris mentioned this pull request Apr 25, 2019
2 tasks
@Datseris Datseris merged commit 3a1ccf7 into JuliaDynamics:master May 4, 2019
@Datseris
Copy link
Member

Datseris commented May 4, 2019

oh sorry. I messed up and merged this into master when I wanted to actually merge the master into this branch.... shit. I continue this in #86

@Datseris Datseris mentioned this pull request May 4, 2019
Datseris added a commit that referenced this pull request May 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Distinguish "Hard Chaos" from "Partially Predictable Chaos" [$100]
4 participants