Multiscale re-thinking #223

Datseris · 2022-12-26T13:09:31Z

Alright, last thing before I close my V2 review is the multiscale stuff. As I've told you, we have another package that creates window viewers: it wants to apply some computation over views (specifically, rolling windows) of some timeseries. We decided that it doesn't make sense to offer specialized function for doing this. What makes sense is to ofer a type that produces views over a timeseries and just use map to apply the computation.

I proipose to do the same thing here, so that we can move out multiscale viewers in to a future package TimeseriesViewers or something like that.

Here is what I propose:

The existing multiscale algorithms, such as Regular take as input a timeseries, besides their other keywords.
Once instantiated, they are iterators. They iterate over views of the input timeseries, each view at a successive scale.
We define a function applyover(function, ViewerStruct) -> scales, result. This function simply applies a computational function over the views the viewer iterates over.
applyover is identical to map for rolling-window like viewers that will be in the TimeseriesViewers package. However, for our "coarse-graining" viewers applyover first applies the computation function at the given down-sampled scale, but then also averages it (or computes the variance, depends what's given as aggregator) at the given down-sampled scale.
Functions downsample, multiscale, multiscale_normalized are deleted!!! If you want the multiscale quantity of somethng you do:

x = rand(100) # input timeseries
viewer = Regular(x) # notice `x` is input, and ofcourse Reguar is an unfit name. ReguylarDownsampler is better
# for multiscale sample entropy
f1 = x -> complexity(SampleEntropy(), x)
# for multiscale permutaiton entropy
f2 = x -> entropy(Kaniadakis(), SymbolicPermutation(3), x)
# for multiscale normalizd wavelet entropy 
f3 = x -> entropy_normalized(WaveletOverlap(x), x)
# now apply to get multiscale
scales, result = applyover(f, viewer)

thoughts.?

The text was updated successfully, but these errors were encountered:

kahaaga · 2022-12-26T13:40:30Z

The existing multiscale algorithms, such as Regular take as input a timeseries, besides their other keywords.
Once instantiated, they are iterators. They iterate over views of the input timeseries, each view at a successive scale.

For the Composite multiscale algorithm, there are multiple possible views at each coarse-graining scale. How would this approach handle that?
For performance reasons, with this construction, would it be possible to re-fill e.g. the Regular struct with data in-place?

We define a function applyover(function, ViewerStruct) -> scales, result. This function simply applies a computational function over the views the viewer iterates over.

This makes sense.

Functions downsample, multiscale, multiscale_normalized are deleted

I don't agree with this. I think multiscale and multiscale_normalized should remain, for the following reasons:

Multiscale methods are widely used in the NLD litererature, at totally deserving of their own function.
The name multiscale is instantly recognizable by anyone reading the docs

Moreover, the current API is consistent with the remaining ecosystem: which is "estimate SomeInformationMeasure, using SomeEstimator, on the following data...". multiscale(::MultiscaleAlgorithm, ::SomeInformationMeasure, ::SomeEstimator, data...) is super-intuitive. It says "give me the multiscale version of SomeInformationMeasure, using SomeEstimator, on the following data...".*

The following however,

# for multiscale normalizd wavelet entropy 
f3 = x -> entropy_normalized(WaveletOverlap(x), x)
# now apply to get multiscale
scales, result = applyover(f, viewer)

is harder to find in the docs, because it is not attached to a well-known name in the literature, cannot be found using auto-completion in the repl, requires the user to learn and understand anonymous functions (not a beginner thing at all), and to learn this obscure (not for us, but for regular Julia users) applyover functions, and to learn what iterators are, etc.

The approach you describe is totally fine, because it makes ecosystem compatibility and future extensions much easier. But I think it should remain as an implementation detail, with multiscale/multiscale_normalized as wrappers. The only thing a user should see is multiscale, multiscale_normalized and whatever types these dispatch on (e.g. RegularDownsampler, CompositeDownsampler or whatever).

EDIT: that means that I'm fine with removing downsample in favor of multiscale data viewers, which are documented in such a way that the user gets the same information as now.

kahaaga · 2022-12-26T13:46:23Z

Btw, would there be any significant cost to defining these viewers repeatedly? I guess defining them as iterators completely eliminates existing allocations caused by downsample, so the approach you describe has to be much more performant, right?

Say I have an EEG dataset with 45 channels and 100000 datapoints, and I want to run multiple experiments at 10 different scale levels, with N surrogates at each level, etc.

Datseris · 2022-12-26T13:46:34Z

is harder to find in the docs, because it is not attached to a well-known name in the literature

I don't get this argument with namings... There will be a section named "multiscale" in the docs. Why do you need a function?

Datseris · 2022-12-26T13:49:04Z

Generally speaking, books/tutorials on software design advise that having a lot of input arguments is not a good plan, and one should keep them as little as possible. Your multiscale has 4 input arguments, all equally valid, mine has two, because the dispatch complexity of multiscale, in terms of the Multiple Dispatch paradigm, isn't really 4-fold, but 2-fold. What entropies and/or entropy desiding algorithms one uses, well theuy don't actually matter for multiscale.

Datseris · 2022-12-26T13:49:50Z

yes, the viewers have significant performance benefits, we've already measured that in the other pacakge.

Datseris · 2022-12-26T13:51:11Z

For the Composite multiscale algorithm, there are multiple possible views at each coarse-graining scale. How would this approach handle that?

I am confused here, and I was in fact already from the docs. I thought Regular also has "multiple" possible viuews at coarse-graining scale, otherwise why would Regular accept a mean function? You are not averaging across different subsampling scales, you are averaging in the same subsampling scale but with different start, right?

Datseris · 2022-12-26T13:56:13Z

Moreover, the current API is consistent with the remaining ecosystem: which is "estimate SomeInformationMeasure, using SomeEstimator, on the following data...". multiscale(::MultiscaleAlgorithm, ::SomeInformationMeasure, ::SomeEstimator, data...) is super-intuitive

It isn't in my experience. A function with 4 input arguments, all of which decide the behavior of the function, is not at all something easy. Especially because it hides the fact that all these input arguments do not, in fact, decide the behavior of the function. The function is "calculate quantity Y across different subsampling scales of Y". The function really doesn't care what quantity Y is. In the scientific context you might care, but we're talking about software design now. We can have a section called multiscale that demonstrates how to estimate a multiscale quantity. That's also why I am arguing that this "apply Y over coasre grained views of X" shouldn't even be in this package, but in a different package. Entropies.jl only supplies the function that computes Y, and cares about nothing else.

It seems to me that multuiscale is something that should be in our scientific scripts, not in a library that is even intended to be used by other libraries. The library should have the minimal methods, not integrated pipelines. That's at least my take on design.

Datseris · 2022-12-26T13:58:42Z

I guess we can argree on having a multiscale_... function in the "convenience wrappers" part of the documentation. Like we do for sampled_entropy or so. And the function actually does what I've described here? This is what you're suggestng at the end of the comment, right? #223 (comment)

kahaaga · 2022-12-26T14:09:10Z

I don't have a clear mental model of the approach you describe yet. To address all these comments/thoughts properly, I need to have a longer think and test it out on some concrete examples.

I guess we can argree on having a multiscale_... function in the "convenience wrappers" part of the documentation. Like we do for sampled_entropy or so. And the function actually does what I've described here? This is what you're suggestng at the end of the comment, right? #223 (comment)

Yes, my point is that a simple wrapper multiscale/multiscale_normalized abstracts away all the non-beginner stuff (anonymous functions, viewers/iterators). If we keep that, then whatever internal/external machinery we use to actually compute the multiscale stuff doesn't matter for casual users. As you say, subsampling/multiscale averaging/rolling-func-stuff is completely generic, and not something that we should provide in this package.

I might end up totally agreeing with you on removing the wrappers too in the end, but I need some time and experimenting to conclude anything definitely.

Our focus now should be to get a stable release out, and leave internal/external implementation details for future work.

Datseris · 2022-12-26T14:12:42Z

ok i write the code in my revbiew PR. I will leave the warppers. Please reply here though, beucase I still don't getwhat the difference between Regualr and Composite is:

#223 (comment)

Datseris · 2022-12-26T14:17:46Z

BTW, yet another reason I find the function "multiscale" confusing. Its docstring says:

Compute the multi-scale entropy e with estimator est, or the complexity measure c, for timeseries x

Yet the function returns a vector of values. So it doesn't compute an entropy even tough the name implies so. It computes several entropies, each one with a different input derived from x.

kahaaga · 2022-12-26T14:20:33Z

BTW, yet another reason I find the function "multiscale" confusing. Its docstring says:

Compute the multi-scale entropy e with estimator est, or the complexity measure c, for timeseries x

Yet the function returns a vector of values. So it doesn't compute an entropy even tough the name implies so. It computes several entropies, each one with a different input derived from x.

Yes, that is a good point. However, it how most papers in the literature defines it. As you point out, this is not a very clever choice of words. I think I've seen some authors actually define some weighted combination of the entropies at different levels as a multiscale entropy, though. So it varies. Which adds to the arguments for separating the multiscale sampling from the actual computations.

kahaaga · 2022-12-26T14:21:05Z

ok i write the code in my revbiew PR. I will leave the warppers. Please reply here though, beucase I still don't getwhat the difference between Regualr and Composite is:

Ok, just give me a few moments, and I will try to explain.

kahaaga · 2022-12-26T14:27:00Z

I think with the viewers it would basically look like this:

x = rand(1000)
vr = RegularDownsamplingViewer(x)
vc = CompositeDownsamplingViewer(x

# give me, at scale `s`,  the `i`-th point among `N1 = {1, 2, 3, ..., i, ..., n_1 - 1, n_1}` points
# where n_1 indicates that indexing at a certain scale always starts with the first point
vr[s][i] 

 # give me, at scale `s`,  at start index `j`, the `i`-th point among the points `N_j = {j, j+1, j+2, ...}` 
vc[s][j][i]

So the difference is that for Composite, there is an extra index which controls where the rolling-downsampling-window starts.

kahaaga · 2022-12-26T14:28:38Z

@Datseris I just remembered: These things are well-illustrated in the test/multiscale/downsampling.jl file. Just have a look there. The difference should be obvious (EDIT: and if it is not, just tag me here again)

kahaaga · 2022-12-26T14:38:19Z

Something like this:

# `Regular` downsampling
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
r = Regular(f = mean)
# The second argument is the scale
downsample(r, 2, x) # gives [f(x[1:2]), f(x[3:4], f(x[5:6]), f(x[7:8]), f(x[9:10])] = [1.5, 3.5, 5.5, 7.5, 9.5]
downsample(r, 3, x) # gives [f(x[1:3]), f(x[4:6], f(x[7:9])] = [2.0, 5.0, 8.0]

# Composite downsampling
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
c = Composite(f = mean)
downsample(c, 2, x) # There are two unique ways of downsampling, depending on the start index, so I get two arrays
# [ [f(x[1:2], f(x[3:4]), f(x[5:6], f(x[7:8])], [f(x[2:3], f(x[4:5]), f(x[6:7], f(x[8:9])] ]  = [ [1.5, 3.5, 5.5, 7.5], [2.5, 4.5, 6.5, 8.5] ]
downsample(c, 3, x) # There three unique ways of downsampling , depending on the start index, so I get three arrays
# [ [f(x[1:3], f(x[4:6])],  [f(x[2:4], f(x[5:7])],   [f(x[3:5], f(x[6:8])] ] = [ [2.0, 5.0], [3.0, 6.0], [4.0, 7.0] ] `

EDIT: In terms of the viewers, this would be

x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
vr = Regular(x; f = mean)
vr[2] =  [1.5, 3.5, 5.5, 7.5, 9.5]
vr[3] = [2.0, 5.0, 8.0]

vc = Composite(x; f = mean)
vc[2][1] = [1.5, 3.5, 5.5, 7.5]
vc[2][2] = [2.5, 4.5, 6.5, 8.5]
vc[3][1] = [2.0, 5.0]
vc[3][2] = [3.0, 6.0]
vc[4][3] = [4.0, 7.0]

EDIT2: The Regular and Composite functions are by no means the only possible ways of multilevel sampling.

To allow completely generic dispatch, perhaps there should be some subtyping RegularDownsampleViewer <: SingleViewPerLevelViewer and Composite <: MultipleViewsPerLevelViewer, or something like that, so that we can dispatch on SingleViewPerLevelViewer/MultipleViewsPerLevelViewer, and automatically have compatibility with whatever downsampling scheme one would use.

EDIT 3:
Perhaps I'm confusing, but I think this may even be relevant for #64. Wavelet decomposition is just another downsampling scheme, so we could in principle provide a WaveletDownsamplerViewer or something like that, although its implementation would be a bit more involved.

Datseris · 2022-12-26T16:58:26Z

@kahaaga can we please leave multiscale stuff for v2.1? It needs more time to review and think of the implementation, and there is no obvious need for this for v2. I have finished reviewing everything else and if we fixed the doc examples we can move on to tag the release v2 (without multiscale exporrted or part of online docs, we can leave them in the source for now). This would allow us to shift focus to finish CausalityTools, which for me is a rather pressing matter. I need to work on the stockhoml workshop asap because I wil lbe mostly gone the first week of January.

kahaaga · 2022-12-26T17:08:40Z

@kahaaga can we please leave multiscale stuff for v2.1? It needs more time to review and think of the implementation, and there is no obvious need for this for v2. I have finished reviewing everything else and if we fixed the doc examples we can move on to tag the release v2 (without multiscale exporrted or part of online docs, we can leave them in the source for now). This would allow us to shift focus to finish CausalityTools, which for me is a rather pressing matter. I need to work on the stockhoml workshop asap because I wil lbe mostly gone the first week of January.

Yes, let's leave it as is, but not export it. That way, I can still develop upstream methods without checking out git dev branches, which hugely simplifies the workflow.

If you leave your review open, I'll have a final look, then we can rename the package and tag it.

Datseris · 2022-12-26T17:27:10Z

I havent pushed my finished changes yet. will do so by end of day today

kahaaga · 2022-12-26T17:28:49Z

I havent pushed my finished changes yet. will do so by end of day today

Ok, no worries. Just tag me whenever your review is ready for review.

kahaaga · 2023-07-20T11:44:37Z

Any updates on the window viewer package, @Datseris?

If not, can we go with the existing proposed API for now, and leave window viewers for a later minor/major version? I agree that the current approach is not optimal, but I'd very much like to have the multiscale stuff as part of the paper. We can always deprecate it later.

Datseris · 2023-07-21T08:24:46Z

Yes, we can go ahead with the viewer here, but I would argue that the api should be
applyover(function, ViewerStruct) -> result::Vector, while the scales are obtained from a different way from the viewer struct, with a different function scales(ViewerStruct) -> scales::Vector. We can port it later to a common timeseries viewer package.

kahaaga · 2024-06-05T08:01:04Z

Since we now merged #404, I am closing this. If we ever decide to creating this viewer package, we should open a new issue to discuss it.

Datseris added the discussion-design Discussion or design matters label Dec 26, 2022

Datseris added this to the 2.0 milestone Dec 26, 2022

Datseris removed this from the 2.0 milestone Dec 26, 2022

kahaaga modified the milestones: 2.1, 2.X Dec 26, 2022

kahaaga mentioned this issue Jan 1, 2023

Rename everything to ComplexityMeasures.jl #234

Merged

kahaaga mentioned this issue Feb 25, 2023

Finalize multiscale API #252

Closed

kahaaga closed this as completed Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiscale re-thinking #223

Multiscale re-thinking #223

Datseris commented Dec 26, 2022 •

edited

Loading

kahaaga commented Dec 26, 2022 •

edited

Loading

kahaaga commented Dec 26, 2022

Datseris commented Dec 26, 2022 •

edited

Loading

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022

kahaaga commented Dec 26, 2022

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022

kahaaga commented Dec 26, 2022

kahaaga commented Dec 26, 2022

kahaaga commented Dec 26, 2022

kahaaga commented Dec 26, 2022 •

edited

Loading

kahaaga commented Dec 26, 2022 •

edited

Loading

Datseris commented Dec 26, 2022

kahaaga commented Dec 26, 2022

Datseris commented Dec 26, 2022

kahaaga commented Dec 26, 2022

kahaaga commented Jul 20, 2023 •

edited

Loading

Datseris commented Jul 21, 2023

kahaaga commented Jun 5, 2024

Multiscale re-thinking #223

Multiscale re-thinking #223

Comments

Datseris commented Dec 26, 2022 • edited Loading

kahaaga commented Dec 26, 2022 • edited Loading

kahaaga commented Dec 26, 2022

Datseris commented Dec 26, 2022 • edited Loading

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022

kahaaga commented Dec 26, 2022

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022

kahaaga commented Dec 26, 2022

kahaaga commented Dec 26, 2022

kahaaga commented Dec 26, 2022

kahaaga commented Dec 26, 2022 • edited Loading

kahaaga commented Dec 26, 2022 • edited Loading

Datseris commented Dec 26, 2022

kahaaga commented Dec 26, 2022

Datseris commented Dec 26, 2022

kahaaga commented Dec 26, 2022

kahaaga commented Jul 20, 2023 • edited Loading

Datseris commented Jul 21, 2023

kahaaga commented Jun 5, 2024

Datseris commented Dec 26, 2022 •

edited

Loading

kahaaga commented Dec 26, 2022 •

edited

Loading

Datseris commented Dec 26, 2022 •

edited

Loading

kahaaga commented Dec 26, 2022 •

edited

Loading

kahaaga commented Dec 26, 2022 •

edited

Loading

kahaaga commented Jul 20, 2023 •

edited

Loading