-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiscale re-thinking #223
Comments
This makes sense.
I don't agree with this. I think
Moreover, the current API is consistent with the remaining ecosystem: which is "estimate The following however, # for multiscale normalizd wavelet entropy
f3 = x -> entropy_normalized(WaveletOverlap(x), x)
# now apply to get multiscale
scales, result = applyover(f, viewer) is harder to find in the docs, because it is not attached to a well-known name in the literature, cannot be found using auto-completion in the repl, requires the user to learn and understand anonymous functions (not a beginner thing at all), and to learn this obscure (not for us, but for regular Julia users) The approach you describe is totally fine, because it makes ecosystem compatibility and future extensions much easier. But I think it should remain as an implementation detail, with EDIT: that means that I'm fine with removing |
Btw, would there be any significant cost to defining these viewers repeatedly? I guess defining them as iterators completely eliminates existing allocations caused by Say I have an EEG dataset with 45 channels and 100000 datapoints, and I want to run multiple experiments at 10 different scale levels, with N surrogates at each level, etc. |
I don't get this argument with namings... There will be a section named "multiscale" in the docs. Why do you need a function? |
Generally speaking, books/tutorials on software design advise that having a lot of input arguments is not a good plan, and one should keep them as little as possible. Your |
yes, the viewers have significant performance benefits, we've already measured that in the other pacakge. |
I am confused here, and I was in fact already from the docs. I thought |
It isn't in my experience. A function with 4 input arguments, all of which decide the behavior of the function, is not at all something easy. Especially because it hides the fact that all these input arguments do not, in fact, decide the behavior of the function. The function is "calculate quantity Y across different subsampling scales of Y". The function really doesn't care what quantity Y is. In the scientific context you might care, but we're talking about software design now. We can have a section called multiscale that demonstrates how to estimate a multiscale quantity. That's also why I am arguing that this "apply Y over coasre grained views of X" shouldn't even be in this package, but in a different package. Entropies.jl only supplies the function that computes Y, and cares about nothing else. It seems to me that |
I guess we can argree on having a |
I don't have a clear mental model of the approach you describe yet. To address all these comments/thoughts properly, I need to have a longer think and test it out on some concrete examples.
Yes, my point is that a simple wrapper I might end up totally agreeing with you on removing the wrappers too in the end, but I need some time and experimenting to conclude anything definitely. Our focus now should be to get a stable release out, and leave internal/external implementation details for future work. |
ok i write the code in my revbiew PR. I will leave the warppers. Please reply here though, beucase I still don't getwhat the difference between Regualr and Composite is: |
BTW, yet another reason I find the function "multiscale" confusing. Its docstring says:
Yet the function returns a vector of values. So it doesn't compute an entropy even tough the name implies so. It computes several entropies, each one with a different input derived from |
Yes, that is a good point. However, it how most papers in the literature defines it. As you point out, this is not a very clever choice of words. I think I've seen some authors actually define some weighted combination of the entropies at different levels as a multiscale entropy, though. So it varies. Which adds to the arguments for separating the multiscale sampling from the actual computations. |
Ok, just give me a few moments, and I will try to explain. |
I think with the viewers it would basically look like this: x = rand(1000)
vr = RegularDownsamplingViewer(x)
vc = CompositeDownsamplingViewer(x
# give me, at scale `s`, the `i`-th point among `N1 = {1, 2, 3, ..., i, ..., n_1 - 1, n_1}` points
# where n_1 indicates that indexing at a certain scale always starts with the first point
vr[s][i]
# give me, at scale `s`, at start index `j`, the `i`-th point among the points `N_j = {j, j+1, j+2, ...}`
vc[s][j][i] So the difference is that for |
@Datseris I just remembered: These things are well-illustrated in the |
Something like this: # `Regular` downsampling
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
r = Regular(f = mean)
# The second argument is the scale
downsample(r, 2, x) # gives [f(x[1:2]), f(x[3:4], f(x[5:6]), f(x[7:8]), f(x[9:10])] = [1.5, 3.5, 5.5, 7.5, 9.5]
downsample(r, 3, x) # gives [f(x[1:3]), f(x[4:6], f(x[7:9])] = [2.0, 5.0, 8.0]
# Composite downsampling
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
c = Composite(f = mean)
downsample(c, 2, x) # There are two unique ways of downsampling, depending on the start index, so I get two arrays
# [ [f(x[1:2], f(x[3:4]), f(x[5:6], f(x[7:8])], [f(x[2:3], f(x[4:5]), f(x[6:7], f(x[8:9])] ] = [ [1.5, 3.5, 5.5, 7.5], [2.5, 4.5, 6.5, 8.5] ]
downsample(c, 3, x) # There three unique ways of downsampling , depending on the start index, so I get three arrays
# [ [f(x[1:3], f(x[4:6])], [f(x[2:4], f(x[5:7])], [f(x[3:5], f(x[6:8])] ] = [ [2.0, 5.0], [3.0, 6.0], [4.0, 7.0] ] ` EDIT: In terms of the viewers, this would be x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
vr = Regular(x; f = mean)
vr[2] = [1.5, 3.5, 5.5, 7.5, 9.5]
vr[3] = [2.0, 5.0, 8.0]
vc = Composite(x; f = mean)
vc[2][1] = [1.5, 3.5, 5.5, 7.5]
vc[2][2] = [2.5, 4.5, 6.5, 8.5]
vc[3][1] = [2.0, 5.0]
vc[3][2] = [3.0, 6.0]
vc[4][3] = [4.0, 7.0] EDIT2: The To allow completely generic dispatch, perhaps there should be some subtyping EDIT 3: |
@kahaaga can we please leave multiscale stuff for v2.1? It needs more time to review and think of the implementation, and there is no obvious need for this for v2. I have finished reviewing everything else and if we fixed the doc examples we can move on to tag the release v2 (without multiscale exporrted or part of online docs, we can leave them in the source for now). This would allow us to shift focus to finish CausalityTools, which for me is a rather pressing matter. I need to work on the stockhoml workshop asap because I wil lbe mostly gone the first week of January. |
Yes, let's leave it as is, but not export it. That way, I can still develop upstream methods without checking out git dev branches, which hugely simplifies the workflow. If you leave your review open, I'll have a final look, then we can rename the package and tag it. |
I havent pushed my finished changes yet. will do so by end of day today |
Ok, no worries. Just tag me whenever your review is ready for review. |
Any updates on the window viewer package, @Datseris? If not, can we go with the existing proposed API for now, and leave window viewers for a later minor/major version? I agree that the current approach is not optimal, but I'd very much like to have the multiscale stuff as part of the paper. We can always deprecate it later. |
Yes, we can go ahead with the viewer here, but I would argue that the api should be |
Since we now merged #404, I am closing this. If we ever decide to creating this viewer package, we should open a new issue to discuss it. |
Alright, last thing before I close my V2 review is the multiscale stuff. As I've told you, we have another package that creates window viewers: it wants to apply some computation over views (specifically, rolling windows) of some timeseries. We decided that it doesn't make sense to offer specialized function for doing this. What makes sense is to ofer a type that produces views over a timeseries and just use
map
to apply the computation.I proipose to do the same thing here, so that we can move out multiscale viewers in to a future package
TimeseriesViewers
or something like that.Here is what I propose:
Regular
take as input a timeseries, besides their other keywords.applyover(function, ViewerStruct) -> scales, result
. This function simply applies a computational function over the views the viewer iterates over.applyover
is identical tomap
for rolling-window like viewers that will be in the TimeseriesViewers package. However, for our "coarse-graining" viewersapplyover
first applies the computation function at the given down-sampled scale, but then also averages it (or computes the variance, depends what's given as aggregator) at the given down-sampled scale.downsample, multiscale, multiscale_normalized
are deleted!!! If you want the multiscale quantity of somethng you do:thoughts.?
The text was updated successfully, but these errors were encountered: