Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Off-by-one error? #1

Open
robbyblum opened this issue Nov 25, 2019 · 3 comments
Open

Off-by-one error? #1

robbyblum opened this issue Nov 25, 2019 · 3 comments

Comments

@robbyblum
Copy link

Hi,

So I installed this repository in order to check my implementation of sine-gap sampling against the one you defined and used in the 2015 JMR. I already knew that the "black box" versions of both might not be equivalent, because the task of choosing L to get a certain sampling factor is somewhat subtle (as shown by the fact that you do an optimization for that). But even so, I found that my method gives different results for the same scaling factor (I modified /src/seq.c to print the finalL*w factor) when using the suggested sine-gap function syntax. I'm working in 1D for all of this, which should make it simple, but there are still some weird issues.

For instance, if I do gaputil 0.25 128 'L * sin((pi/2) * (x + sum(O)) / sum(N))', it reports a final scaling factor of 6.259887, and nout=33 (which is "correct" given how nerr is defined, even though I would have expected 32). But it also prints 34 values, not 33. Furthermore, when I put that factor of 6.259887 into my sine-gap implementation, I get a sequence that's 41 points long, not 32.

So then I stepped back and tried the really simple uniform sampling case, where the gap function is just L, and still found really weird behavior. For instance, if I do gaputil 0.5 20 'L', I get this:

0
1
3
5
7
9
11
13
15
17
19

First of all, this is 11 values, not the expected 10. This is despite the fact that the code thinks nout = 10 and nerr = 0. Second, this isn't a uniform lattice! If it were just all the even numbers or all the odd numbers, it would be fine, but instead it stutter-steps over to the odd numbers after starting at 0. Similarly, if I do gaputil 0.25 20 'L', I get this:

0
3
7
11
15
19

Again, it's not actually uniform, and has 6 values instead of 5. This shows that it's probably not a case of "ignore the 0 at the start, this is actually 1-indexed", since it is perfectly willing to not choose point 1 when it doesn't want to. But I don't really know what the problem actually is.

If this is present for uniform sampling, I have to assume something similar is present when I use the sine-gap function instead. But I really have no idea what the implications of this would be.

Anyway, hopefully this is useful in some way. Sorry to open a long, overly-verbose issue three years after you last touched this repository! Let me know if you figure out any explanation for this, or if you want to see my version of the sine-gap function (though I do suspect there's a problem upstream of that).

Best,
Robby Blum

@geekysuavo
Copy link
Owner

Hi Robby,

I'm not quite sure what the source of your issue is, and unfortunately I don't have the bandwidth to investigate it in more depth. If you do find a concrete failure, where the sampler is actually doing something incorrect, I am happy to review a merge/pull request.

The logic in seq(), allows for some wiggle-room in the output sequence lengths, so it's very likely operating as expected.

As for the zero at the start of all your sequences, I believe it may be treated specially by the code: sampling the very first point of an indirect-dimension trace was claimed to give you clear skin and a fantastic love life at the time. :)

Finally, as an aside, I wouldn't recommend using gaputil for uniform sampling. pdfutil or rejutil are much better-suited.

To satisfy my own curiosity, why do you need to write your own sine-gap implementation?

Best regards,
~ Brad.

@robbyblum
Copy link
Author

Thanks for the reply! This isn't a very high-priority issue for me either, I just needed to get it all out in one place so that maybe someday it could be resolved. I spent all day down this rabbit hole instead of writing the relevant section of my dissertation :)

Yeah I'm fine with the wiggle-room in the lengths, it makes perfect sense given the implementation.

In all the NUS work that I've done, I've also stuck to the rule of always sampling the first indirect-dimension time point. Nailing down the integral of the spectrum is useful! Which is why I'm confused about it saying "nout = 33" and giving 34 numbers including the zero.

I'll take a look at the other functions at some point; I was just using uniform sampling as a test case for gaputil, since it was pointed out in the readme. In an ideal world I would love them all to be able to achieve certain "special" cases (like in my example where I tried to get it to just do 50% uniform undersampling), but I also probably do not have the bandwidth to untangle this right now (see above re: dissertation). If I sort it out I will definitely submit a PR, though.

As for why I rolled my own sine-gap sampling: for a paper we published this year (https://link.springer.com/article/10.1007/s10858-019-00262-4), where we went in depth to try to squeeze predictable behavior out of quasi-even sampling (i.e., more or less uniform), one of the reviewers asked if we had looked at other deterministic methods such as sine-gap. I probably should have taken a look at these programs to test it out, but for whatever reason instead I just wrote my own function based off of the equations in your 2015 JMR. So, I guess I didn't have to write my own, I just did because I was stressed out enough that I missed the link to your code at the end of the paper! Unless there's some other error, the main difference is that I implemented it without any wiggle room on number of sampled points, because I needed to be able to reliably generate and test the sampling patterns for every possible sampling fraction of a 128-point dense dimension.

@geekysuavo
Copy link
Owner

Very interesting! I'm starting to read through your paper now.

Which is why I'm confused about it saying "nout = 33" and giving 34 numbers including the zero.

It could be as simple as the struct bst_t not counting the root node in its n member variable? Like I said, I'm unsure without a deeper dive into the code and its internal state at runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants