Optimization: Potential speedup for nth_powerfree #112

danaj · 2024-05-19T06:00:03Z

Once we have an initial estimate and the associated count, we can make a very good second estimate by using the Qk ~ 1n/ zeta(k) on the error term.

For instance, we make an initial estimate either using n/zeta(k) of using the midpoint (which for most cases ends up being the same). We calculate the count. This has error say 20000 too low. Then count + 20000 * zeta(k) is going to be very close to the correct value.

We could do this 1 or more times until we're very close (in my tests the very first time will reduce the error from say 30,000 to about 60, which is close enough). No need to bracket.

Alternately, use this in the binary search. If count < n, mid is our new lo value as in the current code. But if hi > 2 * zeta(k) * error then set hi to that. Same thing for the count > n case. (1) that means the next midpoint will be the expected approximation. (2) We want to prove this is actually a proper bound. I believe it is an overestimate (which we need for a bracket), but I haven't sat down to do the math.

In powerfree count, we might be able to save some time by pulling out small constant values at the tail. This is a half-way version of the full squarefree_count optimization. I'm not sure it's really worth doing in the C code, but in my Perl code it is a big deal as the simple bigmath operations are terribly expensive (Sidef beautifully avoids this by using Math::GMPz). The final values can be done by either summing the moebius values in their range or by using two calls to mertens (a version of mertens that did this in a single call would be nice).

Inspired by Dana Jacobsen ideas from: #112 But a bit simpler: v = floor(zeta(k)*n) # initial approximation c = k.powerfree_count(v) # count of k-powerfree numbers <= v v += n - c # adjust v by the difference (n-c) c = k.powerfree_count(v) # recompute the count For large values of `k`, most numbers are k-powerfree, therefore the approximation is very good. Most of the time, it actually computes the exact value (e.g.: c == n). However, it's only beneficial for very large values of `n`. Otherwise, it's slower to compute `k.powerfree_count(v)` twice. Example: say nth_powerfree(1e50, 10) # takes 0.3s (previously it took 45 seconds -- ouch) say nth_powerfree(1e50, 7) # takes 14.6s (previously it took almost forever) say nth_powerfree(1e40, 7) # takes 0.6s (previously it took 7.4 seconds) say nth_powerfree(1e40, 10) # takes 0.2s (previously it took 1.2 seconds) say nth_powerfree(1e45, 10) # takes 0.2s (previously it took 4.1 seconds) say nth_powerfree(1e45, 8) # takes 0.6s (previously it took 12 seconds)

trizen closed this as completed in 94bfebe May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization: Potential speedup for nth_powerfree #112

Optimization: Potential speedup for nth_powerfree #112

danaj commented May 19, 2024 •

edited

Optimization: Potential speedup for nth_powerfree #112

Optimization: Potential speedup for nth_powerfree #112

Comments

danaj commented May 19, 2024 • edited

danaj commented May 19, 2024 •

edited