The Faure sequence is wrong #2653

mbaudin47 · 2024-05-11T10:35:58Z

What happened?

The Faure sequence generates wrong points in any dimension.

Here are the proofs of this.

In dimension 2, we see that a set of 4 points does not correctly fill the 4 base-2 elementary volumes.
In dimension 3, we see that 9 first points generated are wrong. The basis is obviously equal to 2 instead of 3. The third dimension is a copy of the first dimension.
The estimated value of the mean of the GSobol' function does not converge to its true value.

Many thanks to Félix Husson (EDF & Institut Galilée) for pointing that bug.

One of the reasons of the bug is undetected is the way the unit test t_FaureSequence_std.cxx is implemented. Since the algorithm produces a set of points, the simplest and most efficient way to implement the unit test is to check the coordinates of the set of points against a pre-defined reference Sample. This reference sample can be computed from another library, e.g. Matlab, Scilab, Mathematica, etc. Several dimensions e.g. 2, 3, and 4 should be checked. Instead, the first part of the current unit test checks the library against itself, which prevents from detecting any bug. The second part of the unit test estimates the number $\pi$. Unfortunately, the algorithm seems to converge with relative accuracy equal to $10^{-4}$. Actually, this unit test is still of poor value. Indeed, the fact that the estimate converges is a weak proof that the method is OK. Indeed, we may expect that the rate of convergence is OK (which is not). Finally, the sample size is equal to 1000, instead of being equal to a power of the basis, i.e. 2 in this particular case.

How to reproduce the issue?

Proof #1: In dimension 2

import openturns as ot
import openturns.usecases.ishigami_function as ishigami
import openturns.viewer as viewer
import matplotlib.pyplot as plt

basis = 2
dimension = 2
taille = basis**dimension
distribution = ot.ComposedDistribution([ot.Uniform(0.0, 1.0)] * dimension)
sequence = ot.FaureSequence()
experiment_QMC = ot.LowDiscrepancyExperiment(sequence, distribution, taille, False)
sample = experiment_QMC.generate()
fig = viewer.PlotDesign(
    sample,
    subdivisions=[basis] * dimension,
    bounds=ot.Interval([0.0] * dimension, [1.0] * dimension),
)
plt.title(f"n = {taille}")
fig.set_figwidth(3.0)
fig.set_figheight(3.0)

Figure 1. The first 4 points of the Faure sequence in dimension 2. There should be one single point for each elementary volume, but the upper left cell contain 2 points.

Proof #2: In dimension 3

The script below defines the first 9 points of the Faure sequence in dimension 3. This is from ¹.

# First 9 points of Faure sequence in dimension 3
expected = ot.Sample([
    [1 / 3, 1 / 3, 1 / 3],
    [2 / 3, 2 / 3, 2 / 3],
    [1 / 9, 4 / 9, 7 / 9],
    [4 / 9, 7 / 9, 1 / 9],
    [7 / 9, 1 / 9, 4 / 9],
    [2 / 9, 8 / 9, 5 / 9],
    [5 / 9, 2 / 9, 8 / 9],
    [8 / 9, 5 / 9, 2 / 9],
])

Now compare to the result from OpenTURNS.

dimension = 3
taille = 9
distribution = ot.ComposedDistribution([ot.Uniform(0.0, 1.0)] * dimension)
sequence = ot.FaureSequence()
experiment_QMC = ot.LowDiscrepancyExperiment(sequence, distribution, taille, False)
sample = experiment_QMC.generate()
sample

which prints:

	y0	y1	y2
0	0.5	0.5	0.5
1	0.25	0.75	0.25
2	0.75	0.25	0.75
3	0.125	0.625	0.125
4	0.625	0.125	0.625
5	0.375	0.375	0.375
6	0.875	0.875	0.875
7	0.0625	0.9375	0.0625
8	0.5625	0.4375	0.5625

We see that the basis is equal to 2 instead of 3. We see that the third dimension is a copy of the first one.

Proof #3: Convergence on GSobol'

Notice that the lack of convergence of the faulty Faure sequence is not so obvious on the Ishigami function, because the third input variable is no so influential. Only the speed of convergence is affected here.

We consider the GSobol' test function and compute its mean.

a=[0.0, 9.0, 99.0]
def GSobolModel(X):
    X = ot.Point(X)
    d = X.getDimension()
    Y = 1.0
    for i in range(d):
        Y *= (abs(4.0 * X[i] - 2.0) + a[i]) / (1.0 + a[i])
    return ot.Point([Y])

gSobolFunction = ot.PythonFunction(dimension, 1, GSobolModel)
gSobolFunction.setOutputDescription(["Y"])

# Define the distribution
distributionList = [ot.Uniform(0.0, 1.0) for i in range(dimension)]
distribution = ot.ComposedDistribution(distributionList)

name = "GSobol"

# Compute Mean
gSobolMean = 1.0


dimension = 3
basis = 3  # ie the smallest prime number greater or equal to the dimension
sampleSize = basis ** 7  # This is significant!
sequence = ot.FaureSequence()
experiment_QMC = ot.LowDiscrepancyExperiment(sequence, im.distributionX, sampleSize, False)
inputSample = experiment_QMC.generate()
outputSample = gSobolFunction(inputSample)
print(f"n = {sampleSize} , estimate = {outputSample.computeMean()[0]}")
print(f"Expected = {gSobolMean}")

We get:

n = 2187 , estimate = 10.722134899604974
Expected = 1.0

Version

1.22

Operating System

unknown

Installation media

unknown

Additional Context

No response

"Low Discrepancy Toolbox Manual", Michael Baudin. Version 0.1. May 2013. https://gitlab.com/scilab/forge/lowdisc ↩

The text was updated successfully, but these errors were encountered:

regislebrun · 2024-05-11T20:37:47Z

@mbaudin47 Nice catch, and thanks for the detailed analysis. The faulty part seems to be in LowDiscrepancySequence::GetNextPrimeNumber(n), which was supposed to return the first prime number greater or equal to n and return the first prime number greater or equal to n-1 if primesieve is used (see https://github.com/kimwalisch/primesieve/blob/master/doc/CPP_API.md#primesieveiteratorjump_to-since-primesieve-110).

Once the bug is fixed, using your script I get (after replacing im.distributionX by distribution):

    [ y0       y1       y2       ]
0 : [ 0.333333 0.333333 0.333333 ]
1 : [ 0.666667 0.666667 0.666667 ]
2 : [ 0.111111 0.444444 0.777778 ]
3 : [ 0.444444 0.777778 0.111111 ]
4 : [ 0.777778 0.111111 0.444444 ]
5 : [ 0.222222 0.888889 0.555556 ]
6 : [ 0.555556 0.222222 0.888889 ]
7 : [ 0.888889 0.555556 0.222222 ]
8 : [ 0.037037 0.592593 0.481481 ]
n = 2187 , estimate = 0.9998313836954922
Expected = 1.0

mbaudin47 · 2024-05-12T16:37:03Z

I try to improve some of the low discrepancy unit tests in #2654. I was not able to reproduce the bug based on my Linux-build of the lib: do you know why?

mbaudin47 · 2024-05-13T08:15:40Z

Is there a workaround, like uninstalling PrimeSieve?

jschueller · 2024-05-13T14:52:55Z

no need to uninstall, you can pass -DUSE_PRIMESIEVE=OFF to cmake

mbaudin47 · 2024-05-13T15:31:45Z

What if the installation is based on Conda: is there a way to remove that package/module, or is this integrated within the OT binary?

jschueller · 2024-05-13T15:37:49Z

its not possible to change it at runtime

Closes openturns#2653

Closes #2653

Closes openturns#2653

mbaudin47 · 2024-06-10T21:22:10Z

@regislebrun I found the issue.

Here are the first 4 points of the Faure sequence in the current's master:

	y0	y1
0	0.5	0.5
1	0.25	0.75
2	0.75	0.25
3	0.125	0.625

This is wrong. The 3d point is in base 8, not in base 4 as expected. The reason behind this is because we removed the first point, the infamous $(0, 0)$. This destroys the (t,m,s)-net property of the Faure sequence. When we use the script:

import openturns as ot
import openturns.viewer as viewer
import matplotlib.pyplot as plt

basis = 2
dimension = 2
sample_size = basis**dimension
distribution = ot.ComposedDistribution([ot.Uniform(0.0, 1.0)] * dimension)
sequence = ot.FaureSequence()
experiment_QMC = ot.LowDiscrepancyExperiment(sequence, distribution, sample_size, False)
sample = experiment_QMC.generate()
fig = viewer.PlotDesign(
    sample,
    subdivisions=[basis] * dimension,
    bounds=ot.Interval([0.0] * dimension, [1.0] * dimension),
)
plt.title(f"n = {sample_size}")
plt.xlim(-0.1, 1.1)
plt.ylim(-0.1, 1.1)
fig.set_figwidth(3.0)
fig.set_figheight(3.0)

this produces:

Figure 1. The faulty Faure sequence in OpenTURNS 1.23.

The correct 4 points are :

	y0	y1
0	0	0
1	0.5	0.5
2	0.25	0.75
3	0.75	0.25

This produces:

Figure 2. The correct Faure sequence.

A. Owen explained this in ¹. A significant contribution of this paper is that Owen presents examples where removing the first point in the sequence reduces the convergence speed of the method. Please have a look at the figure 2 page 9. In the paper, $\hat{\mu}{x,1}$ is the average of the first $n$ points where the first point was skipped and $\hat{\mu}{x,2}$ is the average of the first $n$ points including zero. Removing the first point changes the expected convergence rate from $n^{-1}$ down to $n^{-3/2}$. One of the tricks to find a good example is to carefully select a function which is not symmetric about $(1/2, ..., 1/2)$.

One solution is: put the 0 back (which restores the convergence rate) and scramble the sequence by default (which solves the 0 issue). Quoting ¹: "With digital nets as with antibiotics, one should take the whole sequence."

I will create a new issue on the topic with title: "The zero should be included in some low discrepancy sequences". This issue will impact Sobol', Faure and perhaps other sequences as well.

On dropping the first Sobol' point. Art B. Owen. 2021 https://arxiv.org/abs/2008.08051 ↩ ↩²

mbaudin47 added the bug label May 11, 2024

mbaudin47 mentioned this issue May 12, 2024

Improve low discrepancy sequences #2654

Merged

mbaudin47 added a commit to mbaudin47/openturns that referenced this issue May 12, 2024

Closes openturns#2653

0b6ccd1

jschueller added this to the 1.23 milestone May 13, 2024

mbaudin47 added a commit to mbaudin47/openturns that referenced this issue May 14, 2024

Fixes the Faure sequence when PrimeSieve is used

fae5686

Closes openturns#2653

jschueller pushed a commit to mbaudin47/openturns that referenced this issue Jun 5, 2024

Fixes the Faure sequence when PrimeSieve is used

1f2e356

Closes openturns#2653

jschueller closed this as completed in #2654 Jun 5, 2024

jschueller pushed a commit that referenced this issue Jun 5, 2024

Fixes the Faure sequence when PrimeSieve is used

d0a86b3

Closes #2653

jschueller pushed a commit that referenced this issue Jun 5, 2024

Fixes the Faure sequence when PrimeSieve is used

7aad1e9

Closes #2653

mbaudin47 added a commit to mbaudin47/openturns that referenced this issue Jun 10, 2024

Fixes the Faure sequence when PrimeSieve is used

56e6a73

Closes openturns#2653

mbaudin47 mentioned this issue Jun 10, 2024

The first zero point should be moved back into the low discrepancy sequences #2686

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Faure sequence is wrong #2653

The Faure sequence is wrong #2653

mbaudin47 commented May 11, 2024 •

edited

Loading

regislebrun commented May 11, 2024 •

edited

Loading

mbaudin47 commented May 12, 2024

mbaudin47 commented May 13, 2024 •

edited

Loading

jschueller commented May 13, 2024

mbaudin47 commented May 13, 2024

jschueller commented May 13, 2024

mbaudin47 commented Jun 10, 2024 •

edited

Loading

The Faure sequence is wrong #2653

The Faure sequence is wrong #2653

Comments

mbaudin47 commented May 11, 2024 • edited Loading

What happened?

How to reproduce the issue?

Proof #1: In dimension 2

Proof #2: In dimension 3

Proof #3: Convergence on GSobol'

Version

Operating System

Installation media

Additional Context

Footnotes

regislebrun commented May 11, 2024 • edited Loading

mbaudin47 commented May 12, 2024

mbaudin47 commented May 13, 2024 • edited Loading

jschueller commented May 13, 2024

mbaudin47 commented May 13, 2024

jschueller commented May 13, 2024

mbaudin47 commented Jun 10, 2024 • edited Loading

Footnotes

mbaudin47 commented May 11, 2024 •

edited

Loading

regislebrun commented May 11, 2024 •

edited

Loading

mbaudin47 commented May 13, 2024 •

edited

Loading

mbaudin47 commented Jun 10, 2024 •

edited

Loading