# Discrepancy Example from Zhou et al. (2013)

We have compared the computations between QMCPy and Scipy for the article Zhou, Y.-D., Fang, K.-T., & Ning, J.-H. (2013). Mixture discrepancy for quasi-random point sets. Journal of Complexity, 29(3–4), 283–301. https://doi.org/10.1016/j.jco.2012.11.006. We have found that

* The computations in example 1 are correct for QMCPy and incorrect for Scipy. This is because Scipy neglects to take the square root in computing discrepancy.
* Scipy matches the compuations in Example 3, because the paper shows the squared discrepancy rather than the discrepancy itself. Hence, there is a typo in the paper.

In [3]:
import qmcpy as qp
import numpy as np
import scipy.stats
from tabulate import tabulate
import pandas as pd
import time

## Example 1

In [4]:
#P_1 and P_2 were given in Fang's paper
P_1 = np.array([[1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3],
               [1, 1, 2, 2, 2, 3, 3, 3, 1, 1, 1, 2, 2, 2, 3, 3, 1, 1, 1, 2, 2, 3, 3, 3],
               [1, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 1, 2, 3, 2, 3, 1, 2, 3]]).T / 3 -(1/6)

                
P_2 = np.array([[1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2 ,2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3],
               [1, 1, 1, 2, 2, 3, 3, 3, 1, 1, 2, 2, 2, 2, 3, 3, 1, 1, 1, 2, 2, 3, 3, 3],
               [1, 2, 3, 1, 3, 1, 2, 3, 1, 3, 2, 2, 2, 2, 1, 3, 1, 2, 3, 1, 3, 1, 2, 3]]).T / 3 - (1/6)

#List out your discrepancies, which from Fang for example 1 used, Centered and Wrap Around Discrepancy
Discrepancies = ['CD', 'WD']
#List of the different arrays we have to compute the discrepancy
P = [P_1, P_2]

#My data is used so that we can create a table
mydata = [[' ', 'P_1', 'P_2']]

#QMC being our QMCPy code, and Sci being Scipy's values
QMC = mydata
Sci = mydata

#I created 2 for-loops, 1 for discrepancy and the other for P
for i in range(2):
    #Start off with the discrepancy we want to calculate for both QMCPy and Scipy as these are lists so that we can create a table
    QMCPy = [Discrepancies[i]]
    Scipy = [Discrepancies[i]]
    #Now for both P_1 and P_2
    for j in range(2):
        #Calculate the discrepancy given either Centered or Wrap Around and use Scipy and QMCPy to calculate discrepancy but square QMCPy
        #We want to find discrepancy^2 given the table in Example 1 and show that Scipy actually calculates discrepancy^2

        ###Make this coding a bit more readable. And look into panda
        QMCPy = QMCPy + [round(qp.discrepancy(Discrepancies[i], P[j])**2,6)]
        Scipy = Scipy + [round(scipy.stats.qmc.discrepancy(P[j], method = Discrepancies[i]),6)]
    #And then add the list to QMC and Sci respectively so that we can create our table
    QMC = QMC + [QMCPy]
    Sci = Sci + [Scipy]
#Now print QMCPy and Scipy tables
print("QMCPy's calculated discrepancy squared")
print(tabulate(QMC, tablefmt="grid"))
print(' ')
print("Scipy's calculated discrepancy")
print(tabulate(Sci, tablefmt="grid"))

TypeError: list indices must be integers or slices, not tuple

The mission of Zhou's paper in Example 1 is to calculate the discrepancy squared of Centered and Wrap Around for $P_1$ and $P_2$ given in the example. 
The tables above matches with the data given by Zhou's paper. And we see that Scipy is actually calculating discrepancy squared, because notice that for QMCPy when we square those values we are getting the exact same values.

## Example 3

In [5]:
#We was given P1_star, P2_star, P3_star, and P4_star in Fang's paper in Example 3 so we will punch it into the code
P1_star = np.array([[1/7, 4/7], [2/7, 1/7], [3/7, 5/7], [4/7, 2/7], [5/7, 6/7], [6/7, 3/7], [1, 1]]) - (1/14)
P2_star = np.array([[1/7, 5/7], [2/7, 2/7], [3/7, 1], [4/7, 4/7], [5/7, 1/7], [6/7, 6/7], [1, 3/7]]) - (1/14)
P3_star = np.array([[1/7, 6/7], [2/7, 1/7], [3/7, 3/7], [4/7, 5/7], [5/7, 1], [6/7, 2/7], [1, 4/7]]) - (1/14)
P4_star = np.array([[1/7, 3/7], [2/7, 6/7], [3/7, 1/7], [4/7, 4/7], [5/7, 1], [6/7, 2/7], [1, 5/7]]) - (1/14)

#Like in Example 1, list out the discrepancies you want to use, which is Wrap Around,
#Centered, and Mixture discrepancy.
Discrepancies = ['WD', 'CD', 'MD']

#Create a list for all P stars
P = [P1_star, P2_star, P3_star, P4_star]


#Let mydata be the top row for the table, we are about to create
mydata = [['No.', 'P_1^*', 'P_2^*', 'P_3^*', 'P_4^*']]

#Initialize QMC with mydata and now make a table using the P_i^*'s and the 3 discrepancies
QMC = mydata
Sci = mydata

#For each discrepancy

for i in range(3): 
    #Discrepancy^2 because we want to calculate it and show that there is a typo in Fang's paper
    QMCPy = [Discrepancies[i] + '^2']
    Scipy = [Discrepancies[i]]
    for j in range(4):
        #Now calculate discrepancy for QMCPy squared and use Scipy respectively.
        QMCPy = QMCPy + [round(qp.discrepancy(Discrepancies[i], P[j])**2,6)]
        Scipy = Scipy + [round(scipy.stats.qmc.discrepancy(P[j], method = Discrepancies[i]),6)]
    #Now add another row to the lists QMC and Sci table
    QMC = QMC + [QMCPy]
    Sci = Sci + [Scipy]

#Now show the tables for QMCPy and Scipy
print("QMCPy's calculated squared discrepancy")
print(tabulate(QMC, tablefmt="grid"))
print(' ')
print("Scipy's calculated discrepancy")
print(tabulate(Sci, tablefmt="grid"))

df = pd.DataFrame()  
print(df) 
  
# list of strings  
lst = ['Geeks', 'For', 'Geeks', 'is', 'portal', 'for', 'Geeks']  
    
# Calling DataFrame constructor on list  
df = pd.DataFrame(lst)  
print(df)

TypeError: list indices must be integers or slices, not tuple

Notice that the values for both are the same, as we've shown in Example 1. But there is a bit of a typo, because for discrepancy we sqaured the values and in Zhou's paper there is a typo as Zhou is actually reporting $WD^2$, $CD^2$, and $MD^2$. Another approach would be to take the square root of the values to get $WD$, $CD$, and $MD$ accordingly. 

To conclude Scipy, needs to take the square root for Wrap Around, Centered, and Mixture discrepancy. And Zhou's paper has a typo in Example 3.