# Adaptive PDE discretizations on cartesian grids 
## Volume : GPU accelerated methods
## Part : Eikonal equations, acceleration and reproducibility
## Chapter : Riemannian metrics

In this notebook, we solve Riemannian eikonal equations on the CPU and the GPU, and check that they produce consistent results.

**GPU performance** GPUs are massively parallel machines, which efficiently exploit cache locality. Hence they are at their advantage with :
* Large problem instances, which are embarassingly parallel
* Moderate anisotropy, so that the numerical scheme stncils are not too wide

In [1]:
large_instances = False # True favors the GPU code (CPU times may become a big long.)
strong_anisotropy = True # True favors the CPU code 
anisotropy_bound = 10. if strong_anisotropy else 4. # Ratio between the fastest and the smallest velocity at any given point

[**Summary**](Summary.ipynb) of volume GPU accelerated methods, this series of notebooks.

[**Main summary**](../Summary.ipynb) of the Adaptive Grid Discretizations 
	book of notebooks, including the other volumes.

# Table of contents
  * [1. Two dimensions](#1.-Two-dimensions)
    * [1.1 Isotropic metric](#1.1-Isotropic-metric)
    * [1.2 Smooth anisotropic metric](#1.2-Smooth-anisotropic-metric)
  * [2. Three dimensions](#2.-Three-dimensions)
    * [2.1 Smooth anisotropic metric](#2.1-Smooth-anisotropic-metric)



**Acknowledgement.** The experiments presented in these notebooks are part of ongoing research.
The author would like to acknowledge fruitful informal discussions with L. Gayraud on the 
topic of GPU coding and optimization.

Copyright Jean-Marie Mirebeau, University Paris-Sud, CNRS, University Paris-Saclay

## 0. Importing the required libraries

In [2]:
import sys; sys.path.insert(0,"..")
#from Miscellaneous import TocTools; print(TocTools.displayTOC('Riemann_Repro','GPU'))

In [3]:
import cupy as cp
import numpy as np
import itertools
from matplotlib import pyplot as plt
np.set_printoptions(edgeitems=30, linewidth=100000, formatter=dict(float=lambda x: "%5.3g" % x))

In [4]:
from agd import Eikonal
from agd import AutomaticDifferentiation as ad
from agd import Metrics
from agd import FiniteDifferences as fd
from agd import LinearParallel as lp
import agd.AutomaticDifferentiation.cupy_generic as cugen

from agd.ExportedCode.Notebooks_GPU.Isotropic_Repro import RunCompare
Eikonal.dictIn.default_mode = 'gpu'

In [5]:
def ReloadPackages():
    from Miscellaneous.rreload import rreload
    global Eikonal,ad,cugen,RunGPU,RunSmart,Metrics
    Eikonal,ad,cugen,Metrics = rreload([Eikonal,ad,cugen,Metrics],"../..")    
    Eikonal.dictIn.default_mode = 'gpu'

In [6]:
cp = ad.functional.decorate_module_functions(cp,cugen.set_output_dtype32) # Use float32 and int32 types in place of float64 and int64
plt = ad.functional.decorate_module_functions(plt,cugen.cupy_get_args)

## 1. Two dimensions

### 1.1 Isotropic metric

In [7]:
n=4000 if large_instances else 1000
hfmIn = Eikonal.dictIn({
    'model':'Riemann2',
    'metric':Metrics.Riemann.from_cast(Metrics.Isotropic(cp.array(1.),vdim=2)),
    'seed':[0.5,0.5],
    'exportValues':1,
#    'bound_active_blocks':True,
    'traits':{
        'niter_i':24,'shape_i':(12,12), # Best
    }
})
hfmIn.SetRect([[0,1],[0,1]],dimx=n+1,sampleBoundary=True)

In [8]:
_,cpuOut = RunCompare(hfmIn,check=1e-5)

Setting the kernel traits.
Prepating the domain data (shape,metric,...)


Preparing the problem rhs (cost, seeds,...)
Preparing the GPU kernel


Running the eikonal GPU kernel
GPU kernel eikonal ran for 0.06896042823791504 seconds, and 86 iterations.
Post-Processing
--- gpu done, turning to cpu ---


Field verbosity defaults to 1
Field order defaults to 1
Field seedRadius defaults to 0
Fast marching solver completed in 0.829 s.
Solver time (s). GPU : 0.06896042823791504, CPU : 1.49. Device acceleration : 21.606594362486646
Max |gpuValues-cpuValues| :  2.91457399725914e-06


In [9]:
n=200; hfmInS = hfmIn.copy() # Define a small instance for bit-consistency validation
hfmInS.SetRect([[0,1],[0,1]],dimx=n+1,sampleBoundary=True)
X = hfmInS.Grid()
cost = np.prod(np.sin(2*np.pi*X),axis=0)+1.1
hfmInS.update({
    'metric': Metrics.Riemann.from_cast(Metrics.Isotropic(cost,vdim=2)), # Isotropic but non-constant metric
    'verbosity':0,
})

In [10]:
RunCompare(hfmInS,variants='basic')

Solver time (s). GPU : 0.011998414993286133, CPU : 0.054. Device acceleration : 4.500594456035768
Max |gpuValues-cpuValues| :  1.2913288426341651e-06

 --- Variant {'multiprecision': True} ---


Solver time (s). GPU : 0.015989065170288086, CPU : 0.051000000000000004. Device acceleration : 3.1896799129177045
Max |gpuValues-cpuValues| :  4.5230663991979725e-08

 --- Variant {'seedRadius': 2.0} ---


Solver time (s). GPU : 0.01599740982055664, CPU : 0.051000000000000004. Device acceleration : 3.1880160958597874
Max |gpuValues-cpuValues| :  1.2437417652444438e-06

 --- Variant {'seedRadius': 2.0, 'multiprecision': True} ---
Solver time (s). GPU : 0.013969659805297852, CPU : 0.05. Device acceleration : 3.579185226904238
Max |gpuValues-cpuValues| :  5.144490700104143e-08


In [11]:
RunCompare(hfmInS,variants='ext',check=0.004)

Solver time (s). GPU : 0.014998912811279297, CPU : 0.052. Device acceleration : 3.4669179462724524
Max |gpuValues-cpuValues| :  1.2913288426341651e-06

 --- Variant {'multiprecision': True} ---
Solver time (s). GPU : 0.0159759521484375, CPU : 0.051000000000000004. Device acceleration : 3.192297994269341
Max |gpuValues-cpuValues| :  4.5230663991979725e-08

 --- Variant {'seedRadius': 2.0} ---


Solver time (s). GPU : 0.015502452850341797, CPU : 0.052000000000000005. Device acceleration : 3.354307895789118
Max |gpuValues-cpuValues| :  1.2437417652444438e-06

 --- Variant {'seedRadius': 2.0, 'multiprecision': True} ---
Solver time (s). GPU : 0.012996673583984375, CPU : 0.051000000000000004. Device acceleration : 3.924081009685941
Max |gpuValues-cpuValues| :  5.144490700104143e-08

 --- Variant {'factoringRadius': 10.0, 'factoringPointChoice': 'Key'} ---


Solver time (s). GPU : 0.0164797306060791, CPU : 0.051000000000000004. Device acceleration : 3.094710782540762
Max |gpuValues-cpuValues| :  0.00014084236694447694

 --- Variant {'factoringRadius': 10.0, 'factoringPointChoice': 'Key', 'multiprecision': True} ---


Solver time (s). GPU : 0.01598381996154785, CPU : 0.051000000000000004. Device acceleration : 3.1907266299727035
Max |gpuValues-cpuValues| :  0.0001408460922347754

 --- Variant {'order': 2} ---


Solver time (s). GPU : 0.017999887466430664, CPU : 0.066. Device acceleration : 3.666689590314847
Max |gpuValues-cpuValues| :  0.0014198483293996755

 --- Variant {'order': 2, 'multiprecision': True} ---


Solver time (s). GPU : 0.018474817276000977, CPU : 0.067. Device acceleration : 3.6265581953567607
Max |gpuValues-cpuValues| :  0.0014198483293996755

 --- Variant {'order': 2, 'seedRadius': 2.0} ---
Solver time (s). GPU : 0.018002986907958984, CPU : 0.067. Device acceleration : 3.721604661634221


Max |gpuValues-cpuValues| :  0.0025155697393134946

 --- Variant {'order': 2, 'seedRadius': 2.0, 'multiprecision': True} ---


Solver time (s). GPU : 0.01849842071533203, CPU : 0.066. Device acceleration : 3.5678721451770894
Max |gpuValues-cpuValues| :  0.0025155697393134946

 --- Variant {'order': 2, 'factoringRadius': 10.0, 'factoringPointChoice': 'Key'} ---


Solver time (s). GPU : 0.016976118087768555, CPU : 0.066. Device acceleration : 3.8878146145527577
Max |gpuValues-cpuValues| :  0.0014996085138356818

 --- Variant {'order': 2, 'factoringRadius': 10.0, 'factoringPointChoice': 'Key', 'multiprecision': True} ---


Solver time (s). GPU : 0.018482685089111328, CPU : 0.066. Device acceleration : 3.570909728851165
Max |gpuValues-cpuValues| :  0.0014996085138356818


### 1.2 Smooth anisotropic metric

In [12]:
n=4000 if large_instances else 1000
hfmIn = Eikonal.dictIn({
    'model':'Riemann2',
    'seed':[0.,0.],
    'exportValues':1,
#    'bound_active_blocks':True,
    'traits':{
        'niter_i':16,'shape_i':(8,8), # Best
    },
})
hfmIn.SetRect([[-np.pi,np.pi],[-np.pi,np.pi]],dimx=n+1,sampleBoundary=True)

In [13]:
def height(x): return np.sin(x[0])*np.sin(x[1])
def surface_metric(x,z,mu):
    ndim,shape = x.ndim-1,x.shape[1:]
    x_ad = ad.Dense.identity(constant=x,shape_free=(ndim,))
    tensors = lp.outer_self( z(x_ad).gradient() ) + mu**-2 * fd.as_field(cp.eye(ndim),shape)
    return Metrics.Riemann(tensors)

In [14]:
hfmIn['metric'] = surface_metric(hfmIn.Grid(),height,mu=anisotropy_bound)

In [15]:
gpuOut,cpuOut = RunCompare(hfmIn,check=False)

Setting the kernel traits.
Prepating the domain data (shape,metric,...)
Preparing the problem rhs (cost, seeds,...)
Preparing the GPU kernel
Running the eikonal GPU kernel


GPU kernel eikonal ran for 0.24398493766784668 seconds, and 254 iterations.
Post-Processing
--- gpu done, turning to cpu ---


Field verbosity defaults to 1
Field order defaults to 1
Field seedRadius defaults to 0
Fast marching solver completed in 1.478 s.
Solver time (s). GPU : 0.24398493766784668, CPU : 2.585. Device acceleration : 10.594916328479
Max |gpuValues-cpuValues| :  5.230396603717047e-05


In [16]:
n=200; hfmInS = hfmIn.copy() # Define a small instance for bit-consistency validation
hfmInS.SetRect([[-np.pi,np.pi],[-np.pi,np.pi]],dimx=n+1,sampleBoundary=True)
hfmInS.update({
    'metric' : surface_metric(hfmInS.Grid(),height,mu=anisotropy_bound), 
    'verbosity':0,
})

In [17]:
RunCompare(hfmInS,variants='basic')

Solver time (s). GPU : 0.03699994087219238, CPU : 0.093. Device acceleration : 2.5135175302373236
Max |gpuValues-cpuValues| :  7.824160913716405e-06

 --- Variant {'multiprecision': True} ---


Solver time (s). GPU : 0.0429844856262207, CPU : 0.09. Device acceleration : 2.093778689888513
Max |gpuValues-cpuValues| :  1.7039630373361092e-07

 --- Variant {'seedRadius': 2.0} ---


Solver time (s). GPU : 0.0364992618560791, CPU : 0.094. Device acceleration : 2.5753945482693075
Max |gpuValues-cpuValues| :  7.856240023862426e-06

 --- Variant {'seedRadius': 2.0, 'multiprecision': True} ---


Solver time (s). GPU : 0.04349684715270996, CPU : 0.089. Device acceleration : 2.0461253131183574
Max |gpuValues-cpuValues| :  1.740857225041026e-07


Due to the different switching criteria of the second order scheme, we do not have bit consistency in that case. The results are nevertheless quite close. Note also that we do not deactivate the `decreasing` trait here, contrary to the isotropic case, because the scheme often does not converge without it.

**Bottom line.** Second order accuracy for anisotropic metrics on the GPU is very experimental, and not much reliable, at this stage. Further investigation is needed on the matter.

In [18]:
RunCompare(hfmInS,variants='ext',check=0.1)

Solver time (s). GPU : 0.03699851036071777, CPU : 0.091. Device acceleration : 2.459558482565745


Max |gpuValues-cpuValues| :  7.824160913716405e-06

 --- Variant {'multiprecision': True} ---
Solver time (s). GPU : 0.0429835319519043, CPU : 0.091. Device acceleration : 2.117089868320335


Max |gpuValues-cpuValues| :  1.7039630373361092e-07

 --- Variant {'seedRadius': 2.0} ---


Solver time (s). GPU : 0.03599715232849121, CPU : 0.089. Device acceleration : 2.4724177953809368
Max |gpuValues-cpuValues| :  7.856240023862426e-06

 --- Variant {'seedRadius': 2.0, 'multiprecision': True} ---


Solver time (s). GPU : 0.03450369834899902, CPU : 0.09. Device acceleration : 2.6084160338310793
Max |gpuValues-cpuValues| :  1.740857225041026e-07

 --- Variant {'factoringRadius': 10.0, 'factoringPointChoice': 'Key'} ---


Solver time (s). GPU : 0.02699756622314453, CPU : 0.091. Device acceleration : 3.3706742025504255
Max |gpuValues-cpuValues| :  0.0002934934470091993

 --- Variant {'factoringRadius': 10.0, 'factoringPointChoice': 'Key', 'multiprecision': True} ---


Solver time (s). GPU : 0.04349994659423828, CPU : 0.09. Device acceleration : 2.0689680573520706
Max |gpuValues-cpuValues| :  0.0002934934470091993

 --- Variant {'order': 2} ---


Solver time (s). GPU : 0.0444788932800293, CPU : 0.114. Device acceleration : 2.563013411378767
Max |gpuValues-cpuValues| :  0.07954645757978751

 --- Variant {'order': 2, 'multiprecision': True} ---


Solver time (s). GPU : 0.05448126792907715, CPU : 0.113. Device acceleration : 2.074107382139153
Max |gpuValues-cpuValues| :  0.07953489427870108

 --- Variant {'order': 2, 'seedRadius': 2.0} ---


Solver time (s). GPU : 0.03248095512390137, CPU : 0.11699999999999999. Device acceleration : 3.6021108232098946
Max |gpuValues-cpuValues| :  0.07787691205527092

 --- Variant {'order': 2, 'seedRadius': 2.0, 'multiprecision': True} ---


Solver time (s). GPU : 0.052977561950683594, CPU : 0.114. Device acceleration : 2.151854404061133
Max |gpuValues-cpuValues| :  0.07841037362601022

 --- Variant {'order': 2, 'factoringRadius': 10.0, 'factoringPointChoice': 'Key'} ---


Solver time (s). GPU : 0.04397940635681152, CPU : 0.11399999999999999. Device acceleration : 2.592122300949242
Max |gpuValues-cpuValues| :  0.07994939064629736

 --- Variant {'order': 2, 'factoringRadius': 10.0, 'factoringPointChoice': 'Key', 'multiprecision': True} ---


Solver time (s). GPU : 0.05297446250915527, CPU : 0.11499999999999999. Device acceleration : 2.170857325454226
Max |gpuValues-cpuValues| :  0.07999381992090315


If one removes enforced monotonicity, obtaining the scheme convergence is harder, and requires setting some other parameters carefully and conservatively.

<!---
hfmInS.update({
    'order2_threshold':0.03,
    'verbosity':1,
    'traits':{'decreasing_macro':0,'order2_threshold_weighted_macro':1},
    'metric' : surface_metric(hfmInS.Grid(),height),
    'multiprecision':False,
    'tol':1e-6
})
--->

In [19]:
hfmInS.update({
    'tol':1e-6, # Tolerance for the convergence of the fixed point solver
    'order2_threshold':0.03, # Use first order scheme if second order difference is too large
    'traits':{'decreasing_macro':0}, # Do not enforce monotonicity
})

In [20]:
RunCompare(hfmInS,variants='ext',check=0.15)

Solver time (s). GPU : 0.03749990463256836, CPU : 0.091. Device acceleration : 2.4266728380148264
Max |gpuValues-cpuValues| :  1.1519648889790624e-05

 --- Variant {'multiprecision': True} ---


Solver time (s). GPU : 0.040499210357666016, CPU : 0.09. Device acceleration : 2.2222655504927413
Max |gpuValues-cpuValues| :  1.99715544513257e-06

 --- Variant {'seedRadius': 2.0} ---


Solver time (s). GPU : 0.036997318267822266, CPU : 0.094. Device acceleration : 2.540724690355592
Max |gpuValues-cpuValues| :  1.1492123355161254e-05

 --- Variant {'seedRadius': 2.0, 'multiprecision': True} ---


Solver time (s). GPU : 0.033978939056396484, CPU : 0.092. Device acceleration : 2.7075595223059543
Max |gpuValues-cpuValues| :  1.9622250657658213e-06

 --- Variant {'factoringRadius': 10.0, 'factoringPointChoice': 'Key'} ---


Solver time (s). GPU : 0.03549909591674805, CPU : 0.091. Device acceleration : 2.5634455653014894
Max |gpuValues-cpuValues| :  0.0002934934470091993

 --- Variant {'factoringRadius': 10.0, 'factoringPointChoice': 'Key', 'multiprecision': True} ---


Solver time (s). GPU : 0.0414738655090332, CPU : 0.091. Device acceleration : 2.1941528450050014
Max |gpuValues-cpuValues| :  0.0002934934470091993

 --- Variant {'order': 2} ---


Solver time (s). GPU : 0.041500091552734375, CPU : 0.11699999999999999. Device acceleration : 2.819270888868462
Max |gpuValues-cpuValues| :  0.13634624391057515

 --- Variant {'order': 2, 'multiprecision': True} ---


Solver time (s). GPU : 0.04545950889587402, CPU : 0.11699999999999999. Device acceleration : 2.573718960932706
Max |gpuValues-cpuValues| :  0.13634135632970357

 --- Variant {'order': 2, 'seedRadius': 2.0} ---


Solver time (s). GPU : 0.032497406005859375, CPU : 0.11599999999999999. Device acceleration : 3.569515670853386
Max |gpuValues-cpuValues| :  0.13520766794541883

 --- Variant {'order': 2, 'seedRadius': 2.0, 'multiprecision': True} ---


Solver time (s). GPU : 0.03850126266479492, CPU : 0.114. Device acceleration : 2.9609418525444933
Max |gpuValues-cpuValues| :  0.1352026611552577

 --- Variant {'order': 2, 'factoringRadius': 10.0, 'factoringPointChoice': 'Key'} ---


Solver time (s). GPU : 0.039980411529541016, CPU : 0.114. Device acceleration : 2.8513963623352616
Max |gpuValues-cpuValues| :  0.13797029715511666

 --- Variant {'order': 2, 'factoringRadius': 10.0, 'factoringPointChoice': 'Key', 'multiprecision': True} ---


Solver time (s). GPU : 0.04647231101989746, CPU : 0.11499999999999999. Device acceleration : 2.4745918048009683
Max |gpuValues-cpuValues| :  0.13797059517834054


In [21]:
# TODO : discontinuous metric

## 2. Three dimensions

### 2.1 Smooth anisotropic metric

We generalize the two dimensional test case, although it does not much make geometrical sense anymore: we are computing geodesics in a three dimensional volume viewed as an hypersurface embedded in four dimensional Euclidean space.

In [22]:
n=200 if large_instances else 100
hfmIn = Eikonal.dictIn({
    'model':'Riemann3',
    'seed':[0.,0.,0.],
    'exportValues':1,
#    'bound_active_blocks':True,
})
hfmIn.SetRect([[-np.pi,np.pi],[-np.pi,np.pi],[-np.pi,np.pi]],dimx=n+1,sampleBoundary=True)

In [23]:
def height3(x): return np.sin(x[0])*np.sin(x[1])*np.sin(x[2])

In [24]:
hfmIn['metric'] = surface_metric(hfmIn.Grid(),height3,mu=anisotropy_bound)

In [25]:
gpuOut,cpuOut = RunCompare(hfmIn,check=1e-4)

Setting the kernel traits.
Prepating the domain data (shape,metric,...)
Preparing the problem rhs (cost, seeds,...)
Preparing the GPU kernel


Running the eikonal GPU kernel
GPU kernel eikonal ran for 0.13898158073425293 seconds, and 60 iterations.
Post-Processing
--- gpu done, turning to cpu ---


Field verbosity defaults to 1
Field order defaults to 1
Field seedRadius defaults to 0
Fast marching solver completed in 4.875 s.
Solver time (s). GPU : 0.13898158073425293, CPU : 8.247. Device acceleration : 59.33879839637967
Max |gpuValues-cpuValues| :  8.175189253556425e-06


In [26]:
n=20; hfmInS = hfmIn.copy() # Define a small instance for bit-consistency validation
hfmInS.SetRect([[-np.pi,np.pi],[-np.pi,np.pi],[-np.pi,np.pi]],dimx=n+1,sampleBoundary=True)
hfmInS.update({
    'metric' : surface_metric(hfmInS.Grid(),height,mu=anisotropy_bound), 
    'verbosity':0,
})

In [27]:
RunCompare(hfmInS,variants='basic')

Solver time (s). GPU : 0.008501768112182617, CPU : 0.038. Device acceleration : 4.4696584873384
Max |gpuValues-cpuValues| :  2.4000878279251125e-07

 --- Variant {'multiprecision': True} ---


Solver time (s). GPU : 0.009477376937866211, CPU : 0.038. Device acceleration : 4.009548237780182
Max |gpuValues-cpuValues| :  3.35133334750104e-07

 --- Variant {'seedRadius': 2.0} ---
Solver time (s). GPU : 0.006499528884887695, CPU : 0.039. Device acceleration : 6.000434907010014
Max |gpuValues-cpuValues| :  1.7843692967645097e-07

 --- Variant {'seedRadius': 2.0, 'multiprecision': True} ---


Solver time (s). GPU : 0.008997440338134766, CPU : 0.038. Device acceleration : 4.223423392866606
Max |gpuValues-cpuValues| :  4.170939875702828e-07


Due to the different switching criteria of the second order scheme, we do not have bit consistency in that case. The results are nevertheless quite close.

In [28]:
RunCompare(hfmInS,variants='ext',check=0.1)

Solver time (s). GPU : 0.008493185043334961, CPU : 0.038. Device acceleration : 4.474175448446228
Max |gpuValues-cpuValues| :  2.4000878279251125e-07

 --- Variant {'multiprecision': True} ---


Solver time (s). GPU : 0.009501218795776367, CPU : 0.038. Device acceleration : 3.9994868886602593
Max |gpuValues-cpuValues| :  3.35133334750104e-07

 --- Variant {'seedRadius': 2.0} ---


Solver time (s). GPU : 0.007001638412475586, CPU : 0.04. Device acceleration : 5.712948547689583
Max |gpuValues-cpuValues| :  1.7843692967645097e-07

 --- Variant {'seedRadius': 2.0, 'multiprecision': True} ---


Solver time (s). GPU : 0.008995294570922852, CPU : 0.039. Device acceleration : 4.335600095417318
Max |gpuValues-cpuValues| :  4.170939875702828e-07

 --- Variant {'factoringRadius': 10.0, 'factoringPointChoice': 'Key'} ---


Solver time (s). GPU : 0.009499549865722656, CPU : 0.046. Device acceleration : 4.8423347053508685
Max |gpuValues-cpuValues| :  0.014766020550922188

 --- Variant {'factoringRadius': 10.0, 'factoringPointChoice': 'Key', 'multiprecision': True} ---


Solver time (s). GPU : 0.009458541870117188, CPU : 0.044. Device acceleration : 4.6518798144787255
Max |gpuValues-cpuValues| :  0.014766020550922188

 --- Variant {'order': 2} ---


Solver time (s). GPU : 0.010497808456420898, CPU : 0.053000000000000005. Device acceleration : 5.04867279871
Max |gpuValues-cpuValues| :  0.07377230654143407

 --- Variant {'order': 2, 'multiprecision': True} ---


Solver time (s). GPU : 0.010998725891113281, CPU : 0.052000000000000005. Device acceleration : 4.727820341628371
Max |gpuValues-cpuValues| :  0.07377236614607885

 --- Variant {'order': 2, 'seedRadius': 2.0} ---
Solver time (s). GPU : 0.00796961784362793, CPU : 0.049. Device acceleration : 6.148350016453765
Max |gpuValues-cpuValues| :  0.06863715214899868

 --- Variant {'order': 2, 'seedRadius': 2.0, 'multiprecision': True} ---


Solver time (s). GPU : 0.009482383728027344, CPU : 0.053000000000000005. Device acceleration : 5.589311877702907
Max |gpuValues-cpuValues| :  0.0686370925443539

 --- Variant {'order': 2, 'factoringRadius': 10.0, 'factoringPointChoice': 'Key'} ---


Solver time (s). GPU : 0.011979103088378906, CPU : 0.053000000000000005. Device acceleration : 4.4243713080168785
Max |gpuValues-cpuValues| :  0.04057617740123154

 --- Variant {'order': 2, 'factoringRadius': 10.0, 'factoringPointChoice': 'Key', 'multiprecision': True} ---


Solver time (s). GPU : 0.013480424880981445, CPU : 0.053000000000000005. Device acceleration : 3.931626819476133
Max |gpuValues-cpuValues| :  0.03974621252505661
