
Goal:
- measure how variability factors affect associativity failure rate
- identify which factor impacts the most
- recommend stable settings to remove variability


# Associativity variability analysis — conclusion



In [2]:
pip install pandas

Note: you may need to restart the kernel to use updated packages.


In [3]:
import pandas as pd

df = pd.read_csv("associativity/results_associativity.csv")
df.head()


Unnamed: 0,repetitions,op1,op2,dtype,dist,seed,rng,result
0,1000,(x + y) + z,x + (y + z),float64,uniform01,0,uniform01,0.157
1,1000,(x + y) + z,x + (y + z),float64,uniform01,0,uniform_signed,0.157
2,1000,(x + y) + z,x + (y + z),float64,uniform01,0,wide,0.157
3,1000,(x + y) + z,x + (y + z),float64,uniform01,1,uniform01,0.171
4,1000,(x + y) + z,x + (y + z),float64,uniform01,1,uniform_signed,0.171


In [7]:
summary = {}

for col in ["dtype","dist","seed"]:
    summary[col] = (
        df.groupby(col)["result"]
          .mean()
          .round(4)          
          .to_frame("mean")   
          .sort_values("mean")
    )
for k, v in summary.items():
    display(v.style.format("{:.4f}"))



Unnamed: 0_level_0,mean
dtype,Unnamed: 1_level_1
decimal50,0.0001
float32,0.1039
float64,0.1331


Unnamed: 0_level_0,mean
dist,Unnamed: 1_level_1
uniform_signed,0.0384
uniform01,0.0567
wide,0.142


Unnamed: 0_level_0,mean
seed,Unnamed: 1_level_1
0,0.0783
1,0.0798


## Conclusion

From our CSV:

**Most important factors:**
1) **distribution (dist)**  
→ `wide` produces the most failures

2) **dtype**  
→ `decimal50` drastically reduces failures  
→ float types have higher failure rates

`seed` has almost no influence.

### recommended settings for stable result

| factor | recommended |
|---|---|
| dtype | `decimal50` |
| dist  | avoid `wide` — use `uniform_signed` |
| seed  | fix to constant (0 or 1) |


# Banking problem analysis — conclusion

In [11]:
import pandas as pd

df = pd.read_csv("banking_problem/results_banking.csv")
df.head(100)

Unnamed: 0,precision,terms,method,n,result
0,30,50,iterative,10,-9.008878e-01
1,30,50,iterative,20,-9.501183e-01
2,30,50,iterative,50,-6.892593e+35
3,30,50,iterative,100,-2.115005e+129
4,30,100,iterative,10,-9.008878e-01
...,...,...,...,...,...
91,80,200,closed_form,100,-2.000000e+00
92,80,500,closed_form,10,-9.008878e-01
93,80,500,closed_form,20,-9.501183e-01
94,80,500,closed_form,50,-9.800078e-01


## Conclusion

|Method|Explanation|
|---|---|
|Iterative|Repeated application of a formula|
|Closed form|Uses an equation derived from a mathematical property|

From our CSV:

**Most important factor: n**  
- Same precision, same terms, same method, different n -> different results
- Different precision, different terms, different method, same n -> generally same result

### Recommended settings for stable result

Since the theoritical result is -1, the settings which match the most are:
- 80 digits precision
- terms doesn't matter
- method doesn't matter
- n must be 50

# Global conclusions

## Factors impacting the evaluation of mathematical properties on floats

- Limited precision of floats
- Method of comparing floats (`a==b`, `abs(a-b) <= 0.001`)
- Use of a large number library (arbitrary precision)
- Processor architecture
- Data types in programming languages
- The interval chosen for generating numbers

## Our recommendations

- Use a containerized environment with specific software versions and builds (for instance, Python 3.14.0: CPython build with a very specific GCC version)
- Use multiprecision libraries when dealing with floating point numbers (`decimal` for Python, `gmp` for C/C++, etc.)  