In [1]:
import numpy as np

### Problem description

In our case we will define and implement the value iteration algorithm for the classical bear-honey problem. 

We will define a grid of 4 blocks as width and 3 blocks as height (12 blocks in total). So all in all we will have **12 states**. 

Specific blocks (states) will hold enemies (pirate ships), that prevent the merchant ship (our agent) to reach the goal (the port). As you will see  in the implementation algorithm the reward values for these blocks will be set to negative, thus our agent should strive to avoid, even getting close to these blocks. 

Our goal is to find the optimal policy, and thus the optimal values for each block, the ship will find itself within. 

In [9]:
# grid setup, actions, states, rewards

threshold_change = 0.005
gamma = 0.9
noise = 0.1

# let's define all states we set to a 4 X 3

all_states = [(i, j) for i in range(3) for j in range(4)]

# populate rewards for each state block

rewards = {}

for x in all_states:
    # if a pirate ship is there, then -1
    if x == (1, 2) or x == (2, 2):
        rewards[x] = -1
    # if ship reached the port, then 1
    elif x == (2, 3):
        rewards[x] = 1
    else:
        rewards[x] = 0
        
# Let's create our actions, bear in mind that states (1, 2) and (2, 2) are game over, since our ship has met a pirate ship, 
# (2, 3) is the target state. Possible actions: 'Down', 'Up', 'Right', 'Left'

actions = {
    (0, 0): ('D', 'R'),
    (0, 1): ('D', 'R', 'L'),
    (0, 2): ('D', 'R', 'L'),
    (0, 3): ('D', 'L'),
    (1, 0): ('D', 'U', 'R'),
    (1, 1): ('D', 'U', 'R', 'L'),
    (1, 3): ('D', 'L', 'U'),
    (2, 0): ('U', 'R'),
    (2, 1): ('U', 'L', 'R')
}

# Let's set an initial policy randomly for now, this will be updated later

policy = {}

for x in actions.keys():
    # we take a random choice of possible actions
    policy[x] = np.random.choice(actions[x])
    
# initial values 

values = {}

for state in all_states:
    if state in actions.keys():
        values[state] = 0
    elif state == (1, 2) or state == (2, 2):
        values[state] = -1
    else:
        values[state] = 1
print('States list: \n')        
print(all_states)
print('----------------')

print('Rewards list: \n')
print(rewards)
print('----------------')

print('Policy: \n')
print(policy)
print('----------------')

print('Values: \n')
print(values)

States list: 

[(0, 0), (0, 1), (0, 2), (0, 3), (1, 0), (1, 1), (1, 2), (1, 3), (2, 0), (2, 1), (2, 2), (2, 3)]
----------------
Rewards list: 

{(0, 0): 0, (0, 1): 0, (0, 2): 0, (0, 3): 0, (1, 0): 0, (1, 1): 0, (1, 2): -1, (1, 3): 0, (2, 0): 0, (2, 1): 0, (2, 2): -1, (2, 3): 1}
----------------
Policy: 

{(0, 0): 'D', (0, 1): 'D', (0, 2): 'R', (0, 3): 'D', (1, 0): 'D', (1, 1): 'U', (1, 3): 'U', (2, 0): 'U', (2, 1): 'R'}
----------------
Values: 

{(0, 0): 0, (0, 1): 0, (0, 2): 0, (0, 3): 0, (1, 0): 0, (1, 1): 0, (1, 2): -1, (1, 3): 0, (2, 0): 0, (2, 1): 0, (2, 2): -1, (2, 3): 1}


### Value iteration algorithm implementation

Now we will loop and update policies as long as we see an improvement which is at least bigger as the threshold we set. We will keep also the iteration numbers, to see after how many iterations did we converge.

In [10]:
iteration = 0

while True:
    max_diff = 0
    # loop through all state blocks
    for state in all_states:
        # if not a terminal block, pirate ship or port
        if state in policy:
            # update values, gradiently
            previous_values = values[state]
            best_values = 0
            
            # iterate through all possible actions of that state
            for action in actions[state]:
                # update per action
                if action == 'U':
                    next_state = (state[0] - 1, state[1])
                if action == 'D':
                    next_state = (state[0] + 1, state[1])
                if action == 'R':
                    next_state = (state[0], state[1] + 1)
                if action == 'L':
                    next_state = (state[0], state[1] -1)
                    
                # we choose a new random action (creating move transition probabilities)
                random_action = np.random.choice([random_action for random_action in actions[state] if random_action != action])
                
                # update probabilities for next action
                if random_action == 'U':
                    next_action = (state[0] - 1, state[1])
                if random_action == 'D':
                    next_action = (state[0] + 1, state[1])
                if random_action == 'R':
                    next_action = (state[0], state[1] + 1)
                if random_action == 'L':
                    next_action = (state[0], state[1] - 1)
                    
                # Update vales now, using the Bellman equation
                calculate_value = rewards[state] + (gamma * ((1 - noise) * values[next_state] + (noise * values[next_action])))
                
                # check if these values are the best so far, if so we keep it
                if calculate_value > best_values:
                    best_values = calculate_value
                    policy[state] = action
                    
            # finally we save the best actions for that state
            values[state] = best_values
            
            # update the step difference
            max_diff = max(max_diff, np.abs(previous_values - values[state]))
            
            # exit the loop if no siginificant progress has been made on updating the values
    iteration += 1
    print('iteration: {}'.format(iteration))
    print('Max difference: {}'.format(max_diff))
    if max_diff < threshold_change:
        print('Converged after {} iterations'.format(iteration))
        break
               

iteration: 1
Max difference: 0.7200000000000001
iteration: 2
Max difference: 0.5832000000000002
iteration: 3
Max difference: 0.47239200000000015
iteration: 4
Max difference: 0.3826375200000002
iteration: 5
Max difference: 0.30993639120000016
iteration: 6
Max difference: 0.15422434826111991
iteration: 7
Max difference: 0.09828496535980397
iteration: 8
Max difference: 0.1476403008030524
iteration: 9
Max difference: 0.11722893299839221
iteration: 10
Max difference: 0.15266263399960317
iteration: 11
Max difference: 0.15869978534609214
iteration: 12
Max difference: 0.14794907320664086
iteration: 13
Max difference: 0.24266490986132816
iteration: 14
Max difference: 0.21046125959655104
iteration: 15
Max difference: 0.17914139211313787
iteration: 16
Max difference: 0.10122360083730336
iteration: 17
Max difference: 0.14611604595325778
iteration: 18
Max difference: 0.14611604595325778
iteration: 19
Max difference: 0.14660697535154887
iteration: 20
Max difference: 0.1224119477414215
iteration: 21


Max difference: 0.218302022992645
iteration: 183
Max difference: 0.17676650282116502
iteration: 184
Max difference: 0.17095607617076736
iteration: 185
Max difference: 0.1120049309757612
iteration: 186
Max difference: 0.13139867880686895
iteration: 187
Max difference: 0.10560923175379278
iteration: 188
Max difference: 0.08531447585064839
iteration: 189
Max difference: 0.10426282583612012
iteration: 190
Max difference: 0.15963519551515626
iteration: 191
Max difference: 0.14800101922679043
iteration: 192
Max difference: 0.14800101922679043
iteration: 193
Max difference: 0.17670652084146132
iteration: 194
Max difference: 0.08921106935399375
iteration: 195
Max difference: 0.10649132668770306
iteration: 196
Max difference: 0.15297943744003456
iteration: 197
Max difference: 0.14247397835479353
iteration: 198
Max difference: 0.14590200185279945
iteration: 199
Max difference: 0.14590200185279945
iteration: 200
Max difference: 0.14667735496471257
iteration: 201
Max difference: 0.1100689265460274

iteration: 349
Max difference: 0.14699326847862315
iteration: 350
Max difference: 0.11912973765319701
iteration: 351
Max difference: 0.1577149448674109
iteration: 352
Max difference: 0.1479524338032766
iteration: 353
Max difference: 0.2430495529651565
iteration: 354
Max difference: 0.21053380195718763
iteration: 355
Max difference: 0.09960673063936032
iteration: 356
Max difference: 0.07285715253337843
iteration: 357
Max difference: 0.1262080845754412
iteration: 358
Max difference: 0.11567137863279342
iteration: 359
Max difference: 0.125910608305296
iteration: 360
Max difference: 0.09665480976499469
iteration: 361
Max difference: 0.15830012494733858
iteration: 362
Max difference: 0.126528385902415
iteration: 363
Max difference: 0.2436447622031354
iteration: 364
Max difference: 0.1459841115258691
iteration: 365
Max difference: 0.21799174472708016
iteration: 366
Max difference: 0.16964639290034922
iteration: 367
Max difference: 0.14701484086587124
iteration: 368
Max difference: 0.11895545

iteration: 571
Max difference: 0.12244704068721524
iteration: 572
Max difference: 0.09654790594197493
iteration: 573
Max difference: 0.11440041163627612
iteration: 574
Max difference: 0.08709204298531292
iteration: 575
Max difference: 0.13073185327716313
iteration: 576
Max difference: 0.14583499900508445
iteration: 577
Max difference: 0.16307670205411
iteration: 578
Max difference: 0.1426267886553081
iteration: 579
Max difference: 0.14588543073643567
iteration: 580
Max difference: 0.12871560992805497
iteration: 581
Max difference: 0.15746983562996064
iteration: 582
Max difference: 0.14672765273834032
iteration: 583
Max difference: 0.14672765273834032
iteration: 584
Max difference: 0.21260593015575913
iteration: 585
Max difference: 0.16717919233027356
iteration: 586
Max difference: 0.13067191827643232
iteration: 587
Max difference: 0.1459093894671375
iteration: 588
Max difference: 0.11735777241301304
iteration: 589
Max difference: 0.2178878907563983
iteration: 590
Max difference: 0.2512

iteration: 762
Max difference: 0.166721756763052
iteration: 763
Max difference: 0.15838477087200986
iteration: 764
Max difference: 0.1267013749792365
iteration: 765
Max difference: 0.24400035397387326
iteration: 766
Max difference: 0.14590462476054178
iteration: 767
Max difference: 0.17343798504777252
iteration: 768
Max difference: 0.08743301874939957
iteration: 769
Max difference: 0.1581579437101276
iteration: 770
Max difference: 0.1518767146556041
iteration: 771
Max difference: 0.14810619051213264
iteration: 772
Max difference: 0.14683315760039684
iteration: 773
Max difference: 0.12214033113775358
iteration: 774
Max difference: 0.14590478337656065
iteration: 775
Max difference: 0.14590478337656065
iteration: 776
Max difference: 0.21794935344111893
iteration: 777
Max difference: 0.2026937675909462
iteration: 778
Max difference: 0.18509304951236402
iteration: 779
Max difference: 0.14221795470953957
iteration: 780
Max difference: 0.14711094156174487
iteration: 781
Max difference: 0.1471

iteration: 936
Max difference: 0.15240576820865565
iteration: 937
Max difference: 0.15065195023185013
iteration: 938
Max difference: 0.1693540811106093
iteration: 939
Max difference: 0.1477731508265222
iteration: 940
Max difference: 0.22252939892207274
iteration: 941
Max difference: 0.21020403421315942
iteration: 942
Max difference: 0.15909990379334793
iteration: 943
Max difference: 0.14197275218319771
iteration: 944
Max difference: 0.10320519414809681
iteration: 945
Max difference: 0.17816896070668198
iteration: 946
Max difference: 0.20438338351229285
iteration: 947
Max difference: 0.14586454506738133
iteration: 948
Max difference: 0.1675154314992329
iteration: 949
Max difference: 0.1466336423896598
iteration: 950
Max difference: 0.1466336423896598
iteration: 951
Max difference: 0.20642075233352286
iteration: 952
Max difference: 0.16117533618498253
iteration: 953
Max difference: 0.14587994326771636
iteration: 954
Max difference: 0.13771642491049352
iteration: 955
Max difference: 0.177

iteration: 1157
Max difference: 0.1477483799955448
iteration: 1158
Max difference: 0.21969420911970527
iteration: 1159
Max difference: 0.2093262763849687
iteration: 1160
Max difference: 0.12543673905019126
iteration: 1161
Max difference: 0.09464364068973347
iteration: 1162
Max difference: 0.1582138403729869
iteration: 1163
Max difference: 0.14818442151414718
iteration: 1164
Max difference: 0.09921618927274645
iteration: 1165
Max difference: 0.14022323604293774
iteration: 1166
Max difference: 0.13323660657863406
iteration: 1167
Max difference: 0.14684583661225947
iteration: 1168
Max difference: 0.11639199052170995
iteration: 1169
Max difference: 0.21188463072723768
iteration: 1170
Max difference: 0.16560679493875696
iteration: 1171
Max difference: 0.16180187964812281
iteration: 1172
Max difference: 0.14781772418084493
iteration: 1173
Max difference: 0.11998654005907516
iteration: 1174
Max difference: 0.1459325154913822
iteration: 1175
Max difference: 0.1459325154913822
iteration: 1176
M

iteration: 1387
Max difference: 0.12512010134171536
iteration: 1388
Max difference: 0.14695761208287406
iteration: 1389
Max difference: 0.14695761208287406
iteration: 1390
Max difference: 0.21136066038190793
iteration: 1391
Max difference: 0.16266889956036434
iteration: 1392
Max difference: 0.14712325143114258
iteration: 1393
Max difference: 0.15918485216863965
iteration: 1394
Max difference: 0.14798668292418948
iteration: 1395
Max difference: 0.11057196083229726
iteration: 1396
Max difference: 0.15793815939909617
iteration: 1397
Max difference: 0.1263139093715866
iteration: 1398
Max difference: 0.10057838170294042
iteration: 1399
Max difference: 0.14702743689944264
iteration: 1400
Max difference: 0.14702743689944264
iteration: 1401
Max difference: 0.2118629074653121
iteration: 1402
Max difference: 0.2064183322563507
iteration: 1403
Max difference: 0.12606848601525977
iteration: 1404
Max difference: 0.09420186674792203
iteration: 1405
Max difference: 0.0816892250132894
iteration: 1406


iteration: 1553
Max difference: 0.12871566295166093
iteration: 1554
Max difference: 0.1574698420324654
iteration: 1555
Max difference: 0.147788533875162
iteration: 1556
Max difference: 0.11220798593691506
iteration: 1557
Max difference: 0.09269370136908073
iteration: 1558
Max difference: 0.1593162807921792
iteration: 1559
Max difference: 0.14157456147421876
iteration: 1560
Max difference: 0.14590196815216483
iteration: 1561
Max difference: 0.19423628025139614
iteration: 1562
Max difference: 0.1728723423336452
iteration: 1563
Max difference: 0.13701902087261142
iteration: 1564
Max difference: 0.11124625574173197
iteration: 1565
Max difference: 0.15956816450383438
iteration: 1566
Max difference: 0.1426774321301415
iteration: 1567
Max difference: 0.1459022346801152
iteration: 1568
Max difference: 0.1459022346801152
iteration: 1569
Max difference: 0.10962047807053432
iteration: 1570
Max difference: 0.07496666603545887
iteration: 1571
Max difference: 0.14698801104300363
iteration: 1572
Max 

iteration: 1740
Max difference: 0.12242212442845568
iteration: 1741
Max difference: 0.1397721030919372
iteration: 1742
Max difference: 0.14683734744475685
iteration: 1743
Max difference: 0.14683734744475685
iteration: 1744
Max difference: 0.1110249258019752
iteration: 1745
Max difference: 0.14588620739708102
iteration: 1746
Max difference: 0.1175777960157044
iteration: 1747
Max difference: 0.21720201351602597
iteration: 1748
Max difference: 0.18328120752950333
iteration: 1749
Max difference: 0.23011214274127842
iteration: 1750
Max difference: 0.16679574270270803
iteration: 1751
Max difference: 0.12847099174222765
iteration: 1752
Max difference: 0.09960905717193708
iteration: 1753
Max difference: 0.08538266115444815
iteration: 1754
Max difference: 0.07001946760559563
iteration: 1755
Max difference: 0.07417437990176359
iteration: 1756
Max difference: 0.12265790117776737
iteration: 1757
Max difference: 0.10466527827467825
iteration: 1758
Max difference: 0.10793051122606107
iteration: 1759

Max difference: 0.11215454210918807
iteration: 1983
Max difference: 0.1539986377853288
iteration: 1984
Max difference: 0.1593168264036724
iteration: 1985
Max difference: 0.12976788020932828
iteration: 1986
Max difference: 0.14599393541322403
iteration: 1987
Max difference: 0.14599393541322403
iteration: 1988
Max difference: 0.21739241805987375
iteration: 1989
Max difference: 0.17415165108248715
iteration: 1990
Max difference: 0.1433411665765622
iteration: 1991
Max difference: 0.21195676478348596
iteration: 1992
Max difference: 0.15905631054659897
iteration: 1993
Max difference: 0.14535732693924652
iteration: 1994
Max difference: 0.14714837436994543
iteration: 1995
Max difference: 0.14714837436994543
iteration: 1996
Max difference: 0.1477867252854851
iteration: 1997
Max difference: 0.22408308792262144
iteration: 1998
Max difference: 0.15674530454662372
iteration: 1999
Max difference: 0.14777609452544016
iteration: 2000
Max difference: 0.222866325251306
iteration: 2001
Max difference: 0.

iteration: 2222
Max difference: 0.19613975032663133
iteration: 2223
Max difference: 0.09894528391598023
iteration: 2224
Max difference: 0.14702651211298734
iteration: 2225
Max difference: 0.11886282291512484
iteration: 2226
Max difference: 0.09595006692397645
iteration: 2227
Max difference: 0.07736848210096187
iteration: 2228
Max difference: 0.1474305302959686
iteration: 2229
Max difference: 0.14267901773540848
iteration: 2230
Max difference: 0.15847095147508694
iteration: 2231
Max difference: 0.1468170493672385
iteration: 2232
Max difference: 0.12071356083879436
iteration: 2233
Max difference: 0.15768126984273
iteration: 2234
Max difference: 0.1273114345393359
iteration: 2235
Max difference: 0.146954652789966
iteration: 2236
Max difference: 0.146954652789966
iteration: 2237
Max difference: 0.12233527152503654
iteration: 2238
Max difference: 0.09909156993527973
iteration: 2239
Max difference: 0.1735540729512479
iteration: 2240
Max difference: 0.10317784798641533
iteration: 2241
Max dif

iteration: 2385
Max difference: 0.1581890665292517
iteration: 2386
Max difference: 0.20937854290351632
iteration: 2387
Max difference: 0.24336821917391016
iteration: 2388
Max difference: 0.20881003374243007
iteration: 2389
Max difference: 0.1667115646006655
iteration: 2390
Max difference: 0.21330236235848504
iteration: 2391
Max difference: 0.1677624897251187
iteration: 2392
Max difference: 0.13157403803122109
iteration: 2393
Max difference: 0.13078271830645272
iteration: 2394
Max difference: 0.11711801015442458
iteration: 2395
Max difference: 0.08523058018105178
iteration: 2396
Max difference: 0.15962354853245764
iteration: 2397
Max difference: 0.14800004304767989
iteration: 2398
Max difference: 0.14800004304767989
iteration: 2399
Max difference: 0.1478154891900928
iteration: 2400
Max difference: 0.22737531211544498
iteration: 2401
Max difference: 0.1567494983239155
iteration: 2402
Max difference: 0.1477771644650543
iteration: 2403
Max difference: 0.22298878710516196
iteration: 2404
Ma

iteration: 2561
Max difference: 0.17510368360359269
iteration: 2562
Max difference: 0.156631207387177
iteration: 2563
Max difference: 0.1330030054495741
iteration: 2564
Max difference: 0.10911321510376665
iteration: 2565
Max difference: 0.1388024695020863
iteration: 2566
Max difference: 0.15846333936706314
iteration: 2567
Max difference: 0.14780615057041824
iteration: 2568
Max difference: 0.14780615057041824
iteration: 2569
Max difference: 0.14670799950240443
iteration: 2570
Max difference: 0.14670799950240443
iteration: 2571
Max difference: 0.1108306578686602
iteration: 2572
Max difference: 0.14693803551641926
iteration: 2573
Max difference: 0.11958089262021987
iteration: 2574
Max difference: 0.1577003158522391
iteration: 2575
Max difference: 0.13473580428573673
iteration: 2576
Max difference: 0.11745504492195352
iteration: 2577
Max difference: 0.146271509782944
iteration: 2578
Max difference: 0.11392084826010207
iteration: 2579
Max difference: 0.09227588709068268
iteration: 2580
Max 

Max difference: 0.14671103541293562
iteration: 2735
Max difference: 0.14662015797922712
iteration: 2736
Max difference: 0.14662015797922712
iteration: 2737
Max difference: 0.10963444554238688
iteration: 2738
Max difference: 0.14710229596231517
iteration: 2739
Max difference: 0.14710229596231517
iteration: 2740
Max difference: 0.14661745802537018
iteration: 2741
Max difference: 0.1106374150525421
iteration: 2742
Max difference: 0.08961630619255911
iteration: 2743
Max difference: 0.14284043676271807
iteration: 2744
Max difference: 0.15928945214348367
iteration: 2745
Max difference: 0.1323337897055215
iteration: 2746
Max difference: 0.1471485435158827
iteration: 2747
Max difference: 0.17798074604374192
iteration: 2748
Max difference: 0.16880389830818204
iteration: 2749
Max difference: 0.21241294165393376
iteration: 2750
Max difference: 0.18158025296905672
iteration: 2751
Max difference: 0.10971357290302586
iteration: 2752
Max difference: 0.14588553086161948
iteration: 2753
Max difference:

Max difference: 0.14777923898409828
iteration: 2945
Max difference: 0.14774354535059597
iteration: 2946
Max difference: 0.14774354535059597
iteration: 2947
Max difference: 0.1352176232319895
iteration: 2948
Max difference: 0.21841898941264942
iteration: 2949
Max difference: 0.12267493656726708
iteration: 2950
Max difference: 0.18344118086017416
iteration: 2951
Max difference: 0.14682338889767288
iteration: 2952
Max difference: 0.11382269716451637
iteration: 2953
Max difference: 0.08803315083246993
iteration: 2954
Max difference: 0.14459186310607328
iteration: 2955
Max difference: 0.1200087573074668
iteration: 2956
Max difference: 0.15842260603608316
iteration: 2957
Max difference: 0.14796026527413997
iteration: 2958
Max difference: 0.11078481189258893
iteration: 2959
Max difference: 0.15793089834447294
iteration: 2960
Max difference: 0.14779665348356397
iteration: 2961
Max difference: 0.20633531083976236
iteration: 2962
Max difference: 0.15788959044021267
iteration: 2963
Max difference

iteration: 3144
Max difference: 0.15812609004068162
iteration: 3145
Max difference: 0.08688675827300629
iteration: 3146
Max difference: 0.15815789009329329
iteration: 3147
Max difference: 0.1477987212477686
iteration: 3148
Max difference: 0.1477987212477686
iteration: 3149
Max difference: 0.14773518657784945
iteration: 3150
Max difference: 0.21818413287953353
iteration: 3151
Max difference: 0.21003502515123884
iteration: 3152
Max difference: 0.21062909510463163
iteration: 3153
Max difference: 0.12841073052187602
iteration: 3154
Max difference: 0.15752004888187154
iteration: 3155
Max difference: 0.13689736921736384
iteration: 3156
Max difference: 0.14711383415802137
iteration: 3157
Max difference: 0.10977636345376274
iteration: 3158
Max difference: 0.1884898146669709
iteration: 3159
Max difference: 0.14344135582799483
iteration: 3160
Max difference: 0.1466600414011361
iteration: 3161
Max difference: 0.1466600414011361
iteration: 3162
Max difference: 0.14777204506766806
iteration: 3163
M

iteration: 3426
Max difference: 0.2002668845389936
iteration: 3427
Max difference: 0.11900725799569745
iteration: 3428
Max difference: 0.0797232601747988
iteration: 3429
Max difference: 0.1580571667760441
iteration: 3430
Max difference: 0.15180653024108764
iteration: 3431
Max difference: 0.14697615275771425
iteration: 3432
Max difference: 0.14697615275771425
iteration: 3433
Max difference: 0.1466951419386675
iteration: 3434
Max difference: 0.10990010839989128
iteration: 3435
Max difference: 0.21295263913532192
iteration: 3436
Max difference: 0.16580308827634532
iteration: 3437
Max difference: 0.14589909610310703
iteration: 3438
Max difference: 0.14589909610310703
iteration: 3439
Max difference: 0.2180102195741881
iteration: 3440
Max difference: 0.18693509972005395
iteration: 3441
Max difference: 0.15783762669453294
iteration: 3442
Max difference: 0.17431700003003392
iteration: 3443
Max difference: 0.14681129585771124
iteration: 3444
Max difference: 0.21255178597217522
iteration: 3445
M

Max difference: 0.21802897351670425
iteration: 3663
Max difference: 0.256611211353714
iteration: 3664
Max difference: 0.1595013908931946
iteration: 3665
Max difference: 0.1467492457735533
iteration: 3666
Max difference: 0.1467492457735533
iteration: 3667
Max difference: 0.12189243509288195
iteration: 3668
Max difference: 0.09873287242523437
iteration: 3669
Max difference: 0.14583351324173532
iteration: 3670
Max difference: 0.11807792209899648
iteration: 3671
Max difference: 0.21832997270542265
iteration: 3672
Max difference: 0.1982423056518547
iteration: 3673
Max difference: 0.14266670493206496
iteration: 3674
Max difference: 0.042984974656476826
iteration: 3675
Max difference: 0.1293860050273204
iteration: 3676
Max difference: 0.10215562813637574
iteration: 3677
Max difference: 0.1723773326776256
iteration: 3678
Max difference: 0.19807789845665977
iteration: 3679
Max difference: 0.15766841695501943
iteration: 3680
Max difference: 0.14779296836804134
iteration: 3681
Max difference: 0.1

iteration: 3822
Max difference: 0.16561904890888932
iteration: 3823
Max difference: 0.1386471514774038
iteration: 3824
Max difference: 0.1460030992056195
iteration: 3825
Max difference: 0.15450661832927784
iteration: 3826
Max difference: 0.21631675050981058
iteration: 3827
Max difference: 0.18367969438493031
iteration: 3828
Max difference: 0.14653321506071626
iteration: 3829
Max difference: 0.2275268114540937
iteration: 3830
Max difference: 0.12994453917637733
iteration: 3831
Max difference: 0.10004580773604105
iteration: 3832
Max difference: 0.15801292786035281
iteration: 3833
Max difference: 0.14779583517997885
iteration: 3834
Max difference: 0.1703274446395815
iteration: 3835
Max difference: 0.1317158995480895
iteration: 3836
Max difference: 0.219033054200873
iteration: 3837
Max difference: 0.1458390989696623
iteration: 3838
Max difference: 0.18466412729689058
iteration: 3839
Max difference: 0.12733469876086365
iteration: 3840
Max difference: 0.272369477240026
iteration: 3841
Max di

iteration: 4030
Max difference: 0.07470890646557066
iteration: 4031
Max difference: 0.014206236944991901
iteration: 4032
Max difference: 0.1459142206147913
iteration: 4033
Max difference: 0.1459142206147913
iteration: 4034
Max difference: 0.21779598836317177
iteration: 4035
Max difference: 0.1760194846535122
iteration: 4036
Max difference: 0.1500401586852838
iteration: 4037
Max difference: 0.1591780701048855
iteration: 4038
Max difference: 0.12955948184666777
iteration: 4039
Max difference: 0.14589576949469618
iteration: 4040
Max difference: 0.14589576949469618
iteration: 4041
Max difference: 0.11995614982702196
iteration: 4042
Max difference: 0.1469440353654683
iteration: 4043
Max difference: 0.1276315046081089
iteration: 4044
Max difference: 0.10079312817928343
iteration: 4045
Max difference: 0.07955449899353084
iteration: 4046
Max difference: 0.15823929185301533
iteration: 4047
Max difference: 0.1263444120229933
iteration: 4048
Max difference: 0.24315652693420226
iteration: 4049
Max

Max difference: 0.11237258184312615
iteration: 4247
Max difference: 0.15788356351518817
iteration: 4248
Max difference: 0.1261230678769718
iteration: 4249
Max difference: 0.24351278421492223
iteration: 4250
Max difference: 0.14598402746623984
iteration: 4251
Max difference: 0.14668293993213266
iteration: 4252
Max difference: 0.12182056118537132
iteration: 4253
Max difference: 0.15764679043881613
iteration: 4254
Max difference: 0.1272596498536388
iteration: 4255
Max difference: 0.14589458299646207
iteration: 4256
Max difference: 0.17329572150819308
iteration: 4257
Max difference: 0.11215750371853472
iteration: 4258
Max difference: 0.1745067646499047
iteration: 4259
Max difference: 0.12963129747007696
iteration: 4260
Max difference: 0.1469630212807682
iteration: 4261
Max difference: 0.1469630212807682
iteration: 4262
Max difference: 0.12232056724597606
iteration: 4263
Max difference: 0.14589203489172686
iteration: 4264
Max difference: 0.14589203489172686
iteration: 4265
Max difference: 0

Max difference: 0.12671458679692182
iteration: 4408
Max difference: 0.15752403440126728
iteration: 4409
Max difference: 0.14672822880732483
iteration: 4410
Max difference: 0.14672822880732483
iteration: 4411
Max difference: 0.1466134821329118
iteration: 4412
Max difference: 0.11067515068962974
iteration: 4413
Max difference: 0.21335388972252306
iteration: 4414
Max difference: 0.16822072207678346
iteration: 4415
Max difference: 0.1583843628619761
iteration: 4416
Max difference: 0.2192139912258466
iteration: 4417
Max difference: 0.24360554324745703
iteration: 4418
Max difference: 0.211333612290382
iteration: 4419
Max difference: 0.14710125318610923
iteration: 4420
Max difference: 0.14710125318610923
iteration: 4421
Max difference: 0.1466174469418896
iteration: 4422
Max difference: 0.12234436948595562
iteration: 4423
Max difference: 0.10224631562568076
iteration: 4424
Max difference: 0.1336988670083747
iteration: 4425
Max difference: 0.19433695499488116
iteration: 4426
Max difference: 0.1

Max difference: 0.12913113729514436
iteration: 4600
Max difference: 0.11295897841511282
iteration: 4601
Max difference: 0.17650648680043707
iteration: 4602
Max difference: 0.12847241816984967
iteration: 4603
Max difference: 0.10691776385969393
iteration: 4604
Max difference: 0.14702847930046603
iteration: 4605
Max difference: 0.11882013375181788
iteration: 4606
Max difference: 0.12297319543934165
iteration: 4607
Max difference: 0.16191346228565728
iteration: 4608
Max difference: 0.14798937707445425
iteration: 4609
Max difference: 0.14798937707445425
iteration: 4610
Max difference: 0.11242736379771101
iteration: 4611
Max difference: 0.22715153750249634
iteration: 4612
Max difference: 0.14586066018123955
iteration: 4613
Max difference: 0.14586066018123955
iteration: 4614
Max difference: 0.11262910834207096
iteration: 4615
Max difference: 0.167392225475351
iteration: 4616
Max difference: 0.14710732698433437
iteration: 4617
Max difference: 0.14664282980134102
iteration: 4618
Max difference

iteration: 4783
Max difference: 0.11006647451033791
iteration: 4784
Max difference: 0.21304564021285777
iteration: 4785
Max difference: 0.16671599835692552
iteration: 4786
Max difference: 0.15950983935646634
iteration: 4787
Max difference: 0.14674933557074876
iteration: 4788
Max difference: 0.14674933557074876
iteration: 4789
Max difference: 0.14777377059933883
iteration: 4790
Max difference: 0.14777377059933883
iteration: 4791
Max difference: 0.13665154388505407
iteration: 4792
Max difference: 0.2171042478113585
iteration: 4793
Max difference: 0.1459672072994176
iteration: 4794
Max difference: 0.1642785906050125
iteration: 4795
Max difference: 0.1469364028013096
iteration: 4796
Max difference: 0.1469364028013096
iteration: 4797
Max difference: 0.21135697737546272
iteration: 4798
Max difference: 0.20518970505863843
iteration: 4799
Max difference: 0.14662297820208336
iteration: 4800
Max difference: 0.16381928977814444
iteration: 4801
Max difference: 0.15763003626095407
iteration: 4802
M

iteration: 4953
Max difference: 0.12136630489055622
iteration: 4954
Max difference: 0.03317247571767157
iteration: 4955
Max difference: 0.13666177720558315
iteration: 4956
Max difference: 0.21010875176869176
iteration: 4957
Max difference: 0.1478238855108266
iteration: 4958
Max difference: 0.1478238855108266
iteration: 4959
Max difference: 0.12830304157523875
iteration: 4960
Max difference: 0.1471087426158051
iteration: 4961
Max difference: 0.11817484971056869
iteration: 4962
Max difference: 0.09459626666482168
iteration: 4963
Max difference: 0.12055903378129695
iteration: 4964
Max difference: 0.12509039817736112
iteration: 4965
Max difference: 0.15937214256154486
iteration: 4966
Max difference: 0.19962301878912037
iteration: 4967
Max difference: 0.14674787201630102
iteration: 4968
Max difference: 0.1466908931838501
iteration: 4969
Max difference: 0.1466908931838501
iteration: 4970
Max difference: 0.2128843049980016
iteration: 4971
Max difference: 0.16708917811341772
iteration: 4972
Ma

iteration: 5180
Max difference: 0.16092346242924332
iteration: 5181
Max difference: 0.14587993262300725
iteration: 5182
Max difference: 0.1777297808625513
iteration: 5183
Max difference: 0.21810944574650015
iteration: 5184
Max difference: 0.17545187218415903
iteration: 5185
Max difference: 0.14328098946383905
iteration: 5186
Max difference: 0.21183147358917975
iteration: 5187
Max difference: 0.14886887057269688
iteration: 5188
Max difference: 0.14792044670698856
iteration: 5189
Max difference: 0.16010662736882425
iteration: 5190
Max difference: 0.16864126387399386
iteration: 5191
Max difference: 0.14589318985584532
iteration: 5192
Max difference: 0.2176290624309754
iteration: 5193
Max difference: 0.17905300909279448
iteration: 5194
Max difference: 0.1438838206556417
iteration: 5195
Max difference: 0.09514138655697302
iteration: 5196
Max difference: 0.17078676208694948
iteration: 5197
Max difference: 0.1286104509946826
iteration: 5198
Max difference: 0.1471430643093995
iteration: 5199
M

iteration: 5351
Max difference: 0.21260419036342854
iteration: 5352
Max difference: 0.2046523444466014
iteration: 5353
Max difference: 0.21662376866210703
iteration: 5354
Max difference: 0.21596176136746142
iteration: 5355
Max difference: 0.15044447733146082
iteration: 5356
Max difference: 0.14661832322187018
iteration: 5357
Max difference: 0.1268901011940534
iteration: 5358
Max difference: 0.1566309134017253
iteration: 5359
Max difference: 0.1477709562859445
iteration: 5360
Max difference: 0.22227821885176002
iteration: 5361
Max difference: 0.2102297705522389
iteration: 5362
Max difference: 0.1782338283265932
iteration: 5363
Max difference: 0.19039838483546706
iteration: 5364
Max difference: 0.21350634891955556
iteration: 5365
Max difference: 0.147119257962951
iteration: 5366
Max difference: 0.13993168215529272
iteration: 5367
Max difference: 0.22410695190248775
iteration: 5368
Max difference: 0.12025042004352426
iteration: 5369
Max difference: 0.21793283439616606
iteration: 5370
Max 

iteration: 5531
Max difference: 0.14777131287171108
iteration: 5532
Max difference: 0.11199414957499876
iteration: 5533
Max difference: 0.14671047968744044
iteration: 5534
Max difference: 0.12154701760245346
iteration: 5535
Max difference: 0.09126116104374793
iteration: 5536
Max difference: 0.15818332790340817
iteration: 5537
Max difference: 0.13901786684710427
iteration: 5538
Max difference: 0.14600366727949754
iteration: 5539
Max difference: 0.12770720474068797
iteration: 5540
Max difference: 0.15749731570615955
iteration: 5541
Max difference: 0.14794504767879624
iteration: 5542
Max difference: 0.24220416086568403
iteration: 5543
Max difference: 0.20987943988037
iteration: 5544
Max difference: 0.14768321743881518
iteration: 5545
Max difference: 0.14579981066686515
iteration: 5546
Max difference: 0.14711860579188607
iteration: 5547
Max difference: 0.14711860579188607
iteration: 5548
Max difference: 0.14669777266011652
iteration: 5549
Max difference: 0.1098751399303155
iteration: 5550


iteration: 5717
Max difference: 0.14780104300719743
iteration: 5718
Max difference: 0.13529402710465266
iteration: 5719
Max difference: 0.20091623846293494
iteration: 5720
Max difference: 0.15815856321225508
iteration: 5721
Max difference: 0.13157707912624805
iteration: 5722
Max difference: 0.11372728325049175
iteration: 5723
Max difference: 0.14704119268076876
iteration: 5724
Max difference: 0.10279597327162415
iteration: 5725
Max difference: 0.1468263286013648
iteration: 5726
Max difference: 0.12050744415779224
iteration: 5727
Max difference: 0.13853883384326177
iteration: 5728
Max difference: 0.17625144913736995
iteration: 5729
Max difference: 0.1467467709408785
iteration: 5730
Max difference: 0.1467467709408785
iteration: 5731
Max difference: 0.11088888876964742
iteration: 5732
Max difference: 0.07166772953313721
iteration: 5733
Max difference: 0.10522774222589043
iteration: 5734
Max difference: 0.14584836881680907
iteration: 5735
Max difference: 0.11793692618535145
iteration: 5736

iteration: 5930
Max difference: 0.11010779228190637
iteration: 5931
Max difference: 0.1684012074467977
iteration: 5932
Max difference: 0.1469497806162694
iteration: 5933
Max difference: 0.10976521903783232
iteration: 5934
Max difference: 0.18963756284737243
iteration: 5935
Max difference: 0.14586343028408905
iteration: 5936
Max difference: 0.14713762641892036
iteration: 5937
Max difference: 0.14978931518285088
iteration: 5938
Max difference: 0.2122640267934096
iteration: 5939
Max difference: 0.21685404206708786
iteration: 5940
Max difference: 0.14596122365839814
iteration: 5941
Max difference: 0.1466053297732448
iteration: 5942
Max difference: 0.12242762533561291
iteration: 5943
Max difference: 0.06089697149372364
iteration: 5944
Max difference: 0.13542553836241866
iteration: 5945
Max difference: 0.1593170124195773
iteration: 5946
Max difference: 0.15332919882146961
iteration: 5947
Max difference: 0.14683007901296163
iteration: 5948
Max difference: 0.12213242557210402
iteration: 5949
M

Max difference: 0.11134949071289568
iteration: 6179
Max difference: 0.09019308747744548
iteration: 6180
Max difference: 0.1468351468760022
iteration: 6181
Max difference: 0.1468351468760022
iteration: 6182
Max difference: 0.2037017251455383
iteration: 6183
Max difference: 0.2061686408215479
iteration: 6184
Max difference: 0.1459602421752979
iteration: 6185
Max difference: 0.1203562681246737
iteration: 6186
Max difference: 0.10110160755580849
iteration: 6187
Max difference: 0.1167779811110099
iteration: 6188
Max difference: 0.14881897836326743
iteration: 6189
Max difference: 0.14697058870193158
iteration: 6190
Max difference: 0.11122504111260845
iteration: 6191
Max difference: 0.1573292073895794
iteration: 6192
Max difference: 0.12871127290466988
iteration: 6193
Max difference: 0.15747032519928728
iteration: 6194
Max difference: 0.14806517958748244
iteration: 6195
Max difference: 0.1651389488951861
iteration: 6196
Max difference: 0.18095515175518517
iteration: 6197
Max difference: 0.145

iteration: 6383
Max difference: 0.20434379621866822
iteration: 6384
Max difference: 0.13197464757992872
iteration: 6385
Max difference: 0.11927342907221955
iteration: 6386
Max difference: 0.1946812878941538
iteration: 6387
Max difference: 0.11210872045023584
iteration: 6388
Max difference: 0.08517992325469442
iteration: 6389
Max difference: 0.11704298440655331
iteration: 6390
Max difference: 0.14582896867231798
iteration: 6391
Max difference: 0.11812105511228832
iteration: 6392
Max difference: 0.2183212513306082
iteration: 6393
Max difference: 0.17692988397443854
iteration: 6394
Max difference: 0.14678933750898682
iteration: 6395
Max difference: 0.14669234614029858
iteration: 6396
Max difference: 0.1099266436328542
iteration: 6397
Max difference: 0.21296696571616308
iteration: 6398
Max difference: 0.16593616076114165
iteration: 6399
Max difference: 0.12919842927099984
iteration: 6400
Max difference: 0.04746575505334605
iteration: 6401
Max difference: 0.13347956180950826
iteration: 6402

iteration: 6555
Max difference: 0.19391500947650406
iteration: 6556
Max difference: 0.15805171213628577
iteration: 6557
Max difference: 0.1329124522114643
iteration: 6558
Max difference: 0.14667606457997073
iteration: 6559
Max difference: 0.12184268816022814
iteration: 6560
Max difference: 0.15764190651439125
iteration: 6561
Max difference: 0.1468082696953572
iteration: 6562
Max difference: 0.1468082696953572
iteration: 6563
Max difference: 0.11276822958987609
iteration: 6564
Max difference: 0.09134226596779971
iteration: 6565
Max difference: 0.17565369602446756
iteration: 6566
Max difference: 0.14682404441213015
iteration: 6567
Max difference: 0.17665354431355776
iteration: 6568
Max difference: 0.15767113384357334
iteration: 6569
Max difference: 0.14680865964971357
iteration: 6570
Max difference: 0.14680865964971357
iteration: 6571
Max difference: 0.1922581010947924
iteration: 6572
Max difference: 0.15130296689038686
iteration: 6573
Max difference: 0.21277007684825777
iteration: 6574


Max difference: 0.15773760550695237
iteration: 6828
Max difference: 0.1353460310543373
iteration: 6829
Max difference: 0.1471065962028787
iteration: 6830
Max difference: 0.11270297687727665
iteration: 6831
Max difference: 0.08430238676432045
iteration: 6832
Max difference: 0.14075020820528394
iteration: 6833
Max difference: 0.15930116729828503
iteration: 6834
Max difference: 0.15290539881282855
iteration: 6835
Max difference: 0.14686189777627323
iteration: 6836
Max difference: 0.20571411789167715
iteration: 6837
Max difference: 0.15769250995271522
iteration: 6838
Max difference: 0.14673001950363562
iteration: 6839
Max difference: 0.17459081256221437
iteration: 6840
Max difference: 0.10981560023582182
iteration: 6841
Max difference: 0.1466739756180847
iteration: 6842
Max difference: 0.1466739756180847
iteration: 6843
Max difference: 0.14661290548552897
iteration: 6844
Max difference: 0.14661290548552897
iteration: 6845
Max difference: 0.14668889178464406
iteration: 6846
Max difference: 

Max difference: 0.21795275939469666
iteration: 6986
Max difference: 0.187441029170208
iteration: 6987
Max difference: 0.14791222227054257
iteration: 6988
Max difference: 0.14776029944349223
iteration: 6989
Max difference: 0.11246304381687455
iteration: 6990
Max difference: 0.15788197338701093
iteration: 6991
Max difference: 0.12761287159389845
iteration: 6992
Max difference: 0.14696008855190357
iteration: 6993
Max difference: 0.12946890079875228
iteration: 6994
Max difference: 0.10208729359584906
iteration: 6995
Max difference: 0.08173798815305033
iteration: 6996
Max difference: 0.15823934160768205
iteration: 6997
Max difference: 0.14780072168526703
iteration: 6998
Max difference: 0.14780072168526703
iteration: 6999
Max difference: 0.1242451680812523
iteration: 7000
Max difference: 0.14598232820445656
iteration: 7001
Max difference: 0.14598232820445656
iteration: 7002
Max difference: 0.10974077069492971
iteration: 7003
Max difference: 0.0968402390264474
iteration: 7004
Max difference: 

iteration: 7183
Max difference: 0.15142790282475005
iteration: 7184
Max difference: 0.21255525611884507
iteration: 7185
Max difference: 0.20059087728960845
iteration: 7186
Max difference: 0.12293965493536763
iteration: 7187
Max difference: 0.10240896947357636
iteration: 7188
Max difference: 0.08484078263069894
iteration: 7189
Max difference: 0.1044010830598463
iteration: 7190
Max difference: 0.15962128001897513
iteration: 7191
Max difference: 0.14783077222887575
iteration: 7192
Max difference: 0.2291245596778742
iteration: 7193
Max difference: 0.21012049603687838
iteration: 7194
Max difference: 0.14679668452925787
iteration: 7195
Max difference: 0.14679668452925787
iteration: 7196
Max difference: 0.1118668138744523
iteration: 7197
Max difference: 0.09061211923830625
iteration: 7198
Max difference: 0.14682760956410834
iteration: 7199
Max difference: 0.16377003231956563
iteration: 7200
Max difference: 0.15766576945808675
iteration: 7201
Max difference: 0.19472185457084948
iteration: 7202

iteration: 7367
Max difference: 0.14589086952025165
iteration: 7368
Max difference: 0.10960340871420504
iteration: 7369
Max difference: 0.14693575984510554
iteration: 7370
Max difference: 0.15125016428444693
iteration: 7371
Max difference: 0.14662908268424346
iteration: 7372
Max difference: 0.12224117690870417
iteration: 7373
Max difference: 0.13398314795865707
iteration: 7374
Max difference: 0.11858156722146207
iteration: 7375
Max difference: 0.08885105390379322
iteration: 7376
Max difference: 0.1582953566883103
iteration: 7377
Max difference: 0.14795381508236305
iteration: 7378
Max difference: 0.2432076497356585
iteration: 7379
Max difference: 0.13715891691667842
iteration: 7380
Max difference: 0.11531531220219718
iteration: 7381
Max difference: 0.13298701064022522
iteration: 7382
Max difference: 0.12148266396374063
iteration: 7383
Max difference: 0.15842879194398796
iteration: 7384
Max difference: 0.13058868573615978
iteration: 7385
Max difference: 0.14711652548254717
iteration: 738

iteration: 7542
Max difference: 0.16809795327081106
iteration: 7543
Max difference: 0.21196140243322742
iteration: 7544
Max difference: 0.12250710206074478
iteration: 7545
Max difference: 0.09089916258969111
iteration: 7546
Max difference: 0.1459100707821508
iteration: 7547
Max difference: 0.1459100707821508
iteration: 7548
Max difference: 0.14660478607853078
iteration: 7549
Max difference: 0.12006826291039666
iteration: 7550
Max difference: 0.21339569247515705
iteration: 7551
Max difference: 0.16671734613789652
iteration: 7552
Max difference: 0.1725644441731911
iteration: 7553
Max difference: 0.1566001501311125
iteration: 7554
Max difference: 0.14777050754632592
iteration: 7555
Max difference: 0.14777050754632592
iteration: 7556
Max difference: 0.1466741007981942
iteration: 7557
Max difference: 0.12188345531170974
iteration: 7558
Max difference: 0.15764361177624808
iteration: 7559
Max difference: 0.14795016312673082
iteration: 7560
Max difference: 0.20489876985387845
iteration: 7561
M

iteration: 7784
Max difference: 0.14591264236026624
iteration: 7785
Max difference: 0.14660481341137166
iteration: 7786
Max difference: 0.14660481341137166
iteration: 7787
Max difference: 0.14775950383005965
iteration: 7788
Max difference: 0.13680132352300745
iteration: 7789
Max difference: 0.15742308093873458
iteration: 7790
Max difference: 0.1477883024986738
iteration: 7791
Max difference: 0.1551003647595801
iteration: 7792
Max difference: 0.1489528324565771
iteration: 7793
Max difference: 0.14589298705907594
iteration: 7794
Max difference: 0.14589298705907594
iteration: 7795
Max difference: 0.14660460449871326
iteration: 7796
Max difference: 0.12242936240398405
iteration: 7797
Max difference: 0.08450413016913805
iteration: 7798
Max difference: 0.08814725562289172
iteration: 7799
Max difference: 0.1582605666428032
iteration: 7800
Max difference: 0.1468136301713847
iteration: 7801
Max difference: 0.12072934276196878
iteration: 7802
Max difference: 0.06727409809359047
iteration: 7803
M

iteration: 7964
Max difference: 0.146824439376395
iteration: 7965
Max difference: 0.12052111438311475
iteration: 7966
Max difference: 0.09953843630528325
iteration: 7967
Max difference: 0.14311050992002378
iteration: 7968
Max difference: 0.08366379268261981
iteration: 7969
Max difference: 0.062012365864910934
iteration: 7970
Max difference: 0.12364280565671362
iteration: 7971
Max difference: 0.08865146929122136
iteration: 7972
Max difference: 0.13080784938272727
iteration: 7973
Max difference: 0.12127566181760352
iteration: 7974
Max difference: 0.09077734287853306
iteration: 7975
Max difference: 0.15829693299705994
iteration: 7976
Max difference: 0.12823610026215027
iteration: 7977
Max difference: 0.10387124121234176
iteration: 7978
Max difference: 0.18229065113961052
iteration: 7979
Max difference: 0.14683781599500123
iteration: 7980
Max difference: 0.21183488194057765
iteration: 7981
Max difference: 0.1645303294078309
iteration: 7982
Max difference: 0.12778760709318943
iteration: 798

iteration: 8142
Max difference: 0.09071940229560804
iteration: 8143
Max difference: 0.14612873129322967
iteration: 8144
Max difference: 0.12663494966992295
iteration: 8145
Max difference: 0.16472599190115028
iteration: 8146
Max difference: 0.18370461810364996
iteration: 8147
Max difference: 0.1378660731308321
iteration: 8148
Max difference: 0.15846107101149187
iteration: 8149
Max difference: 0.15430231904097041
iteration: 8150
Max difference: 0.24432227115177302
iteration: 8151
Max difference: 0.14598454304594155
iteration: 8152
Max difference: 0.10871968040418478
iteration: 8153
Max difference: 0.15606586833326413
iteration: 8154
Max difference: 0.14694713262968773
iteration: 8155
Max difference: 0.11951808908211581
iteration: 8156
Max difference: 0.15770376064707814
iteration: 8157
Max difference: 0.12582375728886563
iteration: 8158
Max difference: 0.1379764135462389
iteration: 8159
Max difference: 0.20806583315516197
iteration: 8160
Max difference: 0.14607523902527564
iteration: 816

iteration: 8317
Max difference: 0.1293861307751028
iteration: 8318
Max difference: 0.21115651116373463
iteration: 8319
Max difference: 0.16432875726519702
iteration: 8320
Max difference: 0.14679199007092514
iteration: 8321
Max difference: 0.21238200661460283
iteration: 8322
Max difference: 0.16483327778758672
iteration: 8323
Max difference: 0.12732697836714485
iteration: 8324
Max difference: 0.14695836911513516
iteration: 8325
Max difference: 0.17804079161955805
iteration: 8326
Max difference: 0.18207252992030065
iteration: 8327
Max difference: 0.11230311655007252
iteration: 8328
Max difference: 0.1578872926904359
iteration: 8329
Max difference: 0.16262329226320543
iteration: 8330
Max difference: 0.11078898606760257
iteration: 8331
Max difference: 0.19331349318129304
iteration: 8332
Max difference: 0.15023322927193772
iteration: 8333
Max difference: 0.11595427724422858
iteration: 8334
Max difference: 0.10288306777884215
iteration: 8335
Max difference: 0.08859631899651316
iteration: 833

iteration: 8490
Max difference: 0.13380627512208787
iteration: 8491
Max difference: 0.15663030604551142
iteration: 8492
Max difference: 0.1426193846074657
iteration: 8493
Max difference: 0.1595792989197105
iteration: 8494
Max difference: 0.14664790488404666
iteration: 8495
Max difference: 0.10173516961957646
iteration: 8496
Max difference: 0.12226161742687885
iteration: 8497
Max difference: 0.10355339245212586
iteration: 8498
Max difference: 0.16015954741295424
iteration: 8499
Max difference: 0.08128897471067437
iteration: 8500
Max difference: 0.09874917484360812
iteration: 8501
Max difference: 0.19469052029452663
iteration: 8502
Max difference: 0.11731464620072729
iteration: 8503
Max difference: 0.21791456774918605
iteration: 8504
Max difference: 0.17490430248706718
iteration: 8505
Max difference: 0.16215905035654782
iteration: 8506
Max difference: 0.20160944588777308
iteration: 8507
Max difference: 0.2211537985913556
iteration: 8508
Max difference: 0.21022396071898392
iteration: 8509

iteration: 8652
Max difference: 0.09117448684106011
iteration: 8653
Max difference: 0.14684426842176235
iteration: 8654
Max difference: 0.14684426842176235
iteration: 8655
Max difference: 0.2118829659096314
iteration: 8656
Max difference: 0.20490858897916225
iteration: 8657
Max difference: 0.133249254745271
iteration: 8658
Max difference: 0.2178206045547435
iteration: 8659
Max difference: 0.23280542244448227
iteration: 8660
Max difference: 0.13634644321815292
iteration: 8661
Max difference: 0.24630672701329515
iteration: 8662
Max difference: 0.13238785176259393
iteration: 8663
Max difference: 0.1469858460268011
iteration: 8664
Max difference: 0.1469858460268011
iteration: 8665
Max difference: 0.1466334586541853
iteration: 8666
Max difference: 0.1466334586541853
iteration: 8667
Max difference: 0.14668938036204016
iteration: 8668
Max difference: 0.1106228426988829
iteration: 8669
Max difference: 0.20662804231648613
iteration: 8670
Max difference: 0.16185641672885465
iteration: 8671
Max d

iteration: 8863
Max difference: 0.14674702140802776
iteration: 8864
Max difference: 0.12188434062867382
iteration: 8865
Max difference: 0.14589180983958705
iteration: 8866
Max difference: 0.14589180983958705
iteration: 8867
Max difference: 0.21802794212535198
iteration: 8868
Max difference: 0.17539276583758767
iteration: 8869
Max difference: 0.1409331008668751
iteration: 8870
Max difference: 0.14583248906704316
iteration: 8871
Max difference: 0.14583248906704316
iteration: 8872
Max difference: 0.21832176624177402
iteration: 8873
Max difference: 0.2024358150092509
iteration: 8874
Max difference: 0.13793931400748388
iteration: 8875
Max difference: 0.14694073767175075
iteration: 8876
Max difference: 0.12725406117039523
iteration: 8877
Max difference: 0.21133685010514908
iteration: 8878
Max difference: 0.16305404048013572
iteration: 8879
Max difference: 0.14713658592031087
iteration: 8880
Max difference: 0.14713658592031087
iteration: 8881
Max difference: 0.1466399207657958
iteration: 8882

### Notes

Results are dynamic, and they vary from in my tries within a range of **8000 - 22000** iterations in order to converge into the optimal policy. 

For now, although as mentioned above this number can vary, this seems to be our metric for the current grid, these  specific current enemy blocks (1, 2) and (2, 2) respectively, and the final state (2, 3)

Below we can print the optimal policy values by borrowing some code from the interactive activities.

In [11]:
def print_values(V):
    for i in range(3):
        print("---------------------------")
        for j in range(4):
            v = V.get((i,j), 0)
            if v >= 0:
                print(" %.2f|" % v, end="")
            else:
                print("%.2f|" % v, end="") # -ve sign takes up an extra space
        print("")

In [12]:
print_values(values)

---------------------------
 0.41| 0.46| 0.52| 0.76|
---------------------------
 0.36| 0.41|-1.00| 0.88|
---------------------------
 0.33| 0.36|-1.00| 1.00|


### Conclusion Note

It is very funny how the blocks on (1,1) and (2, 1) hold smaller values than the block (0,1) or even the start (0,0) although the latter are significantly further from port than the former ones. 

The reason has to lie within the fact, that for these *dangerous* blocks that are close to *pirate ships* blocks, their values drops to a possibility of an accidental maneuver within the pirate region.

By taking a look at the printed values format, we can safely extract info about what the best next action is, at any given state.