Skip to content

Commit

Permalink
Merge pull request #3 from yrevar/patch-2
Browse files Browse the repository at this point in the history
Fixing policy improvement equation
  • Loading branch information
mimoralea committed Jul 16, 2017
2 parents 34273a9 + 7f074ce commit 863804c
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion notebooks/solutions/03-planning-algorithms.ipynb
Expand Up @@ -751,7 +751,7 @@
" Qs = np.zeros(len(A), dtype=float)\n",
" for a in A:\n",
" for prob, s_prime, reward, done in P[s][a]:\n",
" Qs[a] += prob * (reward + gamma * V[s] * (not done))\n",
" Qs[a] += prob * (reward + gamma * V[s_prime] * (not done))\n",
" pi[s] = np.argmax(Qs)\n",
" V[s] = np.max(Qs)\n",
" return pi, V"
Expand Down

0 comments on commit 863804c

Please sign in to comment.