Why do the program only use two state? #33

guotong1988 · 2017-03-07T08:18:56Z

I read from here.
Why do the program only use the current state and the next state?
Why only using the two state can work?
Thank you @yenchenlin

ColdCodeCool · 2017-04-12T10:12:12Z

@guotong1988 I think you should learn the very basic concept of reinforcement learning. It is basically a dynamic program, the state changes from time to time. You'd better learn Markov Decision Process and Bellman Equation first.

guotong1988 · 2017-04-12T10:18:55Z

the state changes from time to time
thank you
could you please have a look at my another question? thx!
the question is also in the issues

guotong1988 · 2017-04-12T11:08:28Z

反过来想，为什么不只用1个state呢，而用了2个state

ColdCodeCool · 2017-04-12T11:40:43Z

@guotong1988 no, you cannot use only one state, since intuitively you must communicate with the environment by behaving to learn a lesson. Once your action done, you are in another state, and you get reward or punishment from the environment, thus you can learn something.

ColdCodeCool · 2017-04-12T11:42:25Z

@guotong1988 for comprehensive understanding, you should learn mdp theory first.

guotong1988 · 2017-04-12T11:53:12Z

关键这两个state是紧挨着的，
就是说第二个state有情况，是前若干步决定的啊

ColdCodeCool · 2017-04-12T12:00:21Z

@guotong1988 like I said, you really need to learn mdp first. Markov property informs the current state captures all relevant information from the history. Thus the future state only depends on the current state. In mathematical forms, P[s_{t+1}|s_{t}] = P[s_{t+1}|s_1,...,s_t].

guotong1988 · 2017-04-21T08:30:21Z

The answer: One state contains 4 frame.

guotong1988 closed this as completed Apr 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why do the program only use two state? #33

Why do the program only use two state? #33

guotong1988 commented Mar 7, 2017 •

edited

ColdCodeCool commented Apr 12, 2017

guotong1988 commented Apr 12, 2017

guotong1988 commented Apr 12, 2017

ColdCodeCool commented Apr 12, 2017 •

edited

ColdCodeCool commented Apr 12, 2017

guotong1988 commented Apr 12, 2017 •

edited

ColdCodeCool commented Apr 12, 2017 •

edited

guotong1988 commented Apr 21, 2017

Why do the program only use two state? #33

Why do the program only use two state? #33

Comments

guotong1988 commented Mar 7, 2017 • edited

ColdCodeCool commented Apr 12, 2017

guotong1988 commented Apr 12, 2017

guotong1988 commented Apr 12, 2017

ColdCodeCool commented Apr 12, 2017 • edited

ColdCodeCool commented Apr 12, 2017

guotong1988 commented Apr 12, 2017 • edited

ColdCodeCool commented Apr 12, 2017 • edited

guotong1988 commented Apr 21, 2017

guotong1988 commented Mar 7, 2017 •

edited

ColdCodeCool commented Apr 12, 2017 •

edited

guotong1988 commented Apr 12, 2017 •

edited

ColdCodeCool commented Apr 12, 2017 •

edited