Here is presented DQN algorithm. In DQN_Main we can change parameters to see how Agent learns.
Parameter 'egreedy_decay' is to adjust the exponential probability to pick random action. This is good if we want our Agent to explore more. If we set a higher number of training episodes, we can choose a higher value for 'egreedy_decay'.
Down below is presented how it works.
Here are presented the results of training.
The 'train()' method can return the average rewards and we can store it in some variables. Thus we can train multiple Agents with different parameters and compare their evolutions.