A small example of Q learning given a set reward matrix and graph with corresponding weighted edges.
Q-Learning is a reinforcement learning technique used in machine learning. The goal of Q-Learning is to learn a policy, which tells an agent what action to take under what circumstance. It does not require a model of the environment and can handle problems with stochastic transitions and rewards, without requiring adaptations.(1) 1.https://en.wikipedia.org/wiki/Q-learning
Matlab