These JavaScript modules are library to implement Reinforcement Learning, especially for Q-Learning.
These modules are functionally equivalent to Python Scripts in pyqlearning. Considering many variable parts and functional extensions in the Q-learning paradigm, I implemented these scripts for demonstrations of commonality/variability analysis in order to design the models.
The source code is currently hosted on GitHub.
The function of autocompletion is a kind of natural language processing. Load follow JavaScript files in devsample. These scripts are functionally equivalent to Python Scripts in pysummarization.
<script type="text/javascript" src="devsample/nlpbase.js"></script>
<script type="text/javascript" src="devsample/ngram.js"></script>
The modules of autocompletion depend on TinySegmenter (v0.2). Load this JavaScript file.
<script type="text/javascript" src="dependencies/tiny_segmenter-0.2.js"></script>
And Q-Learning
modules are to be included.
<script type="text/javascript" src="jsqlearning/qlearning.js"></script>
<script type="text/javascript" src="jsqlearning/qlearning/boltzmann.js"></script>
<script type="text/javascript" src="jsqlearning/qlearning/boltzmann/autocompletion.js"></script>
If you want to use not Boltzmann-Distribution-Q-Learning
but Epsilon-Greedy-Q-Learning
, include follow files instead.
<script type="text/javascript" src="jsqlearning/qlearning.js"></script>
<script type="text/javascript" src="jsqlearning/qlearning/greedy.js"></script>
<script type="text/javascript" src="jsqlearning/qlearning/greedy/autocompletion.js"></script>
Initialize NLP modules.
// The number of n-gram.
var n = 2;
// The function of n-gram.
var n_gram = new Ngram();
// Base class of NLP for tokenization.
var nlp_base = new NlpBase();
// The function of autocompletion algorithm.
var autocompletion = new Autocompletion(
nlp_base,
n_gram,
n
);
Setup hyperparameters in Boltzmann-Distribution-Q-Learning
.
// Time rate in boltzmann distribution.
taime_rate = 0.001;
// The algorithm of boltzmann distribution.
var strategy = new Boltzmann(
autocompletion,
{
"time_rate": time_rate
}
);
If you want to use Epsilon-Greedy-Q-Learning
, setup the epsilon-greedy-rate instead.
// The epsilon greedy rate.
epsilon_greedy_rate = 0.75;
// The algorithm of epsilon-greedy.
var strategy = new Greedy(
autocompletion,
{
"epsilon_greedy_rate": epsilon_greedy_rate
}
);
And, setup common hyperparameters in Q-Learning
and initialize.
// Alpha value in Q-Learning algorithm.
alpha_value = 0.5;
// Gamma value in Q-Learning algorithm.
gamma_value = 0.5;
// The number of learning.
limit = 10000;
// Base class of Q-Learning.
var q_learning = new QLearning(
strategy,
{
"alpha_value": alpha_value,
"gamma_value": gamma_value
}
);
Set learned data.
// Learned data.
first_learned_data = "hogehogehogefugafuga";
// Pre training for first user's typing.
autocompletion_.pre_training(
q_learning,
first_learned_data
);
Execute recursive learning in loop control structure or recursive call.
// User's typing.
input_document = "hogefuga";
// Extract state in input_document.
var state_key = autocompletion.lap_extract_ngram(
q_learning,
input_document
);
// Learning.
q_learning.learn(state_key, limit);
// Predict next token.
var next_action_list = q_learning.extract_possible_actions(
state_key
);
var action_key = q_learning.select_action(
state_key,
next_action_list
);
// Compute reward value.
var reward_value = q_learning.observe_reward_value(
state_key,
action_key
);
// Compute Q-Value.
var q_value = q_learning.extract_q_dict(
state_key,
action_key
);
// Pre training for next user's typing.
autocompletion_.pre_training(
q_learning,
input_document
);
- 深層強化学習のベイズ主義的な情報探索に駆動された自然言語処理の意味論 (Japanese)
- 「人工の理想」を背景とした「万物照応」のデータモデリング (Japanese)
- 1.0.1
- chimera0(RUM)
- GNU General Public License v2.0