Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
..
Failed to load latest commit information.
devsample
jsqlearning
.gitignore
README.md
demo_autocompletion.html
demo_autocompletion_greedy.html

README.md

Reinforcement Learning with JavaScript

These JavaScript modules are library to implement Reinforcement Learning, especially for Q-Learning.

Description

These modules are functionally equivalent to Python Scripts in pyqlearning. Considering many variable parts and functional extensions in the Q-learning paradigm, I implemented these scripts for demonstrations of commonality/variability analysis in order to design the models.

Installation

Source code

The source code is currently hosted on GitHub.

Demonstration: Autocompletion

Code sample

The function of autocompletion is a kind of natural language processing. Load follow JavaScript files in devsample. These scripts are functionally equivalent to Python Scripts in pysummarization.

<script type="text/javascript" src="devsample/nlpbase.js"></script>
<script type="text/javascript" src="devsample/ngram.js"></script>

The modules of autocompletion depend on TinySegmenter (v0.2). Load this JavaScript file.

<script type="text/javascript" src="dependencies/tiny_segmenter-0.2.js"></script>

And Q-Learning modules are to be included.

<script type="text/javascript" src="jsqlearning/qlearning.js"></script>
<script type="text/javascript" src="jsqlearning/qlearning/boltzmann.js"></script>
<script type="text/javascript" src="jsqlearning/qlearning/boltzmann/autocompletion.js"></script>

If you want to use not Boltzmann-Distribution-Q-Learning but Epsilon-Greedy-Q-Learning, include follow files instead.

<script type="text/javascript" src="jsqlearning/qlearning.js"></script>
<script type="text/javascript" src="jsqlearning/qlearning/greedy.js"></script>
<script type="text/javascript" src="jsqlearning/qlearning/greedy/autocompletion.js"></script>

Initialize NLP modules.

// The number of n-gram.
var n = 2;
// The function of n-gram.
var n_gram = new Ngram();
// Base class of NLP for tokenization.
var nlp_base = new NlpBase();

// The function of autocompletion algorithm.
var autocompletion = new Autocompletion(
    nlp_base,
    n_gram,
    n
);

Setup hyperparameters in Boltzmann-Distribution-Q-Learning.

// Time rate in boltzmann distribution.
taime_rate = 0.001;

// The algorithm of boltzmann distribution.
var strategy = new Boltzmann(
    autocompletion,
    {
        "time_rate": time_rate
    }
);

If you want to use Epsilon-Greedy-Q-Learning, setup the epsilon-greedy-rate instead.

// The epsilon greedy rate.
epsilon_greedy_rate = 0.75;

// The algorithm of epsilon-greedy.
var strategy = new Greedy(
    autocompletion,
    {
        "epsilon_greedy_rate": epsilon_greedy_rate
    }
);

And, setup common hyperparameters in Q-Learning and initialize.

// Alpha value in Q-Learning algorithm.
alpha_value = 0.5;
// Gamma value in Q-Learning algorithm.
gamma_value = 0.5;
// The number of learning.
limit = 10000;

// Base class of Q-Learning.
var q_learning = new QLearning(
    strategy,
    {
        "alpha_value": alpha_value,
        "gamma_value": gamma_value
    }
);

Set learned data.

// Learned data.
first_learned_data = "hogehogehogefugafuga";

// Pre training for first user's typing.
autocompletion_.pre_training(
    q_learning,
    first_learned_data
);

Execute recursive learning in loop control structure or recursive call.

// User's typing.
input_document = "hogefuga";

// Extract state in input_document.
var state_key = autocompletion.lap_extract_ngram(
    q_learning,
    input_document
);

// Learning.
q_learning.learn(state_key, limit);

// Predict next token.
var next_action_list = q_learning.extract_possible_actions(
    state_key
);
var action_key = q_learning.select_action(
    state_key,
    next_action_list
);

// Compute reward value.
var reward_value = q_learning.observe_reward_value(
    state_key,
    action_key
);

// Compute Q-Value.
var q_value = q_learning.extract_q_dict(
    state_key,
    action_key
);

// Pre training for next user's typing.
autocompletion_.pre_training(
    q_learning,
    input_document
);

Related PoC

Version

  • 1.0.1

Author

  • chimera0(RUM)

Author URI

License

  • GNU General Public License v2.0