Skip to content
/ ccelm Public

A concurrent implementation of the candidate elimination algorithm.

Notifications You must be signed in to change notification settings

dob9601/ccelm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ccelm (Concurrent Candidate ELiMination)

An implementation of the candidate elimination algorithm which can be run across multiple cores. Written in rust for speed and fearless concurrency

What is Candidate Elimination?

Candidate Elimination is a type of version-space learning - an older approach to machine learning, originally introduced in 1977 by Tom Mitchell. It involves finding the most specific and most general hypotheses that satisfy all of the training examples that the algorithm has been shown, where each hypothesis is a logical sentence.

The algorithm is rarely seen nowadays, primarily because of its lack of noise resistance. Just one incorrectly labeled training example will cause the algorithm to converge incorrectly - a problem that more modern machine learning methods such as Neural Networks.

Why Use Candidate Elimination Over Other Methods (Such as Neural Networks)

The main advantage candidate elimination has over other more modern approaches is that the output is easily interpretable. It's near impossible to figure out what concepts a neural network (or another black-box approach) has learned through looking at just its weights. However, looking at the hypotheses in the specific and general boundary produced by the algorithm, it is quite easy to see what constraints the target concept has.

Version space

Installing the Tool

The tool can be installed using cargo, the package manager installed as part of the rust toolchain.

Since the tool isn't currently published to crates.io, the easiest way to get it running is to cd into a cloned version of the repository and run:

cargo install --path .

Assuming cargo has been configured correctly, this will add the ccelm command to your path. Instructions on how to use the command can be found through ccelm --help. Examples of configured datasets can be found in data/.

Pre-built binaries are also currently unavailable.

Included Data

This repository includes data adapted from the paper At the Boundaries of Syntactic Prehistory. The original data can be found on GitHub.