Utilities for working with discrete probability distributions and other tools useful for doing NLP work
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
counter
examples math looks alright, add string methods May 23, 2011
features split => splitn Nov 15, 2011
frozencounter
gnlp remove the feature abstraction May 16, 2011
minimizer logging, fix initial weights May 23, 2011
smoothing
.gitignore example in readme + test May 4, 2011
README.md switch to Counter interface May 8, 2011
build.sh gradient descent minimizer, first pass at maxent classifier May 18, 2011
test.sh

README.md

GNLP

A few structures for doing NLP analysis / experiments.

Basics

  • counter.Counter

A map-like data structure for representing discrete probability distributions. Contains an underlying map of event -> probability along with a probability for all other events. Supports some element-wise mathematical operations with other counter.Counter objects.

// Create a counter with 0 probability for unknown events (and with ""
// corresponding to the unknown event)
balls := counter.New(0.0)
	
// Add some observations
balls.Incr("blue")
balls.Incr("blue")
balls.Incr("red")

// Normalize into a discrete distribution
balls.Normalize()

// blue => 0.666666
balls.Get("blue")

// purple => 0.0
balls.Get("purple")

preference = counter.New(0.0)
preference.Set("red", 2.0)
preference.Set("blue", 1.0)
preference.Normalize()

expected_with_preference = counter.Multiply(balls, preference)
expected_with_preference.Normalize()

// blue => 0.5
expected_with_preference.Get("blue")
// red => 0.5
expected_with_preference.Get("red")

// You can also use log probabilities
balls.LogNormalize()
preferences.LogNormalize()

// And do in-place operations
balls.Add(preferences)

// Log-normalize expects counters with positive counts, so
// exponentiate-then-normalize
balls.Exp()
balls.LogNormalize()

// blue => -1 (== lg(0.5))
balls.Get("blue")
  • frozencounter.Counter

Similar to counter.Counters, but with a fixed set of keys and no default value. Represented under the hood as an array of doubles (with order fixed according to the set of keys). Supports element-wise math operations with other frozencounter.Counters that share the same set of keys. Some mathematical operations are accelerated by the BLAS library.

fBalls := frozencounter.Freeze(balls)
fPrefs := frozencounter.Freeze(preference)

fExpectedWithPreference := frozencounter.Multiply(fBalls, fPrefs)