Skip to content

sbenthall/bluestocking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bluestocking

An information extraction toolkit.

To discuss the project with use, join our maing list: http://groups.google.com/forum/?fromgroups#!forum/bluestocking-dev

This project depends on NLTK. You will need to install it before running these scripts.

To run tests:

python tests.py

To run factchecker demo, try this:

python factchecker.py "The sky is not blue."

or this:

python factchecker.py "People never eat fish. Goldfish are unpopular."

This test a document against the Simple English Wikipedia articles for each word in the string passed as argument.

(Warning: documents with long sentences take longer to query)

Scripts included:

parse.py

Defines Document class for wrapping raw text and Parser class for extracting Relations from a Document.

Documents have a method to turn them into Doxaments (see below).

doxament.py

Defines a Doxament class. A Doxament contains many Relations. A Doxament may be queried for consistency with another Doxament. They may also be merged to form a more complete knowledge base.

Relations encapsulate a semantically significant lexical cooccurence.

other

wikipedia.py and wiki2plain.py from http://stackoverflow.com/questions/4460921/extract-the-first-paragraph-from-a-wikipedia-article-python

About

A bookish botscript eager to provide her considerate opinion

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages