Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

A bookish botscript eager to provide her considerate opinion

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 bluestocking
Octocat-spinner-32 .gitignore
Octocat-spinner-32 CHANGES.txt
Octocat-spinner-32 LICENSE.txt
Octocat-spinner-32 MANIFEST.in
Octocat-spinner-32 README.md
Octocat-spinner-32 setup.py
README.md

bluestocking

An information extraction toolkit.

To discuss the project with use, join our maing list: http://groups.google.com/forum/?fromgroups#!forum/bluestocking-dev

This project depends on NLTK. You will need to install it before running these scripts.

To run tests:

python tests.py

To run factchecker demo, try this:

python factchecker.py "The sky is not blue."

or this:

python factchecker.py "People never eat fish. Goldfish are unpopular."

This test a document against the Simple English Wikipedia articles for each word in the string passed as argument.

(Warning: documents with long sentences take longer to query)

Scripts included:

parse.py

Defines Document class for wrapping raw text and Parser class for extracting Relations from a Document.

Documents have a method to turn them into Doxaments (see below).

doxament.py

Defines a Doxament class. A Doxament contains many Relations. A Doxament may be queried for consistency with another Doxament. They may also be merged to form a more complete knowledge base.

Relations encapsulate a semantically significant lexical cooccurence.

other

wikipedia.py and wiki2plain.py from http://stackoverflow.com/questions/4460921/extract-the-first-paragraph-from-a-wikipedia-article-python

Something went wrong with that request. Please try again.