NU Infolab News Context Project

context is a suite of Python based tools for managing contextual knowledge related to web content. This includes resources for article text and metadata extraction from web pages, keyword and named entity extraction, and more.

The primary entry point is a Flask based web application that serves both HTML and JSON payloads. This application is located in the web directory.

context itself, under the context directory, may also be used directly as a python library.

About

A number of projects at NU InfoLab involve experiments in the space of evaluating contextual information related to web content in order to enhance user experience. This toolkit brings a number of those explorations into a single project space where they can be further explored and expanded.

Requirements

In order to install lxml, you will need the development packages libxml2 and libxsl:

sudo apt-get install libxml2-dev libxslt-dev

In order to use the categorizer, you will need to liblinear.

Ubuntu/Debian: sudo apt-get install liblinear1

Mac OS: should be able to use the included liblinear.so.1

NLTK Resource requirements

The following resources should be installed with the NLTK downloader:

wordnet
words
maxent_treebank_pos_tagger
punkt
maxent_ne_chunker
stopwords

To use the downloader:

>>> import nltk
>>> nltk.download()

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
browser		browser
context		context
examples		examples
tests		tests
web		web
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
context.cfg.example		context.cfg.example
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

browser

browser

context

context

examples

examples

tests

tests

web

web

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

context.cfg.example

context.cfg.example

setup.py

setup.py

Repository files navigation

NU Infolab News Context Project

About

Requirements

NLTK Resource requirements

About

Releases

Packages

Languages

License

NUinfolab/context

Folders and files

Latest commit

History

Repository files navigation

NU Infolab News Context Project

About

Requirements

NLTK Resource requirements

About

Resources

License

Stars

Watchers

Forks

Languages