neocon: a New (neo) Concordance Framework

neocon is an in-memory concordance framework, using Akka, Scala, and Play.

Overview

At startup, an IndexActor is created, which parses and tokenizes all xml files stored at the location specified by key neocon.basedir in conf/application.conf, and stores them as an in-memory HashMap, with the index entries stored as sorted TreeSets per-word. Each client maintains a persistent websocket to the server, which allocates a SocketActor for the lifetime of the session.
The socket actor accepts certain meta commands, :dir to list loaded documents, and :freq for a word frequency table; otherwise, all queries are interpreted as index lookups, which are streamed out to the client in bulk.

Usage

>sbt run

browse to localhost:9000

TODO

docker
evaluate phrase queries
parse metadata from XML
streaming XPath processor
indexed metadata queries
tree-walk metadata queries
Akka Streams for big output
infinite scrolling client for #7
Redis backend

About the name

I learned to use concordances when I was reading Plato; specifically, I was using PhiloLogic, Perseus, and the Thesaurus Linguae Graecae while reading Meno, Phaedrus, and the Republic. I took at least one course where I read Plato each term for the first two years of college, and by far the best teacher I had was the late Herman Sinaiko. Sinaiko was a student of Leo Strauss, who taught a great number of conservative and not-so-conservative students, including, notoriously, "architect of the Iraq War" Paul Wolfowitz. Because of this, many of my colleagues teased me, relentlessly, about being a closet conservative, despite my claims of Deleuzianism.

That's basically the joke. Concordances are undoubetedly a "conservative" tool for textual analysis in the current mode of "big data" inflected digital humanities scholarship, and this tool is particularly conservative architecturally--although there were a smattering of in-memory concordances in the 70's and 80's, the trend has been toward on-disk indexes for quite a while. Now however, the pendulum is swinging back toward high-throughput, in-memory systems, and this project is an attempt to sketch how those techniques can be applied to old-fashioned concordance systems.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
app		app
conf		conf
project		project
public		public
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
activator		activator
activator-launch-1.3.6.jar		activator-launch-1.3.6.jar
activator.bat		activator.bat
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

neocon: a New (neo) Concordance Framework

Overview

Usage

TODO

About the name

About

Releases

Packages

Languages

License

rwhaling/neocon

Folders and files

Latest commit

History

Repository files navigation

neocon: a New (neo) Concordance Framework

Overview

Usage

TODO

About the name

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages