Topic modelling tool. Target userbase: The whole world
Seek makes use of extensive libraries and external tools so as a result, we are providing you with a one time installer for Linux and OS X.
-
Simply run
./install.sh
and everything should be setup. -
Afterwards, run the command
source activate.sh
to export all the required paths and create all the required NLTK and Stanford data. -
Once you have downloaded all of the required data, you can comment any installation or download commands in activate.sh, keeping only the exported paths.
-
Navigate to
$NLTK_DATA/nltk_trainer
and run the following command: `python train_chunker.py treebank_chunk --classifier=NaiveBayes -
Copy
seek5.ser.gz
to$STANFORD_MODELS
-
Almost there! Navigate to core and run python init.py. That will exported all the necessary objects for your application.
- Named Entity Recognition
- Topic Modelling
- Summarization
- Relationship extraction
The executor is the main program if you will. However, not all of the capabilities have been currently implemented in it. To bypass that, simply run
python executor.py/linguist.py [flag] [path] {Optional arguments}
All of the capabilities will later be added to the executor for simplicity.
The Web App provides much of the same capabilities via natural language interpretation.
- PDF to plain text
- Handwritten stuff to plain text
- Text from images
- Web scraping - Wikipedia(big no no), Arxiv(sensitive bastards), Imperial's Spiral repository(say what?)
- Parsing plain text to LDA rules
- Gatherer - script that runs the integration algorithms and parses the plain text
- Analyser - takes plain text ^ connects to database
- Assistant - Interface for document/query input, RL (terminal only)
- LDA
- Topic linking to database
- Weights and links for semantic analysis, knowledge graph
- Grammar skills for consistent discourse
- Author and agent analysis
- Associating images with context/concepts
- Visual interface
- Speech recognition
- Feed
- Model for agent/person
- Moral model