Skip to content
Browser add-on and web server to support collection and analysis of web browsing data.
JavaScript Python HTML CSS Other
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
FirefoxAddon
deployment
dev-env
documentation
server
.gitignore
LICENSE
README.md

README.md

This project is deprecated, check out the latest work at https://github.com/Sotera/DatawakeDepot

DataWake

The DataWake project consists of various server and database technologies that aggregate user browsing data via a plug-in using domain-specific searches. This captured, or extracted, data is organized into browse paths and elements of interest. This information, in the form of Trails , can be shared or expanded amongst teams of individuals. Elements of interest which are extracted either automatically, or manually by the user, are given weighted values. Extracted elements that are not of interest or might be confused with an element that is of interest (e.g. an Organization with a similar name but not associated in any meaningful way to the one being researched) can be manually removed from the extracted data list.

Additionally, the application can be configured to export all page contents and extracted information to RESTFul services, Elasticsearch, or Kafka.

Companion projects

Necessary for building

Other projects

  • DataWake Prefetch Streaming search with scraping and entity extraction of all results.
  • Firmament Provides a simplified configuration of interconnected Docker containers.

More information including build information can be found at our Github Page.

DataWake is part of the DARPA Memex Open Catalog

For more information, please email memex@soteradefense.com.

You can’t perform that action at this time.