Skip to content
MathWebSearch Implementation
Branch: master
Clone or download
tkw1536 Update README
The PPA is no longer maintained
Latest commit 115be0d Jan 24, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
analytics Add index analytics. Jun 26, 2014
config Set default port for schemad Jan 28, 2015
data/zbl
doc
scripts
src
test
third_party
.arcconfig
.arclint Switch to new arcanist lint configuration. Jul 2, 2014
.clang-format
.gitignore Ignore idea project files. Sep 1, 2014
.travis.yml Update build dependency May 5, 2017
CMakeLists.txt
Dockerfile
LICENSE Imported MWS repository r1209 Apr 28, 2013
Makefile
README.md
TODO Refactor build system. Mar 28, 2014

README.md

MathWebSearch Build Status

About

The MathWebSearch system (MWS) is a content-based search engine for mathematical formulae. It indexes MathML formulae, using a technique derived from automated theorem proving: Substitution Tree Indexing. The software is licensed under the GNU General Public License version 3.

Demos

can be found at http://search.mathweb.org, in particular, the arXiv demo is at http://arxivsearch.mathweb.org

Content

  • analytics/ user defined analytics source files
  • config/ configuration files
  • data/ data used to run a MWS demo
  • doc/ documentation for MWS users
  • scripts/ utility scripts
  • src/ source code
  • test/ test source code and data
  • third_party/ third party source code
  • CMakeLists.txt CMake build script
  • LICENSE copy of the license under which this software is distributed
  • Makefile build makefile
  • README.md documentation overview about the project
  • TODO project TODOs which have not materialized into tickets

Compiling and Testing

Compiling of the source tree is automated via CMake. You can build the sources using the following command:

make

Binaries are built in the bin/ directory, while documentation in bin/docs. To select or de-select which components to compile, use:

make config

To run the tests, use:

make test

Finally, install the binaries using:

make install

Dependencies

To build this software, one needs:

  • g++ (with pthread) (>= 4.4)
  • cmake (>= 2.6)
  • make
  • pkg-config

The core MathWebSearch executables require:

  • libmicrohttpd (>= 0.4)
  • libxml2
  • libleveldb
  • libsnappy
  • libjson-c
  • libjson0-dev

The crawler executables require:

  • libhtmlcxx-dev
  • libicu-dev
  • libcurl4-gnutls-dev

The documentation target requires:

  • doxygen

The test target requires:

  • netcat
  • curl

The config step requires:

  • cmake-curses-gui

Makefile targets

  • all builds all the binaries of the project and tests
  • clean cleans the build
  • config brings up the cmake CLI configuration tool
  • doc generates the documentation of the project
  • test runs project tests
  • help display the complete list of targets
  • install installs mwsd, docs2harvest, mws-config on your system

Usage

To use the Crawler, one needs to start the crawler with the defined website, the count of sites to crawl and optionaly: the start of the crawling and links to be skipped when crawling.

Another way to generate harvests is via docs2harvest. This takes as arguments XHTML documents and crawls them, creating harvests files. To crawl a repository of XHTML documents, use:

find . -name *.xhtml | xargs -n 10 bin/docs2harvest -o /path/to/harvests

The executable mwsd starts the main MWS server. This takes as argument a harvest include path which is used to load document data, and a port which where the data is served via HTTP. It accepts HTTP POST requests with MWS Queries and returns MWS Answer Sets.

bin/mwsd -I <harvest include dir> -p <port number>

For additional options, see:

bin/mwsd --help
bin/docs2harvest --help
bin/mws-config help

To setup or remove mwsd as a global SysV service, use (as root):

mws-config create -p 9090 -i data/zbl zbldemo
mws-config enable zbldemo

This will deploy MathWebSearch to serve the ZBL demo harvests on port 9090. To monitor, start or stop the service, use

service mwsd_zbldemo [start|stop|status|...]

Output is logged to /var/log/mwsd_zbldemo.log. To serve different harvest paths, create your own configurations and deploy the service.

Usage with Docker

This repository contains a Dockerfile for using the MWS Daemon. It can be found as the mathwebsearch/mathwebsearch automated build on Docker Hub and used as follows:

docker run -v /path/to/harvests:/data/ -p 8080:8080 mathwebsearch/mathwebsearch

The image is configured to serve harvests from a /data/ volume on port 8080.

Copyright

The software in this project (binaries and sources) is released "as is", under the GNU Public License version 3. A copy of this license can be found in the root of this project, under the file name LICENSE.

Credits

Most of the code in the core repository was developed by Corneliu-Claudiu Prodescu, under the supervision of Prof. Michael Kohlhase. For a complete list of developers visit https://github.com/KWARC/mws/graphs/contributors

Contact

The easiest way to contact the developers is using the MathWebSearch mailing list.

You can’t perform that action at this time.