SCala OrderBook REconstructor (SCOBRE)
This software allows the user to reconstruct the state of the limit order-book from low-level tick-data provided by the London Stock-Exchange (LSE). The tick-data can be hosted in either mysql, or Apache HBase, and tools are provided for loading to the data into either of these back-ends from the compressed raw files provided by the LSE. Once the data has been loaded, events corresponding to a particular asset and a particular date-range can be replayed through an order-book simulator in order to reconstruct the state of the book. Variables such as the mid-price can then be recorded as a time-series in CSV format. Alternatively the simulator can be run directly from a Python client using an Apache Thrift API.
Oracle Java JVM 1.7.0 or higher. Note that the default JVM installed on MacOS or Linux needs to be replaced by the Oracle version in order for the software to work correctly.
If running on Windows you will need to install Cygwin in order to execute the shell scripts.
(Optional) In order to build the software from source, you will need the scala build tool (sbt); see the sbt documentation.
(Optional) In order to host the data, you will need to install Apache HBase version 1.1.2. The software can optionally connect to an existing server which already hosts the data.
1. Configure the HBase host
Open the file hbase-site.xml in the directory etc/ using a text-editor and
check that the hbase.master and hbase.zookeeper.quorum properties point to the
machine running Apache HBase. For example, the configuration below can be
used to connect to the machine with hostname
Alternatively to connect to your own laptop running HBase in stand-alone mode,
<configuration> <property> <name>hbase.master</name> <value>cseesp1.essex.ac.uk</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>cseesp1.essex.ac.uk</value> </property> </configuration>
2. Compile the code
To compile the source-code to separate .class files, execute the following command:
To create jar files and the script files:
3. Install the shell scripts
Execute the following commands in the shell to install the scripts into the directory
cd target/pack/bin make install
~/local/bin is not already in your
PATH environment variable, add a command similar to the following to
Running the reconstructor from the shell
replay-orders can then be used retrieve a univariate time-series of prices.
The following example will replay all recorded events for the asset with given ISIN and provide a GUI visualisation of the order-book.
replay-orders -t GB0009252882 --with-gui
The following will replay a subset of events over a given date-range:
replay-orders -t GB0009252882 --with-gui \ --start-date 5/6/2007 --end-date 6/6/2007
The following command will log the mid-price to a CSV file called
will not provide a GUI:
replay-orders -t GB0009252882 --property midPrice \ --start-date 5/6/2007 --end-date 6/6/2007 -o hf.csv
The following command will log transaction prices to a CSV file called hf.csv:
replay-orders -t GB0009252882 --property lastTransactionPrice -o hf.csv
To get the full list of options use the built-in help:
Accessing the simulator from a Python client
The simulator provides an Apache Thrift API which allows clients written in non-JVM languages to call the reconstructor. To start the server, run the following script:
By default the server will listen on TCP port 9090. To see the configurations options, run:
To see an example of using the API from Python see the script tickdata.py.
- The data description provided by the LSE
- The API documentation
Working on the project using an IDE
To import the project as an IntelliJ IDEA project, first install the Scala
plugin, and then directly import the
build.sbt file as a new project.
Importing the raw data into Apache HBase
Install Apache HBase 1.1.2 in standalone mode.
Modify the file
etc/directory of the folder where you unpacked the lse-data distribution as follows:
<configuration> <property> <name>hbase.master</name> <value>localhost</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>localhost</value> </property> </configuration>
Create an empty table called
eventswith column family
datausing the HBase shell:
cd /opt/hbase/bin ./hbase shell create 'events', 'data'
- Run the shell script
hbase-import.shspecifying the raw files to import:
cd ./scripts ./import-data-lse.sh ../data/lse/*.CSV.gz
(C) Steve Phelps 2016