MAQUI is a web application that supports expressive querying and flexible pattern mining for exploring event sequence data. It is a collaborative effort between researchers at Georgia Tech and Adobe Research. For more information about the project, please refer to the paper at IEEE VIS 2018:
MAQUI: Interweaving Queries and Pattern Mining for Recursive Event Sequence Exploration Po-Ming Law, Zhicheng Liu, Sana Malik, and Rahul C. Basole IEEE Transactions on Visualization and Computer Graphics (IEEE VIS 2018)
This video walks you through the basic funcationality of MAQUI. This video demonstrates how MAQUI can be used for exploring the Foursquare dataset.
MAQUI needs your data to be in two specific files in the data folder folder, in a specific format.
data/events.csv is a file with one row per 'event'. In GOV.UK analytics,
'event' is unfortunately an overloaded term. We recommed creating a dataset with
one row per Google Analytics hit, where a hit can be one of two types, PAGE or
EVENT.
The data/events.csv file must be comma-delimited, have a header row, and the
first three columns must be ID,time,Activity. Data in the time column must
be an ISO8601 datetime stamp of the form %Y-%m-%DT%H:%M:%S.%fZ. For example,
2022-03-02T06:11:23.004Z and 2022-03-02T06:11:23Z will both work, because
subseconds are optional. Data in the ID column must be a unique ID per
user/session, so we recommend combining the fullVisitorID and visitID. A
line of SQL code to do this is be CONCAT(fullVisitorId, "-", visitId) AS ID.
recordAttributes.csv is a file with one row per 'session'. The file exists to
store attributes about each whole session (rather than about each hit in each
session). It must have at least one column, called ID, containing the unique
set of values from the ID column of the file data/events.csv. You can make
any other columns to contain attributes about each session.
If you only need to create the ID column, then run the script
make_recordAttributes.sh. It will find the unique set of ID values from the
file data/events.csv, and write them into a new file called
data/recordAttributes.csv.
bash make_recordAttributes.shMAQUI should work fine for data sets that contain up to 200,000 events, and a
few hundred event different values of Activity and any other columns that you
create.
You need to install Python 3 (rather than Python 2), and Flask. Java is also needed in your system, because MAQUI uses a java library called SPMF for pattern mining. The Chrome browser is recommended, because the original developers only tested the software with Chrome.
The following are the instructions and commands for running the system using a Mac:
Clone the repository using a terminal.
git clone https://github.com/alphagov/MAQUI.gitGo to the folder named MAQUI.
cd MAQUICreate a python virtual environment, activate it, and install Flask in it. There are different ways to do this. For example:
python -m venv ~/.virtualenvs/MAQUI
source activate ~/.virtualenvs/MAQUI/bin/activate
pip install FlaskPut data files called events.csv and recordAttributes.csv into the data
folder. See the
section above called "Data" for what form the data should take.
Go to the server folder.
cd serverStart the Python server (you need Python 3 for the system to work properly).
python server.pyThis was forked from a project by Terrance Law.
