Working with a mySQL database
When running large-scale hctsa computations, it can be useful to set up a mySQL database for time series, operations, and the computation results, and have many Matlab instances (running on different nodes of a compute cluster, for example) communicate directly with the database.
The hctsa software comes with this (optional) functionality, allowing a more powerful, distributed way to compute and store results of a large-scale computation.
This chapter outlines the steps involved in setting up, and running hctsa computations using a linked mySQL database.
Installing the hctsa code package to work with a mySQL database
The hctsa package requires some preliminary set up to work with a mySQL database, described here:
- Installation of mySQL, either locally, or on an accessible server.
- Setting up Matlab with a mySQL java connector (done by running the
install_jconnectorscript in the Database directory, and then restarting Matlab).
After the database is set up, and the packages required by hctsa are installed (by running the
install script), linking to a mySQL database can be done by running the
install_database script, which:
- Sets up Matlab to be able to communicate with the mySQL server and creates a new database to store Matlab calculations in, described here.
- Populates the database with our default library of master operations and operations, as described here. (NB: a description of the terminology of 'master operations': a set of input arguments to an analysis function, and 'operations': a single time-series feature, is here).
This section contains additional details about each of these steps.
Note that the above steps are one-off installation steps; once the software is installed and compiled, a typical workflow will simply involve opening Matlab, running the
startup script (which adds all paths required for the hctsa software), and then working within Matlab from any desired directory.
Adding a time-series dataset
Once installed using our default library of operations, the typical next step is to add a dataset of time series to the database using the
Custom master operations and operations can also be added, if required.
Computation, processing, and analysis
After installing the software and importing a time-series dataset to a mySQL database, the process by which data is retrieved from the database to local Matlab files (using
SQL_retrieve), feature sets computed within Matlab (using
TS_compute), and computed data stored back in the database (
SQL_store) is described in detail here.
After the computation is complete for a time-series dataset, a range of processing, analysis, and plotting functions are also provided with the software, as described here.