AppScale is designed to make it easy to run over different databases. It does so by abstracting away the database itself through two interfaces: one that defines how the database should be started, and one that defines how to save and retrieve data. This document details how these interfaces are defined and how to extend AppScale to support a new database via an example (which we will call "mydb").
ProTip: In addition to these directions, check out how we implement support for Cassandra (appscale/AppDB/cassandra) for best practices when adding support for your database.
When we build an AppScale virtual machine from source, we need to automatically build your new database as well. Open up appscale/debian/appscale_install_functions.sh and add a function named "installmydb", which should define how to install your database. Next, open appscale/debian/appscale_build.sh and change this line:
supported_dbs=(hbase hypertable mysql cassandra)
supported_dbs=(hbase hypertable mysql cassandra mydb)
to tell the build script that your database is officially supported. Continue by opening appscale/debian/appscale_install.sh and after the "core)" section, add in your database:
core) ... lots of stuff ... ;; mydb) installmydb ;; ... lots of stuff ... all) ... lots of stuff ... installmydb installcassandra # and the other DBs
Of course, be sure to actually test that this works! Build a clean Ubuntu Lucid VM, run "bash appscale_build.sh" and make sure your database is installed.
Begin by creating a directory in AppDB named after your database:
Next, create a file in that directory named "mydb_helper.rb". This Ruby file needs to define the following functions:
setup_db_config_files(master_ip, slave_ips, creds)
Begin by creating an empty "init.py" file in your app's interface directory:
Next, create a file named "mydb_interface.py" and create a class called "MyDBInterface" that subclasses "AppDBInterface". Be sure to import that interface from dbinterface_batch and not the older, slower dbinterface. Next, implement the following methods for your database:
batch_get_entity(self, table_name, row_key, column_names)
batch_put_entity(self, table_name, row_key, column_names, cell_values)
batch_delete(self, table_name, row_keys, column_names=)
range_query(self, table_name, column_names, start_key, end_key, limit, offset=0, start_inclusive=True, end_inclusive=True, keys_only=False)
To prevent users from running with a misspelled database name, the AppScale Tools only let users run databases that match a whitelist. Open up appscale-tools/lib/parse_args.py and change this line:
ALLOWED_DATASTORES = ["hbase", "hypertable", "cassandra"]
to include your database:
ALLOWED_DATASTORES = ["hbase", "hypertable", "cassandra", "mydb"]
You then will be good to go when you execute "appscale-run-instances" or "appscale up" with your new database name specified! Make sure that you can run AppScale with your database on one node and four nodes, in a cluster deployment and in a cloud deployment. Try the guestbook app for an easy app that does puts and queries.
Once you have your new database automatically started and installed on AppScale virtual machines, send us a pull request in a branch that details what database you've added support for and why it's awesome, and we'll review it and (assuming it works for us) accept it!