Skip to content

SQL Support

Alessandro Luccaroni edited this page Jul 7, 2020 · 9 revisions

HerdDB is born due to the need to port existing JDBC application to a distributed, shared nothing, embeddable database, and so even if it is a key-value database at the low level the most commonly used API is the JDBC/SQL interface.

This is a brief description of SQL/JDBC supported functions If you need to support any of the SQL constructs just ask to the dev team on the mailing-list, file a GitHub issue or a JIRA issue

HerdDB uses Apache Calcite and jSQLParser in order to parse SQL and create a data access plan Apache Calcite SQL reference is here We are enabling Babel Parser, and MySQL 5.6 dialect. In order to use reserved words you have to use MySQL backticks.

Please note that PrimaryKey access is the best data access pattern

Supported SQL Commands:

  • CREATE/DROP TABLE
  • BEGIN/COMMIT/ROLLBACK TRANSACTION
  • ALTER TABLE ADD/DROP/MODIFY COLUMN
  • INSERT INTO TABLE (....) VALUES (....)
  • UPSERT INTO TABLE (....) VALUES (....)
  • SELECT .... FROM TABLE WHERE PK=?
  • UPDATE TABLE SET .... WHERE PK=?
  • DELETE FROM TABLE .... WHERE PK=?
  • UPDATE TABLE SET .... WHERE ....
  • SELECT .... FROM TABLE WHERE ....
  • DELETE FROM TABLE WHERE ....
  • EXPLAIN QUERY
  • SHOW CREATE TABLE
  • TRUNCATE TABLE
  • JDBC PreparedStatements (which query plan cache, leveraging JDBC Parameters)
  • SUBQUERIES which deal on other tables on the same tablespace
  • SUM, COUNT, MIN, MAX aggregated functions
  • UPPER, LOWER scalar functions
  • AUTO_INCREMENT for integer/long columns (only on single columns PKs)
  • CURRENT_TIMESTAMP MySQL virtual column
  • YEAR, MONTH, DAYOFMONTH, DAYOFWEEK, HOUR, MINUTE, SECOND date functions
  • TIMESTAMPDIFF, TIMESTAMPADD date functions
  • FLOOR(datetime AS DAY) function

Supported types:

  • strings (backed by java UTF encoding)
  • byte (8bit signed integers) (partially)
  • int (32bit signed integers)
  • long (64bit signed integers)
  • float (32bit single precision floating point) (partially)
  • double (64bit double precision floating point) (partially)
  • java.sql.Timestamp (datetimes with millisecond precision)
  • blobs (byte arrays)

The supported list of datatype names is just the mapping from the various names to the list of the previous list raw datatypes, for instance VARCHAR2 maps to 'string'

Other notes:

  • Identifiers are always non case-insensitive
  • JDBC ResultSet are never updatable and forwards-only
  • The same Connection can run multiple statements concurrently and keeps multiple ResultSets/cursors open
  • Transaction isolation is ALWAYS "COMMITTED_READ"
  • When a DML statement spans multiple columns the selection is performed in transaction and then the mutation on each row (READ_COMMITTED semantics, not REPEATABLE_READ or SERIALIZABLE)
  • Transactions hold READ_LOCKs for each record selected and WRITE_LOCKS for each record INSERTED/UPDATED/DELETED
  • Using SELECT .... FOR UPDATE the transaction will hold a WRITE_LOCK for each record
  • Every column is NULLABLE if not denoted as "NOT NULL"
  • DEFAULT is available for all types excluding "byte". For Timestamp columns only CURRENT_STATEMENT is supported
  • NOT NULL constraint is only available for "integer", "long" and "string" types
  • There is no constraint on string length (syntax is supported only for supporting existing scripts)
  • Joins are quite limited. Nested Loop Join, Sort-Merge Join and Hash Join are supported

We will be very pleased to add any requested feature, just file an issue on the GitHub bug tracker