Mimir SQL

Shivang Aggarwal edited this page Dec 10, 2017 · 11 revisions


query := 
   'SELECT'  expression ['AS' alias][, expression ['AS' alias][, ...]]
     'FROM' source
    'WHERE' condition
 'GROUP BY' column[, column[, ...]]
 'ORDER BY' expression [asc | desc][, column [asc | desc][, ...]]
    'LIMIT' number

expression := 
   | '(' expression ')'
   | value
   | expression ('+'|'-'|'*'|'/') expression
   | '-' expression
   | function '(' [expression[, expression[, ...]] ')'
   | 'CASE' 'WHEN' condition 'THEN' expression ['WHEN' condition 'THEN' expression ['WHEN'...]] 'ELSE' expression 'END'
   | 'CASE' expression 'WHEN' expression 'THEN' expression ['WHEN' expression 'THEN' expression ['WHEN'...]] 'ELSE' expression 'END'
   | 'CAST' '(' expression 'AS' type ')'
   | 'EXTRACT' '(' field 'FROM' expression ')'

condition := 
   | '(' condition ')'
   | condition ('+'|'-'|'*'|'/'|'AND'|'OR'|'>'|'>='|'='|'<'|'<='|'<>') condition
   | 'NOT' expression
   | function '(' [expression[, expression[, ...]] ')'

source := 
   | '(' query ')' ['AS'] alias
   | [schema_name '.'] table_name [['AS'] alias]

Mimir generally supports most SQL92 features in SELECT queries. Mimir's support for aggregate and non-aggregate functions is still a work in progress. See the Functions documentation for more details.


'ANALYZE' ['WITH ASSIGNMENTS'] [[column 'IN'] rowid 'OF'] query

Explain an uncertainty annotation for the specified relation, row, or cell.

  • neither rowid or column: List all sources of uncertainty that might affect the table.
  • rowid but not column: List all sources of uncertainty that affect the presence of the specified row in the table.
  • rowid + column: List all sources of uncertainty that might affect the value of the specified cell.

To obtain the rowid for a given row of a query, add the ROWID() function to the query's outermost SELECT. Alternatively, just write a select query that produces a single row and you'll get the same effect.

Add the WITH ASSIGNMENTS option to get a list of prioritized cleaning tasks.


'FEEDBACK' variable_expression 'IS' value

Provide feedback, repairing a given source of uncertainty. To obtain the variable expression needed to repair a source of uncertainty, use ANALYZE.


'LOAD' file_path_string [ AS name ]

Load the specified file into the database and prepare it for use. Depending on the format, one or more lenses and/or adaptive schemas will be created wrapping the data. If a table name is specified, this name will be used for the newly created table or adaptive schema. Otherwise the base name of the file will be used.

Presently, Mimir only supports loading of CSV files. We expect to support loading JSON files soon (tm).


CSV files are loaded using the Apache Commons CSV parser into a single table. Mimir makes every effort to recover from malformed CSV: Under-sized rows are padded, Over-sized rows are trimmed, Lines that break CommonsCSV are dropped. At present, these errors are logged to the console, but not persisted.

The behavior of the CSV loader differs depending on whether a table with the same name already exists. If there is such a table, rows of the CSV are loaded into this table according to its schema.

If there is no table with the same name, a new table is created with the _RAW suffix, contents of the CSV file are loaded into it as string attributes, and a new Type Inference lens with the specified name is created.


  | 'TABLE' tableName '(' attrName type[, attrName type[, ...]][, tableOption] ')'
  | 'VIEW' viewName 'AS' query
  | 'LENS' lensName 'AS' query 'WITH' lensType '(' lensParameters ')'
  | 'ADAPTIVE' 'SCHEMA' schemaName 'AS' query 'WITH' extractor '(' extractorParameters ')'

tableOption :=
  | 'PRIMARY' 'KEY' '(' attrName[, attrName[, ...]] ')'
  | 'INDEX' '(' attrName[, attrName[, ...]] ')'

Create one of several types of database objects. CREATE TABLE and CREATE VIEW function like they do in a classical database. CREATE LENS and CREATE ADAPTIVE SCHEMA are specific to Mimir. See Lens Documentation for more details.


'EXPLAIN' query

Document the Mimir rewriting engine's decisions in executing a specified query. Analogous to EXPLAIN in many database engines.


'PLOT' data_table_name [WITH globalOption][ globalOption][ USING lineSpecification][, lineSpecification];

|'FORMAT' ('line'|'scatter'|'bar')
|'PLOTNAME' value
|'XLABEL' value
|'YLABEL' value
|'XMAX' numeric_value
|'YMAX' numeric_value
|'XMIN' numeric_value
|'YMIN' numeric_value
|'SAVEFORMAT' ('png'|'pdf'|'svg'|'eps')
|'SAVENAME' value
|'BARORIENT' ('vertical'|'horizontal')
|'LEGEND' ('show'|'hide')
|'LEGENDLOC' ('upper left'|'upper right'|'right'|'lower left'|'lower right'|'center'|
              'center left'|'center right'|'lower center'|'upper center'|'best')

lineSpecification := '('x_column_name, y_column_name')' [lineValue][lineValue]

lineValue :=
|'COLOR' ('black'|'blue'|'red'|'yellow'|'green')
|'STYLE' ('^'|'s'|'o') | ('-'|'--'|':')
|'WEIGHT' numeric_value
|'FILTER' column_name'='value
|'LINENAME' value

Plot the given data table with any specified global or line settings. The first set of lineValue STYLE options are for scatter plots, while the second set are for line plots. The FILTER lineValue option is to allow the x and y values noted in the lineSpecification to be filtered to only contain the x,y values where the column denoted by column_name's values are equal to the specified value.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.