Skip to content

Identifiers in the CQL grammar

Ewout Kramer edited this page Oct 27, 2023 · 1 revision

(TODO: make explicit in which uses names are introduced and when they are a reference?)

The CQL Syntax diagrams are woefully incomplete about the parsing rules around identifiers in the current (1.5) CQL spec. Below, I will details all you will need to know to invoke the right rules and visitors in ANTLR.

Note:

The grammar fragments below are taken from the CQL Framework repo, not the official IG. The relevant difference for this discussion is the change to the usingDefinition, which now also support an alias using called.

The syntax of identifiers

Starting at the bottom, let's first look at the syntactic aspect of identifiers.

identifier
    : IDENTIFIER
    | DELIMITEDIDENTIFIER
    | QUOTEDIDENTIFIER
    ;

QUOTEDIDENTIFIER
    : '"' (ESC | .)*? '"'
    ;

DELIMITEDIDENTIFIER
        : '`' (ESC | .)*? '`'
        ;

IDENTIFIER
        : ([A-Za-z] | '_')([A-Za-z0-9] | '_')*            // Added _ to support CQL (FHIR could constrain it out)
        ;

fragment ESC
    : '\\' ([`'"\\/fnrt] | UNICODE)    // allow \`, \', \", \\, \/, \f, etc. and \uXXX
    ;

Mostly similar to identifiers from other programming languages, but explicitly excluding unicode. There are two escape mechanism, which keeps the grammar compatible with (older versions of) FhirPath. The escapes allow you to use the full set of characters, including a set of escapes. Note that inside double quotes you are still allowed to escape backticks and vice versa.

Alternative names for identifier

Below are the rules that define straight synonyms for identifier.

modelIdentifier
    : identifier
    ;

libraryIdentifier
    : identifier
    ;

qualifier
    : identifier
    ;
  • qualifier is used to construct "qualified identifiers", which we will discuss later.
  • libraryIdentifier is only still used as a prefix in codeSystemIdentifier and codeIdentifier, which suggests that these are remnants from a version of CQL where libraries could not be aliases yet.
  • modelIdentifier is only still used as a prefix in contextDefinition, which suggests that these are remnants from a version of CQL where used models could not be aliases yet.

(TODO: if qualifier is only used in references, then they are used in references)

To summarize, these rules redefine identifier to be used as a qualifier (or prefix) for other identifiers, so the combination can unambiguously refer to definitions in other libraries and models.

alias
    : identifier
    ;

localIdentifier
    : identifier
    ;

These rules introduce aliases:

  • localIdentifier is used to introduce the alias that can be specified after the called in an include or using.
  • alias is used to introduce the alias name for a query source in a query.

Identifiers and keywords

The next set of rules help the ANTLR parser to accept reserved keywords as identifiers exactly where that is expected and allowed.

referentialIdentifier
    : identifier
    | keywordIdentifier
    ;

identifierOrFunctionIdentifier
    : identifier
    | functionIdentifier
    ;

referentialOrTypeNameIdentifier
    : referentialIdentifier
    | typeNameIdentifier
    ;

In these rules, keywordIdentifier, functionIdentifier and typeNameIdentifier refer to enumerated subsets of the keywords in CQL.

Examples or their use are:

  • As the name of a tuple element
  • As the parameter name in a function definition
  • As the name of a member you can access on a complex object
  • As the name of a function to be invoked
  • As the name of a function in a function definition
  • As the unqualified name part in a NamedTypeSpecifier

These are in addition to the normal use of identifier, which is used in:

  • As the name of a parameter/codesystem/valueset/etc in the definition section of the library
  • As the name of an expression when it is defined

To summarize, these rules can be regarded as defining simple, unqualified identifiers plus the possibility for them to be reserved words.

Qualified identifiers

codesystemIdentifier
    : (libraryIdentifier '.')? identifier
    ;

codeIdentifier
    : (libraryIdentifier '.')? identifier
    ;

namedTypeSpecifier
    : (qualifier '.')* referentialOrTypeNameIdentifier
    ;

qualifiedIdentifier
    : (qualifier '.')* identifier
    ;

qualifiedIdentifierExpression
    : (qualifierExpression '.')* referentialIdentifier
    ;

qualifierExpression
    : referentialIdentifier
    ;

contextIdentifier
    : qualifiedIdentifierExpression
    ;
  • namedTypeSpecifier is not really an identifier, but it works as a reference to a model type, and as such is a close cousing to the constructs in this section.
  • qualifiedIdentifier is used to define the name of a library, to refer to a model in the using statement and to refer to a library in the include statement.
  • qualifiedIdentifierExpression, qualifierExpression and contextIdentifier are only used by the querySource, and the retrieve statement`.
  • codeSystemIdentifier and codeIdentifier have been discussed before, they seem to qualify a reference to a codesystem or code. Suprisingly, this is only through a simple (non-repeating) qualification construct so it could never refer to a library with multiple qualifiers. I assume this only works through aliases.

Additionally, the expressionTerm rule allows construction of a path through its invocationExpression term, which allows expressions like this to parse:

define "Referring to another library":  OtherLibrary.DefinitionInLibrary

In this case, the OtherLibrary.DefinitionInLibrary works like a qualified reference to a definition, which is not immediately obvious from the grammar.

False Friends

Just to confuse you, in the next rules, the word "qualified" has nothing to do with the qualified identifiers mentioned above, instead it seems to refer to the fact that these define terms that can be used after the function/member invocation.

qualifiedInvocation
    : referentialIdentifier             #qualifiedMemberInvocation
    | qualifiedFunction                 #qualifiedFunctionInvocation
    ;

qualifiedFunction
    : identifierOrFunctionIdentifier '(' paramList? ')'
    ;