# Syntax of PartiQL

The syntax of PartiQL is completely provisional, but there are reasons behind each choice. In this notebook, I will explain those reasons.

## Why not SQL?

Although I want to demonstrate the value of relational concepts like surrogate indexes and set operations for particle physics, strict SQL would limit usability in ways that would be too distracting for the demo.

   * PartiQL indexes are not visible, but SQL's are. In PartiQL, we are only using the indexes and index-matching to ensure that particles retain their identity, so there is only one choice for the `ON` clause of a `JOIN`. SQL is more general: SQL users sometimes want to match on surrogate keys, sometimes natural keys, and their choice will depend on their domain. Following this path, PartiQL should at least drop SQL's `ON` clause.
   * SQL's {database, table, row} hierarchy corresponds to the awkward array structure `ListArray(RecordArray(PrimitiveArrays...)))`. In particle physics, we want to deal with more structures than this. It would be possible in SQL by emulating deeper structures using table normalization, but that would require an `ON` clause to select the right foreign keys to link tables. Arguably, what PartiQL does is internally manage foreign keys with its implicit `ON` clauses to provide the appearance of deeply nested data structures, which makes it more high-level than SQL: it maintains data in a way that is appropriate for particle physics only.
   * If we apply queries to individual events, we will need new constructs to perform operations across events, such as cutting events and histogramming.
   * SQL seems to have bad design decisions, patched over by decades of practice (e.g. [common table expressions](https://www.citusdata.com/blog/2018/08/09/fun-with-sql-common-table-expressions/) is a heavy boilerplate and out-of-order way to do functional composition, the evaluation order is visible and [very different](https://sqlbolt.com/lesson/select_queries_order_of_execution) from the order in which queries are written, etc.). Adhering to SQL's syntax would put a burden on physicists who are new to it.
   * PartiQL queries are never going to be exact SQL queries, so the value of interoperability is at the level of concepts: for that, we use the same names. Data scientists who know SQL will find familiar ideas in PartiQL and physicists who learn something like PartiQL will recognize those terms when they encounter them in SQL.
   * The `cut/vary/hist` block syntax of my October 2018 language was a good idea and should be replicated here.