User Drivers

Neil O'Toole edited this page Oct 1, 2017 · 8 revisions

sq provides provides a number of built-in drivers, for SQL-native databases (currently MySQL, Postgres, and SQLite) and common table-like data formats, such as XLSX, CSV, and TSV. Internally, these table-like formats are converted to relational data. However, table-like data can be found in other formats, such as XML and JSON, often in a custom document schema. The User Driver mechanism is sq's solution for accessing these data formats.

Execute sq drivers to get a list of all available drivers, both built-in and user-defined.

What is a User Driver?

User Driver is the sane name for what was once known as a Declarative Data Driver Definition. Essentially, a User Driver maps non-SQL data onto relational tables, which can then be queried as if a normal database. The driver is defined declaratively (typically in ~/.sq/ext/MYDRIVER.yml). Currently it is only possible to define User Drivers for one "genre" of data (XML) but it expected that other genres (such as JSON) will be supported in the future. This is all best explained by example.

Example: People User Driver

Let's say we have an XML document (~/people.xml) that is a list of people and their skills. It might look like this:

<?xml version="1.0"?>
<people>
    <person gender="male">
        <firstName>Neil</firstName>
        <lastName>O'Toole</lastName>
        <email>neilotoole@apache.org</email>
        <skill>Cards</skill>
        <skill>Coding</skill>
    </person>

    <person gender="male">
        <firstName>Nikola</firstName>
        <lastName>Tesla</lastName>
        <email>nikola@tesla.rs</email>
        <skill>Electrifying</skill>
        <skill>Inventing</skill>
    </person>

    <person gender="female">
        <firstName>Margaret</firstName>
        <lastName>Hamilton</lastName>
        <email>mhamilton@nasa.gov</email>
        <skill>Coding</skill>
        <skill>Navigating</skill>
    </person>
</people>

We're going to give this User Driver the short name of ppl. Create a file ~/.sq/ext/ppl.sq.yml, and add the following definition:

user_drivers:
- driver: ppl
  genre: xml
  title: People
  selector: /people
  tables:
  - table: person
    selector: /people/person
    cols:
    - col: person_id
      datatype: int
      primary: true
    - col: first_name
      selector: ./firstName
      datatype: text
    - col: last_name
      selector: ./lastName
      datatype: text
    - col: email
      selector: ./email
      datatype: text
    - col: gender
      datatype: text
      selector: '@gender'
  - table: skill
    selector: /people/person/skill
    cols:
    - col: skill_id
      datatype: int
      primary: true
    - col: person_id
      datatype: int
      foreign: ../person_id
    - col: skill_name
      selector: ./text()
      datatype: text

There's a bunch to unpack in the driver definition, but let's plough ahead. Add people.xml as a data source, giving it the snappy handle @peeps, and specifying the ppl driver.

> sq add /Users/neilotoole/people.xml --driver=ppl @peeps
HANDLE  DRIVER  LOCATION                      OPTIONS
@peeps  ppl     /Users/neilotoole/people.xml

Now, let's inspect the @peeps data source:

> sq inspect @peeps
HANDLE  NAME        SIZE    TABLES  LOCATION
@peeps  people.xml  64.0KB  2       /Users/neilotoole/people.xml

TABLE   ROWS  SIZE    NUM COLS  COL NAMES                                        COL TYPES
person  3     32.0KB  5         person_id, first_name, last_name, email, gender  int(11), text, text, text, text
skill   6     32.0KB  3         skill_id, person_id, skill_name                  int(11), int(11), text

All looking good. Let's run a query:

> sq '@peeps | .person'
person_id  first_name  last_name  email                  gender
1          Neil        O'Toole    neilotoole@apache.org  male
2          Nikola      Tesla      nikola@tesla.rs        male
3          Margaret    Hamilton   mhamilton@nasa.gov     female

And let's take a look at the skill table (using a shorter SLQ syntax):

> sq @peeps.skill
skill_id  person_id  skill_name
1         1          Cards
2         1          Coding
3         2          Electrifying
4         2          Inventing
5         3          Coding
6         3          Navigating

And for good measure, let's run a join, fetching Maggie Hamilton's skills:

> sq '@peeps | .skill, .person | join(.person_id) | .last_name == "Hamilton" | .skill_name'
skill_name
Coding
Navigating
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.