Skip to content
This repository

A Brief History of junqi

In 2012, I started to write the objeq Query Library because I saw a need for dynamically filtering data sets in the Browser that could be tied, without much headache, to the UI framework of one's choice. It needed to be simple but expressive, for both terseness and readability, and it didn't need a lot of bells and whistles to accomplish its goal.

When I started doing Node.js development full time, some of the original requirements for the objeq library went away. Specifically, I no longer needed data sets to dynamically update themselves because they wouldn't be integrating with UI frameworks. Also, the focus on just filtering data sets gave way to a need for general-purpose data manipulation, such as grouping, subqueries and so on.

In trying to determine whether or not the objeq library could be extended to support these new requirements, I was left with a problem: A lot of the library's code is dedicated to generating dynamic results and so to support the type of evolution I was targeting might require a substantial amount of extra work. Considering how lazy I am, that was out of the question.

So I decided to gut it and build up a new query engine. junqi was the result. It immediately supports an evolved (but backward compatible) version of the objeq query language, and will eventually support a subset of JSONiq.

Why two query languages?

I see objeq and JSONiq addressing two different needs. Both are declarative, but objeq is designed to work like the UNIX pipeline. It doesn't pull data on its own, instead you push data into it and grab whatever comes out of the other end. In this way, objeq is not useful as a full-blown programming language. JSONiq is a bit more complete by comparison. A JSONiq query grabs data from any number of locations and is able to process that data in a more free-form fashion.

objeq relies on the fact that its host language (JavaScript) can perform pre or post-processing of the data sets it manipulates, while JSONiq doesn't factor in the existence of a host language, remaining completely agnostic of its own implementation details. This makes it more portable, but maybe means it's not as convenient for integrating into a host language. That said, junqi will attempt to address this integration question.

Why Should I Use a JavaScript Query Language?

Ok, let's give a simple example of the type of code you might write three or four times a day. Specifically, you want to find all the users who have listed you as somebody they're following. This might be your first attempt at accomplishing that task:

// First Naive Attempt
function followersForMe(users, me) {
  var results = [];
  for ( var u = 0, ulen = users.length; u < ulen; u++ ) {
    var user = users[u];
    if ( user.follows.indexOf(me) !== -1 ) {
      results.push(user);
    }
  }
  return results;
}

But then you run it and it explodes! Why?! Well, because follows is undefined. JSON can be sparse. There may be no follows property in a User Object, so we need to check for it:

// Second Naive Attempt
function followersForMe(users, me) {
  var results = [];
  for ( var u = 0, ulen = users.length; u < ulen; u++ ) {
    var user = users[u]
      , follows = user.follows;
    if ( !follows ) {
      continue;
    }
    if ( follows.indexOf(me) !== -1 ) {
      results.push(user);
    }
  }
  return results;
}

Wait, WTF?! indexOf is undefined?!? There's a follows property, but it's not an Array. JavaScript doesn't have static type checking, so we need to check for it:

// Third Naive Attempt
function followersForMe(users, me) {
  var results = [];
  for ( var u = 0, ulen = users.length; u < ulen; u++ ) {
    var user = users[u]
      , follows = user.follows;
    if ( !Array.isArray(follows) ) {
      continue;
    }
    if ( follows.indexOf(me) !== -1 ) {
      results.push(user);
    }
  }
  return results;
}

Ok, this works, but holy shit if that isn't a lot of code just to do something simple. Node's Array implementation allows us to use the filter() function. Let's do that instead:

// Using Array Filters Instead
function followersForMe(users, me) {
  return users.filter(function(user) {
    var follows = user.follows;
    if ( !Array.isArray(follows) ) {
      return false;
    }
    return follows.indexOf(me) !== -1;
  });
}

This is much better. But we're still having to deal with whether or not follows is an Array, and this function is no shorter than our original attempt.

// Will a Ternary make it shorter?
function followersForMe(users, me) {
  return users.filter(function(user) {
    var follows = user.follows;
    return Array.isArray(follows) ? follows.indexOf(me) !== -1 : false;
  });
}

Ok, that's definitely shorter, but even harder to read now. Who the hell invented ternary operators anyway?! What can we do? Well this is why I started to write the objeq library.

var followersForMe = objeq("%1 in follows");

This is great! Almost perfectly clear. You want to filter data based on some parameter being in a follows property. But what the hell is %1? Can't we add some more semantic meaning to it?

var followersForMe = objeq(function(me) {/*
  %me in follows
*/});

Yes, yes you can.

If follows doesn't exist in the User object then the object will simply be skipped. Also, if follows isn't an Array, objeq will do some extra magic by checking for direct equality between the parameter and property.

And this is the tip of the iceberg. How about creating a pre-digested set instead of having to call the function for each user?

var allFollowers = object(function() {/*
  select {
    me: this as %me,
    followers: [
      %data where %me in follows
    ]
  }
*/});

Or even better, why don't we just add the followers directly to the resulting user?

var allFollowers = object(function() {/*
  extend this as %me, {
    followers: [
      %data where %me in follows
    ]
  }
*/});

In both of these examples, a variable called %data was referenced. Where did it come from? Quite simply, junqi will define a few pre-defined variables, and %data is one of them. Specifically, %data points to the original data set (in this case) or the data set that was passed into a subquery.

Subqueries, you might have guessed, look like this in objeq:

[ %data where %me in follows ]

First comes the expression that provides the data for the subquery, in this case it's %data but it could also be any valid expression such as someArray or ['default']. After that comes the actual subquery processing. These can be anything you'd do with a top-level objeq query. Subqueries are always surrounded by Array Literal brackets to reinforce the idea that all data going in and out of an objeq query is an Array.

Something went wrong with that request. Please try again.