Secondary Indexes

There's an extension for that.

A collection/key/value database has limitations. That's why YapDatabase comes with a full suite of extensions.

Multiple options

There's some overlap in functionality between each of these extensions:

A view allows you to specify something akin to a SELECT clause, and gives you a sorted subset of your items. This makes it perfect for acting as the datasource for a table. Or any other kind of SELECT type query you need to access on a regular basis. Further, the view automatically updates as the database changes, and also pumps out change-sets which can be used to animate changes to a tableView.
A secondary index gives you one or more indexes in SQLite, which can then be queried directly to find or enumerate matching items. This is appropriate when you need to regularly search for items based on certain criteria.
And full text search works best when you're doing google-style searching for terms and such.

Your specific use cases will dictate which extension is a better match for your needs. Keep in mind that you can use multiple extensions simultaneously. In fact, you can have multiple views and multiple secondary indexes. So you can make these decisions on a per-need basis.

If all you have is a hammer, everything looks like a nail.

A word to the wise:
If you're wanting to use YapDatabase as a datasource for a tableView/collectionView, then you should likely be using Views. Many people who are new to YapDatabase (but familiar with SQL) are quick to start using secondary indexes for just about everything. But Views were explicitly designed to help you drive a tableView/collectionView. In fact, they even help you drive animations! I know it's a new concept, and there's a lot to learn... but using YapDatabase without understanding Views is like riding a bike without knowing how to shift gears. So take a moment to learn about Views. I promise that YapDatabase is going to seem much more powerful & flexible once you learn how Views work.

Creating secondary indexes

The first step is to identify the properties you'd like to index. These translate into column names and column types.

let setup = YapDatabaseSecondaryIndexSetup()
setup.addColumn("department", with: .text)
setup.addColumn("salary", with: .real)

The column names can be anything you choose. And the column types are used to specify what type of data is being indexed.

The next step is to create a block which will extract the index information for a given row in the database.

let handler: YapDatabaseSecondaryIndexHandler =
  YapDatabaseSecondaryIndexHandler.withObjectBlock {
  (transaction, dict, collection, key, object: Any) in

  // Just add the proper key/vales to the given dictionary.
  
  if let employee = object as? Employee {
    dict["department"] = employee.department
    dict["salary"] = employee.salary
  }
  else {
    // Don't need to index this item.
    // So we simply don't add anything to the dict.
  }
}

In the example above, employee.department is a string, and employee.salary is a float or double. This matches the columns and types we specified.

The other parameters to the block correspond a particular row in the database.

Finally, we create the secondary index instance by feeding it all the configuration from above. And then we plug the extension into the database system.

// The 'versionTag' is a string that tells the SecondaryIndex if you've
// made changes to the handler since last app launch.
// If the versionTag hasn't changed, the SecondaryIndex doesn't need to do anything.
// But if the versionTag HAS changed, the SecondaryIndex will automatically re-populate
// itself. That is, it will:
// - flush its internal tables
// - enumerate the database, and invoke your handler block to see
//   if the row should be indexed
// 
// In other words, if you make changes to your handler code,
// then just change your versionTag, and the extension will do the right thing.
// 
let versionTag = "2019-05-20" // <= change me if you modify handler

let options = YapDatabaseSecondaryIndexOptions()
// You can optionally limit the collections your extension uses.
// For example, if your SecondaryIndex will only contain items from a few collections,
// (say "Foo" & "Bar"), then you can specify a whitelist of collections:
/*
let whitelist = Set(["Foo", "Bar"])
options.allowedCollections = YapWhitelistBlacklist(whitelist: whitelist)
*/
// What you're saying here is:
//   My View will ONLY include items from collections 'Foo' & 'Bar'.
//   So if you see rows in other collections, just act as if you invoked my
//   handler, and it returned an empty dictionary.
//   But don't actually bother invoking my handler.
// 
// This is most helpful when you're:
// - creating a secondary index for the first time
// - OR re-populating a secondary index due to a versionTag change
// - AND the database has a LOT of item in it
// - AND you're indexing a small subset of the entire database
// 
// This allows the extension to enumerate a small subset of the database
// when initializing or re-populating.

let secondaryIndex =
  YapDatabaseSecondaryIndex(setup: setup,
                          handler: handler,
                       versionTag: versionTag,
                          options: options)

let extName = "idx" // <= Name we will use to access this extension
database.asyncRegister(secondaryIndex, withName: extName) {(ready) in
			
  if !ready {
    print("Error registering \(extName) !!!")
  }
}

Once plugged in, the extension will be available to use, and will automatically update itself.

One of the nice things about secondary indexes in YapDatabase is that you can create these indexes on only a subset of your data. So if you have a million items in your database, but only a thousand employee objects, you could create a secondary index as above, and only incur the overhead of an index with 1,000 entries.

Using secondary indexes

Once you've setup your secondary index, you can access it from within any transaction. For example:

dbConnection.read {(transaction) in

  if let idxTransaction = transaction.ext("idx") as? YapDatabaseSecondaryIndexTransaction {
  
    let query = YapDatabaseQuery(string:"WHERE salary >= 10000")
    idxTransaction.iterateKeys(matching: query) {(collection, key, stop) in
      // ...
    }
  }
}

There are two steps to using a secondary index:

creating the query
executing the query

If you're familiar with SQL, you'll notice that the query looks like a subset of a normal query. Rather than shield you from SQL, the door is left open for you. Everything that SQLite supports you can throw into the query. This includes ORDER BY and GROUP BY clauses if you want.

There are 2 types of queries:

standard queries (in which you want to enumerate a set results)
aggregate queries (in which you want to perform a calculation, like a sum, and get a result)

Standard Queries

For standard queries, you skip the SELECT clause. That is, your query will be automatically prepended with the proper SELECT clause:

// You provide this:
let query = YapDatabaseQuery(string: "WHERE salary >= ?", parameters: [min])

// It prepends this:
let prefix = "SELECT * FROM 'properTableNameHere' "

And then the iterate method handles stepping over the results and invoking the given block.

You can iterate over whatever components of the row you want:

dbConnection.read {(transaction) in

  guard let idxTransaction = transaction.ext("idx") as? YapDatabaseSecondaryIndexTransaction else {
    return
  }
  
  let query = YapDatabaseQuery(string: "WHERE department = ? ORDER by salary DESC", parameters: [department])

  // if you only need the (collection, key) tuples:
  idxTransaction.iterateKeys(matching: query) {(collection, key, stop) in
    
  }
  
  // if you want the object:
  idxTransaction.iterateKeysAndObjects(matching: query) {(collection, key, obj, stop) in
    if let employee = obj as? Employe {/*...*/}
  }
  
  // if you want the entire row:
  idxTransaction.iterateRows(matching: query) {(collection, key, obj, metadata, stop) in
    if let employee = obj as? Employe {/*...*/}
  }
}

Aggregate Queries

Aggregate queries allow you to perform functions such as:

avg
max
min
sum
group_concat

For more information on SQLite's support for aggregate functions:
https://www.sqlite.org/lang_aggfunc.html

Creating an aggregate query is almost identical to creating a standard query. The only difference being that you also supply the aggregate function you wish to use. Such as "SUM(salary)".

dbConnection.read {(transaction) in

  guard let idxTransaction = transaction.ext("idx") as? YapDatabaseSecondaryIndexTransaction else {
    return
  }
  
  // Figure out how much the "marketing" department is costing the business

  let query =
    YapDatabaseQuery(aggregateFunction: "SUM(salary)",
                                string: "WHERE department = ?",
                            parameters: ["marketing"])
  
  let cost = idxTransaction.performAggregateQuery(query)
}

Query Parameters

YapDatabaseQuery supports query parameters. This works in almost the exact same way as it does in SQLite:

dbConnection.read {(transaction) in

  guard let idxTransaction = transaction.ext("idx") as? YapDatabaseSecondaryIndexTransaction else {
    return
  }
  
  // Find 28% federal tax bracket employees:

  let minSalary =  87_850
  let maxSalary = 183_250

  var query =
    YapDatabaseQuery(string: "WHERE salary > ? AND salary <= ?",
                 parameters: [minSalary, maxSalary])
    
  idxTransaction.iterateKeys(matching: query) {(collection, key, stop) in
    // ...
  }

  // Find engineers in the 28% federal tax bracket

  let department = @"engineering";

  query =
    YapDatabaseQuery(string: "WHERE salary > ? AND salary <= ? AND department == ?",
                 parameters: [minSalary, maxSalary, department])
  
  idxTransaction.iterateKeys(matching: query) {(collection, key, stop) in
    // ...
  }
}

All the same rules that apply to query syntax in SQLite apply here.

You're highly encouraged to use query parameters. That is, specifying '?' for a value, and then passing the value separately (as above). Doing so means you don't have to worry about escaping strings, and other such caveats. But also because there's an overhead to compile the query string you pass into an executable routine by the sqlite engine. The YapDatabase layer uses some caching optimizations to skip this overhead when you use the same query string multiple times. That is, if the query string doesn't change, then the YapDatabase layer can better optimize automatically for you. Thus using query parameters means the query string remains the same, even when you pass different values.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly