Skip to content

Secondary Indexes

Kirill Serebriakov edited this page Mar 29, 2016 · 23 revisions

There's an extension for that.

A key/value database, in and of itself, is not all that flexible. There are times when you need to find items based on criteria other than the key. And there are times when you may need to perform calculations across a number of items, such as a 'sum' or 'avg'. That's where secondary indexes come in. A secondary index gives you access to the SQL tools you're used to. This allows you to create indexes on important properties, and then query those indexes to find the item(s) you need, or perform aggregate calculations.

Extension options

There are multiple extensions that come with YapDatabase. It's important to understand what's available so that you can pick the best extension for the job. In particular, there is some overlap in functionality between each of these extensions:

For example, you could use a secondary index to index strings, and then perform searches on them. This is fine if you're searching for full & exact matches. But if you're doing google style searching for terms and such, then the FullTextSearch extension is better suited to your needs.

In a similar manner, there is overlap between views and secondary indexes. And this is important to understand:

  • A view allows you to specify something similar to a SELECT clause (but with blocks), and gives you a sorted subset of your items. This makes it perfect for acting as the datasource for a table. Or any other kind of SELECT type query you need to access on a regular basis. Further, the view automatically updates as the database changes, and also pumps out change-sets which can be used to animate changes to a tableView.

  • A secondary index gives you one or more indexes in SQLite, which can then be queried directly to find or enumerate matching items.

Your specific use cases will dictate which extension is a better match for your needs. Keep in mind that you can use both extensions simultaneously. In fact, you can have multiple views and multiple secondary indexes. So you can make these decisions on a per-need basis.

A word to the wise:
If you're wanting to use YapDatabase as a datasource for a tableView/collectionView, then you should likely be using Views. Many people who are new to YapDatabase (but familiar with SQL) are quick to start using secondary indexes for just about everything. But Views were explicitly designed to help you drive a tableView/collectionView. In fact, they even help you drive animations! I know it's a new concept, and the wiki page is long (tl;dr)... but using YapDatabase without understanding Views is like riding a bike without knowing how to shift gears. So take a moment to learn about Views. I promise that YapDatabase is going to seem much more powerful & flexible once you learn how Views work.

Understanding extensions

Extensions work seamlessly behind the scenes. You don't have to change the way you write objects to the database. Extensions automatically take part in the transaction architecture, and automatically update themselves as you make changes to the database.

The end result is that your extensions can come and go with ease. You can change your extensions around every build-and-go as you're developing. But the objects in your database stay the same. And, just as important, the code you have that inserts, deletes & updates objects in the database can stay the same.

For a deeper understanding of the extensions architecture, see the Extensions article.

Creating secondary indexes

The secondary index extension works by creating one or more indexes within SQLite. But one of the nice things about secondary indexes in YapDatabase is that you can create these indexes on only a subset of your data. So if you have a million items in your database, but only a thousand employee objects, you could create a secondary index on employee.salary & employee.department, and only incur the overhead of an index with 1,000 entries.

The first step in creating a secondary index is to identify the properties you'd like to index. These translate into column names and column types.

#import "YapDatabaseSecondaryIndex.h"

YapDatabaseSecondaryIndexSetup *setup = [[YapDatabaseSecondaryIndexSetup alloc] init];
[setup addColumn:@"department" withType:YapDatabaseSecondaryIndexTypeText];
[setup addColumn:@"salary" withType:YapDatabaseSecondaryIndexTypeReal];

The column types are used to specify what type of data is being indexed. And the column names can be anything you choose.

The next step is to create a block which will extract the index information for a given row in the database.

YapDatabaseSecondaryIndexHandler *handler =
  [YapDatabaseSecondaryIndexHandler withObjectBlock:
  ^(YapDatabaseReadTransaction *transaction, NSMutableDictionary *dict, NSString *collection, NSString *key, id object)
{
    if ([object isKindOfClass:[Employee class]])
    {
        Employee *employee = (Employee *)object;
        
        dict[@"department"] = employee.department;
        dict[@"salary"] = @(employee.salary);
    }
    else
    {
        // Don't need to index this item.
        // So we simply don't add anything to the dict.
    }
}];

The first parameter to the block is a mutable dictionary. Just add the proper fields into the dictionary. In the example above, employee.department is a string. This matches the department column and type we specified. And employee.salary is a float or double. This matches the salary column and type we specified. (Of course we wrap it in NSNumber to stuff it in the dictionary.)

The other parameters to the block correspond to the row in the database. In this example we just needed the object. But it's also possible to index information in the metadata as well.

Next we create the secondary index instance by feeding it all the configuration from above.

YapDatabaseSecondaryIndex *secondaryIndex =
    [[YapDatabaseSecondaryIndex alloc] initWithSetup:setup handler:handler];

And lastly we need to plug the extension into the database system.

[database registerExtension:secondaryIndex withName:@"idx"];

Once plugged in, the extension will be available to use, and will automatically update itself. Remember that you can register extensions, and unregister extensions at any time. So extensions that you use on a regular basis should be registered on every app launch, as part of the database setup code. Otherwise you can bring up extensions on-the-fly as you're running the app.

For more information about registering & unregistering extensions, see the Extensions article.

Using secondary indexes

Once you've setup your secondary index, you can access it from within any transaction. For example:

[connection readWithBlock:^(YapDatabaseReadTransaction *transaction) {

    YapDatabaseQuery *query = [YapDatabaseQuery queryWithFormat:@"WHERE salary >= 100000"];
    [[transaction ext:@"idx"] enumerateKeysMatchingQuery:query usingBlock:^(NSString *collection, NSString *key, BOOL *stop) {
        
        // ...
    }];
}];

There are two steps to using a secondary index:

  • creating the query
  • executing the query

If you're familiar with SQL, you'll notice that the query looks like a subset of a normal query. Rather than shield you from SQL, the door is left open for you. Everything that SQLite supports you can throw into the query. This includes ORDER BY and GROUP BY clauses if you want.

There are 2 types of queries:

  • standard queries (in which you want to enumerate a set results)
  • aggregate queries (in which you want to perform a calculation, like a sum, and get a result)

Standard Queries

For standard queries, you skip the SELECT clause. That is, your query will be automatically prepended with the proper SELECT clause that matches the enumerate method you call. In the example above we used enumerateKeysMatchingQuery:, so the full query looks something akin to this:

// You provide "WHERE salary >= 100000"
// You invoke 'enumerateKeysMatchingQuery'
// It automatically prepends "SELECT key FROM database" in order to get:

SELECT key FROM database WHERE salary >= 100000

And then the enumerate method handles stepping over the results and invoking the given block.

You can enumerate over whatever components of the row you want. Such as the keys and objects:

[connection readWithBlock:^(YapDatabaseReadTransaction *transaction) {
		
    YapDatabaseQuery *query =
        [YapDatabaseQuery queryWithFormat:@"WHERE department = ? ORDER by salary DESC", department];

    [[transaction ext:@"idx"] enumerateKeysAndObjectsMatchingQuery:query
                                                        usingBlock:^(NSString *collection, NSString *key, id object, BOOL *stop) {
        Employee *employee = (Employee *)object;
        // ...
    }];
}];

Aggregate Queries

Aggregate queries allow you to perform functions such as:

  • avg
  • max
  • min
  • sum
  • group_concat

For more information on SQLite's support for aggregate functions:
https://www.sqlite.org/lang_aggfunc.html

Creating an aggregate query is almost identical to creating a standard query. The only difference being that you also supply the aggregate function you wish to use. Such as "SUM(salary)".

[connection readWithBlock:^(YapDatabaseReadTransaction *transaction) {

    // Figure out how much the "marketing" department is costing the business

    YapDatabaseQuery *query = [YapDatabaseQuery queryWithAggregateFunction:@"SUM(salary)"
                                                                    format:@"WHERE department = ?", @"marketing"];
    NSNumber *cost = [[transaction ext:@"idx"] performAggregateQuery:query];
}];

Query Parameters

YapDatabaseQuery supports query parameters. This works in almost the exact same way as it does in SQLite:

[connection readWithBlock:^(YapDatabaseReadTransaction *transaction) {

    YapDatabaseQuery *query = nil;

    // Find 28% federal tax bracket employees:

    double minSalary =  87850;
    double maxSalary = 183250;

    query = [YapDatabaseQuery queryWithFormat:@"WHERE salary > ? AND salary <= ?",
                                                    @(minSalary), @(maxSalary)];
    [[transaction ext:@"idx"] enumerateKeysMatchingQuery:query usingBlock:^(NSString *collection, NSString *key, BOOL *stop) {
    
        // ...
    }];

    // Find engineers in the 28% federal tax bracket

    NSString *department = @"engineering";

    query = [YapDatabaseQuery queryWithFormat:@"WHERE salary > ? AND salary <= ? AND department == ?",
                                                    @(minSalary), @(maxSalary), department];
    [[transaction ext:@"idx"] enumerateKeysMatchingQuery:query usingBlock:^(NSString *collection, NSString *key, BOOL *stop) {

        // ...
    }];
}];

All the same rules that apply to query syntax in SQLite apply here. The primary difference is:

  • All query parameters need to be objective-c objects (so just wrap numbers in NSNumber)

You're highly encouraged to use query parameters. That is, specifying '?' for a value, and then passing the value separately (as above). Doing so means you don't have to worry about escaping strings, and other such caveats. But also because there's an overhead to compile the query string you pass into an executable routine by the sqlite engine. The YapDatabase layer uses some caching optimizations to skip this overhead when you use the same query string multiple times. That is, if the query string doesn't change, then the YapDatabase layer can better optimize automatically for you. Thus using query parameters means the query string remains the same, even when you pass different values.

You can’t perform that action at this time.