Skip to content

Bucketed Range Management

isubiker edited this page Dec 13, 2011 · 7 revisions

Bucketed Range Management

A bucketed range is very similar to regular range except that they allow values in the range to be grouped together into "buckets". A common use case for a bucketed range is grouping prices of items together. For example, let's say in the database there is a number of products that have a price that ranges from $1 to $250. A regular range on the price will innumerate all of the unique prices in the database which will likely not make for a very good user interface. However, bucketed ranges allow these items to be put into larger groups like: "Less than $5", "$5 to $25", "$25 to $100" and "Over $100".

For datatypes like strings and numbers, defining the buckets at creation time is sufficient. However, dates present a different problem. It's possible to define the buckets for a date based range, but because time continues on unbounded, a smarter approach quickly becomes desirable. This is where automatically created buckets come in. Presently they can only be defined for ranges where the datatype is a JSON date, an XML date or an XML dateTime.

Create a bucketed range

Creates a bucketed range in the database. Bucketed ranges can not share the same name as regular ranges, fields or mappings, they must be unique.

Manually defined buckets

You must specify a JSON key, XML element or XML element/attribute to create the range on, the datatype (see supported range datatypes) and a string describing the buckets.

  • Endpoint: /manage/bucketedrange/<bucketedrange-name>
  • Request type: POST
  • Parameters:
    • key (optional) - A JSON key to build the range on
    • element (optional) - An XML element to build the range on
    • attribute (optional) - An XML attribute to build the range on. Note: must specify the element that the attribute is on.
    • type - The datatype for the bucketed range. Please see supported range datatypes for a list of valid types.
    • collation - The collation to use for string range indexes.
    • buckets - The string to define the buckets has to follow a specific pattern. This string is a combination of labels and values, separated by a vertical bar ("|"). The format of the string must follow: <label>|<value>|<label>|<value>|…|<label>|<value>|<label>. The values define what the upper and lower bounds of the bucket should be, the labels allow for a human readable description of what's inside those bounds. The string must start and end with a label.
  • Returns:
    • On success a 200 is returned with an empty response body
    • If a range, field or mapping with this name already exists, a 400 is returned
    • If parameters are missing or invalid, a 400 is returned
  • Example:
    • /manage/bucketedrange/price?key=itemPrice&datatype=number&buckets=Less than $5|5|$5 to $25|25|$25 to $100|100|Over $100

Automatically defined buckets

You must specify a JSON key, XML element or XML element/attribute to create the range on, the datatype, the size of each bucket (bucketInterval), a starting date and optionally an ending date.

  • Endpoint: /manage/bucketedrange/<bucketedrange-name>
  • Request type: POST
    • key (optional) - A JSON key to build the range on
    • element (optional) - An XML element to build the range on
    • attribute (optional) - An XML attribute to build the range on. Note: must specify the element that the attribute is on.
    • type - The datatype of the JSON key, XML element or XML element/attribute. In the case of JSON key's, the type must be "date", for XML elements and element/attributes the type can be either "date" or "dateTime".
    • collation - The collation to use for string range indexes.
  • bucketInterval - The bucket interval dictates how large each bucket should be. Must be one of: decade, year, quarter, month, week, day, hour or minute.
  • startingAt - An ISO 8601 dateTime (eg: 1970-01-01T00:00:00). This date will be used to construct the upper bound for the first bucket.
  • stoppingAt (optional) - An ISO 8601 dateTime (eg: 1990-12-31T23:59:59). This date will be used to construct the lower bound on the last bucket. If one is not provided, the current date and time will be used.
  • firstFormat - A strftime string that will be used to format the label for the first bucket
  • format - A strftime string that will be used to format the label for all the buckets except the first and last
  • lastFormat - A strftime string that will be used to format the label for the last bucket
  • Returns:
    • On success a 200 is returned with an empty response body
    • If a range, field or mapping with this name already exists, a 400 is returned
    • If parameters are missing or invalid, a 400 is returned
  • Examples:
    • /manage/bucketedrange/publishedIn?key=pub::date&datatype=date&bucketInterval=year&startingAt=1970-01-01T00:00:00
    • /manage/bucketedrange/publishedIn?key=pub::date&datatype=date&bucketInterval=year&startingAt=1970-01-01T00:00:00&stoppingAt=1990-12-31T23:59:59

Get a list of all configured bucketed ranges

To get a list of all the bucketed ranges that are configured, make a GET request to:

/manage/bucketedranges

Get info about a bucketed range

Returns information about how a bucketed range is configured.

  • Endpoint: /manage/bucketedrange/<bucketedrange-name>
  • Request type: GET
  • Returns:
    • If the bucketed range exists, a 200 is returned with the response body being a JSON document that describes the bucketed range configuration
    • If the bucketed range does not exist, a 404 is returned
  • Example:
    • /manage/bucketedrange/price

Sample response from a bucketed range with defined buckets

{
    "name": "price",
    "key": "itemPrice",
    "type": "number",
    "buckets": [
        "Less than $5", 5, "$5 to $25", 25, "$25 to $100", 100, "Over $100"
    ]
}

Sample response from a bucketed range with automatically created buckets

{
    "name": "publishedIn",
    "key": "pub::date",
    "type": "date",
    "bucketInterval": "year",
    "startingAt": "1970-01-01T00:00:00"
}

Delete a bucketed range

Deletes the bucketed range.

  • Endpoint: /manage/bucketedrange/<bucketedrange-name>
  • Request type: DELETE
  • Returns:
    • If the bucketed range was deleted, a 200 is returned with an empty response body.
    • If the bucketed range does not exist, a 404 is returned
  • Example:
    • /manage/bucketedrange/price

Discussion

What if the bucket name includes a vertical bar? We escaped somehow. Should document.

Should this work with xs:date in addition to xs:dateTime?

Clone this wiki locally