Skip to content

Search Keyword: min

William W. Kimball, Jr., MBA, MSIS edited this page May 5, 2021 · 1 revision
  1. Introduction
    1. Syntax
    2. Sample Data
  2. minimum from Groups of Hashes
  3. minimum from 1-Dimensional Arrays
  4. Dealing With Bad Data
  5. Bonus: Second minimum

Introduction

The [min([NAME])] search keyword is the opposite of the [max([NAME])] search keyword; it will match child nodes which have -- or do not have, when inverted with ! -- the minimum value in either:

  • a named child key shared by multiple immediate-child Hashes (maps/dicts); or
  • an entire present single-dimension Array (sequence/list).

Zero to many matches are made depending on how many nodes are evaluated and how many of those contain the minimum value. So, it is possible for this keyword to match more than one node when they share the same minimum value. When you really need only one match, you can use a Collector and Array Element index to filter the result to only the first match, demonstrated below.

Remember to place this keyword so that it operates against the parent node of all children that are being evaluated. Against Hash data, each child must have the named key; child nodes missing the key are ignored. An exception is raised when this keyword is placed so that it would be forced to evaluate each child in isolation. The examples below will demonstrate what this looks like.

Syntax

[min([NAME])] accepts up to one parameter, NAME. This parameter:

  • is mandatory when evaluating Hash (map/dict) peers, specifying the exact -- case-sensitive -- name of the required child key; and
  • must not be present when evaluating Array (sequence/list) elements.

Sample Data

To illustrate using the same data as used for the [max([NAME])] search keyword, the example commands below will also use this sample data, named max-examples.yaml:

---
# Consistent Data Types
prices_aoh:
  - product: doohickey
    price: 4.99
  - product: fob
    price: 4.99
  - product: whatchamacallit
    price: 9.95
  - product: widget
    price: 0.98
  - product: unknown

prices_hash:
  doohickey:
    price: 4.99
  fob:
    price: 4.99
  whatchamacallit:
    price: 9.95
  widget:
    price: 0.98
  unknown:

prices_array:
  - 4.99
  - 4.99
  - 9.95
  - 0.98
  - null

# Inconsistent Data Types
bare: value

bad_prices_aoh:
  - product: doohickey
    price: 4.99
  - product: fob
    price: not set
  - product: whatchamacallit
    price: 9.95
  - product: widget
    price: true
  - product: unknown

bad_prices_hash:
  doohickey:
    price: 4.99
  fob:
    price: not set
  whatchamacallit:
    price: 9.95
  widget:
    price: true
  unknown:

bad_prices_array:
  - 4.99
  - not set
  - 9.95
  - 0.98
  - null

minimum from Groups of Hashes

There are two groups of Hashes (maps/dicts) in the sample data:

  • Arrays of Hashes (sequence-of-maps / list-of-dicts) at prices_aoh and bad_prices_aoh; and
  • Hashes of Hashes (map-of-maps / dict-of-dicts) at prices_hash and prices_hash.

This section will ignore the bad_* data, which is demonstrated later to show how bad data impacts the outcome of this search keyword.

As data expression strategies, each type of Hash grouping has pros and cons versus the other type. The [min(NAME)] Search Keyword can handle both in the same way.

Any search against either type occurs at the parent node of the grouping. In the sample data above, that would be the prices_aoh or prices_hash nodes, like so:

$ yaml-get --query='/prices_aoh[min(price)]' max-examples.yaml
{"product": "widget", "price": 0.98}

$ yaml-get --query='/prices_hash[min(price)]' max-examples.yaml
{"price": 0.98}

Should you want only the minimum price value or only the name of the matching "product", add another Hash Key Segment for the Array of Hashes and -- for the name -- a [name()] Search Keyword for the Hash of Hashes because its children are uniquely identified their key names rather than the value of their own child identifier key:

$ yaml-get --query='/prices_aoh[min(price)]/price' max-examples.yaml
0.98

$ yaml-get --query='/prices_hash[min(price)]/price' max-examples.yaml
0.98

$ yaml-get --query='/prices_aoh[min(price)]/product' max-examples.yaml
widget

$ yaml-get --query='/prices_hash[min(price)][name()]' max-examples.yaml
widget

minimum from 1-Dimensional Arrays

In the sample data, 1-dimensional Arrays are represented by prices_array and bad_prices_array. For this simple demonstration, we'll look at the good data.

Since 1-dimensional Arrays don't have any keys, searching them uses the empty-parameter list form of the [min()] search keyword. Such a search must be performed at the parent node of all elements under consideration, like so:

$ yaml-get --query='prices_array[min()]' max-examples.yaml
0.98

Dealing With Bad Data

This search keyword will do its best to coalesce incompatible data-types so that they can be compared. No assumptions are made about the data, so when all elements under comparison are the same data-type -- numbers with numbers, text with text, and such -- the result is predictable. However, when incompatible data-types are all compared together, the results may be unexpected. This is because incompatible data-type comparisons are performed against their String equivalents; the result may seem unnatural. In this case, the character sorting locale of your Python run-time will dictate which String values are considered greater than others.

Take a look at what happens when the minimum of incompatible data is calculated against the bad sample data:

$ yaml-get --query='bad_prices_aoh[min(price)]' max-examples.yaml
{"product": "widget", "price": true}

$ yaml-get --query='bad_prices_hash[min(price)]' max-examples.yaml
{"price": true}

$ yaml-get --query='bad_prices_array[min()]' max-examples.yaml
0.98

In the first two results, the lowest calculated value was the Boolean value, true. This is a Pythonic result, which places Boolean values less than other data-types. The third query was against an Array which didn't have a Boolean value. After comparing all values, the minimum was determined to be 0.98. Such a result against bad data could be a false-positive/negative and should always be taken in context of the quality of the source data.

It should also be noted that the minimum of any single element is itself and this cannot be inverted:

$ yaml-get --query='bare[min()]' max-examples.yaml
value

$ yaml-get --query='bare[!min()]' max-examples.yaml
CRITICAL:  Required YAML Path does not match any nodes, 'bare[!min()]'.
Clone this wiki locally