/ mendel Public

Mendel breeds the best combinations of N lists without building all possible combinations.

# nathanl/mendel

Switch branches/tags
Nothing to show

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

## Files

Failed to load latest commit information.
Type
Name
Commit time

# Mendel

Mendel breeds the best combinations of sorted lists that you provide.

For example, suppose you have 100 shirts, 200 pairs of pants, and 50 hats, ordered by price. How could you find the 50 cheapest outfits?

A brute force approach would build all 1 million possibilities (100 * 200 * 50), sort by price, and take the best 50. An ideal solution would build the best 50 and stop.

Mendel gets much closer to the ideal by incrementally building candidates for the "next best" combination and using a priority queue to pull the best one at any given moment.

## How it Works

Mendel is easiest to explain for two lists. In that case, we can think of the combinations as a grid, where the X value is from the first list and the Y value is from the second. Inside the grid, we can represent combinations as the sum of the coordinate values.

The lists must be sorted by score. This means that the sums will increase (or remain constant) along one or both axes.

For example, imagine that these grids are landscapes, and the scores in the middle are elevations. Mendel chooses combinations like a tide, rising from the bottom left.

`````` +---+    +---+    +---+    +---+
1|555|   1|567|   3|777|   3|789|
1|555|   1|567|   2|666|   2|678|
1|555|   1|567|   1|555|   1|567|
+---+    +---+    +---+    +---+
444      456      444      456
``````

In every case, we are guaranteed that the bottom left corner - the best item from list Y combined with the best item from list X - has the lowest elevation. Beyond that, the next best combination could be at `0,1` or `1,0`; we don't know. All we can do is check them both and choose the best one. "Check them both" means producing a score, and to "choose the best one", Mendel uses a priority queue.

If we find that we've chosen `0,1`, before we return it, we add `1,1` and `0,2` to the queue. We don't know yet whether either of them is better than `1,0`, but next time we need a value, the priority queue will decide. So the water line continues to move up and to the right. Any coordinate "under water" has been returned, any coordinate above the water line has not yet been scored, and any coordinate the water line is just touching is a combination that's currently in the priority queue,

Run `rake visualize` to see this process in action.

Mendel does the same process for combinations of 3 or more lists, too. Imagining a 6-dimensional graph is beyond the author's cognitive abilities, but in principle, it's the same.

## Usage

Create a combiner class that knows how to score combinations of your items. Then provide lists of items, sorted in ascending value.

For example:

```# Simple lists of numbers. Any combination of these
# can be scored by adding them together
list1 = (1..100).to_a
list2 = (1.0..100.0).to_a

class NumericCombiner
include Mendel::Combiner
# Scores a combination from the two lists by adding them
def score_combination(numbers)
numbers.reduce(0) { |sum, number| sum += number }
end
end

nc = NumericCombiner.new(list1, list2)
nc.take(50) # The 50 best combinations```

Mendel will return two-item arrays of `[combination, score]`.

A combination of items is, by default, an array with one item from each list. However, if you like, you may specify how to build combinations of your items.

```defense_players = [{name: 'Jimmy', age: 10}, {name: 'Susan', age: 12}]
offense_players = [{name: 'Roger', age: 8},  {name: 'Carla',  age: 14}]

class FoosballTeam
attr_accessor :players

def initialize(*players)
self.players = players
end

def average_age
players.reduce(0){ |total, player|
total += player.fetch(:age)
} / 2.0
end
end

class TeamBuilder
include Mendel::Combiner

def build_combination(players)
FoosballTeam.new(*players)
end

def score_combination(team)
team.average_age
end
end

pc = TeamBuilder.new(defense_players, offense_players)
pc.take(2) # The youngest teams```

If you need to apply other criteria besides the score, use lazy enumeration and chain other calls:

`  pc.each.lazy.reject { |team, score| team.contains_siblings? }.take(50).to_a`

## Serialization and deserialization

`Mendel::Combiner` provides the instance methods `#dump` and `#dump_json` and the class methods `.load` and `.load_json`. This allows you to pause enumeration, save the data, and resume enumerating some time later.

## Caveats

1. Single Enumeration. For memory's sake, Mendel does not keep combinations it has returned to you. Combinations are built and flushed as you enumerate, so if you enumerate twice, there will be no data the second time; you will have to build a new combiner. If you need to keep the combinations, it is up to you to do so.
2. Memory. Producing ALL combinations of your lists in inherently expensive. Mendel shines at producing the N best. It will allow you to enumerate all of the combinations, but the more there are, the more memory it will need to queue them up. If you want the top 10,000 combinations, you'll probably be fine. If you want the top 10 billion, I hope you have lots of RAM.

## Installation

``````gem install mendel
``````

Or in Bundler:

``````gem 'mendel'
``````

## Naming

Mendel is named for Gregor Mendel, "the father of modern genetics", a scientist and monk who discovered patterns of inheritance while breeding pea plants. The Mendel gem helps you breed the best possible hybrids of your data.

Mendel breeds the best combinations of N lists without building all possible combinations.

3 tags

## Packages 0

No packages published

•
•