# Stable Pairing Intro

This notebook provides a brief introduction to the stable pairing class. It will work through the example rankings found in Table 1 of McVitie and Wilson 1970 (https://link.springer.com/content/pdf/10.1007/BF01934199.pdf)

In [1]:
# Add stablepairing module to path
import sys
from pathlib import Path
sys.path.append(str(Path('../py').absolute()))

In [2]:
# Import the StablePairing class
from stablepairing.pairing import StablePairing
import numpy as np # loading data + arrays

We will next load the tables that should be found in `../../data` directory relative to the stablepairing code (if directly cloned from github, else you'll need to work a little here)

In [3]:
data_dir = Path('../data')
men_ranks = np.loadtxt(data_dir / 'mcvitie_wilson_1970_table1_men.txt')
men_ranks = men_ranks.astype(int)
women_ranks = np.loadtxt(data_dir / 'mcvitie_wilson_1970_table1_women.txt')
women_ranks = women_ranks.astype(int)
men_ranks

array([[5, 7, 1, 2, 6, 8, 4, 3],
       [2, 3, 7, 5, 4, 1, 8, 6],
       [8, 5, 1, 4, 6, 2, 3, 7],
       [3, 2, 7, 4, 1, 6, 8, 5],
       [7, 2, 5, 1, 3, 6, 8, 4],
       [1, 6, 7, 5, 8, 4, 2, 3],
       [2, 5, 7, 6, 3, 4, 8, 1],
       [3, 8, 4, 5, 7, 2, 6, 1],
       [1, 6, 7, 4, 2, 5, 8, 3],
       [7, 4, 5, 8, 2, 1, 3, 6]])

In [4]:
women_ranks

array([[ 5,  3,  7,  6,  9, 10,  1,  2,  8,  4],
       [ 8,  6,  3,  5,  7,  2,  1, 10,  9,  4],
       [ 1,  5,  6,  2, 10,  4,  9,  8,  7,  3],
       [ 8,  7,  3,  9,  2,  4,  1,  5,  6, 10],
       [ 6,  4,  7,  3,  8,  1, 10,  9,  2,  5],
       [ 2,  8,  5,  4,  6,  3,  9,  7,  1, 10],
       [ 7,  5, 10,  9,  2,  1,  8,  6,  4,  3],
       [ 7,  4,  1,  5,  2,  3,  9, 10,  6,  8]])

These two arrays are what *I* call "rank" matricies. This means that the rows correspond to a member of a group, the columns correspond to the member's preference (first, second, third ... etc) and the values are a member of the other group. The `StablePairing` class accepts a different format which I call "choice" matricies. Choice matricies have rows corresponding to a member of a group, columns corresponding to a member of the other group and values reflecting the preference of member A to member B (1, 2, 3... so on). Choice matricies are accepted because this is a more natural form that survey data comes in. Both types are used internally to the algorithm so the choice of which to take as input data could have been either.

Fortunately the `StablePairing` class has two static functions `rank2choice` and `choice2rank` which translate between the two formats. These are mostly for the class' own use but we will borrow them here. 

A subtle point that we don't have to deal with here but is important for  more general survey data is that if the preferences in each row of a choice matrix are not unique, a ranking order will be assigned by assigning shuffled adjacent numbers among the equally preferred choices. For this reason `choice2rank` includes an optional `shuffleseed` kwarg for reproducable results. This also means `arr == rank2choice(choice2rank(arr))` is only true when the preferences in the choice matrix are unique. Note, a choice matrix converted from a rank matrix always has unique preferences, so once the order is decided the process is invertable.

In [5]:
men_choice = StablePairing.rank2choice(men_ranks)
women_choice = StablePairing.rank2choice(women_ranks)
men_choice

array([[3, 4, 8, 7, 1, 5, 2, 6],
       [6, 1, 2, 5, 4, 8, 3, 7],
       [3, 6, 7, 4, 2, 5, 8, 1],
       [5, 2, 1, 4, 8, 6, 3, 7],
       [4, 2, 5, 8, 3, 6, 1, 7],
       [1, 7, 8, 6, 4, 2, 3, 5],
       [8, 1, 5, 6, 2, 4, 3, 7],
       [8, 6, 1, 3, 4, 7, 5, 2],
       [1, 5, 8, 4, 6, 2, 3, 7],
       [6, 5, 7, 2, 3, 8, 1, 4]])

In [6]:
women_choice

array([[ 7,  8,  2, 10,  1,  4,  3,  9,  5,  6],
       [ 7,  6,  3, 10,  4,  2,  5,  1,  9,  8],
       [ 1,  4, 10,  6,  2,  3,  9,  8,  7,  5],
       [ 7,  5,  3,  6,  8,  9,  2,  1,  4, 10],
       [ 6,  9,  4,  2, 10,  1,  3,  5,  8,  7],
       [ 9,  1,  6,  4,  3,  5,  8,  2,  7, 10],
       [ 6,  5, 10,  9,  2,  8,  1,  7,  4,  3],
       [ 3,  5,  6,  2,  4,  9,  1, 10,  7,  8]])

Now we can run the pairing algorthim. It is rather simple to do as below.

In [7]:
sp = StablePairing(men_choice, women_choice)
sp.run()
sp.match

array([6, 7, 2, 8, 1, 4, 5, 3])

The output match array represents set B (women) `i` is paired with set A (men) `match[i]`. This reflects the "male optimal" solution that is stared in table 1 of McVitie and Wilson.

We can also view this in a few different ways

In [8]:
# print them (note the names are made up here, pass a pandas.DataFrame if you want actual names)
sp.print_matches()

setB_AA is paired with setA_AF
setB_AB is paired with setA_AG
setB_AC is paired with setA_AB
setB_AD is paired with setA_AH
setB_AE is paired with setA_AA
setB_AF is paired with setA_AD
setB_AG is paired with setA_AE
setB_AH is paired with setA_AC


In [9]:
# Return a series (can also be written as csv with Series.to_csv(filename) )
sp.matches_as_series(orient='A')

setA_AA    setB_AE
setA_AB    setB_AC
setA_AC    setB_AH
setA_AD    setB_AF
setA_AE    setB_AG
setA_AF    setB_AA
setA_AG    setB_AB
setA_AH    setB_AD
setA_AI       None
setA_AJ       None
dtype: object

In [10]:
sp.matches_as_series(orient='B')

setB_AA    setA_AF
setB_AB    setA_AG
setB_AC    setA_AB
setB_AD    setA_AH
setB_AE    setA_AA
setB_AF    setA_AD
setB_AG    setA_AE
setB_AH    setA_AC
dtype: object

In [11]:
# can also get this as a dict if you so prefer
sp.matches_as_series(orient='A', as_series=False)

{'setA_AA': 'setB_AE',
 'setA_AB': 'setB_AC',
 'setA_AC': 'setB_AH',
 'setA_AD': 'setB_AF',
 'setA_AE': 'setB_AG',
 'setA_AF': 'setB_AA',
 'setA_AG': 'setB_AB',
 'setA_AH': 'setB_AD',
 'setA_AI': 'None',
 'setA_AJ': 'None'}