# Stable Roommates Matching for Pair Research
This notebook analyzes the [Stable Roomates Matching](http://www.dcs.gla.ac.uk/~pat/jchoco/roommates/papers/Comp_sdarticle.pdf) algorithm with previous [Pair Research](http://pairresearch.io/). 

# Load in Libraries and Stable Roommates Matching Module

In [1]:
%load_ext autoreload
%autoreload 2

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from copy import deepcopy

# load stable roommates code
from stable_roommates import stable_matching_wrapper as sr_matching
from stable_roommates import verify_stability

# Analysis of Stable Roommates Matching on Pair Research Data
Below, we analyze the impact of using the Stable Roommates algorithm on previous pairings. 

We begin by seeing
1. How frequently can we find stable matchings? 
2. When stable matchings are not possible, for what reason do they fail?

## Fetch Pairing Data from [pairresearch.io](http://pairresearch.io/)

In [2]:
import multiprocessing as mp
import random
import math

import pandas as pd
from pymongo import MongoClient

import seaborn as sns
%matplotlib inline

In [3]:
uri = 'mongodb://delta:delta@ds011419.mlab.com:11419/pair-research'
dbName = 'pair-research'
client = MongoClient(uri)
db = client[dbName]
db.collection_names()

[u'affinities',
 u'meteor_accounts_loginServiceConfiguration',
 u'tasks_history',
 u'groups',
 u'users',
 u'objectlabs-system.admin.collections',
 u'pairs_history',
 u'tasks',
 u'system.indexes',
 u'pairings',
 u'objectlabs-system',
 u'affinities_history']

In [4]:
users = pd.DataFrame(list(db.users.find({})))

print('Number of Users: {}'.format(len(users)))
users.head()

Number of Users: 877


Unnamed: 0,_id,createdAt,emails,groups,profile,services
0,dibWQsjhkpvC52AFp,2016-08-16 15:54:28.489,"[{u'verified': False, u'address': u'hjlkadfjkl...",[],{u'fullName': u'hihi'},{u'password': {u'bcrypt': u'$2a$10$dkjBKl9Po3A...
1,BPQ7hyoHgghctHPqq,2016-08-29 18:24:50.295,"[{u'verified': True, u'address': u'egerber@nor...","[{u'isPending': False, u'groupName': u'Delta L...",{u'fullName': u'Liz Gerber'},{u'password': {u'bcrypt': u'$2a$10$Q9SnAxCEjS1...
2,bZEjadPH7KrjM9PfD,2016-11-10 19:19:34.147,"[{u'verified': False, u'address': u'ampiper@no...","[{u'isPending': True, u'groupName': u'Segal De...",{u'fullName': u'ampiper@northwestern.edu'},{u'password': {}}
3,8mRni9ixefux6bSz9,2016-12-09 01:55:36.706,"[{u'verified': False, u'address': u'hscho122@k...",[],{u'fullName': u'hscho122@kaist.ac.kr'},{u'password': {}}
4,JXCrPvRJwM5pK4Wk7,2017-01-05 07:05:46.455,"[{u'verified': False, u'address': u'artydevelo...",[],"{u'fullName': u'Deokseong', u'avatar': u'http:...",{u'password': {u'bcrypt': u'$2a$10$Obf8jHjBnkq...


In [5]:
groups = pd.DataFrame(list(db.groups.find({})))

# remove testing groups
group_creator_ignore_list = ['Demo Admin', 'ykykykykykykykykykyk', 'Stella', 'Kevin Northwestern',
                             'Kevin Chen', 'Leesha', 'Jennie']
group_ignore_ids = groups[groups['creatorName'].isin(group_creator_ignore_list)]['_id'].unique()

# subset groups by id
groups_orig_size = len(groups)
groups_new_size = 0

groups = groups[~groups['_id'].isin(group_ignore_ids)]
groups.reset_index(drop=True, inplace=True)

# print change in size
groups_new_size = len(groups)
print('Original size: {} --> New size: {}'.format(groups_orig_size, groups_new_size))

# display task history
groups.head()

Original size: 454 --> New size: 59


Unnamed: 0,_id,active,activePairing,creationDate,creatorId,creatorName,description,groupName,members,roles
0,uPLDbfFqqdHEEkgCT,True,,2016-08-10 18:55:16.164,goGr47HDwtfphJ5xK,Julian Vicens,Rock and Roll Band,Beatles,"[{u'isPending': False, u'role': {u'_id': u'oB3...","[{u'_id': u'oB3qMqXdTJNqR6vbZ', u'title': u'Gu..."
1,Et46F6odTBmiFiDSZ,True,nnN46Abcc78AAtqKf,2016-07-18 21:21:54.117,NtZ9hv3g6eLAwN2nY,Joe Germuska,Knight Lab taking Pair Research for a spin,Knight Lab Testing,"[{u'isPending': False, u'role': {u'_id': u's2J...","[{u'_id': u's2JKkhE9XC6GPW5ev', u'title': u'Ad..."
2,kY7xHo6c5m5tCiQMH,False,,2016-09-28 19:17:10.709,u2GAvznbx7Jbf97Hk,Emily Withrow,Thursdays at 2:30,Knight Lab Pair Research,"[{u'isPending': False, u'role': {u'_id': u'q3P...","[{u'_id': u'q3PJXDZpMMhcZBRzM', u'title': u'Pr..."
3,KEo62WdN5WSkHa9Hh,False,,2016-09-29 15:15:15.184,u2GAvznbx7Jbf97Hk,Emily Withrow,Thursdays at 2:30,Knight Lab Pair Research,"[{u'isPending': False, u'role': {u'_id': u'6L6...","[{u'_id': u'6L6YwxgDwpqgoYfQb', u'title': u'Pr..."
4,qPnf2DHHihugATnxD,True,x5nm2GgMvdjGwyK9Y,2016-11-10 18:38:04.379,PavTL8zD9664wvtfB,Haoqi Zhang,an intellectual community for design faculty a...,Segal Design Cluster,"[{u'isPending': False, u'role': {u'_id': u'sSN...","[{u'_id': u'sSNgzD6So2kz95vjt', u'title': u'Pr..."


In [6]:
tasks_history = pd.DataFrame(list(db.tasks_history.find({})))

# remove bad groups
tasks_history_orig_size = len(tasks_history)
tasks_history_new_size = 0

tasks_history = tasks_history[~tasks_history['groupId'].isin(group_ignore_ids)]
tasks_history.reset_index(drop=True, inplace=True)

# add group_pairing_id
tasks_history['group_pairing_id'] = tasks_history['groupId'] + '-' + tasks_history['pairingId']

# print change in size
tasks_history_new_size = len(tasks_history)
print('Original size: {} --> New size: {}'.format(tasks_history_orig_size, tasks_history_new_size))

# display task history
tasks_history.head()

Original size: 2742 --> New size: 2730


Unnamed: 0,_id,groupId,name,pairingId,task,userId,group_pairing_id
0,k4ewZSgDHsvDFkXpX,9mdkMmj4pY8Q2TwqF,Yongsung Kim,nRAQpsPhsQs4zRvTL,i need to send out a short-survey to interviewees,EDEFWcagLwCfXP5Jg,9mdkMmj4pY8Q2TwqF-nRAQpsPhsQs4zRvTL
1,RZZWR8pABaJBKYNFu,9mdkMmj4pY8Q2TwqF,Julian Vicens,nRAQpsPhsQs4zRvTL,I would like to talk about different ways to m...,goGr47HDwtfphJ5xK,9mdkMmj4pY8Q2TwqF-nRAQpsPhsQs4zRvTL
2,Xr3dvNreiwzq9ixrQ,9mdkMmj4pY8Q2TwqF,Spencer Carlson,nRAQpsPhsQs4zRvTL,Make educated guesses about the quality of my ...,vbsF64nAgoitwrNeB,9mdkMmj4pY8Q2TwqF-nRAQpsPhsQs4zRvTL
3,dFpfXT8szHkp2pYgG,9mdkMmj4pY8Q2TwqF,Leesha,nRAQpsPhsQs4zRvTL,I need help planning a latency handling featur...,aNdSTecskgeAm2St5,9mdkMmj4pY8Q2TwqF-nRAQpsPhsQs4zRvTL
4,zEMk9HQo9azvKzDye,9mdkMmj4pY8Q2TwqF,Eureka Foong,nRAQpsPhsQs4zRvTL,Installing a program using Terminal (I'm bad a...,JaEySKdKKg7LAF3Yg,9mdkMmj4pY8Q2TwqF-nRAQpsPhsQs4zRvTL


In [7]:
pairings = pd.DataFrame(list(db.pairings.find({})))

# remove bad groups
pairings_orig_size = len(pairings)
pairings_new_size = 0

pairings = pairings[~pairings['groupId'].isin(group_ignore_ids)]

# add group_pair id
pairings['group_pairing_id'] = pairings['groupId'] + '-' + pairings['_id']
pairings.reset_index(drop=True, inplace=True)

# print change in size
pairings_new_size = len(pairings)
print('Original size: {} --> New size: {}'.format(pairings_orig_size, pairings_new_size))

# display current pairings
print('Pairing count: {}, Unique group count: {}'.format(len(pairings), len(pairings.groupId.unique())))
pairings.sort_values('timestamp', ascending=True).head()

Original size: 445 --> New size: 377
Pairing count: 377, Unique group count: 38


Unnamed: 0,_id,groupId,pairings,timestamp,group_pairing_id
38,N23iLvjp2GWcsHYd5,9mdkMmj4pY8Q2TwqF,"[{u'secondUserId': u'57MnWENtTDkXRYhcL', u'fir...",2016-08-05 20:14:57.480,9mdkMmj4pY8Q2TwqF-N23iLvjp2GWcsHYd5
0,soiecrpv6CRPTqmkd,9mdkMmj4pY8Q2TwqF,"[{u'firstUserName': u'Haoqi Zhang', u'firstUse...",2016-08-29 18:22:48.499,9mdkMmj4pY8Q2TwqF-soiecrpv6CRPTqmkd
1,e3PQuthB9woF8koC8,9mdkMmj4pY8Q2TwqF,"[{u'firstUserName': u'Haoqi Zhang', u'firstUse...",2016-08-29 18:23:39.896,9mdkMmj4pY8Q2TwqF-e3PQuthB9woF8koC8
9,7BpbSGW9YSvqN3sgx,9mdkMmj4pY8Q2TwqF,"[{u'secondUserId': u'gynuaAvfp3gAd4Gyo', u'fir...",2016-09-02 19:12:46.689,9mdkMmj4pY8Q2TwqF-7BpbSGW9YSvqN3sgx
10,vskS7yWgLPkk7jYq2,9mdkMmj4pY8Q2TwqF,"[{u'secondUserId': u'goGr47HDwtfphJ5xK', u'fir...",2016-09-06 19:19:40.448,9mdkMmj4pY8Q2TwqF-vskS7yWgLPkk7jYq2


In [8]:
pairs_history = pd.DataFrame(list(db.pairs_history.find({})))

# remove bad groups
pairs_history_orig_size = len(pairs_history)
pairs_history_new_size = 0

pairs_history = pairs_history[~pairs_history['groupId'].isin(group_ignore_ids)]

# add group_pairing_id column
pairs_history['group_pairing_id'] = pairs_history['groupId'] + '-' + pairs_history['pairingId']
pairs_history.reset_index(drop=True, inplace=True)

# print change in size
pairs_history_new_size = len(pairs_history)
print('Original size: {} --> New size: {}'.format(pairs_history_orig_size, pairs_history_new_size))

# display current pairs_history
print('Unique group count: {}, Unique pairing count: {}'.format(len(pairs_history.groupId.unique()), 
                                                                len(pairs_history.group_pairing_id.unique())))
pairs_history.sort_values('timestamp', ascending=True).head()

Original size: 1976 --> New size: 1968
Unique group count: 38, Unique pairing count: 377


Unnamed: 0,_id,firstUserId,firstUserName,firstUserRole,groupId,pairingId,secondUserId,secondUserName,secondUserRole,timestamp,group_pairing_id
0,SSL2EMkRW4CHf66KE,xCnLbAobcKwPq7RD5,Rob Miller,Admin,9mdkMmj4pY8Q2TwqF,ctPEz48CJqcA54YeD,5pimyGfESMe3ctdSa,HQ test,PhD Students,2016-08-01 18:55:00.107,9mdkMmj4pY8Q2TwqF-ctPEz48CJqcA54YeD
1,x78xDiybFqDgEvNKY,PavTL8zD9664wvtfB,Haoqi Zhang,Admin,9mdkMmj4pY8Q2TwqF,nRAQpsPhsQs4zRvTL,KYnkykoMwd9fbBbWB,Julie Hui,Admin,2016-08-01 18:55:00.232,9mdkMmj4pY8Q2TwqF-nRAQpsPhsQs4zRvTL
2,2iJAAApLAmipkui2d,gynuaAvfp3gAd4Gyo,eharburg@gmail.com,Admin,9mdkMmj4pY8Q2TwqF,nRAQpsPhsQs4zRvTL,MJkj24zXWKhnZQCc3,Daniel George Rees Lewis,Admin,2016-08-01 18:55:00.298,9mdkMmj4pY8Q2TwqF-nRAQpsPhsQs4zRvTL
3,vP4N8EnXMHPkcCpsH,aNdSTecskgeAm2St5,Leesha,Admin,9mdkMmj4pY8Q2TwqF,nRAQpsPhsQs4zRvTL,EDEFWcagLwCfXP5Jg,Yongsung Kim,Admin,2016-08-01 18:55:00.301,9mdkMmj4pY8Q2TwqF-nRAQpsPhsQs4zRvTL
4,cgPA9iLvkf3bb8Smn,6iR9Z64HEJDcD8qbu,Matt Easterday,Admin,9mdkMmj4pY8Q2TwqF,nRAQpsPhsQs4zRvTL,JaEySKdKKg7LAF3Yg,Eureka Foong,Admin,2016-08-01 18:55:00.305,9mdkMmj4pY8Q2TwqF-nRAQpsPhsQs4zRvTL


In [9]:
tasks = pd.DataFrame(list(db.tasks.find({})))

# remove bad groups
tasks_orig_size = len(tasks)
tasks_new_size = 0

tasks = tasks[~tasks['groupId'].isin(group_ignore_ids)]
tasks.reset_index(drop=True, inplace=True)

# print change in size
tasks_new_size = len(tasks)
print('Original size: {} --> New size: {}'.format(tasks_orig_size, tasks_new_size))

# display current tasks
tasks.head()

Original size: 1050 --> New size: 908


Unnamed: 0,_id,groupId,name,task,userId
0,kcrr49h2nqnd4zthw,Caei5ywbviEaF44TS,kchen,ihih,AX8FFZHzPa8eF8bBE
1,N4MWm7c8tTf9LZrZ5,NRg4vMMoxEAqTHazP,kchen,I need help with testing pair research,AX8FFZHzPa8eF8bBE
2,juYeYQAt5iNm64iJs,NRg4vMMoxEAqTHazP,ryan,Meteor cordova enterprise push notifications,SFg6T8vhT56EeCkRX
3,qFodnk9mikQF2SvHd,NRg4vMMoxEAqTHazP,shannon,Fixing my laptop screen,5FjQBco6MXaSFhap4
4,yYtwJsrNaYwFpBuvw,NRg4vMMoxEAqTHazP,katiegeorge,meatspace help,hkZoyLhrWetKwep3r


In [10]:
affinities = pd.DataFrame(list(db.affinities.find({})))

# remove bad groups
affinities_orig_size = len(affinities)
affinities_new_size = 0

affinities = affinities[~affinities['groupId'].isin(group_ignore_ids)]
affinities.reset_index(drop=True, inplace=True)

# print change in size
affinities_new_size = len(affinities)
print('Original size: {} --> New size: {}'.format(affinities_orig_size, affinities_new_size))

# display current affinities
affinities.head()

Original size: 3295 --> New size: 3253


Unnamed: 0,_id,groupId,helpeeId,helperId,value
0,e6rjGWDrWE5YKxdbh,NRg4vMMoxEAqTHazP,AX8FFZHzPa8eF8bBE,SFg6T8vhT56EeCkRX,5.0
1,mSnrrMX7y26NSQ7iN,NRg4vMMoxEAqTHazP,SFg6T8vhT56EeCkRX,AX8FFZHzPa8eF8bBE,5.0
2,w72kT4Ez7xYkfE8JF,NRg4vMMoxEAqTHazP,5FjQBco6MXaSFhap4,AX8FFZHzPa8eF8bBE,1.0
3,c5xFCfvPimbBsnsGg,NRg4vMMoxEAqTHazP,hkZoyLhrWetKwep3r,AX8FFZHzPa8eF8bBE,4.0
4,bbTEQ3mvL46mTTskJ,NRg4vMMoxEAqTHazP,AX8FFZHzPa8eF8bBE,5FjQBco6MXaSFhap4,5.0


In [11]:
affinities_history = pd.DataFrame(list(db.affinities_history.find({})))

# remove bad groups
affinities_history_orig_size = len(affinities_history)
affinities_history_new_size = 0

affinities_history = affinities_history[~affinities_history['groupId'].isin(group_ignore_ids)]

# add group_pairing_id column
affinities_history['group_pairing_id'] = affinities_history['groupId'] + '-' + affinities_history['pairingId']

# remove duplicate ratings
affinities_history.sort_values(['group_pairing_id', 'helpeeId', 'helperId'], inplace=True)
affinities_history.drop_duplicates(subset=['group_pairing_id', 'helpeeId', 'helperId'], keep='first', inplace=True)
affinities_history.reset_index(drop=True, inplace=True)

# print change in size
affinities_history_new_size = len(affinities_history)
print('Original size: {} --> New size: {}'.format(affinities_history_orig_size, affinities_history_new_size))

# display affinity data
print('Unique Group Pairings: {}'.format(len(affinities_history.group_pairing_id.unique())))
affinities_history.head()

Original size: 32777 --> New size: 32507
Unique Group Pairings: 363


Unnamed: 0,_id,groupId,helpeeId,helperId,pairingId,value,group_pairing_id
0,v3nKkg77Jouf6BZ8G,2rFoGTfRa9LFdpQNA,3si95Pn6NjXTxCWcT,GLTz7m8y7RqZCYzxx,2EPbA6HkydPTdxCWD,0.33,2rFoGTfRa9LFdpQNA-2EPbA6HkydPTdxCWD
1,D2kBQDRftmygv5f4L,2rFoGTfRa9LFdpQNA,3si95Pn6NjXTxCWcT,PWufwHDsbRaw4se4X,2EPbA6HkydPTdxCWD,1.0,2rFoGTfRa9LFdpQNA-2EPbA6HkydPTdxCWD
2,R588B5nqLhmLbC4iW,2rFoGTfRa9LFdpQNA,3si95Pn6NjXTxCWcT,f8wwqTXaifkxxoAc2,2EPbA6HkydPTdxCWD,0.0,2rFoGTfRa9LFdpQNA-2EPbA6HkydPTdxCWD
3,poiynLy2tnCMNzdGf,2rFoGTfRa9LFdpQNA,3si95Pn6NjXTxCWcT,iyRaCwz7QzxPRSi5t,2EPbA6HkydPTdxCWD,1.0,2rFoGTfRa9LFdpQNA-2EPbA6HkydPTdxCWD
4,KmiSFQicDRa263Nfc,2rFoGTfRa9LFdpQNA,3si95Pn6NjXTxCWcT,kEZXdjhfohiGxJWdu,2EPbA6HkydPTdxCWD,-1.0,2rFoGTfRa9LFdpQNA-2EPbA6HkydPTdxCWD


## Run Stable Matching with All Previous Pairs

In [12]:
def create_affinity_matrix(input_affinities, tasks, remap=False): 
    """
    Creates an n^2 affinity matrix.
        
    Input:
        input_affinities (pandas DataFrame): dataframe with helpeeId, helperId, and value columns.
        tasks (pandas DataFrame): current tasks for pairing. used to create superset of users in event some don't rate any others.
        remap (boolean): remap values to their equivalent on the interface
    
    Output:
        (list of list of numbers): matrix of affinities. 0 if no affinity between users.
        (dict): dict where keys are numbers and values are userIds mapping matrix index to users.
    """
    # dont modify original dataframe
    affinities = deepcopy(input_affinities)
    
    # create user superset and user:index mapping
    user_superset = list(set(affinities['helperId'].tolist() + 
                             affinities['helpeeId'].tolist() + 
                             tasks['userId'].tolist()))
    user_count = len(user_superset)
    user_index_dict = {user_superset[x]: x for x in range(user_count)}
    
    # create empty n^2 matrix
    affinity_matrix = [[0 for y in range(user_count)] for x in range(user_count)]
    
    # remap data values to UI values
    if remap:        
        value_mappings = {
            '-1.0': 1,
            '0.0':  2,
            '0.33': 3,
            '0.66': 4,
            '1.0':  5
        }
        affinities['value'] = affinities['value'].astype(str)
        affinities.replace({'value': value_mappings}, inplace=True)

    affinities.drop_duplicates(inplace=True)
    
    # loop through data and populate matrix
    for index, row in affinities.iterrows():
        curr_helper_index = user_index_dict[row['helperId']]
        curr_helpee_index = user_index_dict[row['helpeeId']]
        curr_value = row['value']
        
        affinity_matrix[curr_helper_index][curr_helpee_index] = curr_value
    
    # flip user and index in dict
    index_user_dict = {str(v): k for (k, v) in user_index_dict.items()}
        
    return affinity_matrix, index_user_dict

def create_pairing_dict(user_index_dict, pairing):
    """
    Creates a pairing dictionary based on previous pairing, ignoring unmatched users.
    
    Input:
        user_index_dict (dict): mapping from users in data to indices (1-indexed)
        pairing (list): list of pairings
    
    Output: 
        (dict): dict containing bi-directional pairings from the pairing list with keys and values being numbers
            ex. { '1': '2', '2': '1'}
    """
    pairing_dict = {}
    for pairs in pairing:
        # only include cases where users are matched to one another (i.e. ignore odd people paired with -1)
        if 'secondUserId' in pairs:
            # bi-directional representation in dictionary
            pairing_dict[user_index_dict[pairs['firstUserId']]] = user_index_dict[pairs['secondUserId']]
            pairing_dict[user_index_dict[pairs['secondUserId']]] = user_index_dict[pairs['firstUserId']]
    
    return pairing_dict

def get_recent_pairings(group_pair_id, limit):
    """
    Retrieve the most recent pairings for a group_id, up to limit, before pairing instance is run
        and return as a dictionary. 
    
    Input:
        group_pair_id (string): group-pairing instance to get recent pairing data for.
        limit (number): number of most recent pairings to get.
        
    Output:
        (dict): dict mapping, bi-directionally, each pairing
    """
    group_id = group_pair_id.split('-')[0]
    
    # get timestamp of current pairing instance
    curr_timestamp = pairs_history[pairs_history.group_pairing_id == group_pair_id].iat[0, pairs_history.columns.get_loc('timestamp')]
    
    # get pairs for group_id that occurred before group_pair_id did and also ignore group_pair_id
    relevant_pairings = pairings[(pairings['groupId'] == group_id) & 
                                 (pairings['timestamp'] < curr_timestamp) & 
                                 (pairings['group_pairing_id'] != group_pair_id)]
    pairing_instance_list = relevant_pairings.sort_values('timestamp', ascending=True)[0:limit]['pairings'].tolist()
    output_list = []
    
    # create dictionaries and add to output
    for pairing_instance in pairing_instance_list:
        pairing_dict = {}
        for pairing in pairing_instance:
            # check if the user is paired with someone
            if 'secondUserId' in pairing:
                # bi-directional representation in dictionary
                pairing_dict[pairing['firstUserId']] = pairing['secondUserId']
                pairing_dict[pairing['secondUserId']] = pairing['firstUserId']
            else:
                pairing_dict[pairing['firstUserId']] = ''
        
        # add to output list
        output_list.append(pairing_dict)
    
    return output_list

def create_weighted_matrix(affinity_matrix, index_user_mapping, recent_pairings):
    """
    Converts an affinity matrix into a weighted matrix.
        Weight is calculated based on previous recent pairings and some random perturbation.
    
    Input:
        affinity_matrix (list of list of numbers): matrix of affinities. 0 if no affinity between users.
        index_user_mapping (dict): dict where keys are numbers and values are userIds mapping matrix index to users.
        recent_pairings (list of dict): up to 3 pairing sessions, ordered by recency, 
            with each dict containing helper-helpee pairs
    
    Output: 
        (list of list of numbers): weighted matrix
    """
    # dont modify original dataframe
    weighted_matrix = deepcopy(affinity_matrix)
    
    # iterate over each element and compute weighted value
    matrix_iterator = range(len(affinity_matrix))
    for row in matrix_iterator:
        for col in matrix_iterator:
            # ignore diagonal
            if row == col:
                continue
            
            # scale weight to be between -100 to 100
            weight = 1 + 99 * affinity_matrix[row][col]
            
            # Penalize recent pairings by increasing weight of pairs that have NOT occurred recently for last 3 pairings
            # ex. If A and B have not paired last time, increase their weight by 80 * 0.5^1
            # ex. If they also didn't pair time before, further increase their weight by 80 * 0.5^2 and so on (up to 3)
            # only give extra weight if rating is not -1
            if affinity_matrix[row][col] != -1:
                for index, pairing in enumerate(recent_pairings):
                    helper = index_user_mapping[str(row)]
                    helpee = index_user_mapping[str(col)]

                    # helper-helpee pairing does not exist in the current pairing
                    if helper in pairing and pairing[helper] != helpee:
                        weight += 80 * 0.5 ** (index + 1)
            
            # add some random perturbation, between 0-20, to guarentee strict ordering
            weight += random.random() * 20
            
            # store new edge weight
            weighted_matrix[row][col] = math.floor(weight)
    
    return weighted_matrix

def create_preference_matrix(weighted_matrix):
    """
    Converts an n^2 weighted matrix into a n-by-m preference matrix (where m = n - 1).
    
    Input: 
        weighted_matrix (list of list of numbers): matrix of weighted affinities
    
    Return: 
        (list of list of numbers): preference matrix where each list is ordered list of person indices.
    """
    # create zipped lists of (index, rating)
    preference_matrix = [[(i + 1, value) for i, value in enumerate(x)] for x in weighted_matrix]
    
    # format each row
    for index, curr_person in enumerate(preference_matrix):
        curr_person.sort(key=lambda tup: tup[1], reverse=True)
        
        # add sorted preference list without self
        preference_matrix[index] = [person_rating[0] for person_rating in curr_person if person_rating[0] - 1 != index]
        
    return preference_matrix

def sr_matching_pair_research(group_pair_id, handle_odd_method='remove', remove_all=True):
    """
    Runs stable matching on pair research data, given a group_pair_id to run matching for.
    
    Input: 
        group_pair_id (string): group pairing to run matching on
        handle_odd_method (string): handling odd cases by either adding ('add') or removing ('remove') user
        remove_all (boolean): whether to try again if randomly removing a person fails
        
    Output:
        (dict): output of matching, along with matching metadata
    """
    # create affinity matrix and index-user dict
    curr_affinities = deepcopy(affinities_history[affinities_history['group_pairing_id'] == group_pair_id])
    curr_tasks = deepcopy(tasks_history[tasks_history['group_pairing_id'] == group_pair_id])
    curr_affinity_matrix, curr_index_user_mapping = create_affinity_matrix(curr_affinities[['helperId', 'helpeeId', 'value']],
                                                                           curr_tasks, remap=False)
    
    # transform index-user dict into user-index dict where indices are 1-indexed
    curr_user_index_dict = {str(v): str(int(k) + 1) for (k, v) in curr_index_user_mapping.items()}

    # get recent pairings and create weighted matrix
    curr_recent_pairings = get_recent_pairings(group_pair_id, 3)
    curr_weighted_matrix = create_weighted_matrix(curr_affinity_matrix, curr_index_user_mapping, curr_recent_pairings)
    
    # create preference matrix
    curr_pref_matrix = create_preference_matrix(curr_weighted_matrix)
    
    # run stable roommates
    stable_result, debug = sr_matching(curr_pref_matrix, handle_odd_method=handle_odd_method, remove_all=remove_all)
    
    # determine stability of MWM matching
    mwm_stability = compute_mwm_stability(group_pair_id, curr_user_index_dict, curr_pref_matrix)
    
    # create metadata about the current affinity and add data to pairing_data
    group_id, pairing_id = group_pair_id.split('-')
    user_count = len(curr_affinity_matrix)
    curr_timestamp = pairs_history[pairs_history.group_pairing_id == group_pair_id].iat[0, pairs_history.columns.get_loc('timestamp')]
    
    mwm_stable_text = 'NA'
    if mwm_stability is not None:
        mwm_stable_text = 'stable' if mwm_stability else 'unstable'
    
    # create and return matching data
    matching_data = {
        'group_pair_id': group_pair_id,
        'group_id': group_id,
        'pairing_id': pairing_id,
        'timestamp': curr_timestamp,
        'user_count': user_count,
        'odd_even': 'even' if user_count % 2 == 0 else 'odd',
        'odd_handling': handle_odd_method,
        'stable_result': stable_result,
        'sr_stable_unstable': 'unstable' if stable_result is None else 'stable',
        'mwm_stable_unstable': mwm_stable_text,
        'stable_printout': debug,
        'affinity_matrix': curr_affinity_matrix,
        'weighted_matrix': curr_weighted_matrix,
        'preference_matrix': curr_pref_matrix
    }
    return matching_data

def compute_mwm_stability(group_pair_id, user_index_dict, preference_matrix):
    """
    Computes the stability of a previous MWM matching, given a preference_matrix.
    
    Input:
        group_pair_id (string): pairing to determine stability for.
        user_index_dict (dict): mapping of users to index where indices are 1-indexed strings.
        preferences (matrix, list of lists of numbers): n-by-m preference matrix containing preferences for each person.
            m = n - 1, so each person has rated all other people.
            Each row is a 1-indexed ordered ranking of others in the pool.
            Therefore max(preferences[person]) <= number people and min(preferences[person]) = 1.
    
    Output: 
        (boolean): whether MWM matching was stable. None if cannot determine.
    """
    # create a preference lookup table
    # person_number : [list of preferences]
    curr_pref_dict = {
        str(x + 1): [str(y) for y in preference_matrix[x]] for x in range(len(preference_matrix))
    }

    # create a dict of dicts holding index of each person ranked
    # person number : {person : rank_index }
    curr_ranks = {index: dict(zip(value, range(len(value)))) for (index, value) in curr_pref_dict.items()}

    # attempt to create pairing dict and determine stability
    try:
        # create pairing dict
        curr_pairings = pairings[pairings['group_pairing_id'] == group_pair_id]['pairings'].tolist()[0]
        curr_pairing_dict = create_pairing_dict(user_index_dict, curr_pairings)
        
        # determine and return stability
        return verify_stability(curr_pairing_dict, curr_ranks)
    except KeyError:
        # stability could not be computed since some data is missing
        return None

def sr_matching_pair_research_wrapper(exec_dicts):
    """
    Wrapper for sr_matching_pair_research that allows for changing optional parameters.
    
    Input:
        exec_dicts (list of dicts): contains group_pair_id, handle_odd_method, and remove_all
    
    Output:
        (dict): output of matching, along with matching metadata
    """
    return sr_matching_pair_research(exec_dicts['group_pair_id'],
                                     exec_dicts['handle_odd_method'],
                                     exec_dicts['remove_all'])

def execute_sr_matching(group_pairing_ids, handle_odd_method='remove', remove_all=True, parallel=False):
    """
    Wrapper for computing pair research matchings that calls sr_matching_pair_research_wrapper. 
    
    Input:
        group_pairing_ids (list of string): unique group pairing ids to conduct matching on.
        remove_all (boolean): whether to try again if randomly removing a person fails
        parallel (boolean): run matching in parallel across all group_pairing_ids
        
    Output:
        (DataFrame): matchings computed for pair research data
    """
    pairing_data = []
    exec_dicts = [
        {'group_pair_id': group_pair_id, 'handle_odd_method': handle_odd_method, 'remove_all': remove_all} for group_pair_id in group_pairing_ids
    ] 
    
    # compute pairings
    if parallel:
        pool = mp.Pool(processes=mp.cpu_count())
        pairing_data = pool.map(sr_matching_pair_research_wrapper, exec_dicts)
        pool.close()
        pool.join()
    else:
        pairing_data = [sr_matching_pair_research_wrapper(exec_dict) for exec_dict in exec_dicts]
    
    return pd.DataFrame(pairing_data)

### Remove One User Only

In [13]:
# get all pairing instances
group_pairing_ids = affinities_history.group_pairing_id.unique()

# compute pairings and create DataFrame of results
remove_one_pairings_df = execute_sr_matching(group_pairing_ids, handle_odd_method='remove', remove_all=False, parallel=True)

# print stable matching results
sr_stable_count = len(remove_one_pairings_df[remove_one_pairings_df['sr_stable_unstable'] == 'stable'])
sr_unstable_count = len(remove_one_pairings_df[remove_one_pairings_df['sr_stable_unstable'] == 'unstable'])
sr_total = sr_stable_count + sr_unstable_count

output_string = 'Stable Roommates Matching Results\nStable: {} ({:1.2f}%)\nUnstable: {} ({:1.2f}%)\nTotal: {} (100.00%)\n\n'
print(output_string.format(sr_stable_count, 100 * sr_stable_count / sr_total,
                           sr_unstable_count, 100 * sr_unstable_count / sr_total,
                           sr_total))

# print mwm results
mwm_stable_count = len(remove_one_pairings_df[remove_one_pairings_df['mwm_stable_unstable'] == 'stable'])
mwm_unstable_count = len(remove_one_pairings_df[remove_one_pairings_df['mwm_stable_unstable'] == 'unstable'])
mwm_none_count = len(remove_one_pairings_df[remove_one_pairings_df['mwm_stable_unstable'] == 'NA'])
mwm_total = mwm_stable_count + mwm_unstable_count + mwm_none_count

output_string = 'Maximum Weighted Matching Results\nStable: {} ({:1.2f}%)\nUnstable: {} ({:1.2f}%)\nNA (could not determine stability): {} ({:1.2f}%)\nTotal: {} (100.00%)'
print(output_string.format(mwm_stable_count, 100 * mwm_stable_count / mwm_total,
                           mwm_unstable_count, 100 * mwm_unstable_count / mwm_total,
                           mwm_none_count, 100 * mwm_none_count / mwm_total,
                           mwm_total))
remove_one_pairings_df.head()

Stable Roommates Matching Results
Stable: 297 (81.82%)
Unstable: 66 (18.18%)
Total: 363 (100.00%)


Maximum Weighted Matching Results
Stable: 98 (27.00%)
Unstable: 253 (69.70%)
NA (could not determine stability): 12 (3.31%)
Total: 363 (100.00%)


Unnamed: 0,affinity_matrix,group_id,group_pair_id,mwm_stable_unstable,odd_even,odd_handling,pairing_id,preference_matrix,sr_stable_unstable,stable_printout,stable_result,timestamp,user_count,weighted_matrix
0,"[[0, 0.33, 0.66, 0.66, -1.0, 0.33, -1.0, 0.0, ...",2rFoGTfRa9LFdpQNA,2rFoGTfRa9LFdpQNA-2EPbA6HkydPTdxCWD,unstable,even,remove,2EPbA6HkydPTdxCWD,"[[4, 9, 3, 2, 6, 10, 8, 5, 7], [10, 8, 1, 3, 9...",stable,Stable matching found after Phase 2.,"[2, 9, 0, 8, 6, 7, 4, 5, 3, 1]",2017-09-26 21:33:10.196,10,"[[0, 63.0, 77.0, 91.0, -88.0, 53.0, -93.0, 26...."
1,"[[0, 1.0], [0, 0]]",2rFoGTfRa9LFdpQNA,2rFoGTfRa9LFdpQNA-A6d3rQwrRZHEz4qHu,stable,even,remove,A6d3rQwrRZHEz4qHu,"[[2], [1]]",stable,Stable matching found after Phase 1.,"[1, 0]",2017-08-22 17:19:36.847,2,"[[0, 100.0], [5.0, 0]]"
2,"[[0, 0.33, 0.66, 0.66, -1.0, 0.33, -1.0, 0.0, ...",2rFoGTfRa9LFdpQNA,2rFoGTfRa9LFdpQNA-JS2qH6wPAxLfjZtJW,unstable,even,remove,JS2qH6wPAxLfjZtJW,"[[9, 3, 4, 6, 2, 10, 8, 7, 5], [10, 8, 6, 9, 3...",stable,Stable matching found after Phase 1.,"[7, 9, 5, 8, 6, 2, 4, 0, 3, 1]",2017-09-26 21:33:04.597,10,"[[0, 34.0, 71.0, 70.0, -84.0, 47.0, -81.0, 2.0..."
3,"[[0, 0.66, 0.66, 1.0, 0.0, 0.0, 0.0, 0.33], [0...",2rFoGTfRa9LFdpQNA,2rFoGTfRa9LFdpQNA-SpiKfuqCoEZRLfDNK,unstable,even,remove,SpiKfuqCoEZRLfDNK,"[[4, 2, 3, 8, 7, 6, 5], [4, 6, 8, 7, 3, 1, 5],...",stable,Stable matching found after Phase 2.,"[3, 4, 7, 0, 1, 6, 5, 2]",2018-01-16 21:42:19.584,8,"[[0, 76.0, 70.0, 120.0, 11.0, 14.0, 23.0, 54.0..."
4,"[[0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0,...",2rFoGTfRa9LFdpQNA,2rFoGTfRa9LFdpQNA-SwhcfsdjNCZcyzx3t,unstable,even,remove,SwhcfsdjNCZcyzx3t,"[[10, 9, 6, 3, 5, 8, 2, 4, 7], [3, 6, 8, 5, 10...",stable,Stable matching found after Phase 1.,"[9, 2, 1, 6, 8, 7, 3, 5, 4, 0]",2017-11-28 21:48:06.568,10,"[[0, -97.0, -90.0, -98.0, -94.0, -88.0, -98.0,..."


### Remove another user (and attempt for all users) if Stable Matching isn't Found

In [14]:
# get all pairing instances
group_pairing_ids = affinities_history.group_pairing_id.unique()

# compute pairings and create DataFrame of results
remove_all_pairings_df = execute_sr_matching(group_pairing_ids, handle_odd_method='remove', remove_all=True, parallel=True)

# print stable matching results
sr_stable_count = len(remove_all_pairings_df[remove_all_pairings_df['sr_stable_unstable'] == 'stable'])
sr_unstable_count = len(remove_all_pairings_df[remove_all_pairings_df['sr_stable_unstable'] == 'unstable'])
sr_total = sr_stable_count + sr_unstable_count

output_string = 'Stable Roommates Matching Results\nStable: {} ({:1.2f}%)\nUnstable: {} ({:1.2f}%)\nTotal: {} (100.00%)\n\n'
print(output_string.format(sr_stable_count, 100 * sr_stable_count / sr_total,
                           sr_unstable_count, 100 * sr_unstable_count / sr_total,
                           sr_total))

# print mwm results
mwm_stable_count = len(remove_all_pairings_df[remove_all_pairings_df['mwm_stable_unstable'] == 'stable'])
mwm_unstable_count = len(remove_all_pairings_df[remove_all_pairings_df['mwm_stable_unstable'] == 'unstable'])
mwm_none_count = len(remove_all_pairings_df[remove_all_pairings_df['mwm_stable_unstable'] == 'NA'])
mwm_total = mwm_stable_count + mwm_unstable_count + mwm_none_count

output_string = 'Maximum Weighted Matching Results\nStable: {} ({:1.2f}%)\nUnstable: {} ({:1.2f}%)\nNA (could not determine stability): {} ({:1.2f}%)\nTotal: {} (100.00%)'
print(output_string.format(mwm_stable_count, 100 * mwm_stable_count / mwm_total,
                           mwm_unstable_count, 100 * mwm_unstable_count / mwm_total,
                           mwm_none_count, 100 * mwm_none_count / mwm_total,
                           mwm_total))
remove_all_pairings_df.head()

Stable Roommates Matching Results
Stable: 325 (89.53%)
Unstable: 38 (10.47%)
Total: 363 (100.00%)


Maximum Weighted Matching Results
Stable: 100 (27.55%)
Unstable: 251 (69.15%)
NA (could not determine stability): 12 (3.31%)
Total: 363 (100.00%)


Unnamed: 0,affinity_matrix,group_id,group_pair_id,mwm_stable_unstable,odd_even,odd_handling,pairing_id,preference_matrix,sr_stable_unstable,stable_printout,stable_result,timestamp,user_count,weighted_matrix
0,"[[0, 0.33, 0.66, 0.66, -1.0, 0.33, -1.0, 0.0, ...",2rFoGTfRa9LFdpQNA,2rFoGTfRa9LFdpQNA-2EPbA6HkydPTdxCWD,unstable,even,remove,2EPbA6HkydPTdxCWD,"[[9, 3, 4, 2, 10, 6, 8, 7, 5], [10, 8, 6, 3, 4...",stable,Stable matching found after Phase 1.,"[7, 9, 5, 8, 6, 2, 4, 0, 3, 1]",2017-09-26 21:33:10.196,10,"[[0, 56.0, 86.0, 82.0, -91.0, 43.0, -87.0, 25...."
1,"[[0, 1.0], [0, 0]]",2rFoGTfRa9LFdpQNA,2rFoGTfRa9LFdpQNA-A6d3rQwrRZHEz4qHu,stable,even,remove,A6d3rQwrRZHEz4qHu,"[[2], [1]]",stable,Stable matching found after Phase 1.,"[1, 0]",2017-08-22 17:19:36.847,2,"[[0, 100.0], [5.0, 0]]"
2,"[[0, 0.33, 0.66, 0.66, -1.0, 0.33, -1.0, 0.0, ...",2rFoGTfRa9LFdpQNA,2rFoGTfRa9LFdpQNA-JS2qH6wPAxLfjZtJW,unstable,even,remove,JS2qH6wPAxLfjZtJW,"[[9, 3, 4, 6, 2, 10, 8, 7, 5], [10, 8, 6, 9, 3...",stable,Stable matching found after Phase 1.,"[7, 9, 5, 8, 6, 2, 4, 0, 3, 1]",2017-09-26 21:33:04.597,10,"[[0, 34.0, 71.0, 70.0, -84.0, 47.0, -81.0, 2.0..."
3,"[[0, 0.66, 0.66, 1.0, 0.0, 0.0, 0.0, 0.33], [0...",2rFoGTfRa9LFdpQNA,2rFoGTfRa9LFdpQNA-SpiKfuqCoEZRLfDNK,unstable,even,remove,SpiKfuqCoEZRLfDNK,"[[4, 2, 3, 8, 7, 6, 5], [4, 6, 8, 7, 3, 1, 5],...",stable,Stable matching found after Phase 2.,"[3, 4, 7, 0, 1, 6, 5, 2]",2018-01-16 21:42:19.584,8,"[[0, 76.0, 70.0, 120.0, 11.0, 14.0, 23.0, 54.0..."
4,"[[0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0,...",2rFoGTfRa9LFdpQNA,2rFoGTfRa9LFdpQNA-SwhcfsdjNCZcyzx3t,unstable,even,remove,SwhcfsdjNCZcyzx3t,"[[10, 9, 6, 3, 5, 8, 2, 4, 7], [3, 6, 8, 5, 10...",stable,Stable matching found after Phase 1.,"[9, 2, 1, 6, 8, 7, 3, 5, 4, 0]",2017-11-28 21:48:06.568,10,"[[0, -97.0, -90.0, -98.0, -94.0, -88.0, -98.0,..."


## Analyzing Instability

### TODO
- why wasnt a person proposed to? --> see this

In [15]:
remove_all_pairings_df[['sr_stable_unstable', 'stable_printout', 'group_pair_id']].groupby(['sr_stable_unstable', 'stable_printout']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,group_pair_id
sr_stable_unstable,stable_printout,Unnamed: 2_level_1
stable,Stable matching found after Phase 1.,216
stable,Stable matching found after Phase 2.,109
unstable,Failed at Phase 1: not everyone was proposed to.,19
unstable,Failed at Phase 2: could not find an all-or-nothing cycle len > 3.,17
unstable,"Failed at Verification after Phase 2: matching computed, but not valid.",2


In [16]:
remove_all_pairings_df[['sr_stable_unstable', 'stable_printout', 'odd_even', 'group_pair_id']].groupby(['sr_stable_unstable', 'stable_printout', 'odd_even']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,group_pair_id
sr_stable_unstable,stable_printout,odd_even,Unnamed: 3_level_1
stable,Stable matching found after Phase 1.,even,94
stable,Stable matching found after Phase 1.,odd,122
stable,Stable matching found after Phase 2.,even,45
stable,Stable matching found after Phase 2.,odd,64
unstable,Failed at Phase 1: not everyone was proposed to.,even,18
unstable,Failed at Phase 1: not everyone was proposed to.,odd,1
unstable,Failed at Phase 2: could not find an all-or-nothing cycle len > 3.,even,17
unstable,"Failed at Verification after Phase 2: matching computed, but not valid.",even,2


### Unstable Case 1--Failed at Phase 1: not everyone was proposed to.	

In [17]:
unstable_cases_1 = remove_all_pairings_df[remove_all_pairings_df['stable_printout'] == 'Failed at Phase 1: not everyone was proposed to.']
unstable_cases_1.head()

Unnamed: 0,affinity_matrix,group_id,group_pair_id,mwm_stable_unstable,odd_even,odd_handling,pairing_id,preference_matrix,sr_stable_unstable,stable_printout,stable_result,timestamp,user_count,weighted_matrix
44,"[[0, 1.0, 0.33, 1.0, 0.66, 1.0, 0.66, 0.33, 0....",9mdkMmj4pY8Q2TwqF,9mdkMmj4pY8Q2TwqF-HWMgfj5wPa8ahibez,unstable,even,remove,HWMgfj5wPa8ahibez,"[[4, 6, 2, 7, 10, 5, 8, 9, 3], [8, 5, 6, 7, 3,...",unstable,Failed at Phase 1: not everyone was proposed to.,,2017-12-04 16:46:13.547,10,"[[0, 140.0, 38.0, 150.0, 106.0, 143.0, 119.0, ..."
46,"[[0, 0.66, 0.33, 0.66, 0.66, -1.0], [0.66, 0, ...",9mdkMmj4pY8Q2TwqF,9mdkMmj4pY8Q2TwqF-HrBWHhbs7X6gswjWs,unstable,even,remove,HrBWHhbs7X6gswjWs,"[[4, 2, 5, 3, 6], [6, 5, 3, 1, 4], [2, 5, 6, 4...",unstable,Failed at Phase 1: not everyone was proposed to.,,2016-10-12 19:06:48.944,6,"[[0, 106.0, 78.0, 116.0, 106.0, -95.0], [119.0..."
54,"[[0, 0.0, 0.33, 1.0, 0.33, 1.0, 0.33, -1.0], [...",9mdkMmj4pY8Q2TwqF,9mdkMmj4pY8Q2TwqF-Q2Qffh22oZwWrRCWt,stable,even,remove,Q2Qffh22oZwWrRCWt,"[[4, 6, 3, 5, 7, 2, 8], [3, 6, 1, 7, 4, 5, 8],...",unstable,Failed at Phase 1: not everyone was proposed to.,,2016-08-01 18:55:00.816,8,"[[0, 15.0, 47.0, 117.0, 35.0, 108.0, 34.0, -94..."
79,"[[0, 0.0, 0.33, 1.0, 0.33, -1.0], [0.66, 0, 1....",9mdkMmj4pY8Q2TwqF,9mdkMmj4pY8Q2TwqF-nRAQpsPhsQs4zRvTL,,even,remove,nRAQpsPhsQs4zRvTL,"[[4, 5, 3, 2, 6], [3, 1, 4, 6, 5], [5, 1, 6, 4...",unstable,Failed at Phase 1: not everyone was proposed to.,,2016-08-01 18:55:00.232,6,"[[0, 1.0, 39.0, 104.0, 48.0, -85.0], [84.0, 0,..."
123,"[[0, 0.33, 1.0, 0.33, 0.33, 0.33, 0.33, 0.33, ...",BibLRuKtNNv7QEDqb,BibLRuKtNNv7QEDqb-QLkDKBZ2jTebCA2eP,unstable,odd,remove,QLkDKBZ2jTebCA2eP,"[[3, 5, 4, 7, 2, 9, 6, 8, 11, 10], [7, 5, 9, 6...",unstable,Failed at Phase 1: not everyone was proposed to.,,2017-01-13 02:29:12.488,11,"[[0, 38.0, 114.0, 47.0, 51.0, 35.0, 42.0, 34.0..."


### Unstable Case 2--Failed at Phase 2: could not find an all-or-nothing cycle len > 3.

In [18]:
unstable_cases_2 = remove_all_pairings_df[remove_all_pairings_df['stable_printout'] == 'Failed at Phase 1: not everyone was proposed to.']
unstable_cases_2.head()

Unnamed: 0,affinity_matrix,group_id,group_pair_id,mwm_stable_unstable,odd_even,odd_handling,pairing_id,preference_matrix,sr_stable_unstable,stable_printout,stable_result,timestamp,user_count,weighted_matrix
44,"[[0, 1.0, 0.33, 1.0, 0.66, 1.0, 0.66, 0.33, 0....",9mdkMmj4pY8Q2TwqF,9mdkMmj4pY8Q2TwqF-HWMgfj5wPa8ahibez,unstable,even,remove,HWMgfj5wPa8ahibez,"[[4, 6, 2, 7, 10, 5, 8, 9, 3], [8, 5, 6, 7, 3,...",unstable,Failed at Phase 1: not everyone was proposed to.,,2017-12-04 16:46:13.547,10,"[[0, 140.0, 38.0, 150.0, 106.0, 143.0, 119.0, ..."
46,"[[0, 0.66, 0.33, 0.66, 0.66, -1.0], [0.66, 0, ...",9mdkMmj4pY8Q2TwqF,9mdkMmj4pY8Q2TwqF-HrBWHhbs7X6gswjWs,unstable,even,remove,HrBWHhbs7X6gswjWs,"[[4, 2, 5, 3, 6], [6, 5, 3, 1, 4], [2, 5, 6, 4...",unstable,Failed at Phase 1: not everyone was proposed to.,,2016-10-12 19:06:48.944,6,"[[0, 106.0, 78.0, 116.0, 106.0, -95.0], [119.0..."
54,"[[0, 0.0, 0.33, 1.0, 0.33, 1.0, 0.33, -1.0], [...",9mdkMmj4pY8Q2TwqF,9mdkMmj4pY8Q2TwqF-Q2Qffh22oZwWrRCWt,stable,even,remove,Q2Qffh22oZwWrRCWt,"[[4, 6, 3, 5, 7, 2, 8], [3, 6, 1, 7, 4, 5, 8],...",unstable,Failed at Phase 1: not everyone was proposed to.,,2016-08-01 18:55:00.816,8,"[[0, 15.0, 47.0, 117.0, 35.0, 108.0, 34.0, -94..."
79,"[[0, 0.0, 0.33, 1.0, 0.33, -1.0], [0.66, 0, 1....",9mdkMmj4pY8Q2TwqF,9mdkMmj4pY8Q2TwqF-nRAQpsPhsQs4zRvTL,,even,remove,nRAQpsPhsQs4zRvTL,"[[4, 5, 3, 2, 6], [3, 1, 4, 6, 5], [5, 1, 6, 4...",unstable,Failed at Phase 1: not everyone was proposed to.,,2016-08-01 18:55:00.232,6,"[[0, 1.0, 39.0, 104.0, 48.0, -85.0], [84.0, 0,..."
123,"[[0, 0.33, 1.0, 0.33, 0.33, 0.33, 0.33, 0.33, ...",BibLRuKtNNv7QEDqb,BibLRuKtNNv7QEDqb-QLkDKBZ2jTebCA2eP,unstable,odd,remove,QLkDKBZ2jTebCA2eP,"[[3, 5, 4, 7, 2, 9, 6, 8, 11, 10], [7, 5, 9, 6...",unstable,Failed at Phase 1: not everyone was proposed to.,,2017-01-13 02:29:12.488,11,"[[0, 38.0, 114.0, 47.0, 51.0, 35.0, 42.0, 34.0..."


### Unstable Case 3--Failed at Verification after Phase 2: matching computed, but not valid.

In [19]:
instability_cases_3 = remove_all_pairings_df[remove_all_pairings_df['stable_printout'] == 'Failed at Verification after Phase 2: matching computed, but not valid.']
instability_cases_3

Unnamed: 0,affinity_matrix,group_id,group_pair_id,mwm_stable_unstable,odd_even,odd_handling,pairing_id,preference_matrix,sr_stable_unstable,stable_printout,stable_result,timestamp,user_count,weighted_matrix
307,"[[0, -1.0, 0, 1.0, -1.0, 0.33, 0.33, 0.33, 0.0...",sM3z5FkZfsABqcj3g,sM3z5FkZfsABqcj3g-4u2gDDfdjvSzK9RHa,unstable,even,remove,4u2gDDfdjvSzK9RHa,"[[4, 14, 10, 8, 11, 7, 6, 9, 3, 12, 5, 2, 13],...",unstable,Failed at Verification after Phase 2: matching...,,2018-04-27 20:30:10.250,14,"[[0, -90.0, 61.0, 164.0, -88.0, 94.0, 97.0, 10..."
327,"[[0, 0.66, 0.66, 0.0, -1.0, 0.0, 0.33, 0.0, -1...",sM3z5FkZfsABqcj3g,sM3z5FkZfsABqcj3g-a9t7Jwo3EALvZgFg2,unstable,even,remove,a9t7Jwo3EALvZgFg2,"[[13, 23, 18, 16, 2, 3, 10, 20, 7, 12, 11, 15,...",unstable,Failed at Verification after Phase 2: matching...,,2017-09-27 17:15:59.815,24,"[[0, 77.0, 70.0, 12.0, -82.0, 1.0, 49.0, 14.0,..."


## Analyze Specific Pairing Instances