## <font color=green>1. Gale Shapley Algorithm [Stable Matching]
### <font color=black>Introduction


**Gale Shapley Algorithm**
    
Gale Shapley (GS) Algorithm also known as the Stable Matching (SM) problem is a constructive algorithm used to find a stable match meaning for given a candidate the algorithm would find its preferable partner. It is an efficient graph algorithm that is used for solving "Stable Matching" problem. It has a time complexity of O(N^2), where N is the number of entries (i.e. the number of people involved). 
    
The classic stable marriage (SM) problem can be stated as below: Given n men and n women, where each person has ranked all members of the opposite sex with a unique number between 1 and n in order of preference, marry the men and women together such that there are no two people of opposite sex who would both rather have each other than their current partners. If there are no such people, all the marriages are **“stable”**. Gale and Shapley gave their famous
deferred acceptance algorithm in 1962 which always finds a stable matching.

When the preference lists are incomplete and/or have ties, the problem becomes more challenge (called Stable Marriage with Incomplete List and Ties, SMTI). For SMTI, the GS algorithm still finds a stable matching. However, in this case, we are usually interested in finding a stable matching with maximum size. While the matching size of the GS algorithm
is at least half of the optimal matching size, a modified version of the GS algorithm was proposed to guarantee 2/3 of the optimal matching size. Algorithms for finding solutions to the stable marriage problem have applications in a variety of real-world situations. A well known example is in the assignment of graduating medical students to their first hospital appointments. In 2012, the Nobel Prize in Economics was awarded to Lloyd S. Shapley and Alvin E. Roth for the theory of stable allocations and the practice of market design. 

**PROBLEM STATEMENT:**
Stable Matching problem arises as a problem for selecting the suitable and most perfect partner for marriage. The problem requires at any given condition the pairing between a couple should be so, that only the most suitable people are paired as a couple. The algorithm requires 2 sets of input(group of people: male and female),

- The first set contains females
- The second set contains males.

    
 Each person in each set has a list of preferences which includes all the people from the opposite set. That is, every woman in the set has a preference list that contains all the men in the other set. Similarly, every man has a preference list that contains all the women in the other set.
    

- We assume that females can be matched to only males and vice versa.
- Also, there is no male who also identifies as female and no female identifies as male.
- A person can be of the same preference to many people in the opposite set i.e. A woman named 'A' may be the first   preference of 2 men '1' and '2'.
- The preferences are not reciprocated i.e If 'A' is the first preference of '1', but '1' need not be 'A's' first preference.

Under these conditions, we need to perform a matching so that every person is engaged to his/her most preferable choice.

Before going on to solving this problem, we need to choose the proposers' side i.e the people who will be proposing to the other set. The result obtained might vary slightly on this choice, but both the results will be stable matchings even if they are not the same.

### <font color=black>Algorithm
![alt text](Algo.png "GALE AND SHAPLEY'S ALGORITHM")

### <font color=black>Example

The GS Problem can be defined by considering the below example. Consider the image below where there are 5 boys (1,2,3,4,5) who would be the input candidates and there are 5 girls (A,B,C,D,E) who are the preferable candidates or vice versa. These boys and girls have their own preference for a partner. That is each boy has a list of girls whom he wants to partner/dance/play with, ranking them from most desirable to least desirable, and on the other hand each girl has a list of boys whom she wants to partner/dance/play with, ranking them from most desirable to least desirable. A girl is acceptable for a boy if she is on his preference list; similarly, a boy is acceptable for a girl if he is on her preference list. Everyone's preferences depend only on their own opinions; there is no jealousy (in economic terms, there are no externalities). No one is forced to dance with anyone who is not on their preference list.

**A matching:** An outcome which tells everyone in the room who their dance partners are. Formally, a matching is a function that maps the set of boys and girls onto itself.
    
**Stability:** Suppose a matching has been made but there is a boy and a girl who want to dance with each other rather than dance with the person the matching matched them up with. Then they are said to form a blocking pair. A stable matching happens if there are no blocking pairs.
    
![alt text](Stable.png "GALE AND SHAPLEY'S ALGORITHM")

    
### <font color=black>Algorithm Walkthrough
    
Step 1. Each boy proposes to his first acceptable choice (if he has any names on his preference list). Each girl who receives an offer rejects all offers except the best acceptable proposal (according to her preference list), which she holds on to.

Step k. Any boy who was rejected at step k − 1 makes a new proposal to his most preferred girl on his list who has not yet rejected him. (If he ran through the names on his list, he makes no more proposals.) Each girl holds her most preferred acceptable proposal to date, and rejects the rest.

Algorithm terminates when there are no more rejections and each girl is matched with the boy she has been holding in the last step. Any girl who has not been holding an offer or any boy who was rejected by all acceptable girls remains single.

Note: In the explanation above, we had boys propose for demonstration purposes, but of course, girls can propose too, and in that case, we would reach a stable matching as well.

### <font color=black>Dataset
For the Stable Matching Algorithm we have taken the candidates as males which are categorised using Alphabets A,B,etc. and their preferable matches are the girls who have been categorised as numbers 1,2,3,etc. The peferences are ranked in the table based on the most peferred first (P1) and the least preferred last (P10). This dataset can also be extended to add more candidates and also more preferences, as there can be cases where a candidate was not able to find his/her **stable match**
    
**Male Preferences**
    
| Male Candidate | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 |
| :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
| A | 10 | 8  | 9  | 4 | 5 | 1  | 7 | 3  | 2  | 6  |
| B | 6  | 5  | 3  | 8 | 9 | 7  | 1 | 4  | 10 | 2  |
| C | 2  | 1  | 8  | 9 | 5 | 7  | 4 | 10 | 6  | 3  |
| D | 1  | 2  | 7  | 3 | 6 | 8  | 9 | 10 | 4  | 5  |
| E | 4  | 1  | 2  | 3 | 5 | 6  | 9 | 8  | 7  | 10 |
| F | 1  | 2  | 3  | 8 | 4 | 9  | 5 | 7  | 6  | 10 |
| G | 5  | 10 | 4  | 3 | 8 | 1  | 9 | 7  | 6  | 2  |
| H | 9  | 1  | 2  | 6 | 3 | 10 | 4 | 8  | 7  | 5  |
| I | 9  | 7  | 10 | 1 | 4 | 3  | 6 | 5  | 8  | 2  |
| J | 1  | 4  | 10 | 3 | 5 | 8  | 6 | 7  | 9  | 2  |
   

**Female Preferences**

| Female Candidate | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 |
| :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
| 1  | B | E  | J  | C | D | G  | F | A  | H  | I  |
| 2  | J | A  | B  | C | E | G  | F | D  | H  | I  |
| 3  | A | C  | E  | F | I | D  | G | H  | J  | B  |
| 4  | H | E  | I  | F | D | C  | B | J  | G  | A  |    
| 5  | C | I  | B  | D | E | A  | H | J  | G  | F  |
| 6  | J | G  | D  | F | H | B  | A | E  | C  | I  |
| 7  | E | B  | D  | H | G | F  | J | A  | I  | C  |
| 8  | D | F  | E  | H | G | B  | C | J  | A  | I  |
| 9  | G | A  | I  | C | J | B  | E | F  | H  | D  |
| 10 | B | C  | A  | D | E | I  | H | J  | F  | G  |


### <font color=black>Program Code

For Stable Matching we have used a class to create the python code into a useable package called the  **'GSAlgo'** by using that we can replicate Stable Matching Algorithm with our custom dataset apart from that we have also implemented the GS algo with our own python code that is based on data structures and the GS algo discussed above. Both these solutions are performed for the dataset mentioned above, these two methods are discussed so that we can show two different approaches to the coding part of the GS algorithm.

**Method 1**:Using traditional data structures and algos

In [2]:
import collections
from datetime import datetime

MalePrefers = {
    'A': ['10', '4', '9', '4','5','1','7','3','2','6'],
    'B': ['6', '5', '3', '8','9','7','1','4','10','2'],
    'C': ['2', '1', '8', '9','5','7','4','10','6','3'],
    'D': ['1', '2', '7', '3','6','8','9','10','4','5'],
    'E': ['4', '1', '2', '3','5','6','9','8','7','10'],
    'F': ['1', '2', '3', '8','4','9','5','7','6','10'],
    'G': ['5', '10', '4', '3','8','1','9','7','6','2'],
    'H': ['9', '1', '2', '6','3','10','4','8','7','5'],
    'I': ['9', '7', '10', '1','4','3','6','5','8','5'],
    'J': ['1', '4', '10', '3','5','8','6','7','9','2']
}

FemalePrefers = {
    '1':  ['B', 'E', 'J', 'C', 'D', 'G', 'F', 'A', 'H', 'I'],
    '2':  ['J', 'A', 'B', 'C', 'E', 'G', 'F', 'D', 'H', 'I'],
    '3':  ['A', 'C', 'E', 'F', 'I', 'D', 'G', 'H', 'J', 'B'],
    '4':  ['H', 'E', 'I', 'F', 'D', 'C', 'B', 'J', 'G', 'A'],
    '5':  ['C', 'I', 'B', 'D', 'E', 'A', 'H', 'J', 'G', 'F'],
    '6':  ['J', 'G', 'D', 'F', 'H', 'B', 'A', 'E', 'C', 'I'],
    '7':  ['E', 'B', 'D', 'H', 'G', 'F', 'J', 'A', 'I', 'C'],
    '8':  ['D', 'F', 'E', 'H', 'G', 'B', 'C', 'J', 'A', 'I'],
    '9':  ['G', 'A', 'I', 'C', 'J', 'B', 'E', 'F', 'H', 'D'],
    '10': ['B', 'C', 'A', 'D', 'E', 'I', 'H', 'J', 'F', 'G']
}

In [3]:
tentative_matchings = [] #This list keeps track of tentative matches between people, these pairs could be potential matches
unmatched_men = [] #Male candidates who have not been matched with a female partner 

def init_unmatched_men():
    #Initialize the arrays of female and male to represent that all are initially unmatched & have not found a partner'''
    for male_candidate in MalePrefers:
        unmatched_men.append(male_candidate)

def Stablematching(male_candidate): #Find the first free woman available to a man at any given time
    for female_candidate in MalePrefers[male_candidate]:
        #taken will assign a bool value to chk if that female candidate has already been partnered or not
        taken = [matched for matched in tentative_matchings if female_candidate in matched]

        if (len(taken) == 0): #Match the male and female candidates tentatively
            tentative_matchings.append([male_candidate, female_candidate])
            unmatched_men.remove(male_candidate)
            break
        
        
        #If the candidate already has a match then we need to check if that match is the stablematch or not by ranking the match partner with preferences
        elif (len(taken) > 0):
            current_match = FemalePrefers[female_candidate].index(taken[0][0])
            stable_match = FemalePrefers[female_candidate].index(male_candidate)
            
            
            #Checking if the current match is not better than the most stablematch partner
            #If the current match is not the stable match keep iterating till we find someone who is more stable match than potential match
            if (current_match > stable_match): 
                #If not then we partner the candidate with the new match who is more stable i.e. more suitable based on preference
                unmatched_men.remove(male_candidate)
                #Now the previous partner will be left with no partner hence we need to add that candidate to unmatched list
                unmatched_men.append(taken[0][0])
                #Assign the tentative partner for the female candidate
                taken[0][0] = male_candidate
                break

def stable_matching():
    '''Matching algorithm until stable match terminates'''
    start_time = datetime.now()
    cnt = 0
    while (len(unmatched_men) > 0):
        cnt += 1
        for male_candidate in unmatched_men:
            Stablematching(male_candidate)
    end_time = datetime.now()
    print('Duration: {}'.format(end_time - start_time))
    print('Total Iterations: ' + str(cnt))

In [4]:
def main():
    init_unmatched_men()
    stable_matching()
    print(tentative_matchings)

main()

Duration: 0:00:00.000061
Total Iterations: 6
[['A', '10'], ['C', '2'], ['E', '4'], ['B', '5'], ['I', '9'], ['H', '6'], ['J', '1'], ['D', '7'], ['F', '3'], ['G', '8']]


**Method 2**:Using a user-defined package 'GSAlgo'

In [5]:
from GSAlgo import StableMatchingModel

ModuleNotFoundError: No module named 'GSAlgo'

In [6]:
MalePrefers = {
    'A': ['10', '4', '9', '4','5','1','7','3','2','6'],
    'B': ['6', '5', '3', '8','9','7','1','4','10','2'],
    'C': ['2', '1', '8', '9','5','7','4','10','6','3'],
    'D': ['1', '2', '7', '3','6','8','9','10','4','5'],
    'E': ['4', '1', '2', '3','5','6','9','8','7','10'],
    'F': ['1', '2', '3', '8','4','9','5','7','6','10'],
    'G': ['5', '10', '4', '3','8','1','9','7','6','2'],
    'H': ['9', '1', '2', '6','3','10','4','8','7','5'],
    'I': ['9', '7', '10', '1','4','3','6','5','8','5'],
    'J': ['1', '4', '10', '3','5','8','6','7','9','2']
}

FemalePrefers = {
    '1':  ['B', 'E', 'J', 'C', 'D', 'G', 'F', 'A', 'H', 'I'],
    '2':  ['J', 'A', 'B', 'C', 'E', 'G', 'F', 'D', 'H', 'I'],
    '3':  ['A', 'C', 'E', 'F', 'I', 'D', 'G', 'H', 'J', 'B'],
    '4':  ['H', 'E', 'I', 'F', 'D', 'C', 'B', 'J', 'G', 'A'],
    '5':  ['C', 'I', 'B', 'D', 'E', 'A', 'H', 'J', 'G', 'F'],
    '6':  ['J', 'G', 'D', 'F', 'H', 'B', 'A', 'E', 'C', 'I'],
    '7':  ['E', 'B', 'D', 'H', 'G', 'F', 'J', 'A', 'I', 'C'],
    '8':  ['D', 'F', 'E', 'H', 'G', 'B', 'C', 'J', 'A', 'I'],
    '9':  ['G', 'A', 'I', 'C', 'J', 'B', 'E', 'F', 'H', 'D'],
    '10': ['B', 'C', 'A', 'D', 'E', 'I', 'H', 'J', 'F', 'G']
}

In [7]:
start_time = datetime.now()
model = StableMatchingModel(MalePrefers, FemalePrefers)
mu = model.Deferred_Acceptance()
end_time = datetime.now()
print('Duration: {}'.format(end_time - start_time))
mu

NameError: name 'StableMatchingModel' is not defined

Using this user-defined package is very useful to visualize the GS algorithm in various iterations. This allows us to print the total steps the algo took to reach a stable match between the candidates which is shown below.

In [8]:
model.Deferred_Acceptance(print_rounds=True);

NameError: name 'model' is not defined

Similarly, we can also use the package to get the iterative matchings as well

In [8]:
model.Deferred_Acceptance(print_tentative_matchings=True);

Tentative matching after Round 1:
{'A': '10', 'B': '6', 'C': '2', 'J': '1', 'E': '4', 'G': '5', 'I': '9'}
Tentative matching after Round 2:
{'J': '1', 'C': '2', 'A': '10', 'B': '6', 'E': '4', 'G': '5', 'I': '9'}
Tentative matching after Round 3:
{'C': '2', 'F': '3', 'D': '7', 'J': '1', 'A': '10', 'B': '6', 'E': '4', 'G': '5', 'I': '9'}
Tentative matching after Round 4:
{'H': '6', 'C': '2', 'F': '3', 'D': '7', 'J': '1', 'A': '10', 'E': '4', 'G': '5', 'I': '9'}
Tentative matching after Round 5:
{'B': '5', 'H': '6', 'C': '2', 'F': '3', 'D': '7', 'J': '1', 'A': '10', 'E': '4', 'I': '9'}
Tentative matching after Round 6:
{'A': '10', 'B': '5', 'H': '6', 'C': '2', 'F': '3', 'D': '7', 'J': '1', 'E': '4', 'I': '9'}
Tentative matching after Round 7:
{'E': '4', 'A': '10', 'B': '5', 'H': '6', 'C': '2', 'F': '3', 'D': '7', 'J': '1', 'I': '9'}
Tentative matching after Round 8:
{'F': '3', 'E': '4', 'A': '10', 'B': '5', 'H': '6', 'C': '2', 'D': '7', 'J': '1', 'I': '9'}


The algorithm can handle unmatched matches too, i.e. it can handle examples where someone will have to end up unmatched in a stable matching.

In [9]:
FemalePrefers['3'] =  ['I', 'E']

model = StableMatchingModel(MalePrefers, FemalePrefers)
mu_2 = model.Deferred_Acceptance()
mu_2

{'I': '3',
 'E': '4',
 'J': '1',
 'A': '10',
 'D': '7',
 'G': '9',
 'F': '8',
 'B': '5',
 'H': '6',
 'C': '2'}

### <font color=black>Observation 

Random paths to stability
Gale-Shapley algorithm implies a centralized authority who can match men and women by collecting their preference lists and using them to output a stable matching.

Question: Can a group of men and women reach a stable outcome if they match up in a decentralized way? In other words, if men and women date each other, break up, date another, break up, etc. out on their own, can they reach a stable matching eventually?

Roth and van de Vate (1990) answered this question in the affirmative. They proved that starting from any unstable matching, there exists a path to a stable matching. This result suggests that stable matchings are a natural converging point for two-sided matching problems. Note that we are not sure which stable matching will be reached in this decentralized setting, whereas in the Gale-Shapley algorithm, we are very certain which stable matching will be reached.

### <font color=black>Conclusion

- It finds the best outcome for the people on the proposing side and the worst outcome for the people on the receiving side.
- It is strategy-proof for the proposers, i.e. no proposer benefits by unilaterally submitting a false preference list. 
- It is always in the best interest of the proposer to submit their true preference list.

### <font color=black>References
1. SNAP- Stanford Network Analysis Project [link](https://snap.stanford.edu/papers.html)
3. John P Dickerson(CMU) - Stable Matching [link](http://www.cs.cmu.edu/~arielpro/15896s16/slides/896s16-16.pdf)
4. Yilong Geng & Mingyu Gao(Stanford)- Stable Marriage using Spark [link](https://stanford.edu/~rezab/classes/cme323/S15/projects/stable_marriage_spark_report.pdf)
5. Jon Kleinberg(Princeton University)- Stable Matching Algorithm lecture slides [link](https://www.cs.princeton.edu/~wayne/kleinberg-tardos/pdf/01StableMatching.pdf)
2. Asynchronous Self-Stabilizing Stable Marriage (Current Research in the GS Algo) [link](https://tel.archives-ouvertes.fr/tel-03068501/document)

#### 500 Node Dataset

#### <div align="center">-X-X- 

### <font color=black>Appendix

In [10]:
#A simple randomize method to generate the above dataset (Female Prefers)
from random import randint

def randomize (arr, n):
    for i in range(n-1,0,-1):
        j = randint(0,i+1)
        arr[i],arr[j] = arr[j],arr[i]
    return arr

arr = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']
n = len(arr)
print(randomize(arr, n))

['J', 'A', 'D', 'B', 'E', 'G', 'H', 'C', 'I', 'F']


In [11]:
#A simple randomize method to generate the above dataset (Male Prefers)
from random import randint

def randomize (arr, n):
    for i in range(n-1,0,-1):
        j = randint(0,i+1)
        arr[i],arr[j] = arr[j],arr[i]
    return arr

arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
n = len(arr)
print(randomize(arr, n))

[10, 5, 9, 2, 6, 7, 4, 8, 3, 1]


In [12]:
MalePrefers = {
    'A': ['10', '4', '9', '4','5','1','7','3','2','6'],
    'B': ['6', '5', '3', '8','9','7','1','4','10','2'],
    'C': ['2', '1', '8', '9','5','7','4','10','6','3'],
    'D': ['1', '2', '7', '3','6','8','9','10','4','5'],
    'E': ['4', '1', '2', '3','5','6','9','8','7','10'],
    'F': ['1', '2', '3', '8','4','9','5','7','6','10'],
    'G': ['5', '10', '4', '3','8','1','9','7','6','2'],
    'H': ['9', '1', '2', '6','3','10','4','8','7','5'],
    'I': ['9', '7', '10', '1','4','3','6','5','8','5'],
    'J': ['1', '4', '10', '3','5','8','6','7','9','2']
}

In [59]:
#Code to Generate a Dataframe to get a dataset with node sizes 500, 1k, 1.5k, 2k, etc.
import numpy as np
import pandas as pd

#my_array = np.array([[11,22,33],[44,55,66]])
F_cand = []

for i in range(1, 501):
    _str = 'F'+str(i)
    F_cand.append(_str)
    
#print(F_cand)
#columns = np.arange(1, 500, 1).tolist()

#M_cand = 'M'+chr(i):i for i in range(2,500)

M_cand = []

for i in range(1, 501):
    _str = 'M'+str(i)
    M_cand.append(_str)

#print(M_cand)

df = pd.DataFrame(index = F_cand,columns = M_cand)
dataset_name = str(len(M_cand))+'_nodes.csv'
df.to_csv(dataset_name)

print(df)

       M1   M2   M3   M4   M5   M6   M7   M8   M9  M10  ... M491 M492 M493  \
F1    NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  ...  NaN  NaN  NaN   
F2    NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  ...  NaN  NaN  NaN   
F3    NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  ...  NaN  NaN  NaN   
F4    NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  ...  NaN  NaN  NaN   
F5    NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  ...  NaN  NaN  NaN   
...   ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...   
F496  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  ...  NaN  NaN  NaN   
F497  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  ...  NaN  NaN  NaN   
F498  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  ...  NaN  NaN  NaN   
F499  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  ...  NaN  NaN  NaN   
F500  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  ...  NaN  NaN  NaN   

     M494 M495 M496 M497 M498 M499 M500  
F1    NaN  NaN  NaN  

In [1]:
from random import randint

def randomize (arr, n):
    for i in range(n-1,0,-1):
        j = randint(0,i+1)
        arr[i],arr[j] = arr[j],arr[i]
    return arr

n = len(M_cand)

to_append = randomize(M_cand, n)
df_length = len(df)
df.loc[df_length] = to_append

NameError: name 'M_cand' is not defined

In [None]:
#Additional different attempt to generate a dataset 
#not used for the testing
import numpy as np
import pandas as pd


Alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
Numeric = '12345678910'
# more Pythonic: list-comprehension
squares = [r+c for r in Alpha for c in Numeric]
print(len(squares))
print(squares[0:len(squares)])


variables = np.arange(1, 500, 1).tolist()
pd.DataFrame(squares, columns=variables)


#Additional different attempt to generate a dataset 
#not used for the testing
dict_test ={}
for k in range (1, 500):
    dict_test = {chr(i+64):i for i in range(1,27)}

print(dict_test)