# Jaccard Coefficient Calculation for Pathological Test Results

This notebook calculates the Jaccard coefficient for the following pairs of individuals based on their pathological test results: (Jack, Mary), (Jack, Jim), (Jim, Mary).

In [None]:
import pandas as pd

# Data
data = {
    'Name': ['Jack', 'Mary', 'Jim'],
    'Gender': ['M', 'F', 'M'],
    'Fever': ['Y', 'Y', 'Y'],
    'Cough': ['N', 'N', 'P'],
    'Test-1': ['P', 'P', 'N'],
    'Test-2': ['N', 'A', 'N'],
    'Test-3': ['N', 'P', 'N'],
    'Test-4': ['A', 'N', 'A']
}

df = pd.DataFrame(data)
df.set_index('Name', inplace=True)
df

## Jaccard Coefficient Function
The Jaccard coefficient is defined as the size of the intersection divided by the size of the union of two sets. Here, we consider matching values (excluding the individual's name and gender) as the intersection.

In [None]:
def jaccard_coefficient(row1, row2):
    # Exclude Gender
    features1 = row1[1:]
    features2 = row2[1:]
    intersection = sum(f1 == f2 for f1, f2 in zip(features1, features2))
    union = len(features1)
    return intersection / union


## Calculate Jaccard Coefficients for the Pairs


In [None]:
pairs = [('Jack', 'Mary'), ('Jack', 'Jim'), ('Jim', 'Mary')]
for a, b in pairs:
    coef = jaccard_coefficient(df.loc[a], df.loc[b])
    print(f'Jaccard coefficient for ({a}, {b}): {coef:.2f}')
