# Choosing the Deployment Candidate

### Background
A small analytics startup has been benchmarking several prediction models.
For each candidate, they recorded overall accuracy, average response time, a fairness penalty score (lower is better), and how many training samples were used.
They now need to decide which models are actually acceptable for production, and among those, which ones are most attractive to deploy.
The team has some “hard rules” and then a softer scoring idea, but nothing is coded yet.
They’ve asked you to build a small decision helper on top of their benchmark summary.

### Tasks

1. From the list of candidates, filter out any model that fails any of these hard requirements:

    * accuracy must be at least 0.85

    * latency must be at most 120 ms

    * fairness penalty must be at most 0.10

    * training samples must be at least 10,000

2. For the remaining models, compute a simple “deployment score” that rewards accuracy, penalizes latency and fairness issues.
You can design the exact formula yourself, but it should:

    * increase when accuracy is higher

    * decrease when latency or fairness penalty are higher

3. Rank the acceptable models by this score, from best to worst, and print a small summary of the top three.

4. Identify any dominated models among the acceptable ones: a model A is dominated by B if B is at least as good in all metrics (accuracy higher or equal, latency lower or equal, fairness penalty lower or equal, training samples higher or equal) and strictly better in at least one.
Print any such dominated pairs.

In [78]:
CANDIDATES = [
    {
        "name": "M1",
        "accuracy": 0.86,
        "latency_ms": 110,
        "fairness_penalty": 0.05,
        "train_samples": 12000,
    },
    {
        "name": "M2",
        "accuracy": 0.83,
        "latency_ms": 95,
        "fairness_penalty": 0.03,
        "train_samples": 18000,
    },
    {
        "name": "M3",
        "accuracy": 0.89,
        "latency_ms": 140,
        "fairness_penalty": 0.02,
        "train_samples": 25000,
    },
    {
        "name": "M4",
        "accuracy": 0.91,
        "latency_ms": 118,
        "fairness_penalty": 0.09,
        "train_samples": 9000,
    },
    {
        "name": "M5",
        "accuracy": 0.88,
        "latency_ms": 105,
        "fairness_penalty": 0.11,
        "train_samples": 16000,
    },
    {
        "name": "M6",
        "accuracy": 0.92,
        "latency_ms": 100,
        "fairness_penalty": 0.04,
        "train_samples": 20000,
    },
    {
        "name": "M7",
        "accuracy": 0.85,
        "latency_ms": 115,
        "fairness_penalty": 0.08,
        "train_samples": 15000,
    },
]

In [112]:
def model_selection(CANDIDATES):
    requirements_passed = []

    print('Benchmark tests')
    print('---------------\n')

    for i in CANDIDATES:
        # Accuracy test
        if i['accuracy'] < 0.85:
            print(f'Model {i['name']} did not pass the accuracy test.\nAccuracy = {i['accuracy']}\n')
        
        # Latency test
        elif i['latency_ms'] > 120:
            print(f'Model {i['name']} did not pass the latency test.\nLatency = {i['latency_ms']} ms\n')

        # Fairness test
        elif i['fairness_penalty'] > 0.1:
            print(f'Model {i['name']} did not pass the fairness test.\nFairness penalty = {i['fairness_penalty']}\n')

        # Training test
        elif i['train_samples'] < 10000:
            print(f'Model {i['name']} did not pass the training sample test.\nTraining samples = {i['train_samples']}\n')
        
        else:
            requirements_passed.append(i)

    print('\nComputing model scores')
    print('----------------------\n')
    
    # Deriving averages
    accuracy_sum = 0
    latency_sum = 0
    fairness_sum = 0
    sample_sum = 0
    
    for i in requirements_passed:
        accuracy_sum += i['accuracy']
        latency_sum += i['latency_ms']
        fairness_sum += i['fairness_penalty']
        sample_sum += i['train_samples']

    
    accuracy_avg = accuracy_sum / len(requirements_passed)
    latency_avg = latency_sum / len(requirements_passed)
    fairness_avg = fairness_sum / len(requirements_passed)
    sample_avg = sample_sum / len(requirements_passed)

    # Weights

    w_acc = 0.8
    w_samples = 0.2
    w_lat = 0.2
    w_fair = 0.8

    model_scores = {}

    for i in requirements_passed:

        acc_ratio      = i['accuracy'] / accuracy_avg
        lat_ratio      = i['latency_ms'] / latency_avg
        fair_ratio     = i['fairness_penalty'] / fairness_avg
        samples_ratio  = i['train_samples'] / sample_avg
        
        score = (
            (acc_ratio ** w_acc)
            * ((1 / lat_ratio) ** w_lat)
            * ((1 / fair_ratio) ** w_fair)
            * (samples_ratio ** w_samples)
            )

        model_scores[i['name']] = score

        print(f"Model: {i['name']} | Score: {score:.4f}")

    sorted_scores = sorted(model_scores.items(), key=lambda x: x[1], reverse=True)
    req_dict = {m['name']: m for m in requirements_passed}

    

    print('\nModel report')
    print('------------\n')

    print('Top 3 models:')
    print(f'\t 1. {sorted_scores[0][0]}')
    print(f'\t\t Score = {sorted_scores[0][1]}')
    print(f'\t\t Accuracy = {req_dict[sorted_scores[0][0]]['accuracy']}')
    print(f'\t\t Latency = {req_dict[sorted_scores[0][0]]['latency_ms']}')
    print(f'\t\t Fairness penalty = {req_dict[sorted_scores[0][0]]['fairness_penalty']}')
    print(f'\t\t Training samples = {req_dict[sorted_scores[0][0]]['train_samples']}')

    print(f'\t 2. {sorted_scores[1][0]}')
    print(f'\t\t Score = {sorted_scores[1][1]}')
    print(f'\t\t Accuracy = {req_dict[sorted_scores[1][0]]['accuracy']}')
    print(f'\t\t Latency = {req_dict[sorted_scores[1][0]]['latency_ms']}')
    print(f'\t\t Fairness penalty = {req_dict[sorted_scores[1][0]]['fairness_penalty']}')
    print(f'\t\t Training samples = {req_dict[sorted_scores[1][0]]['train_samples']}')

    print(f'\t 3. {sorted_scores[2][0]}')
    print(f'\t\t Score = {sorted_scores[2][1]}')
    print(f'\t\t Accuracy = {req_dict[sorted_scores[2][0]]['accuracy']}')
    print(f'\t\t Latency = {req_dict[sorted_scores[2][0]]['latency_ms']}')
    print(f'\t\t Fairness penalty = {req_dict[sorted_scores[2][0]]['fairness_penalty']}')
    print(f'\t\t Training samples = {req_dict[sorted_scores[2][0]]['train_samples']}')



    def dominates(b, a):
        """
        Return True if model b dominates model a.
        b is better or equal in all metrics,
        and strictly better in at least one.
        """
        better_or_equal = (
            b['accuracy']        >= a['accuracy'] and
            b['train_samples']   >= a['train_samples'] and
            b['latency_ms']      <= a['latency_ms'] and
            b['fairness_penalty']<= a['fairness_penalty']
        )

        strictly_better = (
            b['accuracy']        > a['accuracy'] or
            b['train_samples']   > a['train_samples'] or
            b['latency_ms']      < a['latency_ms'] or
            b['fairness_penalty']< a['fairness_penalty']
        )

        return better_or_equal and strictly_better


    dominated_pairs = []  # list of tuples (dominated_model, dominating_model)

    for a in requirements_passed:
        for b in requirements_passed:
            if a['name'] == b['name']:
                continue
            if dominates(b, a):
                dominated_pairs.append((a['name'], b['name']))

    print("\nDominance analysis")
    print("------------------\n")

    if dominated_pairs:
        for dom, dom_by in dominated_pairs:
            print(f"Model {dom} is dominated by model {dom_by}")
    else:
        print("No dominated models found.")



In [113]:
model_selection(CANDIDATES)

Benchmark tests
---------------

Model M2 did not pass the accuracy test.
Accuracy = 0.83

Model M3 did not pass the latency test.
Latency = 140 ms

Model M4 did not pass the training sample test.
Training samples = 9000

Model M5 did not pass the fairness test.
Fairness penalty = 0.11


Computing model scores
----------------------

Model: M1 | Score: 1.0288
Model: M6 | Score: 1.4653
Model: M7 | Score: 0.7253

Model report
------------

Top 3 models:
	 1. M6
		 Score = 1.4653450677842041
		 Accuracy = 0.92
		 Latency = 100
		 Fairness penalty = 0.04
		 Training samples = 20000
	 2. M1
		 Score = 1.0288027314471755
		 Accuracy = 0.86
		 Latency = 110
		 Fairness penalty = 0.05
		 Training samples = 12000
	 3. M7
		 Score = 0.7252595791677968
		 Accuracy = 0.85
		 Latency = 115
		 Fairness penalty = 0.08
		 Training samples = 15000

Dominance analysis
------------------

Model M1 is dominated by model M6
Model M7 is dominated by model M6


This challenge brought together several core Python and data-science fundamentals in a practical, model-evaluation workflow. I implemented conditional filtering using loops and comparisons to determine which models passed the benchmark constraints, then used dictionary construction, ratio-based scoring, and weighted multiplicative formulas to rank the acceptable models. I also practiced computing dataset-level statistics (means), accessing nested dictionary fields, constructing sorted lists using sorted() with custom key functions, and formatting readable output reports. The final task introduced a classic multi-criteria comparison: dominance analysis, which required defining a helper function, iterating pairwise over models, and applying logical conditions to determine which models were strictly outperformed across all dimensions. Altogether, the challenge reinforced clean code structure, clarity of logic, multi-step reasoning, and practical evaluation patterns that mirror real-world model selection pipelines.

# === End of Challenge! ===