# Simple method for calculating decision criteria weights
Improving human decision intelligence with Julia code  
[Tomás Aragón, MD, DrPH](https://substack.com/@tomasaragon)    
Jul 14, 2024

This is a technical posting using the Julia language. The posting starts at [TEAM Public Health](https://teampublichealth.substack.com/p/simple-method-for-calculating-decision), and finishes here as a [Jupyterlab](https://jupyter.org/) notebook on GitHub. 

### 3.1 Ratio ordinal method in Julia

I will demonstrate this method using the Julia language. This method
can also be implemented using R, Python, or Microsoft Excel.

For rating the cars we have six criteria (attributes) for which we
need to calculate weights:

1. Color (C)
2. Mileage (M)
3. Price (P)
4. Reliability (R)
6. Safety (S)
7. Value, resale (V)

We have five evaluators that will rank the criteria based on their
knowledge, experience, expertise, and wisdom. It is much better for
them to rank the criteria independently and without thinking about
specific cars, otherwise they may game (bias) the weighting.

Here are the steps:

1. Select a ranking method to calculate weights for a specific number
of criteria, in this case we have six criteria. We will write and use
a Julia function that implements the SR method from [Danielson, 2017].
2. Have each evaluator independently rank the criteria.
3. Use Julia to calculate the final criteria weights. We will use the
split-apply-combine workflow that I introduced in a previous blog post
and book review [Kaminski, 2023].

#### 3.1.1 Step 1: The SR method for generating criteria weights

Here is the formula[^1] where $N$
 is the number of criteria, and $w_i^{SR}$
 is the weight for the $i^{th}$
 criterion. [Danielson, 2017]

$$
w_i^{SR} = \frac{1/i + \frac{N+1-i}{N}} {\sum_{j=1}^N
                        \left(\frac{N+1-i}{N}\right)}
$$

For this calculation I use the [Julia
Language](https://julialang.org/). Julia is as simple to program as
Python but with the speed of C++. These calculations can also be
completed in R. 

---
[Danielson, 2017] “Trade-Offs for Ordinal Ranking Methods in Multi-Criteria Decisions.” In Lecture Notes in Business Information Processing, by Mats Danielson and Love Ekenberg. Springer International Publishing, 2017. https://doi.org/10.1007/978-3-319-52624-9_2.

[Kamiński, 2023] Kamiński, Bogumił. Julia for Data Analysis. Manning Publications Co, 2023.

In [20]:
using DataFrames
using StatsBase
function calculate_rank_weights(n::Int64)
    num = zeros(n)
    for i in 1:n
        num[i] = (1/i) + ( (n + 1 - i) / n )
    end
    wi = num / sum(num)
    return wi
end
wi = calculate_rank_weights(6)

6-element Vector{Float64}:
 0.3361344537815126
 0.22408963585434175
 0.1680672268907563
 0.12605042016806722
 0.0896358543417367
 0.05602240896358543

The weights sum to 1, as expected.

In [21]:
round(sum(wi))

1.0

#### 3.1.2 Step 2: Evaluators rank the criteria

Five evaluators rank the criteria based on their expertise.

In [22]:
eval1 = ["Mileage", "Color", "Price", "Safety", "Reliability", "Value"]
eval2 = ["Mileage", "Color", "Safety", "Reliability", "Value", "Price"]
eval3 = ["Color", "Value", "Price", "Mileage", "Reliability", "Safety"]
eval4 = ["Mileage", "Value", "Color", "Safety", "Reliability", "Price"]
eval5 = ["Safety", "Price", "Color", "Reliability", "Mileage", "Value"]

6-element Vector{String}:
 "Safety"
 "Price"
 "Color"
 "Reliability"
 "Mileage"
 "Value"

Next, we organize the evaluator criteria rankings and the SR method criteria weights into a data frame with three columns:

- evaluators (optional)
- evaluator_rankings (required)
- weights (required)

In [23]:
ne = 5; # number of evaluators
nc = 6; # number for criteria
evaluators = repeat(["eval" .* string.(1:ne)...], inner=repeat([nc]))
evaluator_rankings = vcat( eval1, eval2, eval3, eval4, eval5 )
weights = repeat( wi, ne )
df = DataFrame(
    hcat(evaluators, evaluator_rankings, weights), 
    ["evaluator", "criteria", "weight"]
    )

Row,evaluator,criteria,weight
Unnamed: 0_level_1,Any,Any,Any
1,eval1,Mileage,0.336134
2,eval1,Color,0.22409
3,eval1,Price,0.168067
4,eval1,Safety,0.12605
5,eval1,Reliability,0.0896359
6,eval1,Value,0.0560224
7,eval2,Mileage,0.336134
8,eval2,Color,0.22409
9,eval2,Safety,0.168067
10,eval2,Reliability,0.12605


#### 3.1.3 Step 3: Calculate mean criteria weights

Calculate mean criteria weights using **split-apply-combine** workflow (Figure 1). In other words, stratifying by one or more criteria,
what is the mean weight for each strata (attribute)?

<figure>
  <img src="images/2024-07-14-1-phds_split_apply_combine_workflow.png" alt="split-apply-combine diagram" width="100%">
      <figcaption>Figure 1: The split-apply-combine workflow.</figcaption>
</figure>

The split-apply-combine is a common workflow in data science.


In [24]:
gdf = groupby(df, :criteria)

Row,evaluator,criteria,weight
Unnamed: 0_level_1,Any,Any,Any
1,eval1,Mileage,0.336134
2,eval2,Mileage,0.336134
3,eval3,Mileage,0.12605
4,eval4,Mileage,0.336134
5,eval5,Mileage,0.0896359

Row,evaluator,criteria,weight
Unnamed: 0_level_1,Any,Any,Any
1,eval1,Value,0.0560224
2,eval2,Value,0.0896359
3,eval3,Value,0.22409
4,eval4,Value,0.22409
5,eval5,Value,0.0560224


In [25]:
## combine and apply 
crit_weights = combine(gdf, :weight => mean)
sort!(crit_weights, :weight_mean, rev = true)

Row,criteria,weight_mean
Unnamed: 0_level_1,Any,Float64
1,Mileage,0.244818
2,Color,0.22409
3,Safety,0.162465
4,Price,0.134454
5,Value,0.129972
6,Reliability,0.104202


These are the final criteria weights and, as expected, they sum to 1.

In [26]:
round(sum(crit_weights.weight_mean))

1.0

## Appendix

### Function to automate calculating mean weights

We created the `calculate_rank_weights` function to calculate the
 weights for $n$ criteria using the SR method. Then we used Julia to
 calculate the mean weights for five evaluators.

Now we create a final function to automate calculating the mean
weights. This function will handle an arbitrary number of vectors with
rankings, or a matrix created from those vectors beforehand. Notice
that this new function will call our previous `calculate_rank_weights`
function.

In [27]:
function calculate_mean_weights(x...)
    x = hcat(x...)
    ncrit, neval = size(x, 1), size(x, 2)
    wts_i = calculate_rank_weights(ncrit)
    weights = repeat( wts_i, neval )
    evaluators = repeat( 
        ["eval" .* string.(1:neval)...], 
        inner = repeat([ncrit])
    )
    evalvator_rankings = reshape(x, ncrit * neval)
    df = DataFrame( 
        hcat(evaluators, evaluator_rankings, weights), 
        ["evaluator", "criteria", "weight"] 
    )
    gdf = groupby(df, :criteria)
    mean_wts = sort!(
        combine(gdf, :weight => mean), 
        :weight_mean, rev = true
    ) 
    return (
        weights = mean_wts,
        data = df
    )
end

calculate_mean_weights (generic function with 1 method)

Here we test passing vectors as arguments to the
`calculate_mean_weights` function.

In [28]:
r1 = calculate_mean_weights(eval1, eval2, eval3, eval4, eval5);

The results are saved in `r1` and the semi-colon (`;`) suppresses the
output. We use typeof function to evaluate the type of `r1`.

In [29]:
typeof(r1)

@NamedTuple{weights::DataFrame, data::DataFrame}

We see that `r1` is a NamedTuple and it contains two data frames named
weights and data. We can index each separately.

In [30]:
r1.weights

Row,criteria,weight_mean
Unnamed: 0_level_1,Any,Float64
1,Mileage,0.244818
2,Color,0.22409
3,Safety,0.162465
4,Price,0.134454
5,Value,0.129972
6,Reliability,0.104202


In [31]:
r1.data

Row,evaluator,criteria,weight
Unnamed: 0_level_1,Any,Any,Any
1,eval1,Mileage,0.336134
2,eval1,Color,0.22409
3,eval1,Price,0.168067
4,eval1,Safety,0.12605
5,eval1,Reliability,0.0896359
6,eval1,Value,0.0560224
7,eval2,Mileage,0.336134
8,eval2,Color,0.22409
9,eval2,Safety,0.168067
10,eval2,Reliability,0.12605


Next, we create the matrix of evaluator data and then pass it to the
`calculate_mean_weights` function.

In [32]:
eval_rankings_tab = hcat(eval1, eval2, eval3, eval4, eval5)
r2 = calculate_mean_weights(eval_rankings_tab);

The results are saved in `r2` and the semi-colon (`;`) suppresses the
output. We use typeof function to evaluate the type of `r2`.

In [33]:
typeof(r2)

@NamedTuple{weights::DataFrame, data::DataFrame}

We see that `r2` is a NamedTuple and it contains two data frames named
`weights` and `data`. We can index each separately.

In [34]:
r2.weights

Row,criteria,weight_mean
Unnamed: 0_level_1,Any,Float64
1,Mileage,0.244818
2,Color,0.22409
3,Safety,0.162465
4,Price,0.134454
5,Value,0.129972
6,Reliability,0.104202


In [35]:
r2.data

Row,evaluator,criteria,weight
Unnamed: 0_level_1,Any,Any,Any
1,eval1,Mileage,0.336134
2,eval1,Color,0.22409
3,eval1,Price,0.168067
4,eval1,Safety,0.12605
5,eval1,Reliability,0.0896359
6,eval1,Value,0.0560224
7,eval2,Mileage,0.336134
8,eval2,Color,0.22409
9,eval2,Safety,0.168067
10,eval2,Reliability,0.12605


The `calculate_mean_weights` function worked for both vectors or a
matrix. This was possible because of the splat operator (`...`) as in
the two lines below.

```
function calculate_mean_weights(x...)
    x = hcat(x...)
```