# Students Case Study
### Short story
The end of school year is very close and Martin - very ambitious student - wants to know how he performs in his class. He got somehow the data about average grades from every subject of all his colleagues. Now he needs to create a ranking of the students and check which rank he has. Luckily for him the TOPSIS method was invented, so he can use it. To make life simpler he decided to use MSDTransformer library - which was also created by a group of very ambitious students - a tool, which allows him to create a ranking based on TOPSIS method and MSD space.

### Read the data
The first step is to read the data and convert them to pandas DataFrame.

In [1]:
import pandas as pd

students = pd.read_csv("../data/students.csv", sep = ';', index_col = 0)
display(students)

Unnamed: 0_level_0,Math,Bio,Art
StudentID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
U1,0.0,1.0,1.0
U2,33.33,2.67,2.67
U3,50.0,3.5,3.5
U4,66.67,4.33,4.33
U5,100.0,6.0,6.0
U6,20.0,2.25,3.75
U7,67.64,2.5,3.62
U8,45.0,4.75,5.0
U9,0.0,2.25,4.75
U10,62.99,5.0,1.35


In Martin's class there are 19 students. Each of them has average marks from 3 subjects: Mathematics, Biology and Art. Math's grades are in range 0-100, while Biology's and Art's grades are in range 1-6. In Marin's school's system the bigger the grade is, the better.

### Define objectives, weights and expert_range
Now Martin can define an objective function for each subject - that means that he can provide an information which grades are cost type and which are gain type. In his school the greater the grade is the better, so each of them is a gain type criterion.

In [2]:
objectives = {
    "Math": 'gain',
    "Bio" : 'gain',
    "Art" : 'gain'
}

Each of the subjects has the same weight as others, therefore there is no need to define weights.

The MSDTransformer library by default takes minimal and maximal value as a range for each criterion. However there is also on option to provide custom ranges. Martin decided to do it and he defined his own ranges.

In [3]:
expert_range = {
    "Math": [0, 100],
    "Bio" : [1, 6],
    "Art" : [1, 6]
}

### Create a ranking
Finally, when everything is defined, Martin can create a ranking. To do it, at first he need to create an object of MSDTransformer class. He need to decide also which function the library should use to deal with students, who has the same mean of all their grades.

In [11]:
import MSDTransformer as msdt

agg_function = msdt.ATOPSIS
students_obj = msdt.MSDTransformer(agg_function)

Then he need to run the fit() method on the created object, to fit all data from students DataFrame. Only then, he can run the transform() method, which will calculate the ranking scores for each of the students.

In [12]:
students_obj.fit(data = students, objectives = objectives,  expert_range = expert_range)
students_obj.transform(None)

To see the results of created ranking, Martin needs to use show_ranking() method.

In [16]:
students_obj.show_ranking()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  ranking['Rank'][alternative] = self.ranked_alternatives.index(alternative) + 1


Unnamed: 0_level_0,Rank,Math,Bio,Art
StudentID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
U5,1,1.0,1.0,1.0
U16,2,1.0,1.0,0.15
U18,3,1.0,1.0,0.0
U15,4,0.09,0.91,1.0
U11,5,0.25,0.75,1.0
U8,6,0.45,0.75,0.8
U4,7,0.6667,0.666,0.666
U14,8,0.0,0.5,1.0
U10,9,0.6299,0.8,0.07
U17,10,0.0,0.0,1.0


Martin can also look on this ranking in form of 2D chart, showing the position of each student in MSD space. To do is, he need to run plot() method.

In [17]:
students_obj.plot()

plot


### Show potential improvement actions
Martin is not satisfied with his current position. His student's ID is U4 so from ranking he can read that his rank is 7. He wants to be at least better from his class enemy Claudia, whose student's ID is U18 and she occupies 3rd position. Martin doesn't have much time, so he need to know how to improve his position with at least amount of work as possible. Morover Martin knows that there is no chance for him to improve his Art's grade. The improvement_features() method will be here the best choice.

In [20]:
import numpy as np

martin_id = students_obj.data.loc[students_obj.ranked_alternatives[7-1]].copy()
claudia_id = students_obj.data.loc[students_obj.ranked_alternatives[3-1]].copy()
grades_to_change = ['Math', 'Bio']
improvement_ratio = 0.01

students_obj.agg_fn.improvement_features(
    alternative_to_improve=martin_id,
    alternative_to_overcome=claudia_id,
    improvement_ratio=improvement_ratio,
    w = students_obj.weights,
    features_to_change=grades_to_change,
    value_range=students_obj.value_range,
    objectives=students_obj.objectives
)

Unnamed: 0,Improvement rate
Math,33.33
Bio,0.42
Art,0.0


Now Martin knows, that to beat the Claudia, he needs to improve his Math's grade by 33.33 and his Biology's grade by 0.42.