# TEAM 2

This notebook provides the workflow within all the steps that compose the TEAM 2 algorithm for (Privacy) Threat Modelling.

#### Setup

Run the following cells before starting with your analysis.

*Remember to install dependencies only once, so just comment the cell after executed. On the other hand, module imports need to be done on each run.*

In [None]:
#!pip install -r requirements.txt

In [None]:
import pandas as pd
from ipywidgets import widgets
from itables import init_notebook_mode

from embracing_utils import filter_dataframe_by_threshold, find_synset_relations, synset_relations

init_notebook_mode(all_interactive=True)

### Preliminary Threats Upload

Please upload the preliminary threats list in CSV format (suggested path: `./data/TEAM 2/<file>.csv`).

Remember to perform this operation at the start of a new round. Just change the `preliminary_threat_list_path` variable with the path to the threat list.

In [None]:
preliminary_threat_list_path = f'./data/TEAM 2/preliminary_threats.csv'  # CHANGE ME!

### Semantic Similarity Computation

You can proceed and run the `ss_scores_by_groups.py` script in a Terminal, by adjusting the cardinality desired in the main function (line 115). The script will compute the semantic similarity scores and store them as `./data/TEAM 2/{preliminary_threats_filename}_ss_scores_with_cardinality_{k}.csv`.

Once the script terminates, you can load the semantic similarity scores as follows:

In [None]:
ss_scores_df = pd.read_csv(f'./data/TEAM 2/preliminary_threats_ss_scores_with_cardinality_3.csv')    # CHANGE ME!
print(f'Total number of preliminary threats submitted: {len(ss_scores_df.index)}')

### Embraceable Candidates Elicitation

Please set your desirable semantic similarity score threshold by running the following cell and adjusting the slider according to your target number of desiderable final threats.

In [None]:
ss_threshold = widgets.FloatSlider(value=0.66,min=-1, max=1, step=0.01, description='threshold:', readout_format='.2f')
ss_threshold

In [None]:
embraceable_candidates = filter_dataframe_by_threshold(ss_scores_df, 'mean', ss_threshold.value)    # NOTE: You can change the mean column if you want to filter by max or min.
print(f'The list of embraceable candidates above the threshold {round(ss_threshold.value, 2)} contains {len(embraceable_candidates)} pairs.')
embraceable_candidates

### Threat Embracing

Now you are able to further investigate all the threat pairs with a score equal or greater to such a threshold. To display the table, please run the following cell. Then, you should iterate for each pair candidate and annotate the embracing in an external (Excel-like) sheet.

In [None]:
embraceable_candidates

For each row of interesed, specify its index and run the following cell to focus the analysis on such a specific threat pair.

In [None]:
# Please specify the index of the pair you want to embrace.
index_to_embrace = 94801  # CHANGE ME!
embraceable_candidates.iloc[embraceable_candidates.index==index_to_embrace]

Now that you shifted the focus of the analysis on a specific threat pair, run the following cell to obtain automatically identified (if present) synset relations.

In [None]:
s1 = embraceable_candidates.iloc[embraceable_candidates.index==index_to_embrace]['sentence1'].values[0]
s2 = embraceable_candidates.iloc[embraceable_candidates.index==index_to_embrace]['sentence2'].values[0]
is_partof, is_typeof = find_synset_relations(s1, s2)
print(f'\nPart of relation(s) found: {is_partof}\tType of relation(s) found: {is_typeof}')

To support the most appropriate choice of wording/level of detail, run the last cell for an overview of the synset relations related to the nouns identified in both threat labels.

In [None]:
synset_dict1 = synset_relations(embraceable_candidates.iloc[embraceable_candidates.index==index_to_embrace]['sentence1'].values[0])
synset_dict2 = synset_relations(embraceable_candidates.iloc[embraceable_candidates.index==index_to_embrace]['sentence2'].values[0])

focus_df = pd.concat([pd.DataFrame.from_dict(synset_dict1['terms']), pd.DataFrame.from_dict(synset_dict2['terms'])])
if not focus_df.empty:
    focus_df['synonyms'] = focus_df.get('synonyms').str.slice(0,3)
    focus_df['hypernyms [L1]'] = focus_df.get('hypernyms [L1]').str.slice(0,1)
    focus_df['hyponyms [L1]'] = focus_df.get('hyponyms [L1]').str.slice(0,1)
    focus_df['meronyms'] = focus_df.get('meronyms').str.slice(0,1)
    focus_df['holonyms'] = focus_df.get('holonyms').str.slice(0,1)
    focus_df