# Semi-Automated Embracing Approach - PoC

PoC for explanation of the core of the semi-automated evolution of the embracing approach for (Privacy) Threat Modelling.

### Dependencies

Please install the dependencies.

In [None]:
!pip3 install -r requirements.txt

### Imports and Setup

The following snippet imports the required modules and sets up some utils.

In [1]:
import itertools
import pandas
import spacy
from nltk.corpus import wordnet as wn
from sentence_transformers import SentenceTransformer, util


# Semantic similarity model
model = SentenceTransformer('stsb-roberta-large')
# Spacy setup
nlp = spacy.load('en_core_web_lg')
nlp.add_pipe('merge_entities')

# Synset utility functions
hyper = lambda s: s.hypernyms()
hypo = lambda s: s.hyponyms()
part_mero = lambda s: s.part_meronyms()
part_holo = lambda s: s.part_holonyms()
synsets = lambda s: wn.synsets(s)

## Semantic Similarity

We leverage Semantic Similarity to understand whether two threat are similar, hence embraceable, or not.

The most straightforward and effective method is to use a powerful model (e.g., transformer) to encode sentences to get their embeddings and then use a similarity metric (e.g., cosine similarity) to compute their similarity score.

In [18]:
def semantic_similarity(sentence1: str, sentence2: str) -> float:
    '''
    Calculate the semantic similarity between two sentences.
    '''
    # Encode threats to get their embeddings
    embedding1 = model.encode(sentence1, convert_to_tensor=True)
    embedding2 = model.encode(sentence2, convert_to_tensor=True)
    # Compute similarity scores of two embeddings
    cosine_scores = util.pytorch_cos_sim(embedding1, embedding2)
    return cosine_scores.item()

#### Example

In [19]:
# Get and print semantic similarity between threat1 and threat2
threat1 = "An adversary relates pseudonymous positions to specific vehicles."
threat2 = "Possibility to discover and control the behaviour and profile of the driver."
print(f"Threat 1: {threat1}\nThreat 2: {threat2}\nSimilarity score: {semantic_similarity(threat1, threat2)}")

Threat 1: An adversary relates pseudonymous positions to specific vehicles.
Threat 2: Possibility to discover and control the behaviour and profile of the driver.
Similarity score: 0.4503408968448639


## Synset Relations

We also support the analyst in choosing the term with the most pertaining wording/level of detail by leveraging synset (synonym set) relations.

In particular, we are interested in two types of relationships: the "type of" synset relation, which regards the hypernyms/hyponyms, and the "part of" synset relation, which regards part meronyms/part holonyms.

In [20]:
def synset_relations(sentence: str) -> dict:
    '''
    Get synset relations for each term within a sentence.
    '''
    result = {'sentence': sentence, 'terms': []}

    doc = nlp(sentence)
    # Get synset relations for nouns only
    nouns = [token.lemma_ for token in doc if token.pos_ == 'NOUN']
    for noun in nouns:
        # By default, consider the term as the first of the synonyms in the corpus.
        term = synsets(noun)[0] if synsets(noun) else None
        if term:
            # If the term is found, then retrieve the synset relations.
            relations = {
                'term': term,
                'synonyms': synsets(noun),
                'meronyms': list(term.closure(part_mero)),
                'holonyms': list(term.closure(part_holo)),
                'hypernyms [L1]': list(term.closure(hyper, depth=1)),
                'hypernyms [L2]': list(term.closure(hyper, depth=2)),
                'hypernyms [L3]': list(term.closure(hyper, depth=3)),
                'hyponyms [L1]': list(term.closure(hypo, depth=1)),
                'hyponyms [L2]': list(term.closure(hypo, depth=2)),
                'hyponyms [L3]': list(term.closure(hypo, depth=3))
            }
            result['terms'].append(relations)
        else:
            print(f'Term "{noun}" not found in corpus!')

    return result

#### Example

In [21]:
# Get and print synset relations for terms in threat1 and threat2
threat1 = "An adversary relates pseudonymous positions to specific vehicles."
threat2 = "Possibility to discover and control the behaviour and profile of the driver."

sr1 = synset_relations(threat1)
print(f'Synset relations for threat: "{threat1}"\n')
for term in sr1.get('terms'):
    print('Term: ', term.get('term'))
    print('Synonyms: ', term.get('synonyms'))
    print('Meronyms: ', term.get('meronyms'))
    print('Holonyms: ', term.get('holonyms'))
    print('Hypernyms [L1]: ', term.get('hypernyms [L1]'))
    print('Hypernyms [L2]: ', term.get('hypernyms [L2]'))
    print('Hypernyms [L3]: ', term.get('hypernyms [L3]'))
    print('Hyponyms [L1]: ', term.get('hyponyms [L1]'))
    print('Hyponyms [L2]: ', term.get('hyponyms [L2]'))
    print('Hyponyms [L3]: ', term.get('hyponyms [L3]'))
    print('\n\n\n---\n\n\n')

print('\n\n\n------------------------------------------------------------\n\n\n')

sr2 = synset_relations(threat2)
print(f'Synset relations for threat: "{threat2}"\n')
for term in sr1.get('terms'):
    print('Term: ', term.get('term'))
    print('Synonyms: ', term.get('synonyms'))
    print('Meronyms: ', term.get('meronyms'))
    print('Holonyms: ', term.get('holonyms'))
    print('Hypernyms [L1]: ', term.get('hypernyms [L1]'))
    print('Hypernyms [L2]: ', term.get('hypernyms [L2]'))
    print('Hypernyms [L3]: ', term.get('hypernyms [L3]'))
    print('Hyponyms [L1]: ', term.get('hyponyms [L1]'))
    print('Hyponyms [L2]: ', term.get('hyponyms [L2]'))
    print('Hyponyms [L3]: ', term.get('hyponyms [L3]'))
    print('\n\n\n---\n\n\n')

Synset relations for threat: "An adversary relates pseudonymous positions to specific vehicles."

Term:  Synset('adversary.n.01')
Synonyms:  [Synset('adversary.n.01')]
Meronyms:  []
Holonyms:  []
Hypernyms [L1]:  [Synset('person.n.01')]
Hypernyms [L2]:  [Synset('person.n.01'), Synset('causal_agent.n.01'), Synset('organism.n.01')]
Hypernyms [L3]:  [Synset('person.n.01'), Synset('causal_agent.n.01'), Synset('organism.n.01'), Synset('physical_entity.n.01'), Synset('living_thing.n.01')]
Hyponyms [L1]:  [Synset('dueler.n.01'), Synset('enemy.n.02'), Synset('luddite.n.01'), Synset('withstander.n.01')]
Hyponyms [L2]:  [Synset('dueler.n.01'), Synset('enemy.n.02'), Synset('luddite.n.01'), Synset('withstander.n.01'), Synset('besieger.n.01')]
Hyponyms [L3]:  [Synset('dueler.n.01'), Synset('enemy.n.02'), Synset('luddite.n.01'), Synset('withstander.n.01'), Synset('besieger.n.01')]



---



Term:  Synset('position.n.01')
Synonyms:  [Synset('position.n.01'), Synset('military_position.n.01'), Synset('

### Check Relationship Between Two Terms

We can check if two terms have a "type of" synset relation with the following code.

In [62]:
def check_typeof_synset_relationship(term1: str, term2: str) -> bool:
    '''
    Check whether two terms have a "type of" synset relation.
    '''
    common_hypernyms = set()
    result = False

    synsets1 = synsets(term1)
    synsets2 = synsets(term2)
    if not synsets1 or not synsets2:
        print('One or both terms do not have synsets in WordNet.')
        return False

    for synset1 in synsets1:
        for synset2 in synsets2:
            if synset1 in list(synset2.closure(hypo, depth=3)):
                print(f'{term1} is a hyponym of {term2}')
                result = True
            elif synset1 in list(synset2.closure(hyper, depth=3)):
                print(f'{term1} is a hypernym of {term2}')
                result = True
            elif synset2 in list(synset1.closure(hypo, depth=3)):
                print(f'{term2} is a hyponym of {term1}')
                result = True
            elif synset2 in list(synset1.closure(hyper, depth=3)):
                print(f'{term2} is a hypernym of {term1}')
                result = True
            common_hypernyms.update(set(synset1.lowest_common_hypernyms(synset2)))

    if common_hypernyms:
        common_hypernym_names = list(set(hypernym.name().split('.')[0] for hypernym in common_hypernyms))
        print(f'{term1} and {term2} have common hypernyms: {", ".join(common_hypernym_names)}')
        result = True

    if result == False:
        print(f'{term1} and {term2} are not related ("type of")')
    return result

#### Example

In [67]:
# Get and print "type of" synset relationship between term1 and term2
term1 = 'car'
term2 = 'vehicle'
check_typeof_synset_relationship(term1, term2)

car is a hyponym of vehicle
car and vehicle have common hypernyms: object, instrumentality, physical_entity, vehicle, artifact


  for synset in acyclic_breadth_first(self, rel, depth):
  for synset in acyclic_breadth_first(self, rel, depth):
  for synset in acyclic_breadth_first(self, rel, depth):


True

We can check if two terms have a "part of" synset relation with the following code.

In [24]:
def check_partof_synset_relationship(term1: str, term2: str) -> bool:
    '''
    Check whether two terms have a "part of" synset relation.
    '''
    result = False

    synsets1 = synsets(term1)
    synsets2 = synsets(term2)
    if not synsets1 or not synsets2:
        print('One or both terms do not have synsets in WordNet.')
        return False

    for synset1 in synsets1:
        for synset2 in synsets2:
            if synset1 in list(synset2.closure(part_holo, depth=3)):
                print(f'{term1} is a holonym of {term2}')
                result = True
            elif synset1 in list(synset2.closure(part_mero, depth=3)):
                print(f'{term1} is a meronym of {term2}')
                result = True
            elif synset2 in list(synset1.closure(part_holo, depth=3)):
                print(f'{term2} is a holonym of {term1}')
                result = True
            elif synset2 in list(synset1.closure(part_mero, depth=3)):
                print(f'{term2} is a meronym of {term1}')
                result = True

    if result == False:
        print(f'{term1} and {term2} are not related ("part of")')
    return result

#### Example

In [25]:
# Get and print "part of" synset relationship between term1 and term2
term1 = 'car'
term2 = 'window'
check_partof_synset_relationship(term1, term2)

car is a holonym of window


True