# ASN3 - Automated Requirements Interaction

In order to analyze requirements interaction, we first need a list of requirements.
Our requirements will be in the following format:

```
USX: User story title

As a: target role
I want to: perform an action
So that: a goal is met
```

I'll use a class to store a story. There won't be many actions here; it'll mainly act as a named structure type.

In [2]:
class Story:
    """
    Story contains all known data about a given user story.
    """
    def __init__(self, number, title, role, action, goal):
        self.number = number
        self.title = title
        self.role = role
        self.action = action
        self.goal = goal
        
    def __repr__(self):
        return self.__str__()
        
    def __str__(self):
        return ("Story " + str(self.number) + ": " + self.title +
               " As a " + self.role + ", I would like to " + self.action +
               " so that: " + self.goal)

Now I set up the regex to parse the file (in the above formats) with.

In [3]:
# This set of regex matches the format above, 
# extracting (each section corresponds to a captured element)
# 1. The number (X above)
# 2. The story's title
# 3. User role, action, or goal
re_num = "(?:US)(\d+)"
re_title = "(?:\W*)(.+)"
re_detail = "(?:[\w\s]+:\s*)(.+)"
# Each index of the array will be what we search for.
searches = [re_num + re_title, re_detail, re_detail, re_detail]

Finally, we set up the file for parsing and get an array of stories.

In [4]:
import re
import sys
stories = []

curr_story = []
index = 0


with open("stories.txt") as s:
    for line in s:
        # can't capture anything
        if len(line) is 0 or line.isspace():
            pass
        # try and parse the line
        else:
            r = re.search(searches[index], line)
            if r == None:
                sys.exit("Failed to parse line.")
            curr_story.append(r.groups(0)[0])
            if index == 0:
                curr_story.append(r.groups(1)[1])
            index += 1
            if index >= 4:
                stories.append(Story(int(curr_story[0]), *curr_story[1:]))
                index = 0
                curr_story = []

Now we have an array of stories. Here's an example of the result:

In [5]:
print(stories[0])

Story 1: User abilities - super-duper-user As a system user, I would like to be a super-duper-user so that: I have access to all system abilities, including the ability to view restricted content, and to assign roles


That's a lot of text, but it doesn't tell us much.

---

## Natural Language Processing
For this task (and lots of others), I'll be employing a library called [spaCy](https://spacy.io). spaCy could be better referred to as an NLP toolkit, written in Cython. This allows it to take advantage of the simplicity and expressiveness of Python while still utilizing the blazing speed of compiled C.

spaCy has pre-trained neural networks and predefined rulesets for the English language (and others) that can analyze a body of text or small snippets, including computing similarity between two strings.

This similarity will be the main method by which I will be looking at interaction.

Let's take a look at the action for User Story 1 compared with a few of the others (including itself!):

In [6]:
import spacy

nlp = spacy.load('en_core_web_lg') # using this model for better precomputed vectors

parsed = [nlp(s.action) for s in stories]

print("US" + str(stories[0].number) + ":", parsed[0], "compared with:")

for i in range(10):
    print(parsed[0].similarity(parsed[i]), "(US" + str(stories[i].number) + "):", parsed[i], "\n")

US1: be a super-duper-user compared with:
0.999999987867 (US1): be a super-duper-user 

0.360403269612 (US2): be able to designate roles to user accounts 

0.903838203828 (US3): be a super-user 

0.428586439198 (US4): add users to a workgroup 

0.361153475569 (US5): hide objects 

0.372441500027 (US6): set different levels of editing rights based on individual user(s) 

0.311382344599 (US7): embargo content 

0.425561429019 (US8): have access to the master files 

0.442165713863 (US9): designate public access levels for workgroup content once it is published 

0.216929744028 (US10): control which users can send submissions to me 



The [similarity](https://spacy.io/api/doc#similarity) method in spaCy is computed using a [cosine distance](https://en.wikipedia.org/wiki/Cosine_similarity) between the multidimensional vectors for each word.

spaCy utilizes a 4-layer convolutional neural network, pretrained, in this case, on a robust corpus of the English language with [word2vec](https://en.wikipedia.org/wiki/Word2vec). Due to the multilayer CNN, it relies heavily on context to compute word vectors and, therefore, I will not be removing stop words from the input to prevent confusion by the system about phrase structure within the stories.

Further reading on the systems employed by spaCy in this situation is available [here](https://spacy.io/usage/vectors-similarity).

---

## Process of Determining Interaction

I will be relying heavily on the similarity score provided by spaCy to determine whether two stories interact.
Based on the answer set to the test data, **I will determine an appropriate weight to the scores provided by both the goal and action sections of the stories, combining them both to reach a binary result of interaction or no interaction.** The title and number will not have a bearing on interaction.

## Determining Relevancy to the Resource

Not all interactions, however, are relevant to the resource being selected. Therefore, I will also compute a similarity score of the word or phrase to the same components as previous, hand-adjusting the sensitivity based on solution data to balance inclusion of relevant stories with exclusion of the irrelevant.

## Results

After the above is completed, it is trivial to output a list of pairs for checking against an answer set.
