# Engineering Insights


# 1. Setup

To prepare your environment, you need to install some packages and enter credentials for the Watson services.

# 1.1 Install the necessary packages

You need the latest versions of these packages:<br>
** Watson Developer Cloud:** a client library for Watson services.<br>
** NLTK: **leading platform for building Python programs to work with human language data.<br>
** stop_words: **List of common stop words.<br>
** python-swiftclient:** is a python client for the Swift API.<br>
** websocket-client: ** is a python client for the Websockets.<br>
** pyorient: ** is a python client for the Orient DB.<br><br>

** Install the Watson Developer Cloud package: **


In [1]:
!pip install --upgrade watson-developer-cloud

Requirement already up-to-date: watson-developer-cloud in /gpfs/global_fs01/sym_shared/YPProdSpark/user/s96f-564ffbeb843c8a-f1d344ff0b95/.local/lib/python2.7/site-packages
Requirement already up-to-date: pyOpenSSL>=16.2.0 in /gpfs/global_fs01/sym_shared/YPProdSpark/user/s96f-564ffbeb843c8a-f1d344ff0b95/.local/lib/python2.7/site-packages (from watson-developer-cloud)
Requirement already up-to-date: python-dateutil>=2.5.3 in /gpfs/global_fs01/sym_shared/YPProdSpark/user/s96f-564ffbeb843c8a-f1d344ff0b95/.local/lib/python2.7/site-packages (from watson-developer-cloud)
Requirement already up-to-date: requests<3.0,>=2.0 in /gpfs/global_fs01/sym_shared/YPProdSpark/user/s96f-564ffbeb843c8a-f1d344ff0b95/.local/lib/python2.7/site-packages (from watson-developer-cloud)
Requirement already up-to-date: pysolr<4.0,>=3.3 in /gpfs/global_fs01/sym_shared/YPProdSpark/user/s96f-564ffbeb843c8a-f1d344ff0b95/.local/lib/python2.7/site-packages (from watson-developer-cloud)
Requirement already up-to-date: cry

** Install NLTK: **

In [2]:
!pip install --upgrade nltk

Requirement already up-to-date: nltk in /gpfs/global_fs01/sym_shared/YPProdSpark/user/s96f-564ffbeb843c8a-f1d344ff0b95/.local/lib/python2.7/site-packages
Requirement already up-to-date: six in /gpfs/global_fs01/sym_shared/YPProdSpark/user/s96f-564ffbeb843c8a-f1d344ff0b95/.local/lib/python2.7/site-packages (from nltk)


** Install stop_words **

In [3]:
!pip install stop-words



** Install websocket client: **

In [4]:
!pip install websocket-client



** Install pyorient: **

In [5]:
! pip install pyorient --user



# 1.2 Import packages and libraries 

Import the packages and libraries that you'll use:

In [6]:
import json
import watson_developer_cloud
from watson_developer_cloud import NaturalLanguageUnderstandingV1
from watson_developer_cloud.natural_language_understanding_v1 \
  import Features, EntitiesOptions, KeywordsOptions
    
import swiftclient
import re
import nltk
from nltk.cluster.util import cosine_distance
from stop_words import get_stop_words
import numpy

import websocket
import thread
import time

from io import BytesIO
import pandas as pd
import pyorient, json
import sys

# 2. Configuration

Add configurable items of the notebook below
## 2.1 Add your service credentials from Bluemix for the Watson services

You must create a Watson Natural Language Understanding service on Bluemix. Create a service for Natural Language Understanding (NLU). Insert the username and password values for your NLU in the following cell. Do not change the values of the version fields.

Run the cell.

In [7]:
# @hidden_cell
natural_language_understanding = NaturalLanguageUnderstandingV1(
    version='2017-02-27',
    username="2e405d6f-a079-4aaa-9fad-6ce3d80b34da",
    password="AL7vLlEulkVv")

## 2.2 Add your service credentials for Object Storage

You must create Object Storage service on Bluemix. To access data in a file in Object Storage, you need the Object Storage authentication credentials. Insert the Object Storage authentication credentials as credentials_1 in the following cell after removing the current contents in the cell.

In [8]:
# @hidden_cell
# credentials_1 = {
#   'auth_url':'https://identity.open.softlayer.com',
#   'project':'object_storage_5fb1ce0f_6a6f_431b_8c75_652f4cfdc454',
#   'project_id':'63d358127185445ea99ea6f6ba08ef0d',
#   'region':'dallas',
#   'user_id':'38f0cfce8c7446f599a03562266e426d',
#   'domain_id':'14638fcddb2b424ebfaaa0cc0676944d',
#   'domain_name':'1143539',
#   'username':'member_6dcebb0bdc7cb0a438ff89f426cd626db89da7f2',
#   'password':"""ZL,Dlea!]2b]Ba8*""",
#   'container':'CompositeJourney',
#   'tenantId':'undefined',
#   'filename':'SampleData_CompositeJourney.xlsx'
# }
# @hidden_cell
credentials_1 = {
  'auth_url':'https://identity.open.softlayer.com',
  'project':'object_storage_5cb15816_2f10_4d12_8209_0a937d89e342',
  'project_id':'683b6ba01c904c529487c93a6947adb4',
  'region':'dallas',
  'user_id':'667a89019b8e46f0a3b571e5e591e814',
  'domain_id':'ff2831ecc08041d0a849a338d28a0f89',
  'domain_name':'897161',
  'username':'member_5b96dc07f770de882973767e2b4d0a7ef7e848dd',
  'password':"""H{Sf3Mo(4G.)oP4V""",
  'container':'EICompositeJourney',
  'filename':'SampleData_CompositeJourney.xlsx'
}


# 3.  Watson Text Classification

Write the classification related utility functions in a modularalized form.

## 3.1 Watson NLU Classification

In [9]:
def analyze_using_NLU(analysistext):
    """ Call Watson Natural Language Understanding service to obtain analysis results.
    """
    response = natural_language_understanding.analyze( 
        text=analysistext,
        features=Features(entities=EntitiesOptions(), 
                          keywords=KeywordsOptions()))
    return response

## 3.2 Augumented Classification

Custom classification utlity fucntions for augumenting the results of Watson NLU API call

In [10]:
def split_sentences(text):
    """ Split text into sentences.
    """
    sentence_delimiters = re.compile(u'[\\[\\]\n.!?]')
    sentences = sentence_delimiters.split(text)
    return sentences

def split_into_tokens(text):
    """ Split text into tokens.
    """
    tokens = nltk.word_tokenize(text)
    return tokens
    
def POS_tagging(text):
    """ Generate Part of speech tagging of the text.
    """
    POSofText = nltk.tag.pos_tag(text)
    return POSofText

def keyword_tagging(tag,tagtext,text):
    """ Tag the text matching keywords.
    """
    if (text.lower().find(tagtext.lower()) != -1):
        return text[text.lower().find(tagtext.lower()):text.lower().find(tagtext.lower())+len(tagtext)]
    else:
        return 'UNKNOWN'
    
def regex_tagging(tag,regex,text):
    """ Tag the text matching REGEX.
    """    
    p = re.compile(regex, re.IGNORECASE)
    matchtext = p.findall(text)
    regex_list=[]    
    if (len(matchtext)>0):
        for regword in matchtext:
            regex_list.append(regword)
    return regex_list

def chunk_tagging(tag,chunk,text):
    """ Tag the text using chunking.
    """
    parsed_cp = nltk.RegexpParser(chunk)
    pos_cp = parsed_cp.parse(text)
    chunk_list=[]
    for root in pos_cp:
        if isinstance(root, nltk.tree.Tree):               
            if root.label() == tag:
                chunk_word = ''
                for child_root in root:
                    chunk_word = chunk_word +' '+ child_root[0]
                chunk_list.append(chunk_word)
    return chunk_list
    
def augument_NLUResponse(responsejson,updateType,text,tag):
    """ Update the NLU response JSON with augumented classifications.
    """
    if(updateType == 'keyword'):
        if not any(d.get('text', None) == text for d in responsejson['keywords']):
            responsejson['keywords'].append({"text":text,"relevance":0.5})
    else:
        if not any(d.get('text', None) == text for d in responsejson['entities']):
            responsejson['entities'].append({"type":tag,"text":text,"relevance":0.5,"count":1})        
    

def classify_text(text, config):
    """ Perform augumented classification of the text.
    """
    
    response = analyze_using_NLU(text)
    responsejson = response
    
    sentenceList = split_sentences(text)
    
    tokens = split_into_tokens(text)
    
    postags = POS_tagging(tokens)
    
    configjson = json.loads(config)
    
    for stages in configjson['configuration']['classification']['stages']:
        # print('Stage - Performing ' + stages['name']+':')
        for steps in stages['steps']:
            # print('    Step - ' + steps['type']+':')
            if (steps['type'] == 'keywords'):
                for keyword in steps['keywords']:
                    for word in sentenceList:
                        wordtag = keyword_tagging(keyword['tag'],keyword['text'],word)
                        if(wordtag != 'UNKNOWN'):
                            # print('      '+keyword['tag']+':'+wordtag)
                            augument_NLUResponse(responsejson,'entities',wordtag,keyword['tag'])
            elif(steps['type'] == 'd_regex'):
                for regex in steps['d_regex']:
                    for word in sentenceList:
                        regextags = regex_tagging(regex['tag'],regex['pattern'],word)
                        if (len(regextags)>0):
                            for words in regextags:
                                # print('      '+regex['tag']+':'+words)
                                augument_NLUResponse(responsejson,'entities',words,regex['tag'])
            elif(steps['type'] == 'chunking'):
                for chunk in steps['chunk']:
                    chunktags = chunk_tagging(chunk['tag'],chunk['pattern'],postags)
                    if (len(chunktags)>0):
                        for words in chunktags:
                            # print('      '+chunk['tag']+':'+words)
                            augument_NLUResponse(responsejson,'entities',words,chunk['tag'])
            else:
                print('UNKNOWN STEP')
    
    return responsejson

def replace_unicode_strings(response):
    """ Convert dict with unicode strings to strings.
    """
    if isinstance(response, dict):
        return {replace_unicode_strings(key): replace_unicode_strings(value) for key, value in response.iteritems()}
    elif isinstance(response, list):
        return [replace_unicode_strings(element) for element in response]
    elif isinstance(response, unicode):
        return response.encode('utf-8')
    else:
        return response

# 4. Correlate text content

In [11]:
stopWords = get_stop_words('english')
# List of words to be ignored for text similarity
stopWords.extend(["The","This","That",".","!","?"])

def compute_text_similarity(text1, text2, text1tags, text2tags):
    """ Compute text similarity using cosine
    """
    sentences_text1 = split_sentences(text1)
    sentences_text2 = split_sentences(text2)
    tokens_text1 = []
    tokens_text2 = []
    
    for sentence in sentences_text1:
        tokenstemp = split_into_tokens(sentence.lower())
        tokens_text1.extend(tokenstemp)
    
    for sentence in sentences_text2:
        tokenstemp = split_into_tokens(sentence.lower())
        tokens_text2.extend(tokenstemp)
    if (len(text1tags) > 0):  
        tokens_text1.extend(text1tags)
    if (len(text2tags) > 0):    
        tokens_text2.extend(text2tags)
    
    tokens1Filtered = [x for x in tokens_text1 if x not in stopWords]
    
    tokens2Filtered = [x for x in tokens_text2 if x not in stopWords]
    
    #  remove duplicate tokens
    tokens1Filtered = set(tokens1Filtered)
    tokens2Filtered = set(tokens2Filtered)
   
    tokensList=[]

    text1vector = []
    text2vector = []
    
    if len(tokens1Filtered) < len(tokens2Filtered):
        tokensList = tokens1Filtered
    else:
        tokensList = tokens2Filtered

    for token in tokensList:
        if token in tokens1Filtered:
            text1vector.append(1)
        else:
            text1vector.append(0)
        if token in tokens2Filtered:
            text2vector.append(1)
        else:
            text2vector.append(0)  

    cosine_similarity = 1-cosine_distance(text1vector,text2vector)
    if numpy.isnan(cosine_similarity):
        cosine_similarity = 0
    
    return cosine_similarity

# 5. Persistence and Storage
## 5.1 Configure Object Storage Client

In [12]:
auth_url = credentials_1['auth_url']+"/v3"
container = credentials_1["container"]

IBM_Objectstorage_Connection = swiftclient.Connection(
    key=credentials_1['password'], authurl=auth_url, auth_version='3', os_options={
        "project_id": credentials_1['project_id'], "user_id": credentials_1['user_id'], "region_name": credentials_1['region']})

def create_container(container_name):
    """ Create a container on Object Storage.
    """
    x = IBM_Objectstorage_Connection.put_container(container_name)
    return x

def put_object(container_name, fname, contents, content_type):
    """ Write contents to Object Storage.
    """
    x = IBM_Objectstorage_Connection.put_object(
        container_name,
        fname,
        contents,
        content_type)
    return x

def get_object(container_name, fname):
    """ Retrieve contents from Object Storage.
    """
    Object_Store_file_details = IBM_Objectstorage_Connection.get_object(
        container_name, fname)
    return Object_Store_file_details[1]

def get_object_BytesIO(container_name, fname):
    """ Retrieve contents as BytesIO from Object Storage
    """
    Object_Store_file_details = IBM_Objectstorage_Connection.get_object(
        container_name, fname)
    return BytesIO(Object_Store_file_details[1])

## 5.2 OrientDB client - functions to connect, store and retrieve data

** Connect to OrientDB **

In [13]:
client = pyorient.OrientDB(host="184.172.242.125", port=32092)
user = "root"
passw = "XNclkac21nx"
session_id = client.connect(user, passw)

** OrientDB Core functions **

In [14]:
def create_database(dbname, username, password):
    """ Create a database
    """
    client.db_create( dbname, pyorient.DB_TYPE_GRAPH, pyorient.STORAGE_TYPE_MEMORY )
    print dbname  + " created and opened successfully"
        
def drop_database(dbname):
    """ Drop a database
    """
    if client.db_exists( dbname, pyorient.STORAGE_TYPE_MEMORY ):
        client.db_drop(dbname)
    
def create_class(classname):
    """ Create a class
    """
    command = "create class "+classname + " extends V"
    client.command(command)
    
def create_record(classname, entityname, attributes):
    """ Create a record
    """
    command = "insert into " + classname + " set " 
    attrstring = ""
    for index,key in enumerate(attributes):
        attrstring = attrstring + key + " = '"+ attributes[key] + "'"
        if index != len(attributes) -1:
            attrstring = attrstring +","
    command = command + attrstring
    client.command(command)
    
def create_defect_testcase_edge(defectid, testcaseid, attributes):
    """ Create an edge between a defect and a testcase
    """
    command = "create edge linkedtestcases from (select from Defect where ID = " + "'" + defectid + "') to (select from Testcase where ID = " + "'" + testcaseid + "')" 
    if len(attributes) > 0:
        command = command + " set "
    attrstring = ""
    for index,key in enumerate(attributes):
        val = attributes[key]
        if not isinstance(val, str):
            val = str(val)
        attrstring = attrstring + key + " = '"+ val + "'"
        if index != len(attributes) -1:
            attrstring = attrstring +","
    command = command + attrstring
    client.command(command)    
    
def create_testcase_requirement_edge(testcaseid, reqid, attributes):
    """ Create an edge between a testcase and a requirement
    """
    command = "create edge linkedrequirements from (select from Testcase where ID = "+ "'" + testcaseid+"') to (select from Requirement where ID = "+"'"+reqid+"')" 
    if len(attributes) > 0:
        command = command + " set "
    attrstring = ""
    for index,key in enumerate(attributes):
        val = attributes[key]
        if not isinstance(val, str):
            val = str(val)
        attrstring = attrstring + key + " = '"+ val + "'"
        if index != len(attributes) -1:
            attrstring = attrstring +","
    command = command + attrstring
    client.command(command)  

    
def create_requirement_defect_edge(reqid, defectid, attributes):
    """ Create an edge between a requirement and a defect
    """
    command = "create edge linkeddefects from (select from Requirement where ID = "+ "'" + reqid+"') to (select from Defect where ID = "+"'"+defectid+"')" 
    
    if len(attributes) > 0:
         command = command + " set "
    attrstring = ""
    for index,key in enumerate(attributes):
        val = attributes[key]
        if not isinstance(val, str):
            val = str(val)
        attrstring = attrstring + key + " = '"+ val + "'"
        if index != len(attributes) -1:
            attrstring = attrstring +","
    command = command + attrstring
    client.command(command) 
    
def execute_query(query):
    """ Execute a query
    """
    return client.query(query)

** OrientDB Insights **

In [15]:
def get_related_testcases(defectid):
    """ Get the related testcases for a defect
    """
    testcasesQuery = "select * from ( select expand( out('linkedtestcases')) from Defect where ID = '" + defectid +"' )"
    testcases = execute_query(testcasesQuery)
    scoresQuery = "select expand(out_linkedtestcases) from Defect where ID = '"+defectid+"'"
    scores = execute_query(scoresQuery)
    testcaseList =[]
    scoresList= []
    for testcase in testcases:
        testcaseList.append(testcase.ID)
    for score in scores:
        scoresList.append(score.score)
    result = {}
    length = len(testcaseList)
    for i in range(0, length):
        result[testcaseList[i]] = scoresList[i]
    return result

def get_related_requirements(testcaseid):
    """ Get the related requirements for a testcase
    """
    requirementsQuery = "select * from ( select expand( out('linkedrequirements') ) from Testcase where ID = '" + testcaseid +"' )"
    requirements = execute_query(requirementsQuery)
    print requirements
    scoresQuery = "select expand(out_linkedrequirements) from Testcase where ID = '"+testcaseid+"'"
    scores = execute_query(scoresQuery)
    requirementsList =[]
    scoresList= []
    for requirement in requirements:
        requirementsList.append(requirement.ID)
    for score in scores:
        scoresList.append(score.score)
    result = {}
    length = len(requirementsList)
    print requirementsList, scoresList
    for i in range(0, length):
        result[requirementsList[i]] = scoresList[i]
    return result

def get_related_defects(reqid):
    """ Get the related defects for a requirement
    """
    defectsQuery = "select * from ( select expand( out('linkeddefects')) from Requirement where ID = '" + reqid +"' )"
    defects = execute_query(defectsQuery)
    scoresQuery = "select expand(out_linkeddefects) from Requirement where ID = '"+reqid+"'"
    scores = execute_query(scoresQuery)
    defectsList =[]
    scoresList= []
    for defect in defects:
        defectsList.append(defect.ID)
    for score in scores:
        scoresList.append(score.score)
    result = {}
    length = len(defectsList)
    for i in range(0, length):
        result[defectsList[i]] = scoresList[i]
    return result

def build_format_defects_list(defectsResult):
    """ Build and format the OrientDB query results for defects
    """
    defects = []
    for defect in defectsResult:
        detail = {}
        detail['ID'] = defect.ID
        detail['Severity'] = defect.Severity
        detail['Description'] = defect.Description
        defects.append(detail)
    return defects

def build_format_testcases_list(testcasesResult):
    """ Build and format the OrientDB query results for testcases
    """
    testcases = []
    for testcase in testcasesResult:
        detail = {}
        detail['ID'] = testcase.ID
        detail['Category'] = testcase.Category
        detail['Description'] = testcase.Description
        testcases.append(detail)
    return testcases  

def build_format_requirements_list(requirementsResult):
    """ Build and format the OrientDB query results for requirements
    """
    requirements = []
    for requirement in requirementsResult:
        detail = {}
        detail['ID'] =requirement.ID
        detail['Description'] = requirement.Description
        detail['Priority'] = requirement.Priority
        requirements.append(detail)
    return requirements  

def get_defects():
    """ Get all defects
    """
    defectsQuery = "select * from Defect"
    defectsResult = execute_query(defectsQuery)
    defects = build_format_defects_list(defectsResult)
    return defects

def get_testcases():
    """ Get all testcases
    """
    testcasesQuery = "select * from Testcase"
    testcasesResult = execute_query(testcasesQuery)
    testcases = build_format_testcases_list(testcasesResult)
    return testcases

def get_requirements():
    """ Get all requirements
    """
    requirementsQuery = "select * from Requirement"
    requirementsResult =  execute_query(requirementsQuery)
    requirements = build_format_requirements_list(requirementsResult)
    return requirements  

def get_defects_severity(severity):
    """ Get defects of a given severity
    """
    query = "select * from Defect where Severity = " + str(severity)
    queryResult =  execute_query(query)
    defects = build_format_defects_list(queryResult)    
    return defects

def get_testcases_category(category):
    """ Get testcases of a given category
    """
    testcasesQuery = "select * from Testcase where Category = '"+str(category)+"'"
    testcasesResult = execute_query(testcasesQuery)
    testcases = build_format_testcases_list(testcasesResult)
    return testcases

def get_testcases_zero_defects():
    """ Get testcases that did not generate any defects
    """
    testcasesQuery = "Select * from Testcase where in('linkedtestcases').size() = 0"
    testcasesResult = execute_query(testcasesQuery)
    testcases = build_format_testcases_list(testcasesResult)
    return testcases

def get_defects_zero_testcases():
    """ Get defects that have no associated testcases
    """
    query = "Select * from Defect where out('linkedtestcases').size() = 0"
    queryResult =  execute_query(query)
    defects = build_format_defects_list(queryResult)    
    return defects

def get_requirements_zero_defect():
    """ Get requirements that have no defects
    """
    query = "Select * from Requirement where out('linkeddefects').size() = 0"
    requirementsResult =  execute_query(query)
    requirements = build_format_requirements_list(requirementsResult)
    return requirements  

def get_requirements_zero_testcases():
    """ Get requirements that have no associated testcases
    """
    query = "Select * from Requirement where in('linkedrequirements').size() = 0"
    requirementsResult =  execute_query(query)
    requirements = build_format_requirements_list(requirementsResult)
    return requirements  
    
def get_requirement_max_defects():
    """ Get requirement that has maximum defects
    """
    query = "select ID,Description,Priority from Requirement LET $a = (select max(out) from (select out('linkeddefects').size() as out from Requirement)) where out('linkeddefects').size()=first($a).max"
    requirementsResult =  execute_query(query)
    requirements = build_format_requirements_list(requirementsResult)
    for requirement in requirements:
        num = len(get_related_defects(requirement['ID']))
        requirement['defectcount'] = num
    return requirements  

def get_requirement_defects(numdefects):
    """ Get requirements that have more than a given number of defects
    """
    query = "select ID,Description,Priority from Requirement where out('linkeddefects').size() > " + str(numdefects)
    requirementsResult =  execute_query(query)
    requirements = build_format_requirements_list(requirementsResult)
    for requirement in requirements:
        num = len(get_related_defects(requirement['ID']))
        requirement['defectcount'] = num
    return requirements  

# 6. Data Preparation

## 6.1 Global variables and functions

In [16]:
# Name of the excel file with data in Object Storage
dataFileName = "SampleData_CompositeJourney_Final.xlsx"

# Name of the config file in Object Storage
configFileName = "config.txt"

# Config contents
config = None;

# Data file
datafile = None

# Requirements dataframe
requirements_sheet_name = "Requirements"
requirements_df = None

# Defects dataframe
defects_sheet_name = "Defects"
defects_df = None

# Testcases dataframe
testcases_sheet_name ="TestCases"
testcases_df = None

def load_artifacts():
    """ Load the artifacts into a pandas dataframe
    """
    global requirements_df 
    global defects_df 
    global testcases_df 
    global config
    global datafile
    config = get_object(container, configFileName)
    datafile = get_object_BytesIO(container, dataFileName)
    excel_file = pd.ExcelFile(datafile)
    requirements_df = excel_file.parse(requirements_sheet_name)
    defects_df = excel_file.parse(defects_sheet_name)
    testcases_df = excel_file.parse(testcases_sheet_name)
    
def prepare_artifact_dataframes():
    """ Prepare artifact dataframes by creating necessary output columns
    """
    global requirements_df 
    global defects_df 
    global testcases_df 
    req_cols_len = len(requirements_df.columns)
    def_cols_len = len(defects_df.columns)
    tcs_cols_len = len(testcases_df.columns)
    requirements_df.insert(req_cols_len, "ClassifiedText","")
    requirements_df.insert(req_cols_len+1, "Keywords","")
    requirements_df.insert(req_cols_len+2, "DefectsMatchScore","")
    requirements_df.insert(req_cols_len+3, "CosineMatch","")
    defects_df.insert(def_cols_len, "ClassifiedText","")
    defects_df.insert(def_cols_len+1, "Keywords","")
    defects_df.insert(def_cols_len+2, "TestCasesMatchScore","")
    defects_df.insert(def_cols_len+3, "CosineMatch","")
    testcases_df.insert(tcs_cols_len, "ClassifiedText","")
    testcases_df.insert(tcs_cols_len+1, "Keywords","")
    testcases_df.insert(tcs_cols_len+2, "RequirementsMatchScore","")
    testcases_df.insert(tcs_cols_len+3, "CosineMatch","")

## 6.2 Utility functions for Engineering Insights

In [17]:
def add_text_classifier_output(artifact_df, config, output_column_name):
    """ Add Watson text classifier output to the artifact dataframe
    """
    for index, row in artifact_df.iterrows():
        summary = row["Description"]
        classifier_journey_output = classify_text(summary, config)
        artifact_df.set_value(index, output_column_name, classifier_journey_output)
    return artifact_df 
           
def add_keywords_entities(artifact_df, classify_text_column_name, output_column_name):
    """ Add keywords and entities to the artifact dataframe"""
    for index, artifact in artifact_df.iterrows():
        keywords_array = []
        for row in artifact[classify_text_column_name]['keywords']:
            if not row['text'] in keywords_array:
                keywords_array.append(row['text'])
                
        for entities in artifact[classify_text_column_name]['entities']:
            if not entities['text'] in keywords_array:
                keywords_array.append(entities['text'])
            if not entities['type'] in keywords_array:
                keywords_array.append(entities['type'])
        artifact_df.set_value(index, output_column_name, keywords_array)
    return artifact_df 

def populate_text_similarity_score(artifact_df1, artifact_df2, keywords_column_name, output_column_name):
    """ Populate text similarity score to the artifact dataframes
    """
#     print "Subject ID", " ", "Related ID", " ", "Text similarity score"
#     print "==============================================="
    for index1, artifact1 in artifact_df1.iterrows():
        matches = []
        top_matches = []
        for index2, artifact2 in artifact_df2.iterrows():
            matches.append({'ID': artifact2['ID'], 
                            'cosine_score': 0, 
                            'SubjectID':artifact1['ID']})
            cosine_score = compute_text_similarity(
                artifact1['Description'], 
                artifact2['Description'], 
                artifact1['Keywords'], 
                artifact2['Keywords'])
            matches[index2]["cosine_score"] = cosine_score
#             print artifact1['ID'],"        ",artifact2['ID'],"        ",cosine_score
        
       
        sorted_obj = sorted(matches, key=lambda x : x['cosine_score'], reverse=True)
      
        for obj in sorted_obj:
            if obj['cosine_score'] > 0.4:
                top_matches.append(obj)
               
        artifact_df1.set_value(index1, output_column_name, top_matches)
    return artifact_df1

## 6.3 Process flow

** Prepare data **
* Load artifacts from object storage and create pandas dataframes
* Prepare the pandas dataframes. Add additional columns required for further processing.

In [18]:
load_artifacts()
prepare_artifact_dataframes()

** Run Watson Text Classifier on data **
* Add the text classification output to the artifact dataframes

In [19]:
output_column_name = "ClassifiedText"
defects_df = add_text_classifier_output(defects_df,config, output_column_name)
testcases_df = add_text_classifier_output(testcases_df,config, output_column_name)
requirements_df = add_text_classifier_output(requirements_df,config, output_column_name)

** Populate keywords and entities **
* Add the keywords and entities extracted from the unstructured text to the artifact dataframes

In [20]:
classify_text_column_name = "ClassifiedText"
output_column_name = "Keywords"
defects_df = add_keywords_entities(defects_df, classify_text_column_name, output_column_name)
testcases_df = add_keywords_entities(testcases_df, classify_text_column_name, output_column_name)
requirements_df = add_keywords_entities(requirements_df, classify_text_column_name, output_column_name)

** Correlate keywords between artifacts **
* Add the text similarity score of associated artifacts to the dataframe

In [21]:
keywords_column_name = "Keywords"
output_column_name = "TestCasesMatchScore"
defects_df = populate_text_similarity_score(defects_df, testcases_df, keywords_column_name, output_column_name)

output_column_name = "RequirementsMatchScore"
testcases_df = populate_text_similarity_score(testcases_df, requirements_df, keywords_column_name, output_column_name)

output_column_name = "DefectsMatchScore"
requirements_df = populate_text_similarity_score(requirements_df, defects_df, keywords_column_name, output_column_name)

** Utility functions to store entities and relations in Orient DB **

In [22]:
def store_requirements(requirements_df):
    """ Store requirements into the database
    """
    for index, row in requirements_df.iterrows():
        attrs = {}
        reqid = row["ID"]
        attrs["Description"] = row["Description"].replace('\n', ' ').replace('\r', '')
        attrs["ID"] = reqid
        attrs["Priority"]= str(row["Priority"])
        create_record(requirement_classname, reqid, attrs)    
        
def store_testcases(testcases_df):  
    """ Store testcases into the database
    """
    for index, row in testcases_df.iterrows():
        attrs = {}
        tcaseid = row["ID"]
        attrs["Description"] = row["Description"].replace('\n', ' ').replace('\r', '')
        attrs["ID"] = tcaseid
        attrs["Category"] = row["Category"]
        create_record(testcase_classname, tcaseid, attrs)
        
def store_defects(defects_df):
    """ Store defects into the database
    """
    for index, row in defects_df.iterrows():
        attrs = {}
        defid = row["ID"]
        attrs["Description"] = row["Description"].replace('\n', ' ').replace('\r', '')
        attrs["ID"] = defid
        attrs["Severity"] = str(row["Severity"])
        create_record(defect_classname, defid, attrs)
        
def store_testcases_requirement_mapping(testcases_df):
    """ Store the related requirements for testcases into the database
    """
    for index, row in testcases_df.iterrows():
        tcaseid = row["ID"]
        requirements = row["RequirementsMatchScore"]
        for requirement in requirements:
            reqid = requirement["ID"]
            attributes = {}
            attributes['score'] = requirement['cosine_score']
            create_testcase_requirement_edge(tcaseid,reqid, attributes)
            
def store_defect_testcase_mapping(defects_df):
    """ Store the related testcases for the defects into the database
    """
    for index, row in defects_df.iterrows():
        defid = row["ID"]
        testcases = row["TestCasesMatchScore"]
        for testcase in testcases:
            testcaseid = testcase["ID"]
            attributes = {}
            attributes['score'] = testcase["cosine_score"]
            create_defect_testcase_edge(defid,testcaseid, attributes)
            
def store_requirement_defect_mapping(requirements_df):
    """ Store the related defects for the requirements in the database
    """
    for index, row in requirements_df.iterrows():
        reqid = row["ID"]
        defects = row["DefectsMatchScore"]
        for defect in defects:
            defectid = defect["ID"]
            cosine_score =  defect["cosine_score"]
            attributes = {}
            attributes['score'] = cosine_score
            create_requirement_defect_edge(reqid,defectid, attributes)

** Store artifacts data and relations into OrientDB **
* Drop and create a database
* Create classes for each category of artifact
* Store artifact data
* Store artifact relations data

In [23]:
drop_database("EInsights")
create_database("EInsights", "admin", "admin")

requirement_classname = "Requirement"
defect_classname = "Defect"
testcase_classname = "Testcase"

create_class(requirement_classname)
create_class(defect_classname)
create_class(testcase_classname)

store_requirements(requirements_df)
store_defects(defects_df)
store_testcases(testcases_df)

store_testcases_requirement_mapping(testcases_df)
store_defect_testcase_mapping(defects_df)
store_requirement_defect_mapping(requirements_df)

EInsights created and opened successfully


# 7. Transform results for Visualization

In [24]:
def get_artifacts_mapping_d3_tree(defectId):
    """ Create an artifacts mapping json for display by d3js tree widget
    """
    depTree = {}
    depTree['ID'] = defectId
    testcases = get_related_testcases(defectId)
    
    depTree['children'] = []
    i=1
    for key in testcases:
        print key,testcases[key]
        testcaseChildren = {}
        testcaseChildren['ID'] = key
        testcaseChildren['Score'] = testcases[key]
        testcaseChildren['children'] = []
        depTree['children'].append(testcaseChildren)
        requirements = get_related_requirements(key)
        
        for key in requirements:
            requirementChildren = {}
            requirementChildren['ID']=key
            requirementChildren['Score']=requirements[key]
            testcaseChildren['children'].append(requirementChildren)
    return depTree 

def get_artifacts_mapping_d3_network(defectid):
    """ Create an artifacts mapping json for display by d3js network widget
    """
    nodes =[]
    links =[] 
    defect = {}
    defect['id'] = defectid
    defect['group'] = 1
    nodes.append(defect)
    
    testcases = get_related_testcases(defectid)
    
    for key in testcases:
        testcase ={}
        testcaseid = key
        testcase['id'] = testcaseid
        testcase['group'] = 2
        if testcase not in nodes:
            nodes.append(testcase)
        
        link = {}
        link['source'] = defectid
        link['target']=testcaseid
        link['value']=testcases[testcaseid]
        links.append(link)
        
        requirements = get_related_requirements(key)
        for key in requirements:
            requirement ={}
            requirement['id'] = key
            requirement['group'] = 3
            if requirement not in nodes:
                nodes.append(requirement)
            
            link = {}
            link['source'] = testcaseid
            link['target'] = key
            link['value'] = requirements[key]
            links.append(link)
    result ={}
    result["nodes"] = nodes
    result["links"] = links
    return result

def get_tc_req_mapping_d3_network(testcaseid):
    """ Create a testcases to requirement mapping json for display by d3js network widget
    """
    nodes =[]
    links =[] 
    testcase = {}
    testcase['id'] = testcaseid
    testcase['group'] = 2
    nodes.append(testcase)
    requirements = get_related_requirements(testcaseid)
    for key in requirements:            
        requirement ={}
        requirement['id'] = key
        requirement['group'] = 3
        nodes.append(requirement)
            
        link = {}
        link['source'] = testcaseid
        link['target'] = key
        link['value'] = requirements[key]
        links.append(link)
    result ={}
    result["nodes"] = nodes
    result["links"] = links
    return result

def transform_defects_d3_bubble(defects):
    """ Transform the defects list output to a json for display by d3js bubble chart"""
    defectsList = {}
    defectsList['name'] = "defect"
    children = []
    for defect in defects:
        detail = {}
        sizeList = [400,230,130]
        detail["ID"] = defect['ID']
        severity = int(defect['Severity'])
        detail["group"] = str(severity)
        detail["size"] = sizeList[severity-1]
        children.append(detail)
    defectsList['children'] = children 
    return defectsList

def transform_testcases_d3_bubble(testcases):
    """ Transform the testcases list output to a json for display by d3js bubble chart"""
    testcasesList = {}
    testcasesList['name'] = "test"
    sizeList = {}
    sizeList["FVT"]=200
    sizeList["TVT"]=110
    sizeList["SVT"]=400
    children = []
    for testcase in testcases:
        detail = {}
        detail["ID"] = testcase['ID']
        detail["group"] = testcase['Category']
        detail["size"]= sizeList[testcase['Category']]
        children.append(detail)
    testcasesList['children'] = children 
    return testcasesList

def transform_requirements_d3_bubble(requirements):
    """ Transform the requirements list output to a json for display by d3js bubble chart"""
    requirementsList = {}
    requirementsList['name'] = "requirement"
    sizeList = {}
    sizeList[1]=400
    sizeList[2]=200
    sizeList[3]=110
    children = []
    for requirement in requirements:
        detail = {}
        detail["ID"] = requirement['ID']
        detail["group"] = requirement['Priority']
        detail["size"]= sizeList[int(requirement['Priority'])]
        if 'defectcount' in requirement:
            detail['defectcount'] = requirement['defectcount']
        children.append(detail)
    requirementsList['children'] = children 
    return requirementsList

def merge_apply_filters_d3_bubble(mainList, filterList):
    """ Add a filter attribute to the list elements for processing on UI
    """
    mainListChildren = mainList['children']
    filterListChildren = filterList['children']
    for child in mainListChildren:
        if child in filterListChildren:
            child['filter'] = 1
        else:
            child['filter'] = 0
    return mainList  

def getArtifactsListForUI(artifact_df):
    artifactsList = artifact_df.ID
    artifactsList = artifactsList.to_json(orient='records')
    artifactsList = json.loads(artifactsList)
    return artifactsList

# 8. Expose integration point with a websocket client

In [None]:
def on_message(ws, message):
    print(message)
    msg = json.loads(message)
    print "message",msg
    cmd = msg['cmd']
    
    print "Command", cmd

    if cmd == 'DefectList':
        wsresponse = {}
        wsresponse["forCmd"] = "DefectList" 
        defects = get_defects()
        wsresponse["response"] = transform_defects_d3_bubble(defects)
        ws.send(json.dumps(wsresponse))

    if cmd == 'TestcaseList':
        wsresponse = {}
        wsresponse["forCmd"] = "TestcaseList"
        testcases = get_testcases()
        wsresponse["response"] = transform_testcases_d3_bubble(testcases)
        ws.send(json.dumps(wsresponse))

    if cmd == 'ReqsList':
        wsresponse = {}
        wsresponse["forCmd"] = "ReqsList"
        requirements = get_requirements()
        wsresponse["response"] = transform_requirements_d3_bubble(requirements)
        ws.send(json.dumps(wsresponse))

    if cmd == 'DefectRelation':
        defect_id = msg['ID']
        wsresponse = {}
        wsresponse["forCmd"] = "DefectRelation" 
        wsresponse["response"] = get_artifacts_mapping_d3_network(defect_id)
        ws.send(json.dumps(wsresponse))

    if cmd == 'TestcaseRelation':
        testcase_id = msg['ID']
        wsresponse = {}
        wsresponse["forCmd"] = "TestcaseRelation" 
        wsresponse["response"] = get_tc_req_mapping_d3_network(testcase_id)
        ws.send(json.dumps(wsresponse))

    if cmd == 'DefectInsight':
        insight_id = msg['ID']
        defects = get_defects()
        defects = transform_defects_d3_bubble(defects)
        if (insight_id.find('Insight1') != -1):
            defectsSev1 = get_defects_severity(1)
            defectsSev1 = transform_defects_d3_bubble(defectsSev1)
            response = merge_apply_filters_d3_bubble(defects, defectsSev1)
        if (insight_id.find('Insight2') != -1):
            defectsSev2 = get_defects_severity(2)
            defectsSev2 = transform_defects_d3_bubble(defectsSev2)
            response = merge_apply_filters_d3_bubble(defects, defectsSev2)
        if (insight_id.find('Insight3') != -1):
            defectsSev3 = get_defects_severity(3)
            defectsSev3 = transform_defects_d3_bubble(defectsSev3)
            response = merge_apply_filters_d3_bubble(defects, defectsSev3)
        if (insight_id.find('Insight4') != -1):
            defects_zero_tc = get_defects_zero_testcases()
            defects_zero_tc = transform_defects_d3_bubble(defects_zero_tc)
            response = merge_apply_filters_d3_bubble(defects, defects_zero_tc)
        wsresponse = {}
        wsresponse["forCmd"] = "Insight" 
        wsresponse["response"] = response
        ws.send(json.dumps(wsresponse))

    if cmd == 'TestInsight':
        insight_id = msg['ID']
        testcases = get_testcases()
        testcases = transform_testcases_d3_bubble(testcases)
        if (insight_id.find('Insight1') != -1):
            fvtTests = get_testcases_category('FVT')
            fvtTests = transform_testcases_d3_bubble(fvtTests)
            response = merge_apply_filters_d3_bubble(testcases, fvtTests)
        if (insight_id.find('Insight2') != -1):
            svtTests = get_testcases_category('SVT')
            svtTests = transform_testcases_d3_bubble(svtTests)
            response = merge_apply_filters_d3_bubble(testcases, svtTests)
        if (insight_id.find('Insight3') != -1):
            tvtTests = get_testcases_category('TVT')
            tvtTests = transform_testcases_d3_bubble(tvtTests)
            response = merge_apply_filters_d3_bubble(testcases, tvtTests)
        if (insight_id.find('Insight4') != -1):
            testcase_zero_defect = get_testcases_zero_defects()
            testcase_zero_defect = transform_testcases_d3_bubble(testcase_zero_defect)
            response = merge_apply_filters_d3_bubble(testcases, testcase_zero_defect)
        wsresponse = {}
        wsresponse["forCmd"] = "Insight" 
        wsresponse["response"] = response
        ws.send(json.dumps(wsresponse))

    if cmd == 'ReqInsight':
        insight_id = msg['ID']
        requirements = get_requirements()
        requirements = transform_requirements_d3_bubble(requirements)
        if (insight_id.find('Insight1') != -1):
            req = get_requirements_zero_defect()
            req = transform_requirements_d3_bubble(req)
            response = merge_apply_filters_d3_bubble(requirements, req)
        if (insight_id.find('Insight2') != -1):
            req = get_requirements_zero_testcases()
            req = transform_requirements_d3_bubble(req)
            response = merge_apply_filters_d3_bubble(requirements, req)
        if (insight_id.find('Insight3') != -1):
            req = get_requirement_max_defects()
            req = transform_requirements_d3_bubble(req)
            response = merge_apply_filters_d3_bubble(requirements, req)
        wsresponse = {}
        wsresponse["forCmd"] = "Insight" 
        wsresponse["response"] = response
        ws.send(json.dumps(wsresponse)) 

def on_error(ws, error):
    print(error)

def on_close(ws):
    print ("DSX Listen End")
    ws.send("DSX Listen End")

def on_open(ws):
    def run(*args):
        for i in range(10000):
            hbeat = '{"cmd":"EI DSX HeartBeat"}'
            ws.send(hbeat)
            time.sleep(100)
            
    thread.start_new_thread(run, ())


def start_websocket_listener():
    websocket.enableTrace(True)
    ws = websocket.WebSocketApp("ws://ui-journey1.mybluemix.net/ws/EI-socket",
                              on_message = on_message,
                              on_error = on_error,
                              on_close = on_close)
    ws.on_open = on_open
    ws.run_forever()

## 8.1 Start websocket client

In [None]:
start_websocket_listener()

--- request header ---
GET /ws/EI-socket HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Host: ui-journey1.mybluemix.net
Origin: http://ui-journey1.mybluemix.net
Sec-WebSocket-Key: HN7GJ0bxMLWebQ7vy6h42w==
Sec-WebSocket-Version: 13


-----------------------
--- response header ---
HTTP/1.1 101 Switching Protocols
Connection: Upgrade
Sec-WebSocket-Accept: 2G6MW/NcIvFSCvum6V5WEtgkfd0=
Date: Sun, 26 Nov 2017 03:58:56 GMT
X-Global-Transaction-ID: 71085855
Upgrade: websocket
-----------------------
send: '\x81\x9amV\xe1m\x16t\x82\x00\tt\xdbO(\x1f\xc1)>\x0e\xc1%\x087\x93\x19/3\x80\x19O+'


{"cmd":"EI DSX HeartBeat","_msgid":"6378a91.814fd58"}
message {u'_msgid': u'6378a91.814fd58', u'cmd': u'EI DSX HeartBeat'}
Command EI DSX HeartBeat
{"cmd":"Client connected","_msgid":"5382822f.6a4a2c"}
message {u'_msgid': u'5382822f.6a4a2c', u'cmd': u'Client connected'}
Command Client connected


send: '\x81\xfe\x03\x8eA\xe0\x7f\x9c:\xc2\x19\xf33\xa3\x12\xf8c\xda_\xbe\x05\x85\x19\xf9"\x943\xf52\x94]\xb0a\xc2\r\xf92\x90\x10\xf22\x85]\xa6a\x9b]\xf2 \x8d\x1a\xbe{\xc0]\xf8$\x86\x1a\xff5\xc2S\xbcc\x83\x17\xf5-\x84\r\xf9/\xc2E\xbc\x1a\x9b]\xfb3\x8f\n\xecc\xda_\xbes\xc2S\xbcc\xa9;\xbe{\xc0]\xd8p\xd0]\xb0a\xc2\x0c\xf5;\x85]\xa6a\xd2L\xac<\xcc_\xe7c\x87\r\xf34\x90]\xa6a\xc2N\xbem\xc0]\xd5\x05\xc2E\xbcc\xa4M\xacc\xcc_\xbe2\x89\x05\xf9c\xda_\xa8q\xd0\x02\xb0a\x9b]\xfb3\x8f\n\xecc\xda_\xbes\xc2S\xbcc\xa9;\xbe{\xc0]\xd8p\xd6]\xb0a\xc2\x0c\xf5;\x85]\xa6a\xd2L\xac<\xcc_\xe7c\x87\r\xf34\x90]\xa6a\xc2L\xbem\xc0]\xd5\x05\xc2E\xbcc\xa4F\xacc\xcc_\xbe2\x89\x05\xf9c\xda_\xadr\xd0\x02\xb0a\x9b]\xfb3\x8f\n\xecc\xda_\xber\xc2S\xbcc\xa9;\xbe{\xc0]\xd8u\xd1]\xb0a\xc2\x0c\xf5;\x85]\xa6a\xd1L\xac<\xcc_\xe7c\x87\r\xf34\x90]\xa6a\xc2N\xbem\xc0]\xd5\x05\xc2E\xbcc\xa4N\xabc\xcc_\xbe2\x89\x05\xf9c\xda_\xa8q\xd0\x02\xb0a\x9b]\xfb3\x8f\n\xecc\xda_\xbep\xc2S\xbcc\xa9;\xbe{\xc0]\xd8u\xd7]\xb0a\xc2\x0c\xf5;\x85]\xa

{"cmd":"DefectList","_msgid":"d5ed97d4.3645d8"}
message {u'_msgid': u'd5ed97d4.3645d8', u'cmd': u'DefectList'}
Command DefectList
{"forCmd":"DefectList","response":{"name":"defect","children":[{"group":"2","ID":"D10","size":230},{"group":"1","ID":"D20","size":400},{"group":"2","ID":"D16","size":230},{"group":"3","ID":"D90","size":130},{"group":"3","ID":"D41","size":130},{"group":"1","ID":"D17","size":400},{"group":"1","ID":"D47","size":400},{"group":"2","ID":"D38","size":230},{"group":"2","ID":"D48","size":230},{"group":"2","ID":"D75","size":230},{"group":"2","ID":"D11","size":230},{"group":"3","ID":"D13","size":130},{"group":"3","ID":"D21","size":130},{"group":"3","ID":"D31","size":130},{"group":"1","ID":"D18","size":400},{"group":"1","ID":"D56","size":400},{"group":"1","ID":"D76","size":400},{"group":"2","ID":"D26","size":230},{"group":"3","ID":"D59","size":130},{"group":"2","ID":"D90","size":230}]},"_msgid":"dc9f02c8.52451"}
message {u'forCmd': u'DefectList', u'_msgid': u'dc9f02c8.5

  File "/gpfs/fs01/user/s96f-564ffbeb843c8a-f1d344ff0b95/.local/lib/python2.7/site-packages/websocket/_app.py", line 269, in _callback
    callback(self, *args)
  File "<ipython-input-25-959ee3553d74>", line 5, in on_message
    cmd = msg['cmd']
send: '\x81\xfe\x04\xb1P\xdb\x88r+\xf9\xee\x1d"\x98\xe5\x16r\xe1\xa8P\x14\xbe\xee\x173\xaf\xda\x17<\xba\xfc\x1b?\xb5\xaa^p\xf9\xfa\x17#\xab\xe7\x1c#\xbe\xaaHp\xa0\xaa\x1c?\xbf\xed\x01r\xe1\xa8)+\xf9\xef\x00?\xae\xf8Pj\xfb\xb9^p\xf9\xe1\x16r\xe1\xa8P\x14\xea\xbbP-\xf7\xa8\tr\xbc\xfa\x1d%\xab\xaaHp\xe9\xa4Rr\xb2\xecPj\xfb\xaa&\x13\xeb\xbbCr\xa6\xa4R+\xf9\xef\x00?\xae\xf8Pj\xfb\xbb^p\xf9\xe1\x16r\xe1\xa8P\x02\xe9\xbaBr\xa6\xa4R+\xf9\xef\x00?\xae\xf8Pj\xfb\xba^p\xf9\xe1\x16r\xe1\xa8P\x04\x98\xb8Gb\xf9\xf5^p\xa0\xaa\x15"\xb4\xfd\x02r\xe1\xa8A|\xfb\xaa\x1b4\xf9\xb2Rr\x89\xbaEi\xf9\xf5^p\xa0\xaa\x15"\xb4\xfd\x02r\xe1\xa8@|\xfb\xaa\x1b4\xf9\xb2Rr\x8f\xcbBa\xf9\xf5^p\xa0\xaa\x15"\xb4\xfd\x02r\xe1\xa8A|\xfb\xaa\x1b4\xf9\xb2Rr\x89\xbaB`\xf9\xf5^p\xa0\xaa\

{"cmd":"DefectRelation","ID":"D13","_msgid":"d74b51de.85adf"}
message {u'_msgid': u'd74b51de.85adf', u'cmd': u'DefectRelation', u'ID': u'D13'}
Command DefectRelation
[<pyorient.otypes.OrientRecord object at 0x7fe7d1118c90>]
['R220'] ['0.5']
[<pyorient.otypes.OrientRecord object at 0x7fe7d1118c90>, <pyorient.otypes.OrientRecord object at 0x7fe7d1118210>]
['R220', 'R279'] ['0.5', '0.408248290464']
[<pyorient.otypes.OrientRecord object at 0x7fe7d1118210>, <pyorient.otypes.OrientRecord object at 0x7fe7d1118f90>, <pyorient.otypes.OrientRecord object at 0x7fe7d1118b90>, <pyorient.otypes.OrientRecord object at 0x7fe7d1118290>]
['R200', 'R220', 'R211', 'R287'] ['0.679366220487', '0.5', '0.480384461415', '0.462910049886']
[<pyorient.otypes.OrientRecord object at 0x7fe7d1118e10>]
['R220'] ['0.5']
[<pyorient.otypes.OrientRecord object at 0x7fe7d1118e10>]
['R220'] ['0.5']
{"forCmd":"DefectRelation","response":{"nodes":[{"group":1,"id":"D13"},{"group":2,"id":"TC031"},{"group":3,"id":"R220"},{"group

  File "/gpfs/fs01/user/s96f-564ffbeb843c8a-f1d344ff0b95/.local/lib/python2.7/site-packages/websocket/_app.py", line 269, in _callback
    callback(self, *args)
  File "<ipython-input-25-959ee3553d74>", line 5, in on_message
    cmd = msg['cmd']


{"cmd":"EI DSX HeartBeat","_msgid":"727dfcf6.2d1e94"}
message {u'_msgid': u'727dfcf6.2d1e94', u'cmd': u'EI DSX HeartBeat'}
Command EI DSX HeartBeat


send: "\x81\x9a=\x89\xc4oF\xab\xa7\x02Y\xab\xfeMx\xc0\xe4+n\xd1\xe4'X\xe8\xb6\x1b\x7f\xec\xa5\x1b\x1f\xf4"


{"cmd":"EI DSX HeartBeat","_msgid":"36935eb0.eb6bf2"}
message {u'_msgid': u'36935eb0.eb6bf2', u'cmd': u'EI DSX HeartBeat'}
Command EI DSX HeartBeat
{"cmd":"EI DSX HeartBeat","_msgid":"4e44ff8e.ea48e"}
message {u'_msgid': u'4e44ff8e.ea48e', u'cmd': u'EI DSX HeartBeat'}
Command EI DSX HeartBeat


## Testing

In [None]:
# pd.options.display.max_colwidth = 0
# requirements_df.style.set_properties(**{'word-wrap': 'break-word'})
# requirements_df.filter(items=["ID","Description","Priority"])

In [None]:
# defects_df.filter(items=["ID","Description","Severity","Priority"])

In [None]:
# testcases_df.style.set_properties(**{'word-wrap': 'break-word'})
# testcases_df.filter(items=["ID","Description","Category"])

In [None]:
# get_related_testcases("D11")    // correct

# get_related_requirements("TC01")  //completely wrong

# get_defects()  // correct

# get_testcases() // correct

# get_requirements() // correct

# get_defects_severity(3) //correct

# get_testcases_category("FVT") // not working

# get_testcases_zero_defects()  // not getting any result   ---> should logic of threshold be changed

# get_defects_zero_testcases() // correct

# get_requirements_zero_defect()  // working as expected?

# get_requirements_zero_testcases() // incorrect -> german translations
    
#get_requirement_max_defects()

In [None]:
#get_related_defects("R104")

In [None]:
#get_related_requirements("TC05") 

In [None]:
#output_column_name = "RequirementsMatchScore"
#testcases_df = get_matches(testcases_df, requirements_df, keywords_column_name, output_column_name)
#testcases_df = populate_text_similarity_score(testcases_df, requirements_df, keywords_column_name, output_column_name)

In [None]:
# output_column_name = "RequirementsMatchScore"
#testcases_df = get_matches(testcases_df, requirements_df, keywords_column_name, output_column_name)
# testcases_df = populate_text_similarity_score(testcases_df, requirements_df, keywords_column_name, output_column_name)

In [None]:
# defects = get_defects()
# defectsSev1 = get_defects_severity(1)
# defects = transform_defects_d3_bubble(defects)
# defectsSev1 = transform_defects_d3_bubble(defectsSev1)
# merge_apply_filters_d3_bubble(defects,defectsSev1)

In [None]:
# get_testcases_category('FVT')

In [None]:
# get_requirements_zero_testcases()

In [None]:
# get_requirements_zero_defect() 

In [None]:
# len(get_related_defects('R100'))

In [None]:
# get_requirement_max_defects()

In [None]:
# get_testcases_zero_defects()

In [None]:
# requirements = get_requirement_defects(3)
# transform_requirements_d3_bubble(requirements)

In [None]:
# pd.options.display.max_colwidth = 0 
# defects_df.style.set_properties(**{'word-wrap': 'break-word'})
# defects_df.filter(items=["ID","Description","ClassifiedText","Keywords","TestCasesMatchScore","CosineMatch"])

In [None]:
# pd.options.display.max_colwidth = 0 
# testcases_df.style.set_properties(**{'word-wrap': 'break-word'})
# testcases_df.filter(items=["ID","Description","ClassifiedText","Keywords","RequirementsMatchScore","CosineMatch"])

In [None]:
# pd.options.display.max_colwidth = 0 
# requirements_df.style.set_properties(**{'word-wrap': 'break-word'})
# requirements_df.filter(items=["ID","Description","ClassifiedText","Keywords","DefectsMatchScore","CosineMatch"])