# Statementextraction
Author: Edward Schmuhl

---

This notebook shows how the implementation of the Statementextraction in combination of using the Conditionidentification

In [1]:
import spacy
# Conditionidentification
path_to_ner = "./condition_identification_model"
conditionidentification = spacy.load(path_to_ner)

# Spacy model
nlp = spacy.load("en_core_web_trf")

## Statementextraction function

The Conditionidentification will detect conditions and their index position. Every head referencing the condition will be extracted. This operation will be applied recursively on the extracted heads until the ROOT is reached. The extracted words form the statement


In [2]:
def extract_Statement(text):
    returnList = []
    condition_Doc = conditionidentification(text)
    doc = nlp(text)
    
    # Conditionidentification
    conditionList = []
    for i in range(0,len(doc)):
        try:
            if(condition_Doc[i].ent_type_ == "CONDITION" and condition_Doc[i].ent_iob_ == "B"):
                index = i
                
                # Conditionidentification
                condition = condition_Doc[i].text
                conditionList.append(condition)
                for k in range(i,len(doc)):
                    if condition_Doc[k].ent_iob_ == "I":
                        condition += " "+condition_Doc[k].text
                        conditionList.append(condition_Doc[k].text)
                        conditionList.append(condition)
                        if doc[k].pos_ == "NOUN" and doc[i].pos_ != "NOUN":
                            index = k

                # Statementextraction        
                statement = ""
                ancestorList = []
                for ancestor in doc[index].ancestors:
                    if ancestor.text not in conditionList:
                        ancestorList.append(ancestor.text)
                for ancestor in reversed(ancestorList):
                    statement += ancestor+" "
                if statement != "":
                    statement += "condition"
                    returnList.append((statement,condition))
        except IndexError:
            print(i,doc.text)
            continue
    return returnList

## Evaluation

In [5]:
test_data = []
test_data.append("I have digestion problems and wanted to try this product.")
test_data.append("My arthritic knee is getting better because i take this supplement.")
test_data.append("Depression is gone... Love this!!!")
test_data.append("No Migraines, Headaches or other pain")
test_data.append("Helps anxiety, depression and restlessness")
test_data.append("perfect for people with diabetes.")

for test_text in test_data:
    print(extract_Statement(test_text))

[('have condition', 'digestion problems')]
[('getting condition', 'arthritic knee')]
[('Love gone condition', 'Depression')]
[]
[('Helps condition', 'anxiety'), ('Helps condition', 'depression')]
[('perfect for people with condition', 'diabetes')]
