# Overview

`cycontext` comes with a default knowledge base which is loaded by default. While these rules will cover a large number of use cases, users will often want to customize or extend the modifiers included in a knowledge base. In cycontext, users can define their own modifiers and control their behavior through the `ContextItem` class.

In this notebook, we'll dive deeper into the `ConTextItem` and `TagObject` classes and show how to use them to add and customize new rules 

In [1]:
import spacy

from cycontext import ConTextItem, ConTextComponent

from medspacy.visualization import visualize_dep, visualize_ent

In [2]:
nlp = spacy.load("en_core_web_sm", disable="ner")

# Modifiers
## ConTextItem
The knowledge base of cycontext is defined by ConTextItem objects. A ConTextItem is instantiated with the following parameters:

- **literal** (str): The actual string of a concept. If pattern is None,
    this string will be lower-cased and matched to the lower-case string.
- **category** (str): The semantic class of the item.
- **pattern** (list or None): A spaCy pattern to match using token attributes.
    See https://spacy.io/usage/rule-based-matching.
- **rule** (str): The directionality or action of a modifier.
    One of ("forward", "backward", "bidirectional", or "terminate").
- **allowed_types** (set or None): A set of target labels to allow a modifier to modify.
    If None, will apply to any type not specifically excluded in excluded_types.
    Only one of allowed_types and excluded_types can be used. An error will be thrown
    if both or not None.
- **excluded_types** (set or None): A set of target labels which this modifier cannot modify.
    If None, will apply to all target types unless allowed_types is not None.
- **max_targets** (int or None): The maximum number of targets which a modifier can modify.
    If None, will modify all targets in its scope.
- **max_scope** (int or None): A number to explicitly limit the size of the modifier's scope

## TagObject

When a ConTextItem is matched to a string of text, it generates a `TagObject` which is stored in `doc._.context_graph.modifiers`. If it modifies any targets, these relationships can be found as tuples in `doc._.context_graph.edges`. The TagObject also contains a reference to the original ConTextItem.

In addition to the attributes of the original ItemData such as **literal** and **category**, a TagObject contains the following attributes:
- **span**: The spaCy Span of the matched text
- **scope**: The spaCy Span of the Doc which is within the TagObject's scope. Any targets in this scope will be modified by the TagObject
- **start**: Start index
- **end**: End index (non-inclusive)

# Examples

## 1. Default Rules
When you instantiate `ConTextComponent`, a default list of `ConTextItem`s is loaded and included in the `context.item_data` attribute.

In [3]:
context = ConTextComponent(nlp, rules="default")

In [4]:
context.item_data[:5]

[ConTextItem(literal='absence of', category='NEGATED_EXISTENCE', pattern=None, rule='FORWARD'),
 ConTextItem(literal='adequate to rule out', category='NEGATED_EXISTENCE', pattern=[{'LOWER': {'IN': ['adequate', 'sufficient']}}, {'LOWER': 'to'}, {'LOWER': 'rule'}, {'LOWER': {'IN': ['him', 'her', 'them', 'patient', 'pt']}, 'OP': '?'}, {'LOWER': 'out'}, {'LOWER': {'IN': ['against', 'for']}, 'OP': '?'}], rule='FORWARD'),
 ConTextItem(literal='adequate to rule the patient out', category='NEGATED_EXISTENCE', pattern=[{'LOWER': {'IN': ['adequate', 'sufficient']}}, {'LOWER': 'to'}, {'LOWER': 'rule'}, {'LOWER': 'the'}, {'LOWER': {'IN': ['patient', 'pt']}}, {'LOWER': 'out'}, {'LOWER': {'IN': ['against', 'for']}, 'OP': '?'}], rule='FORWARD'),
 ConTextItem(literal='any other', category='NEGATED_EXISTENCE', pattern=None, rule='FORWARD'),
 ConTextItem(literal='apart from', category='NEGATED_EXISTENCE', pattern=[{'LOWER': 'apart'}, {'LOWER': {'IN': ['for', 'from']}}], rule='TERMINATE')]

In [5]:
print(context.item_data[0])

ConTextItem(literal='absence of', category='NEGATED_EXISTENCE', pattern=None, rule='FORWARD')


In [6]:
print(type(context.item_data[0]))

<class 'cycontext.context_item.ConTextItem'>


In [7]:
len(context.item_data)

95

We can also see the unique categories in the knowledge base by checking `context.categories`:

In [8]:
context.categories

{'FAMILY',
 'HISTORICAL',
 'HYPOTHETICAL',
 'NEGATED_EXISTENCE',
 'POSSIBLE_EXISTENCE'}

## 2: Basic Usage
Here, we'll load a blank context component and define our own item data. We'll an example we've seen earlier, where we need to negate **"pneumonia"**:

In [9]:
doc = nlp("There is no evidence of pneumonia.")

First, we instantiate context and pass in `rules=None`:

In [10]:
context = ConTextComponent(nlp, rules=None)

Next, we'll define a ConTextItem with following arguments:
- `literal=`**"no evidence of"**: This is the string of text which ConText will look for in the text (case insensitive)
- `category=`**"NEGATED_EXISTENCE"**: The semantic class assigned to our modifier
- `rule=`**"forward"**: This defines the *directionality* of the rule. A later example shows more examples of this

We'll leave the other arguments blank. Next, we instantiate our ConTextItem as `item` and put it in a list called `item_data`.

In [11]:
item = ConTextItem(literal="no evidence of", category="NEGATED_EXISTENCE", rule="FORWARD")
item_data = [item]

We then add the modifiers to ConText with the `context.add()` method:

In [12]:
context.add(item_data)

In [13]:
context.item_data

[ConTextItem(literal='no evidence of', category='NEGATED_EXISTENCE', pattern=None, rule='FORWARD')]

Now we can call context on our doc. This will typically happen under the hood as part of the nlp pipeline, but you can call it manually on a doc as well:

In [14]:
context(doc)

There is no evidence of pneumonia.

We can see if any modifiers were created by context by looking at the `doc._.context_graph` attribute, which stores all of the information generated on a doc by context. `modifiers` stores the `TagObjects` created by context, and `edges` stores the relationships between the modifiers and targets. Here, we match a modifier with the custom `item_data` that we created, but there are no edges because there are no target concepts in doc.ents yet.

In [15]:
print(doc._.context_graph)
print(doc._.context_graph.modifiers)
print(doc._.context_graph.edges)
print(doc.ents)

<ConTextGraph> with 0 targets and 1 modifiers
[<TagObject> [no evidence of, NEGATED_EXISTENCE]]
[]
()


Each element of `context_graph.modifiers` is a`TagObject`. Let's look at the tag object in this doc and see some of the attributes which are available: 

In [16]:
tag_object = doc._.context_graph.modifiers[0]

`tag_object.span` is the spaCy Span of the Doc which was matched, and has a `start` and `end` index:

In [17]:
print(tag_object.span)
print(tag_object.start, tag_object.end)

no evidence of
2 5


`tag_object.scope` shows what part of the sentence could be modified by the modifier. Any targets in this span of text will be modified:

In [18]:
print(tag_object.scope)

pneumonia.


We can also see the original `ConTextItem` object and attributes:

In [19]:
print(tag_object.category, ",", tag_object.rule)

NEGATED_EXISTENCE , FORWARD


In [20]:
# The reference to the original ConTextItem
print(tag_object.context_item)
assert tag_object.context_item is item_data[0]

ConTextItem(literal='no evidence of', category='NEGATED_EXISTENCE', pattern=None, rule='FORWARD')


## Example 3: Pattern-matching
In this example, we'll use a matching pattern to generate a more flexible matching criteria to match multiple texts with a single ConTextItem. If only `literal` is supplied, the exact phrase is matched in lower case. spaCy offers powerful rule-based matching which operates on each token in a Doc. Matching patterns can use the text, regular expression patterns, linguistic attributes such as part of speech, and operators such as **"?"** (0 or 1) or **"*"** (0 or more) to match sequences of text. 

For more detailed information, see spaCy's documentation on rule-based matching: https://spacy.io/usage/rule-based-matching.

The ConTextItem below has the same literal, categorym, and rule as our previous example, but it also includes a pattern which allows the tokens "evidence" and "of" to be optional. This will then match both "no evidence of" and "no" and assign both spans of text to be negation modifiers.

In [21]:
item_data = [ConTextItem(literal="no evidence of", 
                         category="NEGATED_EXISTENCE", 
                         rule="forward", 
                         pattern=[{"LOWER": "no"}, 
                                  {"LOWER": "evidence", "OP": "?"},
                                  {"LOWER": "of", "OP": "?"},
                                 ]
                        )]

In [22]:
context = ConTextComponent(nlp)
context.add(item_data)

In [23]:
texts = ["THERE IS NO EVIDENCE OF PNEUMONIA.",
        "There is no CHF."]

In [24]:
docs = list(nlp.pipe(texts))
for doc in docs:
    context(doc)

In [25]:
for doc in docs:
    print(doc._.context_graph.modifiers)

[<TagObject> [NO EVIDENCE OF, NEGATED_EXISTENCE]]
[<TagObject> [no, NEGATED_EXISTENCE]]


Under the hood, these matches are generated using two of spaCy's rule-based matching classes: 
- **[PhraseMatcher](https://spacy.io/api/phrasematcher)** for literals
- **[Matcher](https://spacy.io/api/matcher)** for patterns

In [26]:
context.matcher

<spacy.matcher.matcher.Matcher at 0x125bd3320>

In [27]:
context.phrase_matcher

<spacy.matcher.phrasematcher.PhraseMatcher at 0x1252906d0>

## Example 3: Direction
The `rule` attribute defines which direction modifiers should operate. You can imagine an arrow starting at the modifier in a phrase and moving *towards* the target. If the modifier comes before the target, the arrow will move **forward** in the sentence all targets in the sentence *after* the TagObject will be modified. If **"backward"**, it will move **backward** in the sentence and match all targets *before*. If **"bidirectional"** it will look both ahead and behind.

The scope of a modifier is bounded to be within the same sentence, so no modifier will affect targets in other sentences. This can be problematic in poorly split documents, but it prevents all targets in a document from being incorrectly modified by a ConText item. A scope is also defined by any termination points, which will be shown in the next example.

In [28]:
item_data = [ConTextItem("no evidence of", "NEGATED_EXISTENCE", "FORWARD"),
            ConTextItem("is ruled out", "NEGATED_EXISTENCE", "BACKWARD"),
             ConTextItem("unlikely", "POSSIBLE_EXISTENCE", "BIDIRECTIONAL"),
            ]

In [29]:
texts = ["No evidence of pneumonia.",
        "PE is ruled out",
        "unlikely to be malignant", 
        "malignancy unlikely"]

In [30]:
docs = nlp.pipe(texts)

In [31]:
context = ConTextComponent(nlp, rules=None)
context.add(item_data)

In [32]:
for doc in docs:
    context(doc)
    modifier = doc._.context_graph.modifiers[0]
    print(doc)
    print("[{0}:{1}] will modify {2}".format(modifier.span, modifier.category,
                                                      modifier.scope))
    print(modifier.rule)
    
    print()

No evidence of pneumonia.
[No evidence of:NEGATED_EXISTENCE] will modify pneumonia.
FORWARD

PE is ruled out
[is ruled out:NEGATED_EXISTENCE] will modify PE
BACKWARD

unlikely to be malignant
[unlikely:POSSIBLE_EXISTENCE] will modify unlikely to be malignant
BIDIRECTIONAL

malignancy unlikely
[unlikely:POSSIBLE_EXISTENCE] will modify malignancy unlikely
BIDIRECTIONAL



## Example 4: Termination points
As said before, the scope of a modifier is originally set to the entire sentence either before after a TagObject, as defined by the ItemData's `rule` attribute. However, the scope can be modified by **termination points**, which is another TagObject with the rule **"TERMINATE"**. For example, in "There is no evidence of pneumonia but there is CHF", the negation modifier should modify "pneumonia" but not "CHF". This can be achieved by defining a ConTextItem to terminate at the word "but".

In [33]:
text = "There is no evidence of pneumonia but there is CHF"

In [34]:
item_data1 = [ConTextItem("no evidence of", "NEGATED_EXISTENCE", "FORWARD")]

In [35]:
context = ConTextComponent(nlp, rules=None)
context.add(item_data1)

In [36]:
doc = nlp(text)
context(doc)

There is no evidence of pneumonia but there is CHF

In [37]:
tag_object = doc._.context_graph.modifiers[0]
tag_object

<TagObject> [no evidence of, NEGATED_EXISTENCE]

In [38]:
# The scope includes both "pneumonia" and "CHF", so both would be negated
tag_object.scope

pneumonia

In [39]:
# Now add an additional ConTextItem with "TERMINATE"
item_data2 = [ConTextItem("but", "CONJ", "TERMINATE")]

In [40]:
context.add(item_data2)

In [41]:
doc = nlp(text)
context(doc)

There is no evidence of pneumonia but there is CHF

In [42]:
tag_object = doc._.context_graph.modifiers[0]

In [43]:
# The scope now only encompasses "pneumonia"
tag_object.scope

pneumonia

## Example 5: Pruned modifiers
If two ConTextItems result in TagObjects where one is the substring of another, the modifiers will be pruned to keep **only** the larger span. For example, **"no history of"** is a negation modifier, while **"history of"** is a historical modifier. Both match the text "no history of afib", but only "no history of" should ultimately modify "afib".

By default, prune is set to `True`, but can be set to `False` when initiating the context component, as shown below.

In [44]:
item_data = [ConTextItem("no history of", "DEFINITE_NEGATED_EXISTENCE", "FORWARD"),
            ConTextItem("history", "HISTORICAL", "FORWARD"),
            ]

In [45]:
text = "no history of"

In [46]:
context = ConTextComponent(nlp, rules=None, prune=False)
context.add(item_data)

In [47]:
doc = nlp(text)
context(doc)

no history of

In [48]:
# Two overlapping modifiers
doc._.context_graph.modifiers

[<TagObject> [no history of, DEFINITE_NEGATED_EXISTENCE],
 <TagObject> [history, HISTORICAL]]

In [49]:
# Now set prune to True
context = ConTextComponent(nlp, prune=True)
context.add(item_data)

In [50]:
doc = nlp(text)
context(doc)

no history of

In [51]:
# Only one modifier is left
doc._.context_graph.modifiers

[<TagObject> [no history of, DEFINITE_NEGATED_EXISTENCE]]

## Example 6: Manually limiting scope
By default, the scope of a modifier is the **entire sentence** in the direction of the rule up until a termination point (see above). However, sometimes this is too much. In long sentences, this can cause a modifier to extend far beyond its location in the sentence. Some modifiers are really meant to be attached to a single concept, but they are instead distributed to all targets.

To fix this, cycontext allows optional attributes in `ItemData` to limit the scope: `max_scope` and `max_targets`. Both attributes are explained below.

### max_targets
Some modifiers should really only attach to a single target. For example, in the sentence below:

**"Pt presents with diabetes, pneumonia vs COPD"**

**"vs"** indicates uncertainty, but *only* between **"pneumonia"** and **"COPD"**. **"Diabetes"** should not be affected. We can achieve this by creating a bidirectional rule with a `max_targets` of **1**. This will limit the number of targets to 1 *on each side* of the tag object.

Let's first see what this looks like *without* defining `max_targets`:

In [52]:
text = "Pt presents with diabetes, pneumonia vs COPD"

In [53]:
doc = nlp(text)
doc.ents = (doc[3:4], doc[5:6], doc[7:8])
doc.ents

(diabetes, pneumonia, COPD)

In [54]:
item = ConTextItem("vs", category="UNCERTAIN",
                           rule="BIDIRECTIONAL", 
                   max_scope=None)

In [55]:
context = ConTextComponent(nlp, rules=None)
context.add([item])

In [56]:
context(doc)

Pt presents with diabetes, pneumonia vs COPD

In [57]:
visualize_dep(doc)

Now, let's start over and set `max_targets` to **1**:

In [58]:
doc = nlp(text)
doc.ents = (doc[3:4], doc[5:6], doc[7:8])

In [59]:
item = ConTextItem("vs", category="UNCERTAIN",
                           rule="BIDIRECTIONAL", 
                   max_targets=1)

In [60]:
context = ConTextComponent(nlp, rules=None)
context.add([item])

In [61]:
context(doc)

Pt presents with diabetes, pneumonia vs COPD

In [62]:
visualize_dep(doc)

### max_scope
One limitation of using `max_targets` is that in a sentence like the example above, each concept has to be extracted as an entity in order for it to reduce the scope - if **"pneumonia"** was not extracted, then **"vs"** would still etend as far back as **"diabetes"**. 

We can address this by explicitly setting the scope to be no greater than a certain number of tokens using `max_scope`. For example, lab results may show up in a text document with many individual results:

---
Adenovirus DETECTED<br>
SARS NOT DETECTED<br>
...
Cov HKU1 NOT DETECTED<br>

---

Texts like this are often difficult to parse and they are often not ConText-friendly because many lines can be extracted as a single sentence. By default, a modifier like **"NOT DETECTED"** could extend far back to a concept such as **"Adenovirus"**, which we see returned positive. We may also not explicitly extract every virus tested in the lab, so `max_targets` won't work. 

With text formats like this, we can be fairly certain that **"Not Detected"** will only modify the single concept right before it. We can set `max_scope` to be so **only** a single concept will be modified.

In [63]:
text = """Adenovirus DETECTED Sars NOT DETECTED Pneumonia NOT DETECTED"""

In [64]:
doc = nlp(text)
doc.ents = (doc[0:1], doc[2:3], doc[5:6])
doc.ents

(Adenovirus, Sars, Pneumonia)

In [65]:
print([sent for sent in doc.sents])

[Adenovirus DETECTED Sars, NOT DETECTED Pneumonia NOT DETECTED]


In [66]:
#assert len(list(doc.sents)) == 1

In [67]:
item_data = [ConTextItem("DETECTED", category="POSITIVE_EXISTENCE",
                           rule="BACKWARD", 
                   max_scope=None),
             ConTextItem("NOT DETECTED", category="DEFINITE_NEGATED_EXISTENCE",
                           rule="BACKWARD", 
                   max_scope=None),
            ]

In [68]:
context = ConTextComponent(nlp, rules=None)
context.add(item_data)

In [69]:
context(doc)

Adenovirus DETECTED Sars NOT DETECTED Pneumonia NOT DETECTED

In [70]:
visualize_dep(doc)

Let's now set `max_scope`  to 1 and we'll find that only **"pneumonia"** and **"Sars"** are modified by **"NOT DETECTED"**:

In [71]:
doc = nlp(text)
doc.ents = (doc[0:1], doc[2:3], doc[5:6])
doc.ents

(Adenovirus, Sars, Pneumonia)

In [72]:
item_data = [ConTextItem("DETECTED", category="POSITIVE_EXISTENCE",
                           rule="BACKWARD", 
                   max_scope=1),
             ConTextItem("NOT DETECTED", category="DEFINITE_NEGATED_EXISTENCE",
                           rule="BACKWARD", 
                   max_scope=1),
            ]

In [73]:
context = ConTextComponent(nlp, rules=None)
context.add(item_data)

In [74]:
context(doc)

Adenovirus DETECTED Sars NOT DETECTED Pneumonia NOT DETECTED

In [75]:
visualize_dep(doc)

## Example 7: Filtering target types
You may want modifiers to only modify targets with certain semantic classes. You can specify which types to be modified/not be modified through the `allowed_types` and `excluded_types` arguments. 

For example, in the sentence:

---
"She is not prescribed any beta blockers for her hypertension."

---

**"Beta blockers"** is negated by the phrase **not prescribed"**, but **"hypertension"** should not be negated. By default, a modifier will modify all concepts in its scope, regardless of semantic type:

In [76]:
from spacy.tokens import Span

In [77]:
# Let's write a function to create this manual example
def create_medication_example():
    doc = nlp("She is not prescribed any beta blockers for her hypertension.")
    # Manually define entities
    medication_ent = Span(doc, 5, 7, "MEDICATION")
    condition_ent = Span(doc, 9, 10, "CONDITION")
    doc.ents = (medication_ent, condition_ent)
    return doc

In [78]:
doc = create_medication_example()
doc

She is not prescribed any beta blockers for her hypertension.

In [79]:
# Define our item data without any type restrictions
item_data = [ConTextItem("not prescribed", "NEGATED_EXISTENCE", "FORWARD")]

In [80]:
context = ConTextComponent(nlp, rules="other", rule_list=item_data)

In [81]:
context(doc)

She is not prescribed any beta blockers for her hypertension.

In [82]:
# Visualize the modifiers
visualize_dep(doc)

To change this, we can make sure that **"not prescribed"** only modifies **MEDICATION** entities by setting `allowed_types` to **"MEDICATION"**;

In [83]:
item_data2 = [ConTextItem("not prescribed", "NEGATED_EXISTENCE", "FORWARD", allowed_types={"MEDICATION"})]

In [84]:
context = ConTextComponent(nlp, rules="other", rule_list=item_data2)

In [85]:
doc = create_medication_example()
context(doc)

She is not prescribed any beta blockers for her hypertension.

Now, only **"beta blockers"** will be negated:

In [86]:
visualize_dep(doc)

The same can be achieved by setting `excluded_types` to `{"CONDITION"}`.

In [87]:
item_data3 = [ConTextItem("not prescribed", "NEGATED_EXISTENCE", "FORWARD", excluded_types={"CONDITION"})]

# Setting additional Span attributes
As seen in an earlier notebook, cycontext registers two new attributes for target Spans: `is_experienced` and `is_current`. These values are set to default values of True and changed if a target is modified by certain modifiers. This logic is set in the variable `DEFAULT_ATTRS`. This is a dictionary which maps modifier category names to the attribute name/value pair which should be set if a target is modified by that modifier type.

In [88]:
from cycontext.context_component import DEFAULT_ATTRS

In [89]:
DEFAULT_ATTRS

{'NEGATED_EXISTENCE': {'is_negated': True},
 'POSSIBLE_EXISTENCE': {'is_uncertain': True},
 'HISTORICAL': {'is_historical': True},
 'HYPOTHETICAL': {'is_hypothetical': True},
 'FAMILY': {'is_family': True}}

## Defining custom attributes
Rather than using the logic shown above, you can set your own attributes by creating a dictionary with the same structure as DEFAULT_ATTRS and passing that in as the `add_attrs` parameter. If setting your own extensions, you must first call `Span.set_extension` on each of the extensions. 

If more complex logic is required, custom attributes can also be set manually outside of the ConTextComponent, for example as a post-processing step.

Below, we'll create our own attribute mapping and have them override the default cycontext attributes. We'll defined `is_experienced` and `is_family_history`. Because both a negated concept and a family history concept are not actually experienced by a patient, we'll specify both to set `is_experienced` to False. We'll also set the family history modifier to add a new attribute called `is_family_history`.

In [90]:
from spacy.tokens import Span

In [91]:
# Define modifiers and Span attributes
custom_attrs = {
    'NEGATED_EXISTENCE': {'is_experienced': False},
    'FAMILY_HISTORY': {'is_family_history': True,
                      'is_experienced': False},
}

In [92]:
# Register extensions - is_experienced should be True by default, `is_family_history` False
Span.set_extension("is_experienced", default=True)
Span.set_extension("is_family_history", default=False)

In [93]:
context = ConTextComponent(nlp, rules=None, add_attrs=custom_attrs)
context.context_attributes_mapping

{'NEGATED_EXISTENCE': {'is_experienced': False},
 'FAMILY_HISTORY': {'is_family_history': True, 'is_experienced': False}}

In [94]:
item_data = [ConTextItem("no evidence of", "NEGATED_EXISTENCE", "FORWARD"),
            ConTextItem("family history", "FAMILY_HISTORY", "FORWARD"),
            ]

context.add(item_data)

In [95]:
doc = nlp("There is no evidence of pneumonia. Family history of diabetes.")

doc.ents = doc[5:6], doc[-2:-1]

doc.ents

(pneumonia, diabetes)

In [96]:
context(doc)

There is no evidence of pneumonia. Family history of diabetes.

The new attributes are now available in `ent._`:

In [97]:
for ent in doc.ents:
    print(ent)
    print("is_experienced: ", ent._.is_experienced)
    print("is_family_history: ", ent._.is_family_history)
    print()

pneumonia
is_experienced:  False
is_family_history:  False

diabetes
is_experienced:  False
is_family_history:  True



# Reading and Writing a Knowledge Base
ConTextItems can be saved as JSON and read in, which allows a knowledge base to be reused and scaled. When you install `cycontext` with pip or `python setup.py install`, it includes a JSON file of default modifier rules. That file is also included in the GitHub repo:

https://github.com/medspacy/cycontext/blob/master/kb/default_rules.json

The filepath on your local machine can be accessed in the constant `DEFAULT_RULES_FILEPATH`. Let's look at the first 10 lines of this file: 

In [98]:
from cycontext import DEFAULT_RULES_FILEPATH
DEFAULT_RULES_FILEPATH

'/Users/alecchapman/opt/anaconda3/envs/medspacy-37/lib/python3.7/site-packages/cycontext-1.0.3.1-py3.7.egg/kb/default_rules.json'

In [99]:
with open(DEFAULT_RULES_FILEPATH) as f:
    print(f.read()[:500])

{
  "item_data": [
    {
      "category": "NEGATED_EXISTENCE",
      "literal": "absence of",
      "pattern": null,
      "rule": "FORWARD"
    },
    {
      "category": "NEGATED_EXISTENCE",
      "literal": "adequate to rule out",
      "pattern": [
        {
          "LOWER": {
            "IN": ["adequate", "sufficient"]
          }
        },
        {
          "LOWER": "to"
        },
        {
          "LOWER": "rule"
        },
        {
          "LOWER": {
            "IN": ["him"


A JSON file of item data can be loaded with the `ConTextItem.from_json` method:

In [100]:
item_data = ConTextItem.from_json(DEFAULT_RULES_FILEPATH)

In [101]:
for item in item_data[:5]:
    print(item)

ConTextItem(literal='absence of', category='NEGATED_EXISTENCE', pattern=None, rule='FORWARD')
ConTextItem(literal='adequate to rule out', category='NEGATED_EXISTENCE', pattern=[{'LOWER': {'IN': ['adequate', 'sufficient']}}, {'LOWER': 'to'}, {'LOWER': 'rule'}, {'LOWER': {'IN': ['him', 'her', 'them', 'patient', 'pt']}, 'OP': '?'}, {'LOWER': 'out'}, {'LOWER': {'IN': ['against', 'for']}, 'OP': '?'}], rule='FORWARD')
ConTextItem(literal='adequate to rule the patient out', category='NEGATED_EXISTENCE', pattern=[{'LOWER': {'IN': ['adequate', 'sufficient']}}, {'LOWER': 'to'}, {'LOWER': 'rule'}, {'LOWER': 'the'}, {'LOWER': {'IN': ['patient', 'pt']}}, {'LOWER': 'out'}, {'LOWER': {'IN': ['against', 'for']}, 'OP': '?'}], rule='FORWARD')
ConTextItem(literal='any other', category='NEGATED_EXISTENCE', pattern=None, rule='FORWARD')
ConTextItem(literal='apart from', category='NEGATED_EXISTENCE', pattern=[{'LOWER': 'apart'}, {'LOWER': {'IN': ['for', 'from']}}], rule='TERMINATE')


The items can also be saved as JSON by using the `ConTextItem.to_json` method:

In [102]:
ConTextItem.to_json(item_data[:2], "2_modifiers.json")

In [103]:
import json
with open("2_modifiers.json") as f:
    print(json.load(f))

{'item_data': [{'allowed_types': None, 'excluded_types': None, 'max_scope': None, 'max_targets': None, 'pattern': None, 'category': 'NEGATED_EXISTENCE', 'literal': 'absence of', 'rule': 'FORWARD', 'metadata': None}, {'allowed_types': None, 'excluded_types': None, 'max_scope': None, 'max_targets': None, 'pattern': [{'LOWER': {'IN': ['adequate', 'sufficient']}}, {'LOWER': 'to'}, {'LOWER': 'rule'}, {'LOWER': {'IN': ['him', 'her', 'them', 'patient', 'pt']}, 'OP': '?'}, {'LOWER': 'out'}, {'LOWER': {'IN': ['against', 'for']}, 'OP': '?'}], 'category': 'NEGATED_EXISTENCE', 'literal': 'adequate to rule out', 'rule': 'FORWARD', 'metadata': None}]}
