# Predicates in VerbNet 3.4

A crucial component of the extracted action model is the predicate, particularly the state predicates that are integral to defining the semantic roles and relationships within VerbNet. This section provides a quantitative overview of the predicates present in the VerbNet 3.4 annotation.

In [1]:
data_dir = '../src/data'
with open(f'{data_dir}/activity_predicates.txt', 'r', encoding='utf-8') as f:
    activity_pred_list = [line.strip() for line in f if line.strip()]
with open(f'{data_dir}/temporal_predicates.txt', 'r', encoding='utf-8') as f:
    temporal_pred_list = [line.strip() for line in f if line.strip()]

### Defining predicates

In the context of VerbNet, a *predicate* is a fundamental semantic unit composed of a name, a set of arguments, and a Boolean value indicating its truth condition. It is important to note that predicates with the same name can differ in their arguments or their Boolean value. To account for this variation, all forms of a predicate are collected under a single predicate name. The following code shows there 163 predicates, and they form 515 use cases wit different arguments and negated sign.

In [2]:
from vn2am.parser import get_VN_entries, get_frames, get_semantics
from vn2am.utils import formatted_predicate

"""
Collect predicates from VerbNet entries
Each predicate is stored in a dictionary with its positive and negative forms

The dictionary structure is:
{
    'predicate_name': {
        'positive': [list of positive predicates],
        'negative': [list of negative predicates]
    }
}
"""

# Extract all predicates from vn
predicate_dict = {}
vndata = get_VN_entries(f'{data_dir}/verbnet3.4.json')
frames = get_frames(vndata)
for entry in vndata:
    class_id = entry.get('class_id', '[no ID]')
    frames = entry.get('frames', [])
    for i, frame in enumerate(frames):
        semantics = get_semantics(frame)
        for semantic in semantics:
            _, predicate_name, args, bool_value = semantic
            formatted_pred = formatted_predicate(list(semantic))
            # new entry
            if predicate_name not in predicate_dict:
                    predicate_dict[predicate_name] = {
                        'positive': set(),
                        'negative': set()
                    }
            # add to positive or negative set
            if bool_value == None:
                predicate_dict[predicate_name]['positive'].add(formatted_pred)
            elif bool_value == '!':
                predicate_dict[predicate_name]['negative'].add(formatted_pred)
            else:
                raise ValueError(f"Unexpected bool_value: {bool_value} for predicate {predicate_name}")

# Sort all positive and negative predicates
for predicate in predicate_dict:
        predicate_dict[predicate]['positive'] = sorted(predicate_dict[predicate]['positive'], key=lambda x: str(x))
        predicate_dict[predicate]['negative'] = sorted(predicate_dict[predicate]['negative'], key=lambda x: str(x))
all_predicate_count = sum(len(v['positive']) + len(v['negative']) for v in predicate_dict.values())
print(f"total number of predicates: {len(predicate_dict)}")
print(f"total number of predicates with arguments: {all_predicate_count}")

total number of predicates: 163
total number of predicates with arguments: 515


### Type of predicates

Predicates within VerbNet 3.4 is classified into fluent, static, temporal predicates class in the paper. The following count of all predicates reveals the following distribution across different predicate types.

The total number of predicates class identified as follows:
- Total Fluent Predicates: 475
- Total Static Predicates: 19
- Total Temporal Predicates: 9


In [3]:
from vn2am.utils import fst_pred_count

fluent_pred, static_pred, temporal_pred, special = fst_pred_count(predicate_dict)

print(f"Total number of predicates name: {len(predicate_dict)}")
print(f"  total number of fluent predicates: {len(fluent_pred)}")
print(f"  total number of static predicates: {len(static_pred)}")
print(f"  total number of temporal predicates: {len(temporal_pred)}")
assert len(temporal_pred) == len(temporal_pred_list), \
    "The number of temporal predicates in verbnet does not match the number of predefined preidcates"

print("\nSpecial Cases:")
for case in special:
    predicate, has_fluent, has_static, has_temporal = case
    print(f"  predicate <{predicate}> is", end=" ")
    if has_fluent:
        print("fluent", end=" and ")
    if has_static:
        print("static", end=" ")
    if has_temporal:
        print("temporal", end=" ")
    print()

Total number of predicates name: 163
  total number of fluent predicates: 153
  total number of static predicates: 5
  total number of temporal predicates: 8

Special Cases:
  predicate <cause> is fluent and temporal 
  predicate <part_of> is fluent and static 
  predicate <cost> is fluent and static 


Additionally, the paper categories fluent predicates into activity predicates and state predicates. Based on the results, there are 68 activity predicate and 85 state predicates. The sum of these two categories aligns with the total number of fluent predicates. These state predicates collectively form the basis for constructing the extracted action models.

In [4]:
state_predicate_dict = {}
for predicate, value in predicate_dict.items():
    is_fluent = False
    if predicate.lower() in activity_pred_list:
        continue
    for pred in list(value['positive']) + list(value['negative']):
        event_count = sum(1 for arg in pred[2] if arg == 'Event')
        # Simple rules to determine fluent, static, temporal predicates
        if event_count == 1 and len(pred[2]) > 1:
            is_fluent = True
    if is_fluent:
        state_predicate_dict[predicate] = predicate_dict[predicate]

print(f"Total number of activity predicates: {len(activity_pred_list)}")
print(f"Total number of state predicates: {len(state_predicate_dict)}")
assert len(activity_pred_list) + len(state_predicate_dict) == len(fluent_pred), "The sum of activity predicates and state predicates is inconsistent with the number of fluent predicates"

Total number of activity predicates: 68
Total number of state predicates: 85


In [5]:
"""
Export the predicates dictionary into a json file, by default : False
"""
import json

IS_EXPORT = False
OUTPUT_JSON_PATH = '../output/verbnet_predicates.json'

if IS_EXPORT:
    # Convert sets back to lists for JSON
    for key in predicate_dict:
        predicate_dict[key]['positive'] = list(predicate_dict[key]['positive'])
        predicate_dict[key]['negative'] = list(predicate_dict[key]['negative'])

    # Write the collected data to a JSON file
    with open(OUTPUT_JSON_PATH, 'w', encoding='utf-8') as json_file:
        json.dump(predicate_dict, json_file, indent=2)