# Demo Notebook for Passive/Active Switching

## Setup/Loading Data

In [1]:
import conllu
from conllu import Token, parse, parse_tree
from pyinflect import getAllInflections, getInflection
from src.utils import *
from src.units.word import *
from src.units.sentence import *

In [2]:
with open("data/en_gum-ud-train.conllu", "r") as f:
    text = f.read()
sentences = parse(text)

In [3]:
i = 100
token = sentences[i][4]
print(sentences[i].metadata['text'])
list(sentences[i])

The country’s unequal economic growth originated in the colonial era and reflects how the Spanish metropolis influenced the establishment of extractive institutions [7].


[{'id': 1,
  'form': 'The',
  'lemma': 'the',
  'upos': 'DET',
  'xpos': 'DT',
  'feats': {'Definite': 'Def', 'PronType': 'Art'},
  'head': 2,
  'deprel': 'det',
  'deps': [('det', 2)],
  'misc': {'Discourse': 'context-background:105->104:0:sem-synym-803-804,813-815-_',
   'Entity': '(159-abstract-new-nnnnn-cf2-6-sgl(157-place-giv:act-sssss-cf1*-2-coref-Argentina',
   'PDTB': 'Implicit:Expansion.Conjunction:and:_:801-812:813-839'}},
 {'id': (2, '-', 3),
  'form': 'country’s',
  'lemma': '_',
  'upos': '_',
  'xpos': None,
  'feats': None,
  'head': None,
  'deprel': '_',
  'deps': None,
  'misc': None},
 {'id': 2,
  'form': 'country',
  'lemma': 'country',
  'upos': 'NOUN',
  'xpos': 'NN',
  'feats': {'Number': 'Sing'},
  'head': 6,
  'deprel': 'nmod:poss',
  'deps': [('nmod:poss', 6)],
  'misc': None},
 {'id': 3,
  'form': '’s',
  'lemma': "'s",
  'upos': 'PART',
  'xpos': 'POS',
  'feats': None,
  'head': 2,
  'deprel': 'case',
  'deps': [('case', 2)],
  'misc': {'Entity': '157)'}},


## Testing Sentence and PassiveSentence

In [4]:
new_sentences = []
for sent in sentences:
    s = Sentence(sent)
    new_sentences.append(s)

In [5]:
passive_sentences = [s for s in new_sentences if s.is_passive]

In [6]:
passive_sentences[0].text

'The direction of saccades is determined by an interaction between the goals of the observer and the physical properties of the different elements of the scene (e.g. colour, texture, brightness etc).'

In [7]:
build_subtree(passive_sentences[0][1])

[{'id': 1,
  'form': 'The',
  'lemma': 'the',
  'upos': 'DET',
  'xpos': 'DT',
  'feats': {'Definite': 'Def', 'PronType': 'Art'},
  'head': 2,
  'deprel': 'det',
  'deps': [('det', 2)],
  'misc': {'Discourse': 'elaboration-additional:39->36:2:sem-synym-369-370,380-386,391,401-402-_',
   'Entity': '(95-abstract-new-nnnnn-cf7-2-sgl',
   'PDTB': 'Implicit:Expansion.Level-of-detail.Arg2-as-detail:in particular:_:364-387:388-423'},
  'inflection': 'DT',
  'children': []},
 {'id': 2,
  'form': 'direction',
  'lemma': 'direction',
  'upos': 'NOUN',
  'xpos': 'NN',
  'feats': {'Number': 'Sing'},
  'head': 6,
  'deprel': 'nsubj:pass',
  'deps': [('nsubj:pass', 6)],
  'misc': {'MSeg': 'direct-ion'},
  'inflection': 'NN',
  'children': [{'id': 1,
    'form': 'The',
    'lemma': 'the',
    'upos': 'DET',
    'xpos': 'DT',
    'feats': {'Definite': 'Def', 'PronType': 'Art'},
    'head': 2,
    'deprel': 'det',
    'deps': [('det', 2)],
    'misc': {'Discourse': 'elaboration-additional:39->36:2:sem-

In [8]:
passive_sentences_test = []
i = 0
for sent in passive_sentences:
    try:
        new_sent = PassiveSentence(sent)
        passive_sentences_test.append(new_sent)
        active_sent = new_sent.depassivize()
        print(i)
        print(sent.text)
        print(f'{new_sent.passive_subject.text} | {new_sent.verb.text} | {new_sent.agent.text}')
        print(active_sent.text)
        i += 1
    except Exception as e:
        print("Failed:", sent.text)
        print("Error:", e, type(e))
    print()

auxp: VBZ | agent_infl: NN | verb_infl: VBZ | verb: determines
interaction determines
0
The direction of saccades is determined by an interaction between the goals of the observer and the physical properties of the different elements of the scene (e.g. colour, texture, brightness etc).
The direction of saccades | is determined | by an interaction between the goals of the observer and the physical properties of the different elements of the scene ( e.g. colour , texture , brightness etc )
an interaction between the goals of the observer and the physical properties of the different elements of the scene ( e.g. colour , texture , brightness etc ) determines The direction of saccades .

auxp: VB | agent_infl: NN | verb_infl: MD | verb: wash
averaging will wash
1
For example, differences in recruitment and retention strategies across departments will be washed out by averaging, thereby masking any insights into the efficacy of individual strategies and policies.
differences in recruitment a

In [9]:
getAllInflections('be')

{'VB': ('be',),
 'VBD': ('was', 'were'),
 'VBG': ('being',),
 'VBN': ('been',),
 'VBP': ('am', 'are'),
 'VBZ': ('is',)}

In [10]:
getAllInflections('do')

{'NN': ('do',),
 'NNS': ('dos', 'dos'),
 'VB': ('do',),
 'VBP': ('do',),
 'VBD': ('did', 'didst'),
 'VBN': ('done',),
 'VBG': ('doing',),
 'VBZ': ('does', 'dost', 'doth')}

In [11]:
getAllInflections('appreciate')

{'VB': ('appreciate',),
 'VBP': ('appreciate',),
 'VBD': ('appreciated',),
 'VBN': ('appreciated',),
 'VBG': ('appreciating',),
 'VBZ': ('appreciates',)}

In [12]:
passive_sentences_test[57]

[{'id': 1,
  'form': 'Moments',
  'lemma': 'moment',
  'upos': 'NOUN',
  'xpos': 'NNS',
  'feats': {'Number': 'Plur'},
  'head': 5,
  'deprel': 'nsubj',
  'deps': [('nsubj', 5)],
  'misc': {'Discourse': 'attribution-positive:154->155:1:sem-atsrc-1038-_+lex-indwd-1039-_',
   'MSeg': 'Moment-s'},
  'inflection': 'NNS',
  'children': [{'id': 3,
    'form': 'that',
    'lemma': 'that',
    'upos': 'PRON',
    'xpos': 'DT',
    'feats': {'Number': 'Sing', 'PronType': 'Dem'},
    'head': 1,
    'deprel': 'nmod',
    'deps': [('nmod:like', 1)],
    'misc': {'Entity': '(135-event-giv:act-nnnnn-cf1*-1-ana)'},
    'inflection': 'DT',
    'children': [{'id': 2,
      'form': 'like',
      'lemma': 'like',
      'upos': 'ADP',
      'xpos': 'IN',
      'feats': None,
      'head': 3,
      'deprel': 'case',
      'deps': [('case', 3)],
      'misc': None,
      'inflection': 'IN',
      'children': []}]}]},
 {'id': 2,
  'form': 'like',
  'lemma': 'like',
  'upos': 'ADP',
  'xpos': 'IN',
  'feats':

In [13]:
passive_sentences_test[69].depassivize().text

auxp: VBD | agent_infl: NNS | verb_infl: VBD | verb: caused
winds caused


'strong winds and heavy rain that led the crane to fall caused The accident , which occurred yesterday afternoon , .'

In [14]:
type(passive_sentences_test[69].depassivize())

auxp: VBD | agent_infl: NNS | verb_infl: VBD | verb: caused
winds caused


src.units.sentence.Sentence

## Testing Document Conversion

In [15]:
from src.units.document import Document

In [16]:
type(sentences)

conllu.models.SentenceList

In [17]:
sentences[28].metadata

{'newdoc id': 'GUM_academic_census',
 'global.Entity': 'GRP-etype-infstat-salience-centering-minspan-link-identity',
 'meta::author': 'Allison C. Morgan , Samuel F. Way, Aaron Clauset',
 'meta::dateCollected': '2018-09-10',
 'meta::dateCreated': '2018-08-29',
 'meta::dateModified': '2018-08-29',
 'meta::genre': 'academic',
 'meta::salientEntities': '77 (5*), 163 (5*), 172 (4*), 2 (3*), 42 (3), 51 (3*), 206 (3*), 13 (2), 14 (2), 47 (2*), 66 (2), 121 (2), 165 (2), 177 (2), 212 (2), 43 (1), 44 (1), 48 (1), 89 (1), 134 (1), 155 (1), 162 (1), 179 (1), 184 (1), 191 (1)',
 'meta::sourceURL': 'https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0202223',
 'meta::speakerCount': '0',
 'meta::summary1': '(human) This paper presents a Web crawler-based computer system for automatically generating census data for an academic field, which is applied to the compilation of a census of faculty in Computer Science.',
 'meta::summary2': '(claude-3-5-sonnet-20241022) This paper introduces an

In [18]:
doc = Document(sentences[:28])
print(doc)

Aesthetic Appreciation and Spanish Art: Insights from Eye-Tracking 
Claire Bailey-Ross claire.bailey-ross@port.ac.uk University of Portsmouth, United Kingdom 
Andrew Beresford a.m.beresford@durham.ac.uk Durham University, United Kingdom 
Daniel Smith daniel.smith2@durham.ac.uk Durham University, United Kingdom 
Claire Warwick c.l.h.warwick@durham.ac.uk Durham University, United Kingdom 
How do people look at and experience art? Which elements of specific artworks do they focus on? Do museum labels have an impact on how people look at artworks? The viewing experience of art is a complex one, involving issues of perception, attention, memory, decision-making, affect, and emotion. Thus, the time it takes and the ways of visually exploring an artwork can inform about its relevance, interestingness, and even its aesthetic appeal. This paper describes a collaborative pilot project focusing on a unique collection of 17th Century Zurbarán paintings. The Jacob cycle at Auckland Castle is the on

In [19]:
documents = []
curr_doc = []
for s in sentences:
    if "newdoc id" in s.metadata:
        if curr_doc:
            curr_doc = Document(curr_doc)
            if curr_doc.has_passive():
                documents.append(curr_doc)
        curr_doc = [s]
    else:
        curr_doc.append(s)
len(documents)

91

In [20]:
converted_documents = []
for d in documents:
    converted_documents += d.convert_all()

auxp: VBZ | agent_infl: NN | verb_infl: VBZ | verb: determines
interaction determines
auxp: VB | agent_infl: NN | verb_infl: MD | verb: wash
averaging will wash
auxp: VB | agent_infl: NNS | verb_infl: MD | verb: take
actors will take
auxp: VBZ | agent_infl: NNS | verb_infl: VBP | verb: pollute
agents heavily deliberately pollute
auxp: VBZ | agent_infl: NN | verb_infl: VBZ | verb: connects
accident Thus connects
auxp: VBZ | agent_infl: NNS | verb_infl: VBP | verb: constrain
pressures constrain
auxp: VBZ | agent_infl: NN | verb_infl: VBZ | verb: follows
repetition follows
auxp: VBZ | agent_infl: NN | verb_infl: VBZ | verb: initiates
speaker initiates
auxp: VBP | agent_infl: NN | verb_infl: VBZ | verb: compares
efficiency where compares
auxp: VBP | agent_infl: NN | verb_infl: VBZ | verb: generates
application generates
auxp: VBZ | agent_infl: NN | verb_infl: VBZ | verb: determines
location determines
auxp: VBP | agent_infl: JJ | verb_infl: VBP | verb: consider
many consider
auxp: VBD | ag

In [21]:
doc, pass_idx = converted_documents[0]
doc[pass_idx].text

'an interaction between the goals of the observer and the physical properties of the different elements of the scene ( e.g. colour , texture , brightness etc ) determines The direction of saccades .'