# Penman API Demo

This notebook demonstrates the basic usage of the Penman API. For an overview of what Penman does, see the [project page](https://github.com/goodmami/penman). For API documentation, see [here](https://penman.readthedocs.io/en/latest/api/penman.html).

## Basic Decoding and Encoding

To start, the simplest way to parse a PENMAN string into a graph data structure is with [penman.decode()](https://penman.readthedocs.io/en/latest/api/penman.html#penman.decode):

In [1]:
from penman import decode
g = decode('''
  # ::snt The dog didn't bark
  (b / bark-01
     :polarity -
     :ARG0 (d / dog))''')
g

<Graph object (top=b) at 139688313335184>

The [penman.encode()](https://penman.readthedocs.io/en/latest/api/penman.html#penman.encode) function can serialize a graph back to PENMAN notation (note that the metadata is also printed):

In [2]:
from penman import encode
print(encode(g))

# ::snt The dog didn't bark
(b / bark-01
   :polarity -
   :ARG0 (d / dog))


You may customize things like indentation:

In [3]:
print(encode(g, indent=None))  # single-line

# ::snt The dog didn't bark
(b / bark-01 :polarity - :ARG0 (d / dog))


In [4]:
print(encode(g, indent=6, compact=True))  # attributes following concepts printed on same line

# ::snt The dog didn't bark
(b / bark-01 :polarity -
      :ARG0 (d / dog))


## Graph Introspection and Manipulation

The [Graph](https://penman.readthedocs.io/en/latest/api/penman.graph.html#penman.graph.Graph) object returned by `decode()` has methods for inspecting things like the variables and different types of edges.

In [5]:
g.variables()

{'b', 'd'}

In [6]:
g.instances()

[Attribute(source='b', role=':instance', target='bark-01'),
 Attribute(source='d', role=':instance', target='dog')]

In [7]:
g.attributes()

[Attribute(source='b', role=':polarity', target='-')]

In [8]:
g.edges()

[Edge(source='b', role=':ARG0', target='d')]

In [9]:
g.metadata

{'snt': "The dog didn't bark"}

You may also view and modify the full list of triples and the metadata directly:

In [10]:
g.triples

[('b', ':instance', 'bark-01'),
 ('b', ':polarity', '-'),
 ('b', ':ARG0', 'd'),
 ('d', ':instance', 'dog')]

In [11]:
g.triples.extend([('b', ':location', 'g'), ('g', ':instance', 'garden')])
g.triples

[('b', ':instance', 'bark-01'),
 ('b', ':polarity', '-'),
 ('b', ':ARG0', 'd'),
 ('d', ':instance', 'dog'),
 ('b', ':location', 'g'),
 ('g', ':instance', 'garden')]

In [12]:
g.metadata['snt'] = "The dog didn't bark in the garden."

In [13]:
print(encode(g))

# ::snt The dog didn't bark in the garden.
(b / bark-01
   :polarity -
   :ARG0 (d / dog)
   :location (g / garden))


## Advanced Decoding and Encoding

Penman's decoding strategy has 3 stages managed by the [PENMANCodec](https://penman.readthedocs.io/en/latest/api/penman.codec.html#penman.codec.PENMANCodec) class: first it starts with a PENMAN string and parses it to a tree structure, then it interprets the tree structure to produce a pure graph. Earlier when we called the [decode()](https://penman.readthedocs.io/en/latest/api/penman.html#penman.decode) function, it was calling the [PENMANCodec.decode()](https://penman.readthedocs.io/en/latest/api/penman.codec.html#penman.codec.PENMANCodec.decode) method. The codec object also lets us parse to trees (with [PENMANCodec.parse()](https://penman.readthedocs.io/en/latest/api/penman.codec.html#penman.codec.PENMANCodec.parse)) without interpreting the graph. This is useful if you prefer to work with AMR data as trees than as pure graphs, or if you wish to use some of Penman's tree [transformations](#Transformations).

In [14]:
from penman import PENMANCodec
codec = PENMANCodec()
t = codec.parse('''
  # ::snt The dog didn't bark
  (b / bark-01
     :polarity -
     :ARG0 (d / dog))''')
t

Tree(('b', [('/', 'bark-01'), (':polarity', '-'), (':ARG0', ('d', [('/', 'dog')]))]))

The [penman.layout](https://penman.readthedocs.io/en/latest/api/penman.layout.html) module defines the interface between trees and graphs. Getting the graph from the tree then requires a separate call to [penman.layout.interpret()](https://penman.readthedocs.io/en/latest/api/penman.layout.html#penman.layout.interpret):

In [15]:
from penman import layout
g = layout.interpret(t)
g

<Graph object (top=b) at 139688313043408>

We can also go the other way; call [penman.layout.configure()](https://penman.readthedocs.io/en/latest/api/penman.layout.html#penman.layout.configure) to get a tree from a graph, and finally [PENMANCodec.format()](https://penman.readthedocs.io/en/latest/api/penman.codec.html#penman.codec.PENMANCodec.format) to get a string again:

In [16]:
t2 = layout.configure(g)
print(codec.format(t2))

# ::snt The dog didn't bark
(b / bark-01
   :polarity -
   :ARG0 (d / dog))


## Tree Inspection and Manipulation

[Tree](https://penman.readthedocs.io/en/latest/api/penman.tree.html#penman.tree.Tree) objects are simple structures that contain a `node` data attribute as a `(var, branches)` pair, where `var` is the node's variable and `branches` is a list of `(branch_label, target)` pairs. `branch_label` is like a graph role, but it is not normalized for inversion and concept branches use the `/` label instead of the `:instance` role. `target` is either an atomic type (e.g., a string) or, recursively, another node. Tree objects also contain metadata.

In [17]:
t.node

('b', [('/', 'bark-01'), (':polarity', '-'), (':ARG0', ('d', [('/', 'dog')]))])

In [18]:
t.metadata

{'snt': "The dog didn't bark"}

[Tree.nodes()](https://penman.readthedocs.io/en/latest/api/penman.tree.html#penman.tree.Tree.nodes) traverses the tree and returns a flat list of the nodes in the tree (but the nodes themselves are not flat):

In [19]:
t.nodes()

[('b',
  [('/', 'bark-01'), (':polarity', '-'), (':ARG0', ('d', [('/', 'dog')]))]),
 ('d', [('/', 'dog')])]

[Tree.reset_variables()](https://penman.readthedocs.io/en/latest/api/penman.tree.html#penman.tree.Tree.reset_variables) reassigns the node variables based on their appearance in the tree. It takes a formatting parameter with a few possible replacements (see the documentation for details):

In [20]:
t.reset_variables('a{i}')
t

Tree(('a0', [('/', 'bark-01'), (':polarity', '-'), (':ARG0', ('a1', [('/', 'dog')]))]))

In [21]:
t.reset_variables('{prefix}{j}')
t

Tree(('b', [('/', 'bark-01'), (':polarity', '-'), (':ARG0', ('d', [('/', 'dog')]))]))

## Using Models

In Penman, the interpretation of a graph from a tree relies on a [Model](https://penman.readthedocs.io/en/latest/api/penman.model.html) to determine things like whether a role is inverted. By default, a basic model with no special roles defined is used, and this is often enough:

In [22]:
g = decode('''
  # ::snt The dog that barked slept.
  (s / sleep-01
     :ARG0 (d / dog
              :ARG0-of (b / bark)))''')
g.edges()  # note that edge directions are normalized

[Edge(source='s', role=':ARG0', target='d'),
 Edge(source='b', role=':ARG0', target='d')]

AMR, however, has some roles that use `-of` in their primary, or non-inverted, form, which can lead to invalid graphs:

In [23]:
g = decode('''
  # ::snt I bought a ceramic knife
  (b / buy-01
     :ARG0 (i / i)
     :ARG1 (k / knife
              :consist-of (g / glass)))''')
g.edges()

[Edge(source='b', role=':ARG0', target='i'),
 Edge(source='i', role=':instance', target='i'),
 Edge(source='b', role=':ARG1', target='k'),
 Edge(source='g', role=':consist', target='k')]

Instead, by using the AMR-specific model, these edges are correctly interpreted:

In [24]:
from penman.models import amr
g = decode('''
  # ::snt I bought a ceramic knife
  (b / buy-01
     :ARG0 (i / i)
     :ARG1 (k / knife
              :consist-of (c / ceramic)))''',
          model=amr.model)
g.edges()

[Edge(source='b', role=':ARG0', target='i'),
 Edge(source='i', role=':instance', target='i'),
 Edge(source='b', role=':ARG1', target='k'),
 Edge(source='k', role=':consist-of', target='c')]

You can also create a codec with a model:

In [25]:
amrcodec = PENMANCodec(model=amr.model)
amrcodec.decode('(k / knife :consist-of (c / ceramic))').edges()

[Edge(source='k', role=':consist-of', target='c')]

Models are also useful as a source of information for transformations, as shown in the next section.

## Transformations

Penman's transformations sometimes modify the content of the graph and other times only restructure how the graph is displayed. They rely on a [Model](https://penman.readthedocs.io/en/latest/api/penman.model.html#penman.model.Model) for information on how to apply the transformations.

In [26]:
from penman import transform

Consider the following erroneous graph:

In [27]:
g = decode('''
  (c / chapter
     :domain-of 7)''')  # this will log a warning



In [28]:
g.attributes()  # note that it is not normalized

[Attribute(source='c', role=':domain-of', target='7')]

We can use [transform.canonicalize_roles()](https://penman.readthedocs.io/en/latest/api/penman.transform.html#penman.transform.canonicalize_roles) to fix the error using the AMR model. It works on the tree structure, so we first reparse it as a tree:

In [29]:
t = codec.parse('''
  (c / chapter
     :domain-of 7)''')
t2 = transform.canonicalize_roles(t, model=amr.model)
print(codec.format(t2))

(c / chapter
   :mod 7)


Reification is another kind of transformation. It works on graphs. There are two kinds of reification in Penman, and the first is [transform.reify_edges()](https://penman.readthedocs.io/en/latest/api/penman.transform.html#penman.transform.reify_edges), which does reification as defined by the AMR guidelines:

In [30]:
g = layout.interpret(t2)  # get a graph from the tree
g2 = transform.reify_edges(g, model=amr.model)  # :mod -> have-mod-91 is defined by the AMR model
print(encode(g2))

(c / chapter
   :ARG1-of (_ / have-mod-91
               :ARG2 7))


There is also [transform.reify_attributes()](https://penman.readthedocs.io/en/latest/api/penman.transform.html#penman.transform.reify_attributes) which replaces attribute values with nodes. This is another way one could deal with the warning above about interpretation being unable to deinvert an attribute. As this procedure is not defined by a model, the function does not take one:

In [31]:
g3 = transform.reify_attributes(g)
print(encode(g3))

(c / chapter
   :mod (_ / 7))


Finally, there are some transformations defined by other parts of Penman. We've already seen [Tree.reset_variables()](https://penman.readthedocs.io/en/latest/api/penman.tree.html#penman.tree.Tree.reset_variables). Two others are defined in the [penman.layout](https://penman.readthedocs.io/en/latest/api/penman.layout.html) module.

First, [layout.rearrange()](https://penman.readthedocs.io/en/latest/api/penman.layout.html#penman.layout.rearrange) will reorder branches in the tree without otherwise changing its structure. For example:

In [32]:
t = codec.parse('''
  (t / try-01
     :ARG1 (c / chase-01
              :ARG1 (c2 / cat)
              :ARG0 (d / dog))
     :ARG0 d)''')
layout.rearrange(t, key=amr.model.canonical_order)
print(codec.format(t))

(t / try-01
   :ARG0 d
   :ARG1 (c / chase-01
            :ARG0 (d / dog)
            :ARG1 (c2 / cat)))


Next, [layout.reconfigure()](https://penman.readthedocs.io/en/latest/api/penman.layout.html#penman.layout.reconfigure) performs more significant structure changes to the graph:

In [33]:
g = layout.interpret(t)
t2 = layout.reconfigure(g, key=amr.model.canonical_order)
print(codec.format(t2))

(t / try-01
   :ARG0 (d / dog
            :ARG0-of (c / chase-01
                        :ARG1 (c2 / cat)))
   :ARG1 c)


## Command-line Utility

Many of the operations described above are available via the command-line `penman` utility. For more information, see [the documentation](https://penman.readthedocs.io/en/latest/basic.html#using-penman-as-a-tool).