Position of a node #74

rafaelanchieta · 2020-04-19T04:58:18Z

Hi Goodman.

Does the penman library have a method to return the position of each node in the graph? Ex.
(w / want-01 :ARG0 (b / boy) :ARG1 (g / go-01 : ARG0 b))
want-01 - 0
boy - 0.0
go-01 - 0.1

The text was updated successfully, but these errors were encountered:

goodmami · 2020-04-19T05:42:37Z

There isn't currently a function to do this, but it wouldn't be hard to do, and it would probably make a good addition to the library so it could work with JAMR-style alignment metadata. But you would want to target the tree and not the graph (e.g., from PENMANCodec.parse(), also see the penman.tree module), as those are tree positions. The trouble is that if the tree gets restructured from the pure graph, those alignments no longer make sense.

Depending on what you want, it would probably be as simple as the following (not tested):

def positions(node):
    variable, branches = node
    pos = []
    for i, (role, target) in enumerate(branches):
        pos.append((i, target))
        if not tree.is_atomic(target):
            pos.extend((f'{i}.{j}', tgt) for j, tgt in positions(target))
    return pos

I would welcome a PR with a Tree method that does something similar to the above, or maybe I'll have a chance to do it for the next release.

rafaelanchieta · 2020-04-19T20:41:54Z

Thank you for the quick reply. I'm trying to fix the code.

goodmami · 2020-04-20T01:32:37Z

Sounds good. There's some useful documentation of the numbering here, and #19 has further investigation and discussion of the numbering and alignment formats. I'm on the fence as to whether we should have the position start at 0 or 1 or make it a parameter. I kinda like 0-based and in depth-first order so they can be zipped with the output of Tree.nodes(). Users who want 1-based can always add 1 when creating alignment metadata, I suppose?

rafaelanchieta · 2020-04-20T13:53:29Z

I believe that the position should start at 0 to keep the compatibility with the JARM-alignment and with parsers based on it. I'll see the documentation.

goodmami · 2020-04-20T14:02:17Z

Also see this document from the 3.0 release of the AMR data. It uses 1-based indices.

https://catalog.ldc.upenn.edu/docs/LDC2020T02/AMR-alignment-format.txt

rafaelanchieta · 2020-04-20T17:35:50Z

Interesting. Are there more examples of this alignment format?

goodmami · 2020-04-21T00:44:10Z

I'm not sure. In this comment of #19 I investigated some sources and found 3 different in-situ formats (~N, ~eN, ~e.N) and two metadata formats (M-N|i+i.j and M-i.j). Penman already supports all the in-situ formats (shallowly, it just stores whatever comes between ~ and the first index as a prefix, and there's no interpretation of the indices yet, so 0-based and 1-based are both ok).

As mentioned, I'm happy if as a first pass the Tree.positions() method (or whatever it's called) only returns 0-based indices because we can add 1 when creating metadata. If we start interpreting the ::alignment metadata lines or in-situ alignments (e.g., for extracting token sequences from the ::tok line) we'll need to make up a story for interpreting the values.

goodmami · 2020-04-27T08:14:37Z

@rafaelanchieta I don't see a fork of Penman on your GitHub page so am I correct in assuming you're not working on this feature (at least not for submitting a PR)?

Resolves #74

goodmami · 2020-04-27T09:54:34Z

I went ahead and pushed a new branch with the Tree.positions() method. I went with the starting number of 1 to follow the AMR annotation guidelines, but it takes an argument to adjust this. Here's how it works:

>>> from penman.codec import PENMANCodec
>>> c = PENMANCodec()
>>> t = c.parse('(w / want-01 :polarity - :ARG0 (b / boy) :ARG1 (g / go-01 :ARG0 b))')
>>> print(c.format(t))
(w / want-01
   :polarity -
   :ARG0 (b / boy)
   :ARG1 (g / go-01
            :ARG0 b))
>>> t.positions()
['1', '1.2', '1.3']
>>> for node, pos in zip(t.nodes(), t.positions()):
...   print(pos, node[0])
... 
1 w
1.2 b
1.3 g
>>> t.positions(0)
['0', '0.1', '0.2']

Note that the positions are only returned for actual nodes, so the (w :polarity -) triple increments the counter but does not end up in the list of positions. This is so the result of Tree.positions() can be zipped with Tree.nodes().

@rafaelanchieta, does this suit your needs?

rafaelanchieta · 2020-04-28T12:01:48Z

Hi @goodmami .

I downloaded the Penman project and was working on it.
For that example, the (w :polarity -) triple should be counted.
The method should returns ['0', '0.0, '0.1', '0.2']'

goodmami · 2020-04-28T13:38:16Z

Thanks for the feedback. If you want to submit a PR I can still merge it in (even if it gets some further adjustments) so your contribution is recognized by GitHub.

Regarding the (w :polarity -) triple, I chose to count but not return a position for those triples because if the list of positions could not zip with the nodes they would be rather useless. You would have to do something complicated to figure out which node each position went with, and that defeats the purpose of computing them in the first place.

But I just now realized that tree nodes and positions won't help with the main use case of computing the positions, which is to create or interpret alignment metadata, because when a tree is configured the alignment data is already embedded in the tree and thus not easily inspected:

>>> from penman.codec import PENMANCodec
>>> from penman import layout
>>> from penman.surface import alignments, role_alignments
>>> c = PENMANCodec()
>>> g = c.decode('''
... # ::tok I swam in the lake .
... (s / swim~e.1
...    :ARG0 (i / i~e.0)
...    :location~e.2 (l / lake~e.4))''')
>>> alignments(g)
{('s', ':instance', 'swim'): Alignment((1,), prefix='e.'), ('i', ':instance', 'i'): Alignment((0,), prefix='e.'), ('l', ':instance', 'lake'): Alignment((4,), prefix='e.')}
>>> role_alignments(g)
{('s', ':location', 'l'): RoleAlignment((2,), prefix='e.')}
>>> layout.configure(g)
Tree(('s', [('/', 'swim~e.1'), (':ARG0', ('i', [('/', 'i~e.0')])), (':location~e.2', ('l', [('/', 'lake~e.4')]))]))

So perhaps what is needed instead is a function that maps triples to tree positions. I'm not sure if this would be a Tree method or a function of penman.layout, since it is strongly related to penman.layout.configure().

rafaelanchieta · 2020-04-29T15:10:32Z

I'm not sure but, checking if the attribute-is in the attribute list could solve this issue.

goodmami · 2020-04-30T02:15:47Z

Thanks, but I don't think that's the issue. It's that alignments are associated with graph triples, but the positions are based on the tree structure. The graph, tree, and PENMAN string are all separate representations of an AMR (see here if you find that confusing). Associating alignments with tree positions thus requires some function that straddles both the graph and tree.

Now the function yields pairs of (path, branch) where path is a sequence of branch indices to get to the paired branch. Resolves #74, probably

goodmami · 2020-05-06T10:16:41Z

Ok, I've reconsidered this and changed Tree.positions() to Tree.walk() which yields paths to branches. That is, it should yield one pair for every branch in the tree structure. This is still not exactly the same as the alignment annotations, but this is more general and should be able to be applied to alignment annotation conversion. Here's an example:

>>> import penman
>>> t = penman.parse('''
... # ::tok I swam in the lake .
... (s / swim-01~e.1
...    :ARG0 (i / i~e.0)
...    :location~e.2 (l / lake~e.4))''')
>>> for path, branch in t.walk():
...     print(path, branch)
... 
(0,) ('/', 'swim-01~e.1')
(1,) (':ARG0', ('i', [('/', 'i~e.0')]))
(1, 0) ('/', 'i~e.0')
(2,) (':location~e.2', ('l', [('/', 'lake~e.4')]))
(2, 0) ('/', 'lake~e.4')

Note that the top node doesn't have an index, because these are indices of branches. They are 0-based but the concept is included as a branch, so it looks like 1-based indices. Here's an example of how you could get most of the way to the alignment metadata format:

>>> from penman.tree import is_atomic
>>> for path, branch in t.walk():
...     if is_atomic(branch[1]):
...         if path[-1] == 0:
...             path = path[:-1]
...         path = '.'.join(map(str, (1,) + path))
...         print(path, branch[1])
... 
1 swim-01~e.1
1.1 i~e.0
1.2 lake~e.4

To get the rest of the way would require parsing the alignment info from the roles or concepts. The AlignmentMarker.from_string() function would help here. Also one would need to consider role alignments on branches whose target is not atomic.

@rafaelanchieta does this look better?

rafaelanchieta · 2020-05-08T15:08:34Z

It's great. I'll close the issue.

goodmami added a commit that referenced this issue Apr 27, 2020

Add Tree.positions() method

e2552e9

Resolves #74

goodmami added a commit that referenced this issue May 6, 2020

Change Tree.positions() to Tree.walk()

1707cf4

Now the function yields pairs of (path, branch) where path is a sequence of branch indices to get to the paired branch. Resolves #74, probably

goodmami added the enhancement New feature or request label May 7, 2020

rafaelanchieta closed this as completed May 8, 2020

bjascob mentioned this issue Sep 20, 2020

Gorn Address of a Graph Triple #91

Closed

goodmami mentioned this issue Dec 3, 2021

Get node variables in tree.walk() #104

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Position of a node #74

Position of a node #74

rafaelanchieta commented Apr 19, 2020

goodmami commented Apr 19, 2020

rafaelanchieta commented Apr 19, 2020

goodmami commented Apr 20, 2020

rafaelanchieta commented Apr 20, 2020

goodmami commented Apr 20, 2020

rafaelanchieta commented Apr 20, 2020

goodmami commented Apr 21, 2020

goodmami commented Apr 27, 2020

goodmami commented Apr 27, 2020

rafaelanchieta commented Apr 28, 2020

goodmami commented Apr 28, 2020

rafaelanchieta commented Apr 29, 2020

goodmami commented Apr 30, 2020

goodmami commented May 6, 2020

rafaelanchieta commented May 8, 2020

Position of a node #74

Position of a node #74

Comments

rafaelanchieta commented Apr 19, 2020

goodmami commented Apr 19, 2020

rafaelanchieta commented Apr 19, 2020

goodmami commented Apr 20, 2020

rafaelanchieta commented Apr 20, 2020

goodmami commented Apr 20, 2020

rafaelanchieta commented Apr 20, 2020

goodmami commented Apr 21, 2020

goodmami commented Apr 27, 2020

goodmami commented Apr 27, 2020

rafaelanchieta commented Apr 28, 2020

goodmami commented Apr 28, 2020

rafaelanchieta commented Apr 29, 2020

goodmami commented Apr 30, 2020

goodmami commented May 6, 2020

rafaelanchieta commented May 8, 2020