Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sgf-parsing: implement exercise #1359

Merged
merged 5 commits into from
Apr 3, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions config.json
Original file line number Diff line number Diff line change
Expand Up @@ -1305,6 +1305,17 @@
"lists"
]
},
{
"uuid": "0d6325d1-c0a3-456e-9a92-cea0559e82ed",
"slug": "sgf-parsing",
"core": false,
"unlocked_by": null,
"difficulty": 7,
"topics": [
"parsing",
"trees"
]
},
{
"uuid": "e7351e8e-d3ff-4621-b818-cd55cf05bffd",
"slug": "accumulate",
Expand Down
110 changes: 110 additions & 0 deletions exercises/sgf-parsing/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# SGF Parsing

Parsing a Smart Game Format string.

[SGF](https://en.wikipedia.org/wiki/Smart_Game_Format) is a standard format for
storing board game files, in particular go.

SGF is a fairly simple format. An SGF file usually contains a single
tree of nodes where each node is a property list. The property list
contains key value pairs, each key can only occur once but may have
multiple values.

An SGF file may look like this:

```text
(;FF[4]C[root]SZ[19];B[aa];W[ab])
```

This is a tree with three nodes:

- The top level node has two properties: FF\[4\] (key = "FF", value =
"4") and C\[root\](key = "C", value = "root"). (FF indicates the
version of SGF and C is a comment.)
- The top level node has a single child which has a single property:
B\[aa\]. (Black plays on the point encoded as "aa", which is the
1-1 point (which is a stupid place to play)).
- The B\[aa\] node has a single child which has a single property:
W\[ab\].

As you can imagine an SGF file contains a lot of nodes with a single
child, which is why there's a shorthand for it.

SGF can encode variations of play. Go players do a lot of backtracking
in their reviews (let's try this, doesn't work, let's try that) and SGF
supports variations of play sequences. For example:

```text
(;FF[4](;B[aa];W[ab])(;B[dd];W[ee]))
```

Here the root node has two variations. The first (which by convention
indicates what's actually played) is where black plays on 1-1. Black was
sent this file by his teacher who pointed out a more sensible play in
the second child of the root node: `B[dd]` (4-4 point, a very standard
opening to take the corner).

A key can have multiple values associated with it. For example:

```text
(;FF[4];AB[aa][ab][ba])
```

Here `AB` (add black) is used to add three black stones to the board.

There are a few more complexities to SGF (and parsing in general), which
you can mostly ignore. You should assume that the input is encoded in
UTF-8, the tests won't contain a charset property, so don't worry about
that. Furthermore you may assume that all newlines are unix style (`\n`,
no `\r` or `\r\n` will be in the tests) and that no optional whitespace
between properties, nodes, etc will be in the tests.

The exercise will have you parse an SGF string and return a tree
structure of properties. You do not need to encode knowledge about the
data types of properties, just use the rules for the
[text](http://www.red-bean.com/sgf/sgf4.html#text) type everywhere.

## Exception messages

Sometimes it is necessary to raise an exception. When you do this, you should include a meaningful error message to
indicate what the source of the error is. This makes your code more readable and helps significantly with debugging. Not
every exercise will require you to raise an exception, but for those that do, the tests will only pass if you include
a message.

To raise a message with an exception, just write it as an argument to the exception type. For example, instead of
`raise Exception`, you should write:

```python
raise Exception("Meaningful message indicating the source of the error")
```

## Running the tests

To run the tests, run the appropriate command below ([why they are different](https://github.com/pytest-dev/pytest/issues/1629#issue-161422224)):

- Python 2.7: `py.test sgf_parsing_test.py`
- Python 3.3+: `pytest sgf_parsing_test.py`

Alternatively, you can tell Python to run the pytest module (allowing the same command to be used regardless of Python version):
`python -m pytest sgf_parsing_test.py`

### Common `pytest` options

- `-v` : enable verbose output
- `-x` : stop running tests on first failure
- `--ff` : run failures from previous test before running other test cases

For other options, see `python -m pytest -h`

## Submitting Exercises

Note that, when trying to submit an exercise, make sure the solution is in the `$EXERCISM_WORKSPACE/python/sgf-parsing` directory.

You can find your Exercism workspace by running `exercism debug` and looking for the line that starts with `Workspace`.

For more detailed information about running tests, code style and linting,
please see the [help page](http://exercism.io/languages/python).

## Submitting Incomplete Solutions

It's possible to submit an incomplete solution so you can see how others have completed the exercise.
100 changes: 100 additions & 0 deletions exercises/sgf-parsing/example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
class SgfTree(object):
def __init__(self, properties=None, children=None):
self.properties = properties or {}
self.children = children or []

def __eq__(self, other):
if not isinstance(other, SgfTree):
return False
for k, v in self.properties.items():
if k not in other.properties:
return False
if other.properties[k] != v:
return False
for k in other.properties.keys():
if k not in self.properties:
return False
if len(self.children) != len(other.children):
return False
for a, b in zip(self.children, other.children):
if not (a == b):
return False
return True

def __repr__(self):
"""Ironically, encoding to SGF is much easier"""
rep = '(;'
for k, vs in self.properties.items():
rep += k
for v in vs:
rep += '[{}]'.format(v)
if self.children:
if len(self.children) > 1:
rep += '('
for c in self.children:
rep += repr(c)[1:-1]
if len(self.children) > 1:
rep += ')'
return rep + ')'


def is_upper(s):
a, z = map(ord, 'AZ')
return all(
a <= o and o <= z
for o in map(ord, s)
)


def parse(input_string):
root = None
current = None
stack = list(input_string)

def assert_that(condition):
if not condition:
raise ValueError(
'invalid format at {}:{}: {}'.format(
repr(input_string),
len(input_string) - len(stack),
repr(''.join(stack))
)
)
assert_that(stack)

def pop():
if stack[0] == '\\':
stack.pop(0)
ch = stack.pop(0)
return ' ' if ch in '\n\t' else ch

def peek():
return stack[0]

def pop_until(ch):
v = ''
while peek() != ch:
v += pop()
return v
while stack:
assert_that(pop() == '(' and peek() == ';')
while pop() == ';':
properties = {}
while is_upper(peek()):
key = pop_until('[')
assert_that(is_upper(key))
values = []
while peek() == '[':
pop()
values.append(pop_until(']'))
pop()
properties[key] = values
if root is None:
current = root = SgfTree(properties)
else:
current = SgfTree(properties)
root.children.append(current)
while peek() == '(':
child_input = pop() + pop_until(')') + pop()
current.children.append(parse(child_input))
return root
26 changes: 26 additions & 0 deletions exercises/sgf-parsing/sgf_parsing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
class SgfTree(object):
def __init__(self, properties=None, children=None):
self.properties = properties or {}
self.children = children or []

def __eq__(self, other):
if not isinstance(other, SgfTree):
return False
for k, v in self.properties.items():
if k not in other.properties:
return False
if other.properties[k] != v:
return False
for k in other.properties.keys():
if k not in self.properties:
return False
if len(self.children) != len(other.children):
return False
for a, b in zip(self.children, other.children):
if a != b:
return False
return True


def parse(input_string):
pass
94 changes: 94 additions & 0 deletions exercises/sgf-parsing/sgf_parsing_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
import unittest

from sgf_parsing import parse, SgfTree


class SgfParsingTest(unittest.TestCase):
def test_empty_input(self):
input_string = ''
with self.assertRaisesWithMessage(ValueError):
parse(input_string)

def test_tree_with_no_nodes(self):
input_string = '()'
with self.assertRaisesWithMessage(ValueError):
parse(input_string)

def test_node_without_tree(self):
input_string = ';'
with self.assertRaisesWithMessage(ValueError):
parse(input_string)

def test_node_without_properties(self):
input_string = '(;)'
expected = SgfTree()
self.assertEqual(parse(input_string), expected)

def test_single_node_tree(self):
input_string = '(;A[B])'
expected = SgfTree(properties={'A': ['B']})
self.assertEqual(parse(input_string), expected)

def test_properties_without_delimiter(self):
input_string = '(;a)'
with self.assertRaisesWithMessage(ValueError):
parse(input_string)

def test_all_lowercase_property(self):
input_string = '(;a[b])'
with self.assertRaisesWithMessage(ValueError):
parse(input_string)

def test_upper_and_lowercase_property(self):
input_string = '(;Aa[b])'
with self.assertRaisesWithMessage(ValueError):
parse(input_string)

def test_two_nodes(self):
input_string = '(;A[B];B[C])'
expected = SgfTree(
properties={'A': ['B']},
children=[
SgfTree({'B': ['C']})
]
)
self.assertEqual(parse(input_string), expected)

def test_two_child_trees(self):
input_string = '(;A[B](;B[C])(;C[D]))'
expected = SgfTree(
properties={'A': ['B']},
children=[
SgfTree({'B': ['C']}),
SgfTree({'C': ['D']}),
]
)
self.assertEqual(parse(input_string), expected)

def test_multiple_property_values(self):
input_string = '(;A[b][c][d])'
expected = SgfTree(
properties={'A': ['b', 'c', 'd']}
)
self.assertEqual(parse(input_string), expected)

def test_escaped_property(self):
input_string = '(;A[\]b\nc\nd\t\te \n\]])'
expected = SgfTree(
properties={'A': [']b c d e ]']}
)
self.assertEqual(parse(input_string), expected)

# Utility functions
def setUp(self):
try:
self.assertRaisesRegex
except AttributeError:
self.assertRaisesRegex = self.assertRaisesRegexp

def assertRaisesWithMessage(self, exception):
return self.assertRaisesRegex(exception, r".+")


if __name__ == '__main__':
unittest.main()