# The Reader
This section will go over building a reader for the __E__xtensible __D__ata 
__N__otation (EDN). Most of the descriptions on this page are directly from 
[edn-format][1] spec. While the spec itself is pretty informal it looks to
provide enough detail to implement a satisfactory reader.

> __edn__ is an extensible data notation. A superset of edn is used by Clojure to 
> represent programs, and it is used by Datomic and other applications as a data 
> transfer format. This spec describes edn in isolation from those and other specific
> use cases, to help facilitate implementation of readers and writers in other 
> languages, and for other uses.
> 
> edn supports a rich set of built-in elements, and the definition of extension 
> elements in terms of the others. Users of data formats without such facilities
> must rely on either convention or context to convey elements not included in the 
> base set. This greatly complicates application logic, betraying the apparent 
> simplicity of the format. edn is simple, yet powerful enough to meet the demands of
> applications without convention or complex context-sensitive logic.
> 
> edn is a system for the conveyance of values. It is not a type system, and has no 
> schemas. Nor is it a system for representing objects - there are no reference types, 
> nor should a consumer have an expectation that two equivalent elements in some body of
> edn will yield distinct object identities when read, unless a reader implementation
> goes out of its way to make such a promise. Thus the resulting values should be 
> considered immutable, and a reader implementation should yield values that ensure this,
> to the extent possible.
> 
> edn is a set of definitions for acceptable elements. A use of edn might be a stream or
> file containing elements, but it could be as small as the conveyance of a single element
> in e.g. an HTTP query param.
> 
> There is no enclosing element at the top level. Thus edn is suitable for streaming and
> interactive applications.
> 
> The base set of elements in edn is meant to cover the basic set of data structures 
> common to most programming languages. While edn specifies how those elements are formatted
> in text, it does not dictate the representation that results on the consumer side. A well 
> behaved reader library should endeavor to map the elements to programming language types 
> with similar semantics

[1]: https://github.com/edn-format/edn

## Notes
The reader we are going to build will be built as a Recusrive Decent Parser. It will have support for both valid forms
of EDN but will also support a lot of common errors which will be used to provide the user helpful error messages.

## Character Stream
We need to first have a way to read the character stream while keeping track of where in the source we are at. Additionally
we will need to be able to push back characters onto the stream

In [536]:
#export
from collections import namedtuple
from collections.abc import Iterator
from nbdev import patch
from functools import reduce

import re

LineInfo = namedtuple('LineInfo', ['line', 'col'])
LineRange = namedtuple('LineRange', ['start', 'end'])

class Character(str):
    def __new__(cls, val, *args, **kwargs):
        return super().__new__(cls, val)

    def __init__(self, val, line: int, col: int):
        self.lineinfos = []
        for v in val:
            self.lineinfos.append(LineInfo(line, col))
            if v == '\n':
                line += 1
                col = 0
            else:
                col += 1
                
    @property
    def col(self):
        return self.lineinfo().col
    
    @property
    def line(self):
        return self.lineinfo().line
        
    def __getitem__(self, idx):
        
        if isinstance(idx, slice):
            chars = [Character(str.__getitem__(self,i), *self.lineinfos[i])
                     for i in range(*idx.indices(str.__len__(self)))]
            return reduce(lambda s, a: s + a, chars[1:], chars[0])
        return Character(super().__getitem__(idx), *self.lineinfos[idx])
    
    def lineinfo(self, idx=0):
        return self.lineinfos[idx]
    
    def __add__(self, ch):
        return Character(super().__add__(ch), *self.lineinfo())


class PushBackCharStream:
    def __init__(self, chars):
        self.iterator = chars if isinstance(chars, Iterator) else iter(chars)
        self.pushed_back = []
        self.line = 0
        self.col = 0
        self.line_history = [0]
        self.eof_info = LineInfo(0, 0)
        self.__reached_end = False

    def __iter__(self):
        return self

    def __next__(self):
        if self.__reached_end:
            raise StopIteration
            
        if self.pushed_back:
            char = self.pushed_back.pop()
        else:
            try:
                char = next(self.iterator)
            except StopIteration:
                self.__reached_end = True
                return None

            self.eof_info = LineInfo(self.line, self.col)
        char = Character(char, self.line, self.col)

        # advanced character position
        if char == '\n':
            self.line += 1
            self.col = 0
        else:
            self.col += 1

        # TODO: Better explanation for this condition
        # Save last line history for pushback
        if len(self.line_history) <= self.line:
            self.line_history.append(0)
        self.line_history[self.line] = self.col

        return char

    def push_back_single(self, char: str):
        if char is None:
            return
        
        self.pushed_back.append(char)

        # Reverse the position
        if self.col == 0:
            self.line -= 1
            self.col = self.line_history[self.line]
        else:
            self.col -= 1
        
        # Reset EOF since we know we are not there anymore
        self.__reached_end = False
            
    def push_back(self, chars: str):
        if chars is None:
            self.__reached_end = False
            return
        
        for c in reversed(chars):
            self.push_back_single(c)

    @property
    def empty(self):
        return len(self.pushed_back) == 0 and self.__reached_end

In [537]:
# test
stream = PushBackCharStream('123456789')
assert list(stream) == list(iter('123456789')) + [None], "Pushback should mimic string Iterator with None at end"

stream = PushBackCharStream('123\n456789')
assert next(stream) == '1', "Pushback should return chars in order"

assert next(stream).col == 1, "Characters should provide column info"

assert next(stream).line == 0, "Characters should provide line info"
assert next(stream).line == 0, "Characters should provide line info"
assert next(stream).line == 1, "Characters should provide line info"

stream.push_back('4')
assert next(stream) == '4', "Should be able to push back characters"

stream.push_back('4'), stream.push_back('\n')
assert next(stream).col == 3, "Should be able to push back characters and keep column info"

assert next(stream).line == 1, "Should be able to push back characters and keep line info"
stream.push_back('4'), stream.push_back('\n')
assert next(stream).line == 0, "Should be able to push back characters and keep line info"


stream = PushBackCharStream('123')
s = next(stream) + next(stream) + next(stream)
stream.push_back(s)
assert list(stream) == list(iter('123')) + [None], "Should be able to pushback multiple"

stream = PushBackCharStream('123456789')
char_str = next(stream) + next(stream) + next(stream) + next(stream)
assert list(char_str[1:2]) == list(iter('2')), "Can slice and dice"

The above reader wont work util we implement atlest the first read symbol rule

## EDN Spec
This section will build up the reader. The design is to simply iteratre through all
characters, ignore whitespace and dispatch to a reader function.

In [612]:
# export

# reader states to run based on character
dispatch = {}

READ_EOF = 'READ_EOF'
READ_FINISHED = 'READ_FINISHED'

def read(stream_or_str, sentinel=None):
    
    if isinstance(stream_or_str, str):
        stream = PushBackCharStream(stream_or_str)
    else:
        stream = stream_or_str
    
    for ch in stream:
        if Character.isWhitespace(ch): continue
        if ch is None: return READ_EOF
        if ch == sentinel: return READ_FINISHED
        
        lookahead = next(stream)
        
        stream.push_back(lookahead)
        stream.push_back(ch)

        if Character.isNumberLiteral(ch, lookahead): 
            return read_number(stream)
        elif ch in dispatch: 
            return dispatch[ch](stream)
        else: 
            return read_symbol(stream)

### General considerations
__edn__ elements, streams and files should be encoded using UTF-8.

Elements are generally separated by whitespace. Whitespace, other than within strings, is not otherwise significant, 
nor need redundant whitespace be preserved during transmissions. Commas , are also considered whitespace, other than within strings.

The delimiters `{ } ( ) [ ]` need not be separated from adjacent elements by whitespace.



### symbols
Symbols are used to represent identifiers, and should map to something other than strings, if possible.

Symbols begin with a non-numeric character and can contain alphanumeric characters and `. * + ! - _ ? $ % & = < >`. If `-`, `+` or `.` are the first character, the second character (if any) must be non-numeric. Additionally, `: #` are allowed as constituent characters in symbols other than as the first character.

`/` has special meaning in symbols. It can be used once only in the middle of a symbol to separate the prefix (often a namespace) from the name, e.g. `my-namespace/foo`. `/` by itself is a legal symbol, but otherwise neither the prefix nor the name part can be empty when the symbol contains `/`.

If a symbol has a prefix and `/`, the following name component should follow the first-character restrictions for symbols as a whole. This is to avoid ambiguity in reading contexts where prefixes might be presumed as implicitly included namespaces and elided thereafter.

### Builtin Elements
#### nil  
nil represents nil, null or nothing. It should be read as an object with similar meaning on the target platform.

#### booleans
`true` and `false` should be mapped to booleans.

If a platform has canonic values for `true` and `false`, it is a further semantic of booleans that all instances of `true` yield that (identical) value, and similarly for `false`.

In [707]:
#export 

Character.isWhitespace = lambda ch: ch in ' \t\n,'
Character.isEnding     = lambda ch: ch in '";@^`()[]{}\\'
Character.isSpecial    = lambda ch: ch in '-+.'
Character.isNonNumeric = lambda ch: ch.isalpha() or ch in '.*+!-_?$%&=<>@:#'
Character.isStart      = lambda ch: (Character.isNonNumeric(ch) and not ch in ':#')

def isNumeric(ch, base=None):
    if base is None:
        return ch.isdigit()
    try:
        int(ch, base)
        return True
    except:
        return False    
Character.isNumeric = isNumeric

def isNumberLiteral(ch, lookahead):
    if Character.isNumeric(ch):
        return True
    elif ch == '-' or ch == '+':
        return Character.isNumeric(lookahead)
Character.isNumberLiteral = isNumberLiteral            

def info_location_str(info):
    start = f"{info.start.line}:{info.start.col}"
    end = f"{info.end.line}:{info.end.col}"
    return f"@{start}-{end}"

class BaseNode:
    def __init__(self, linerange=None):
        self.info = linerange
        
    def __repr__(self):
        if hasattr(self, 'info'):
            return f"<{self.__class__.__name__}: '{str(self)}' {info_location_str(self.info)}>"
        return f"<{self.__class__.__name__}: '{str(self)}>"
    
    def __hash__(self):
        return super().__hash__()

class ConstNode(BaseNode):
    value = None
       
    def __bool__(self):
        return self.value
    
    def __eq__(self, val):
        return self.value == val    
    
class TrueNode(ConstNode):
    value = True
    
class FalseNode(ConstNode):
    value = False
        
class NoneNode(ConstNode):
    value = None

class StringNode(BaseNode, str):
    def __new__(cls, val, *args, **kwargs):
        return str.__new__(cls, val)

    def __init__(self, val, linerange=None):
        super().__init__(linerange)
        
    def __eq__(self, val):
        return super().__eq__(val)
    
    def __hash__(self):
        return str.__hash__(self)
    
class SymbolNode(StringNode):
    def __init__(self, val, namespace=None, linerange=None):
        super().__init__(linerange)
        self.namespace = namespace
        
    def __eq__(self, val):
        if isinstance(val, SymbolNode):
            return super().__eq__(val) and self.namespace == val.namespace
        elif self.namespace is None:
            return super().__eq__(val)
        return False

    def __repr__(self):
        if hasattr(self, 'info'):
            return f"<{self.__class__.__name__}: '{str(self)}' {info_location_str(self.info)}>"
        return f"<{self.__class__.__name__}: '{str(self)}>"
    
    def __hash__(self):
        return str.__hash__(self)
        
def read_symbol(stream):
    
    token = read_token(stream)
    linerange = LineRange(token.lineinfo(0), token.lineinfo(-1))
    
    # Special Symbols
    if token == 'nil': return NoneNode(linerange)
    elif token == 'true': return TrueNode(linerange)
    elif token == 'false': return FalseNode(linerange)
    elif token == '/': return SymbolNode('/', linerange=linerange)
    
    try:
        ns, sym = parse_symbol(token)
        return SymbolNode(sym, ns, linerange)
    except:
        raise
        
invalid_token = re.compile(r"(^::|.*:$)")
invalid_namespace = re.compile(r".*:$")
def parse_symbol(token):
    "Parses a string into a tuple of the namespace and symbol"
    if not token or invalid_token.match(token):
        raise Exception("Invalid symbol: '{}'".format(token))
    
    # If no namespace just return None as ns and token as symbol
    if token == '/' or '/' not in token:
        return None, token
    
    ns, sym = token.split('/', 1)
    
    if (sym and 
        not Character.isNumeric(sym[0]) and
        not invalid_namespace.match(ns) and
        (sym == '/' or 
         '/' not in sym)):
        return ns, sym
    raise Exception("Invalid symbol: '{}'".format(token))
    

def read_token(stream):
    token = None
    
    for ch in stream:
        if ch is None or Character.isWhitespace(ch):
            return token
        elif Character.isEnding(ch):
            stream.push_back(ch)
            return token
        elif token is None:
            token = ch
        else:
            token += ch
    return token

In [708]:
assert read("abc") == SymbolNode('abc')
assert read('true') == TrueNode()
assert read('false') == FalseNode()
assert read('nil') == NoneNode()
assert read('/') == SymbolNode('/')
assert read('ns/my-name') == SymbolNode('my-name', namespace='ns')
assert read('ns//') == SymbolNode('/', namespace='ns')

try:
    read('invalid:')
    assert True, "Should be invalid"
except Exception as e:
    assert str(e) == "Invalid symbol: 'invalid:'"
    
try:
    read('::invalid')
    assert True, "Should be invalid"
except Exception as e:
    assert str(e) == "Invalid symbol: '::invalid'"
    
try:
    read('ns/double/slash')
    assert True, "Should be invalid"
except Exception as e:
    assert str(e) == "Invalid symbol: 'ns/double/slash'"    

#### strings
Strings are enclosed in `"double quotes"`. May span multiple lines. Standard C/Java escape characters `\t`, `\r`, `\n`, `\\` and `\"` are supported.

In [709]:
#class StringNode(BaseNode, str):
#    def __new__(cls, val, *args, **kwargs):
#        return super().__new__(cls, val)
#
#    def __init__(self, val, linerange=None):
#        self.info = linerange
#        
#    def __eq__(self, val):
#        return super().__eq__(val)
        
def read_string(stream):
    first_char = next(stream) 
    assert first_char == '"', 'String must start with "'
    
    s = ''
    for ch in stream:
        if ch == '"':
            break
        elif ch == '\\':
            s += escape_char(stream)
        elif ch is None:
            raise Exception("EOF in middle of string")
        else:
            s += ch        
    return StringNode(s, LineRange(first_char.lineinfo(), ch.lineinfo()))

dispatch['"'] = read_string

escape_chars = {'t':'\t', 'r':'\r', 'n':'\n', '\\':'\\', '\"':'\"', 'b':'\b', 'f':'\f'}
def escape_char(stream):
    ch = next(stream)
    
    # Normal character
    if ch in escape_chars:
        ch = escape_chars[ch]
    elif ch == 'u':                                    # Hex unicode escape
        read_unicode_char(stream, base=16, length=4)
    elif Character.isNumeric(ch):                      # Octal Unicode escape
        stream.push_back(ch)
        ch = read_unicode_char(stream, base=8, length=3)
    else:
        raise Exception("Invalid escape '\\{}'".format(ch))
    return ch

def read_unicode_char(stream, base, length):
    unicode_bytes = b'\\'
    
    if base == 16:
        unicode_bytes += b'u'

    for _ in range(length):
        ch = next(stream)
        if not Character.isNumeric(ch, base=base):
            raise Exception("Invalid unicode escape '{}'".format(ch))
        unicode_bytes += bytes(ch, 'utf-8')

    try:
        return unicode_bytes.decode('unicode-escape')
    except:
        raise Exception("Invalid unicode escape '{}'".format(unicode_bytes))

In [710]:
assert read('""') == '', "Can have empty string"

assert read('"abc"') == 'abc', 'Simple String'

assert read('"\\t\\r\\n\\\\\\\"\\b\\f"') == '\t\r\n\\"\b\f', 'Simple escapes'

assert read('"\u0021"') == '!', 'Hex unicode'

assert read('"\041"') == '!', 'Octal unicode'

try:
    read('"abc')
    assert True, "Should be invalid"
except Exception as e:
    assert str(e) == "EOF in middle of string"
    
try:
    read('"\\ "')
    assert True, "Should be invalid"
except Exception as e:
    assert str(e) == "Invalid escape '\ '"
    
try:
    read('"\\u123Z"')
    assert True, "Should be invalid"
except Exception as e:
    assert str(e) == "Invalid unicode escape 'Z'"
    
try:
    read('"\\128"')
    assert True, "Should be invalid"
except Exception as e:
    assert str(e) == "Invalid unicode escape '8'"


#### characters
Characters are preceded by a backslash: `\c`, `\newline`, `\return`, `\space` and `\tab` yield the corresponding characters. 
Unicode characters are represented with `\uNNNN` as in Java. Backslash cannot be followed by whitespace.

In [711]:
class CharacterNode(StringNode):
    pass

def read_char(stream):
    first_char = next(stream) 
    assert first_char == '\\', 'Character must start with \\'
    
    ch = next(stream)
    
    if ch is None:
        raise Exception("EOF in character")
        
    if Character.isWhitespace(ch):
        raise Exception("Backslash cannot be followed by whitespace")
    
    if Character.isEnding(ch):
        token = ch
    else:
        stream.push_back(ch)
        token = read_token(stream)
    
    if len(token) == 1:        ch = token
    elif token == "newline":   ch = '\n'
    elif token == 'space':     ch = ' '
    elif token == 'tab':       ch = '\t'
    elif token == 'backspace': ch = '\b'
    elif token == 'formfeed':  ch = '\f'
    elif token == 'return':    ch = '\r'
    elif token.startswith('u'):
        stream.push_back(token[1:])
        ch = read_unicode_char(stream, base=16, length=4)
    elif token.startswith('o'):
        stream.push_back(token[1:])
        ch = read_unicode_char(stream, base=8, length=len(token)-1)
    else:
        raise Exception("Invalid character escape '{}'".format(token))
    
    linerange = LineRange(first_char.lineinfo(0), token.lineinfo(-1))
    return CharacterNode(ch, linerange)

dispatch['\\'] = read_char

In [712]:
assert read('\\c') == 'c', 'Simple character'

assert read('\\newline')   == '\n', 'Newline character'
assert read('\\space')     == ' ',  'Space character'
assert read('\\tab')       == '\t', 'Tab character'
assert read('\\backspace') == '\b', 'Backspace character'
assert read('\\formfeed')  == '\f', 'Formfeed character'
assert read('\\return')    == '\r', 'Carriage Return character'

assert read('\\u0021')     == '!',  'Hex unicode'
assert read('\\o41')       == '!',  'Octal unicode'

#### keywords
Keywords are identifiers that typically designate themselves. They are semantically akin to enumeration values. Keywords follow the rules of symbols, except they can (and must) begin with `:`, e.g. `:fred` or `:my/fred`. If the target platform does not have a keyword type distinct from a symbol type, the same type can be used without conflict, since the mandatory leading `:` of keywords is disallowed for symbols. Per the symbol rules above, `:/` and `:/anything` are not legal keywords. A keyword cannot begin with `::`

If the target platform supports some notion of interning, it is a further semantic of keywords that all instances of the same keyword yield the identical object.

In [713]:
class KeywordNode(SymbolNode):
    pass

def read_keyword(stream):
    initch = next(stream)
    assert initch == ':', 'Keyword must start with a colon'
    
    ch = next(stream)
    if Character.isWhitespace(ch):
        raise Exception('Single colon not allowed')
    
    stream.push_back(ch)
    token = read_token(stream)
    ns, kw = parse_symbol(token)
    
    if ns is not None and ns.startswith(':'):
        raise Exception('Namespace alias not supported')
    
    linerange = LineRange(token.lineinfo(0), token.lineinfo(-1))
    return KeywordNode(kw, ns, linerange)
        
dispatch[':'] = read_keyword

In [714]:
assert read(':abc') == KeywordNode('abc'), 'Simple'
assert read(':abc') == 'abc', 'Compares with strings'

assert read(':ns/abc') == KeywordNode('abc', namespace='ns'), 'Keywords can have namespace'

try:
    read(': ')
    assert True, "Should throw exception"
except Exception as e:
    assert str(e) == 'Single colon not allowed'

try:
    read('::abc/my-symbol')
    assert True, "Should throw exception"
except Exception as e:
    assert str(e) == 'Namespace alias not supported'    

#### integers
Integers consist of the digits 0 - 9, optionally prefixed by - to indicate a negative number, or (redundantly) by +. No integer other than 0 may begin with 0. 64-bit (signed integer) precision is expected. An integer can have the suffix N to indicate that arbitrary precision is desired. -0 is a valid integer not distinct from 0

In [715]:
class IntegerNode(BaseNode, int):
    def __new__(cls, val, *args, **kwargs):
        return int.__new__(cls, val)

    def __init__(self, val, linerange=None):
        super().__init__(linerange)
        
    def __eq__(self, val):
        return super().__eq__(val)

int_pattern = re.compile(r"^([-+]?)(?:(0)|([1-9][0-9]*)|0[xX]([0-9A-Fa-f]+)|0([0-7]+)|([1-9][0-9]?)[rR]([0-9A-Za-z]+)|0[0-9]+)(N)?$")

def read_number(stream):
    s = next(stream)
    
    for ch in stream:
        if ch is None or Character.isWhitespace(ch) or Character.isEnding(ch):
            stream.push_back(ch)
            return match_number(s)
        else:
            s += ch

def match_number(s):
    lineinfo = LineRange(s.lineinfo(), s.lineinfo(-1))
    if int_pattern.match(s):
        return IntegerNode(match_int(s), lineinfo)
    elif float_pattern.match(s):
        return FloatNode(match_float(s), lineinfo)
    elif ratio_pattern.match(s):
        return FloatNode(match_ratio(s), lineinfo)
    
def match_int(s):
    m = int_pattern.match(s).groups()
    if m[1] is not None:
        return 0
    
    negate = m[0] == '-'
    
    if m[2] is not None:
        base = 10
        n = m[2]
    elif m[3] is not None:
        base = 16
        n = m[3]
    elif m[4] is not None:
        base = 8
        n = m[4]
    elif m[6] is not None:
        base = int(m[5])
        n = m[6]
    else:
        base = None
        n = None
        
    number = int(n, base)
    number = -1 * number if negate else number
    return number

In [716]:
assert read('0') == 0
assert read('+0') == 0
assert read('-0') == 0

assert read('42') == 42
assert read('052') == 42
assert read('8r52') == 42
assert read('0x2a') == 42
assert read('36r16') == 42
assert read('2r101010') == 42

#### floating point numbers
64-bit (double) precision is expected.

In [717]:
class FloatNode(BaseNode, float):
    def __new__(cls, val, *args, **kwargs):
        return float.__new__(cls, val)

    def __init__(self, val, linerange=None):
        super().__init__(linerange)
        
    def __eq__(self, val):
        return super().__eq__(val)

float_pattern = re.compile(r"^([-+]?[0-9]+(\.[0-9]*)?([eE][-+]?[0-9]+)?)(M)?$")
ratio_pattern = re.compile(r"^([-+]?[0-9]+)/([0-9]+)$")

def match_float(s):
    m = float_pattern.match(s).groups()
    if m[3] is not None:
        return float(m[0])   # TODO: Should we support exact precision? This would be decimal.Decimal
    else:
        return float(s)
    
def match_ratio(s):
    m = ratio_pattern.match(s).groups()
    numerator = m[0][1:] if m[0].startswith('+') else m[0]
    denominator = m[1]    
    
    # no ratio in python
    return int(numerator) / int(denominator)


In [718]:
assert read('34.1') == 34.1
assert read('3e5')  == 3e5
assert read('1/2')  == 0.5

### lists
A list is a sequence of values. Lists are represented by zero or more elements enclosed in parentheses `()`. Note that lists can be heterogeneous.

In [719]:
class ListNode(BaseNode, tuple):
    def __new__(cls, val, *args, **kwargs):
        return tuple.__new__(cls, val)
    
    def __init__(self, vals, linerange=None):
        self.info = linerange
        
    def __repr__(self):
        items = '['+ ', '.join(repr(child) for child in self) + ']'
        return f"<ListNode {items} {info_location_str(self.info)}>"

def read_list(stream):
    forms, linerange = read_delimited(stream, sentinel=')')
    return ListNode(forms, linerange)

def read_delimited(stream, sentinel):
    initch = next(stream)
    forms = []
    while True:
        form = read(stream, sentinel)
        if form == READ_EOF:
            raise Exception("EOF in middle of list")
        elif form == READ_FINISHED:
            return forms, LineRange(initch.lineinfo(), stream.eof_info)
        else:
            forms.append(form)

dispatch['('] = read_list
        

In [720]:
assert read('(1 2 3)') == (1, 2, 3)

read('(1 "abc", 3, (1, 2 3))')

<ListNode [<IntegerNode: '1' @0:1-0:1>, <StringNode: 'abc' @0:3-0:7>, <IntegerNode: '3' @0:10-0:10>, <ListNode [<IntegerNode: '1' @0:14-0:14>, <IntegerNode: '2' @0:17-0:17>, <IntegerNode: '3' @0:19-0:19>] @0:13-0:20>] @0:0-0:21>

In [721]:
class VectorNode(BaseNode, list):    
    def __init__(self, vals, linerange=None):
        self.info = linerange
        list.__init__(self, vals)
        
    def __repr__(self):
        items = '['+ ', '.join(repr(child) for child in self) + ']'
        return f"<VectorNode {items} {info_location_str(self.info)}>"

def read_vector(stream):
    forms, linerange = read_delimited(stream, sentinel=']')
    return VectorNode(forms, linerange)
dispatch['['] = read_vector


In [722]:
read('[1 2 3 4 ]')

<VectorNode [<IntegerNode: '1' @0:1-0:1>, <IntegerNode: '2' @0:3-0:3>, <IntegerNode: '3' @0:5-0:5>, <IntegerNode: '4' @0:7-0:7>] @0:0-0:9>

In [731]:
class MapNode(BaseNode, dict):
    def __init__(self, vals, linerange=None):
        self.info = linerange
        dict.__init__(self, vals)
        
    def __repr__(self):
        return "MAP NODE"
        #items = '{'+ ', '.join(f"{repr(k)}:{repr(v)}" for k,v in self.items()) + '}'
        #return f"<MapNode {items} {info_location_str(self.info)}>" 
    
def read_map(stream):
    forms, linerange = read_delimited(stream, sentinel='}')
    if len(forms) % 2 != 0:
        raise Exception("Map must have value for every key")
    pairs = [forms[i:i+2] for i in range(0, len(forms), 2)]
    print(len(pairs))
    return MapNode(pairs, linerange)
dispatch['{'] = read_map

In [732]:
read('{a 1 b 3}')

2


MAP NODE

#### # dispatch character
Tokens beginning with `#` are reserved. The character following `#` determines the behavior. The dispatches `#{` (sets), `#_` (discard), `#alphabetic-char` (tag) are defined below. `#` is not a delimiter.

In [412]:
match_int('123')
match_int('0123')
match_int('0x123')
match_int('+123')
match_int('0')

match_int('5r9')

('', None, '123', None, None, None, None, None)
('', None, None, None, '123', None, None, None)
('', None, None, '123', None, None, None, None)
('+', None, '123', None, None, None, None, None)
('', '0', None, None, None, None, None, None)
('', None, None, None, None, '5', '9', None)


In [169]:
help("x".ishexdigit)

AttributeError: 'str' object has no attribute 'ishexdigit'