## DSL Job2

Grammar structure:

```
{'toks': set(token), 'vars': dict(var: definition), 'hvar': var}
token : (class, value)
class : int
value : str
var : str                 # non-terminal name
definition : list(rule)
rule : list(var | token)  # right side of the rule
```

**1. Getting rid of useless nonterminals**

To get rid of useless nonterminals, we just have to leave all useful nonterminals in our grammar. To check whether the nonterminal is useful, its production has to consist only of terminals and other useful nonterminals.

In [1]:
def getUseful(grammar):
  tokens = grammar['toks']
  vars = grammar['vars']
  useful = set()

  while True:
    added = False

    for var, rules in vars.items():
      if var not in useful:
        for rule in rules:
          if all(map(lambda s: s in tokens or s in useful, rule)):
            useful.add(var)
            added = True
            break

    if not added:
      return useful

Then to remove all the rules with useless nonterminals we have to keep the ones that consist only of terminal and useful nonterminal symbols.

In [2]:
def removeUseless(grammar):
  tokens = grammar['toks']
  vars = grammar['vars']
  useful = getUseful(grammar)

  fixedVars = {}
  
  for var, rules in vars.items():
    if var in useful:
      fixedVars[var] = []

      for rule in rules:
        if all(map(lambda s: s in tokens or s in useful, rule)):
          fixedVars[var].append(rule)

  return {'toks': tokens, 'vars': fixedVars, 'hvar': grammar['hvar']}

**2. Identifying disappearing nonterminals**

In order to identify the disappearing nonterminals, their production has to consist of other disappearing nonterminals or be empty.

In [7]:
def getDisappearing(grammar):
  vars = grammar['vars']
  disappearing = set()

  while True:
    added = False

    for var, rules in vars.items():
      if var not in disappearing:
        for rule in rules:
          if all(map(lambda s: s in disappearing, rule)):
            disappearing.add(var)
            added = True
            break

    if not added:
      return disappearing

Test:

In [9]:
GRAMMAR = {
  'toks': set([
      ('type', 'a1'),
      ('type', 'a2'),
      ('type2', 'b1'),
      ('type3', 'c1'),
  ]),
  'vars': {
      'S': [
        ['A', ('type2', 'b1')],
        ['B', ('type3', 'c1')],
        [('type', 'a1'), ('type', 'a2')],
      ],
      'A': [
        ['B']
      ],
      'B': [
        [('type2', 'b1')]
      ],
      'E': [[]],
      'K': [['E']]
  },
  'hvar': 'S'
}

print(removeUseless(GRAMMAR))
print(getDisappearing(GRAMMAR))

{'toks': {('type', 'a1'), ('type', 'a2'), ('type3', 'c1'), ('type2', 'b1')}, 'vars': {'S': [['A', ('type2', 'b1')], ['B', ('type3', 'c1')], [('type', 'a1'), ('type', 'a2')]], 'A': [['B']], 'B': [[('type2', 'b1')]], 'E': [[]], 'K': [['E']]}, 'hvar': 'S'}
{'K', 'E'}
