# Exercise 1 - CKY Recognizer
(6 points)

Finish the `CKYRecognizer` class given below. It takes a grammar (the grammar does not need to be checked for its structure as all rules already fit to the necessary CNF described in the slides), and a sentence for which it generates a table comprising the non-terminal symbols that could be used to generate the sentence (as described in the slides).

The two methods are as follows:
- `Constructor`
  - takes a list of non-terminal to non-terminal rules in the CNF (i.e., each rule is a String pointing to an array of String pairs) as a map (the key is the left side of the rule, the value are the possible right sides)
  - takes a list of lexical rules (i.e., non-terminals pointing to a list of terminals they could be replaced with) as a map (the key is the non-terminal on the left side while the value are the possible tokens on the right side)
- `getParseTable`
  - takes a sentence as a single string
    - the tokens in the given sentence are lowercased
    - the tokens are separated by whitespaces
    - the sentence does not contain any punctuation
  - returns the parsing table as a serialized string with
    - each table cell should be enclosed with `[` and `]`
    - an empty cell should be serialized as `[]`
    - the non-terminals in the cells should be
      - seperated by a comma and a whitespace (i.e., `, `) and
      - sorted alphabetically (i.e., `[A, B]` but not `[B, A]`)
  - in case a terminal symbol is unknown, `null` should be returned

#### Example

Imagine a very simple grammar with the non-terminals $A, B, C, S$ and the terminals $a, b$. $S$ is the start symbol and we have the following rules:

<table>
    <tr><td><p align="left">$S$</p></td><td>$\rightarrow$</td><td><p align="left">$A\;\; B$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$B\;\; C$</p></td></tr>
    <tr><td><p align="left">$A$</p></td><td>$\rightarrow$</td><td><p align="left">$a$</p></td></tr>
    <tr><td><p align="left">$B$</p></td><td>$\rightarrow$</td><td><p align="left">$b$</p></td></tr>
    <tr><td><p align="left">$C$</p></td><td>$\rightarrow$</td><td><p align="left">$b$</p></td></tr>
</table>

For the sentence `"a b"`, the `getParseTable` method should return the following table:
```
[][A][S]
[][][B, C]
[][][]

```
Please note that
* each line of the table ends with a `\n` character (including the last line)
* the first coloumn and the last line are printed although they are typically empty
* you don't have to make sure whether the table contains a complete parsing tree, i.e., whether the sentence can be parsed using the given grammatic. In this first exercise, you only have to apply the CKY recognizer algorithm.

#### Grammar

An extended version of the flight example grammar will be used in this exercise. All rules can be found in the following two tables:

<table>
    <tr>
        <th>Left side non-terminal</th>
        <th></th>
        <th><p align="left">Right side non-terminals</p></th>
    </tr>
    <tr><td><p align="left">$S$</p></td><td>$\rightarrow$</td><td><p align="left">$NP\;\; VP$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$X1\;\; VP$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$Verb\;\; NP$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$X2\;\; PP$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$Verb\;\; PP$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$VP\;\; PP$</p></td></tr>
    <tr><td><p align="left">$Nominal$</p></td><td>$\rightarrow$</td><td><p align="left">$Nominal\;\; Noun$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$Nominal\;\; PP$</p></td></tr>
    <tr><td><p align="left">$NP$</p></td><td>$\rightarrow$</td><td><p align="left">$Det\;\; Nominal$</p></td></tr>
    <tr><td><p align="left">$PP$</p></td><td>$\rightarrow$</td><td><p align="left">$Preposition\;\; NP$</p></td></tr>
    <tr><td><p align="left">$VP$</p></td><td>$\rightarrow$</td><td><p align="left">$Verb\;\; NP$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$X2\;\; PP$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$Verb\;\; PP$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$VP \;\; PP$</p></td></tr>
    <tr><td><p align="left">$X1$</p></td><td>$\rightarrow$</td><td><p align="left">$Aux\;\; NP$</p></td></tr>
    <tr><td><p align="left">$X2$</p></td><td>$\rightarrow$</td><td><p align="left">$Verb\;\; NP$</p></td></tr>
</table>

<table>
    <tr>
        <th><p align="left">Non-terminal</p></th>
        <th></th>
        <th><p align="left">Terminal</p></th>
    </tr>
    <tr><td><p align="left">$S$</p></td><td>$\rightarrow$</td><td><p align="left">$book$ $|$ $include$ $|$ $prefer$ $|$ $shot$</p></td></tr>
    <tr><td><p align="left">$Aux$</p></td><td>$\rightarrow$</td><td><p align="left">$does$</p></td></tr>
    <tr><td><p align="left">$Det$</p></td><td>$\rightarrow$</td><td><p align="left">$a$ $|$ $an$ $|$ $that$ $|$ $the$ $|$ $this$</p></td></tr>
    <tr><td><p align="left">$Nominal$</p></td><td>$\rightarrow$</td><td><p align="left">$book$ $|$ $elephant$ $|$ $flight$ $|$ $meal$ $|$ $money$ $|$ $pajamas$</p></td></tr>
    <tr><td><p align="left">$Noun$</p></td><td>$\rightarrow$</td><td><p align="left">$book$ $|$ $flight$ $|$ $elephant$ $|$ $meal$ $|$ $money$ $|$ $pajamas$</p></td></tr>
    <tr><td><p align="left">$NP$</p></td><td>$\rightarrow$</td><td><p align="left">$i$ $|$ $hustoun$ $|$ $me$ $|$ $my$ $|$ $nwa$ $|$ $she$</p></td></tr>
    <tr><td><p align="left">$Preposition$</p></td><td>$\rightarrow$</td><td><p align="left">$from$ $|$ $in$ $|$ $near$ $|$ $on$ $|$ $through$ $|$ $to$</p></td></tr>
    <tr><td><p align="left">$Verb$</p></td><td>$\rightarrow$</td><td><p align="left">$book$ $|$ $include$ $|$ $prefer$ $|$ $shot$</p></td></tr>
    <tr><td><p align="left">$VP$</p></td><td>$\rightarrow$</td><td><p align="left">$book$ $|$ $include$ $|$ $prefer$ $|$ $shot$</p></td></tr>
</table>

<!--
<table>
    <tr>
        <th><p align="left">Non-terminal</p></th>
        <th></th>
        <th><p align="left">Terminal</p></th>
    </tr>
    <tr><td><p align="left">$Aux$</p></td><td>$\rightarrow$</td><td><p align="left">$does$</p></td></tr>
    <tr><td><p align="left">$Det$</p></td><td>$\rightarrow$</td><td><p align="left">$a$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$an$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$that$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$the$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$this$</p></td></tr>
    <tr><td><p align="left">$Nominal$</p></td><td>$\rightarrow$</td><td><p align="left">$book$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$elephant$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$flight$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$meal$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$money$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$pajamas$</p></td></tr>
    <tr><td><p align="left">$Noun$</p></td><td>$\rightarrow$</td><td><p align="left">$book$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$flight$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$elephant$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$meal$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$money$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$pajamas$</p></td></tr>
    <tr><td><p align="left">$NP$</p></td><td>$\rightarrow$</td><td><p align="left">$i$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$hustoun$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$me$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$my$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$nwa$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$she$</p></td></tr>
    <tr><td><p align="left">$Preposition$</p></td><td>$\rightarrow$</td><td><p align="left">$from$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$in$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$near$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$on$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$through$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$to$</p></td></tr>
    <tr><td><p align="left">$S$</p></td><td>$\rightarrow$</td><td><p align="left">$book$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$include$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$prefer$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$shot$</p></td></tr>
    <tr><td><p align="left">$Verb$</p></td><td>$\rightarrow$</td><td><p align="left">$book$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$include$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$prefer$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$shot$</p></td></tr>
    <tr><td><p align="left">$VP$</p></td><td>$\rightarrow$</td><td><p align="left">$book$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$include$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$prefer$</p></td></tr>
    <tr><td></td><td>$|$</td><td><p align="left">$shot$</p></td></tr>
</table>
-->

#### Notes

- Do not add additional external libraries.
- Interface
  - You can use _[TAB]_ for autocompletion and _[SHIFT]_+_[TAB]_ for code inspection.
  - Use _Menu_ -> _View_ -> _Toggle Line Numbers_ for debugging.
  - Check _Menu_ -> _Help_ -> _Keyboard Shortcuts_.
- Finish
  - Save your solution by clicking on the _disk icon_.
  - Finally, choose _Menu_ -> _File_ -> _Close and Halt_.
  - Do not forget to _Submit_ your solution in the _Assignments_ view.

In [2]:
// YOUR CODE HERE
import java.util.Map.Entry;


public class CKYRecognizer {

	// YOUR CODE HERE
	Map<String, String[][]> grammar;
	Map<String, String[]> lexicon;

	/**
	 * Constructor.
	 * 
	 * @param grammar the non-terminal to non-terminal rules
	 * @param lexicon the non-terminal to terminal rules
	 */
	public CKYRecognizer(Map<String, String[][]> grammar, Map<String, String[]> lexicon) {
		// YOUR CODE HERE
		this.grammar = grammar;
		this.lexicon = lexicon;
	}

	/**
	 * Takes the sentence and returns the generated table serialized as a single
	 * String,
	 * 
	 * @param sentence the sentence that should be parsed
	 * @return the parsing table serialized as a single String.
	 */
	public String getParseTable(String sentence) {
		String parseTable = "";
		String words[] = sentence.toLowerCase().trim().split(" ");
		String table[][] = new String[words.length + 1][words.length + 1];

		// List<String>[][] tab = new ArrayList<String>[][];
		for (int i = 0; i < words.length + 1; i++) {
			for (int j = 0; j < words.length + 1; j++) {
				table[i][j] = "".trim();
			}
		}

		Map<String, List<String>> ownGrammar = new HashMap<>();
		Map<String, List<String>> ownLexiconMap = new HashMap<>();

		Iterator<Entry<String, String[]>> it0 = lexicon.entrySet().iterator();
		while (it0.hasNext()) {
			Map.Entry<String, String[]> pair = (Entry<String, String[]>) it0.next();

			List<String> terminalList = Arrays.asList(pair.getValue());
			for (String temp : terminalList) {

				if (ownLexiconMap.containsKey(temp)) {
					List<String> list = ownLexiconMap.get(temp);
					list.add(pair.getKey());
					list.sort(String::compareTo);
					ownLexiconMap.put(temp, list);
				} else {
					List<String> list = new ArrayList<>();
					list.add(pair.getKey());
					list.sort(String::compareTo);
					ownLexiconMap.put(temp, list);
				}

			}

		}
		
		for(String word:words) {
			if(ownLexiconMap.containsKey(word)) {
				continue;
			}else {
				return null;
			}
		}

		List<String> list = null;

		for (String key : grammar.keySet()) {

			String grammarValue[][] = grammar.get(key);

			for (int i = 0; i < grammarValue.length; i++) {
				String gkey = String.join(" ", Arrays.asList(grammarValue[i]));
				list = new ArrayList<String>();
				if (ownGrammar.containsKey(gkey)) {
					list = ownGrammar.get(gkey);
					if (list.contains(key)) {
						continue;
					} else {
						list.add(key);
						ownGrammar.put(gkey, list);
					}
				} else {
					list.add(key);
					ownGrammar.put(gkey, list);
				}

			}

		}

//		System.out.println("Grammer Map : = " + ownGrammar);
//		System.out.println("Lexicon Map : = " + ownLexiconMap);

		// My Work

		for (int j = 1; j <= words.length; j++) {

			if(ownLexiconMap.containsKey(words[j-1])) {
				Set<String> foo = new TreeSet<String>(ownLexiconMap.get(words[j - 1]));
				table[j - 1][j] = String.join(" ", foo);
			}
			
			for (int i = j - 2; i >= 0; i--) {

				for (int k = i + 1; k <= j - 1; k++) {

					List<String> currentColCell = new ArrayList<String>();
					currentColCell = Arrays.asList(table[k][j].trim().split(" "));

					List<String> currentRowCell = new ArrayList<String>();
					currentRowCell = Arrays.asList(table[i][k].trim().split(" "));

					for (String rowNonTermial : currentRowCell) {
						for (String colNonTerminal : currentColCell) {
							String wordInGrammer = rowNonTermial + " " + colNonTerminal;
							// System.out.print(wordInGrammer);
							if (ownGrammar.containsKey(wordInGrammer.trim())) {
								table[i][j] = table[i][j] + " " + String.join(" ", ownGrammar.get(wordInGrammer));
								Set<String> foo1 = new TreeSet<String>(Arrays.asList(table[i][j].trim().split(" ")));
								table[i][j] = String.join(" ", foo1);

							}
						}
					}
				}
			}
		}
		
		for (int i = 0; i < table.length; i++) {

			for (int j = 0; j < table[i].length; j++) {
				
				if (table[i][j] == "") {
					parseTable += "[]";
				} else {
					parseTable = parseTable + "[" + String.join(", ", Arrays.asList(table[i][j].trim().split(" "))) + "]";
				}
			}
			
			parseTable = parseTable + '\n';
		}

		if(parseTable == "")
			parseTable = null;
		
//		System.out.println("parsetable = "+parseTable);
//		System.out.println("Table ==" + Arrays.deepToString(table));

		return parseTable;
	}
}

// This line should make sure that compile errors are directly identified when executing this cell
// (the line itself does not produce any meaningful result)
new CKYRecognizer(new HashMap<>(), new HashMap<>());
System.out.println("compiled");

compiled


# Evaluation

- Run the following cell to test your implementation.
- You can ignore the cells afterwards.

In [3]:
%maven org.junit.jupiter:junit-jupiter-api:5.3.1
import org.junit.jupiter.api.Assertions;
import org.opentest4j.AssertionFailedError;

public void checkParsingTable(CKYRecognizer recognizer, String sentence, String expectedTable) {
    try {
        long time1 = System.currentTimeMillis();
        String result = recognizer.getParseTable(sentence);
        time1 = System.currentTimeMillis() - time1;
        if (expectedTable == null) {
            Assertions.assertNull(result,
                    "The result was expected to be null. However, the result of your solution is \"" + result
                            + "\".");
        } else {
            Assertions.assertEquals(expectedTable, result);
        }
        System.out.println("Test successful. Calculation took " + time1 + "ms");
    } catch (AssertionFailedError e) {
        throw e;
    } catch (Throwable e) {
        System.err.println("Your solution caused an unexpected error:");
        throw e;
    }
}

Map<String, String[][]> grammar = new HashMap<String, String[][]>();
Map<String, String[]> lexicon = new HashMap<>();
String expectedTable;
CKYRecognizer recognizer;

// Test the very simple ABC example from the description
grammar.put("S", new String[][] { { "A", "B" } });
lexicon.put("A", new String[] { "a" });
lexicon.put("B", new String[] { "b" });
lexicon.put("C", new String[] { "b" });
recognizer = new CKYRecognizer(grammar, lexicon);
expectedTable = "[][A][S]\n"
             + "[][][B, C]\n"
             + "[][][]\n";
checkParsingTable(recognizer, "a b", expectedTable);

// Define the flight grammar
grammar.clear();
grammar.put("Nominal", new String[][] { { "Nominal", "Noun" }, { "Nominal", "PP" } });
grammar.put("NP", new String[][] { { "Det", "Nominal" } });
grammar.put("PP", new String[][] { { "Preposition", "NP" } });
grammar.put("S", new String[][] { { "NP", "VP" }, { "X1", "VP" }, { "Verb", "NP" }, { "X2", "PP" },
        { "Verb", "PP" }, { "VP", "PP" } });
grammar.put("VP", new String[][] { { "book", "include", "prefer" }, { "Verb", "NP" }, { "X2", "PP" },
        { "Verb", "PP" }, { "VP", "PP" } });
grammar.put("X1", new String[][] { { "Aux", "NP" } });
grammar.put("X2", new String[][] { { "Verb", "NP" } });

lexicon.clear();
lexicon.put("Aux", new String[] { "does" });
lexicon.put("Det", new String[] { "a", "an", "that", "this", "the" });
lexicon.put("Nominal", new String[] { "book", "elephant", "flight", "meal", "money", "pajamas" });
lexicon.put("Noun", new String[] { "book", "flight", "elephant", "meal", "money", "pajamas" });
lexicon.put("NP", new String[] { "i", "hustoun", "me", "my", "nwa", "she" });
lexicon.put("Preposition", new String[] { "from", "in", "near", "on", "through", "to" });
lexicon.put("S", new String[] { "book", "include", "prefer", "shot" });
lexicon.put("Verb", new String[] { "book", "include", "prefer", "shot" });
lexicon.put("VP", new String[] { "book", "include", "prefer", "shot" });

// Test the example from the book
recognizer = new CKYRecognizer(grammar, lexicon);
expectedTable = "[][Nominal, Noun, S, VP, Verb][][S, VP, X2][][S, VP, X2]\n"
             + "[][][Det][NP][][NP]\n"
             + "[][][][Nominal, Noun][][Nominal]\n"
             + "[][][][][Preposition][PP]\n"
             + "[][][][][][NP]\n"
             + "[][][][][][]\n";
checkParsingTable(recognizer, "book the flight through hustoun", expectedTable);

// Test a sentence with an unknown word
checkParsingTable(recognizer, "my flight to paris", null);

Test successful. Calculation took 8ms
Test successful. Calculation took 1ms
Test successful. Calculation took 0ms


In [None]:
// Ignore this cell