# Finite State Automata
(15 points)

You should finish the class `Automata`. It should be able to
1. read a given grammar and
2. check whether a given text is accepted by the grammar or not.

In the second Task of this exercise, the extension `ExtendedAutomata` should be enabled to extract all matching Strings from a given text (described later in more detail).

## Task 1
(10 points)

This Task is focussing on the basic functionality of the `Automata` class.

#### Example

We are using the `baa!` sheep example from the slides but your implementation has to accept other grammars as well. The transition matrix is given as a tab separated String:
```
State	a	b	!
0	5	1	5
1	2	5	5
2	3	5	5
3	3	5	4
4:	5	5	5
5	5	5	5
```
(Note that this notebook renders the example above with whitespaces but the data in the tests will use tabs.) For the line endings, the `\n` character is used (might be important for Windows users!).

The transition matrix contains a head line and after that one state per line.

The head line starts with the String `State`. After that, the single input characters are listed.

The single status lines show the transitions from the left state to the other states given the input characters defined in the head state. For example, there is a transition from state `0` to `5` if `a` or `!` are read and a transition to `1` if `b` is read.

The end state(s) are marked with `:`. In the example, state `4` is the only end state.

#### Hints

- `0` is always the start state.
- There can be multiple end states.
- The state with the highest ID in the grammar is always the error state (in the example, it is the state `5`). This is the state the automata gets into if a character is read that does not fit to the grammar at that particular position. It can also be seen that the automata can not leave the error state.
- All characters that are not listed in the head should directly lead to the error state.
- The lines below the head line don't have to be ordered (i.e., the states can be defined in any order).
- The class `Automata` has two methods - `parseGrammar` and `acceptsString`. The first method is called when constructing the Automata and should parse the grammar. Later, the second method is called and should calculate a result based on the grammar and the input String. Note that you may want to add some class attributes to make sure that you can store the parsed information (i.e., the transitions) in the `parseGrammar` method to be able to use them later on in the `acceptsString` method.

#### Notes

- You are free to use a different IDE to develop your solution. However, you have to copy the solution into this notebook to submit it.
- Do not add additional external libraries.
- Interface
  - You can use _[TAB]_ for autocompletion and _[SHIFT]_+_[TAB]_ for code inspection.
  - Use _Menu_ -> _View_ -> _Toggle Line Numbers_ for debugging.
  - Check _Menu_ -> _Help_ -> _Keyboard Shortcuts_.
- Known issues
  - All global variables will be set to void after an import.
  - Missing spaces arround `%` (Modulo) can cause unexpected errors so please make sure that you have added spaces around every `%` character.
- Finish
  - Save your solution by clicking on the _disk icon_.
  - Make sure that all necessary imports are listed at the beginning of your cell.
  - Run a final check of your solution by
    - click on _restart the kernel, then re-run the whole notebook_ (the fast forward arrow in the tool bar)
    - wait fo the kernel to restart and execute all cells (all executable cells should have numbers in front of them instead of a `[*]`) 
    - Check all executed cells for errors. If an exception is thrown, please check your code. Note that although the error might look cryptic, until now we never encounter that an exception was caused without a valid reason inside of the submitted code. A good way to check the code is to copy the solution into a new class in your favorite IDE and check
      - errors reported by the IDE
      - imports the IDE adds to your code which might be missing in your submission.
  - Finally, choose _Menu_ -> _File_ -> _Close and Halt_.
  - Do not forget to _Submit_ your solution in the _Assignments_ view.

In [1]:
class Automata {

	// YOUR CODE HERE
	String[][] state = null;
	List<String> charList = new ArrayList<>();
	int goal = 0;

	/**
	 * Constructor taking the grammar String which defines the behavior of this
	 * automata.
	 */
	public Automata(String grammarDef) {
		parseGrammar(grammarDef);
	}

	/**
	 * An internal method that parses the given grammar String.
	 */
	protected void parseGrammar(String grammarDef) {
		// YOUR CODE HERE
		String line_Grammar[] = grammarDef.split("\\n");
		int length_lines = line_Grammar.length;
		int length = line_Grammar[0].split("\\t").length;

		state = new String[length_lines - 1][length];
		int i = 0;
		for (int in = 1; in < length_lines; in++) {
			state[i] = line_Grammar[in].split("\\t");
			i++;
		}
		for (String[] goalSearch : state) {
			if (goalSearch[0].length() > 1) {
				if (goalSearch[0].charAt(1) == ':')
					goal = Integer.parseInt(goalSearch[0].charAt(0)+"");

			}
		}

		String[] header_line = line_Grammar[0].split("\\t");

		for (i = 1; i < header_line.length; i++) {
			charList.add(header_line[i]);
		}

	}

	/**
	 * This method should return true if the complete given text is accepted by
	 * the FSA. If this is not the case, false should be returned.
	 */
	
	int index = 0;
	int curr_state = 0;
	

	public boolean check_string(String text, int index, int curr_state) {
		String workChar = "" + text.charAt(index);
		if (!charList.contains(workChar)) 
			return false;
		if (charList.contains(workChar)) {
			int workIndex = charList.indexOf(workChar);
			int i = 0;
			int valueState = Integer.parseInt(state[curr_state][workIndex + 1]);
			if (valueState > goal) {
				return false;
			}

			else
				curr_state = valueState;

			if (++index == text.length()) {
				return true;
			}

			else {
				if(check_string( text, index, curr_state) == true){
					return true;
				};

				// curr_state starts from 0 but in our array, we take 1 as first
				// state
			}
		
		}
		return false;
	}

	
	
	public boolean acceptsString(String text) {
		boolean accepted = false;
		// YOUR CODE HERE
		accepted = check_string(text,0,0);
		
		return accepted;
	}
}

#### Evaluation Task 1

- Run the following cell to test your implementation.
- You can ignore the cells afterwards.

In [2]:
%maven org.junit.jupiter:junit-jupiter-api:5.3.1
import org.junit.jupiter.api.Assertions;
import org.opentest4j.AssertionFailedError;

/**
 * A simple check whether an instance of the Automata class initialized with 
 * the given grammar would accept the given input.
 */
public static void checkAutomata(String grammar, String input, boolean expectedResult) {
    try {
        Automata automata = new Automata(grammar);
        if (expectedResult) {
            Assertions.assertTrue(automata.acceptsString(input),
                    "Your automata rejected \"" + input + "\" but it should have accepted it.");
        } else {
            Assertions.assertFalse(automata.acceptsString(input),
                    "Your automata accepted \"" + input + "\" but it should have rejected it.");
        }
        System.out.println("Test successfully completed.");
    } catch (AssertionFailedError e) {
        System.err.println(e);
        throw e;
    } catch (Throwable e) {
        System.err.println("Your solution caused an unexpected error:");
        throw e;
    }
}

// The single checks for the example
String grammar1 = "State\ta\tb\t!\n0\t5\t1\t5\n1\t2\t5\t5\n2\t3\t5\t5\n3\t3\t5\t4\n4:\t5\t5\t5\n5\t5\t5\t5";
checkAutomata(grammar1, "baa!", true);
checkAutomata(grammar1, "baaa!", true);
checkAutomata(grammar1, "baaa!!!", false);
checkAutomata(grammar1, "!aab", false);
checkAutomata(grammar1, "xyz", false);

Test successfully completed.
Test successfully completed.
Test successfully completed.
Test successfully completed.
Test successfully completed.


In [3]:
// Ignore this cell

In [4]:
// Ignore this cell

## Task 2
(5 points)

In the second Task of this exercise, the extension `ExtendedAutomata` should be enabled to extract all matching Strings from a given text. The class is already inheriting the functionalities of your `Automata` class.

#### Hints
- The automata does not have to support very complex grammars (e.g., not all features of regular expressions).
- Make sure that you have executed the cell with your implementation of the `Automata` class to make sure that the Kernel is aware of your class implementation.
- Make sure that if two matches are overlapping, the longest match is extracted. This is not an issue with the given test grammar. However, for the grammar of the regular expression `ab*`, an automaton should extract from the given string `"a abbb b"` the complete second term `abbb` instead of `ab` or `abb`.

In [5]:

import java.util.ArrayList;
import java.util.List;

/**
 * A simple implementation of an FSA that should be finalized.
 */

public class ExtendedAutomata extends Automata {

	public ExtendedAutomata(String grammarDef) {
		super(grammarDef);
		// TODO Auto-generated constructor stub
	}

	List<String> matches = new ArrayList<>();
	int beginIndex = 0;
	int endIndex = 0;

	public boolean checkNext(String text, int index, int curr_state) {
		String workChar = "" + text.charAt(index);
		if (!charList.contains(workChar)) {
			return false;
		}

		else {
			int workIndex = charList.indexOf(workChar);
			int valueState = Integer.parseInt(state[curr_state][workIndex + 1]);
			if (valueState < goal)
				return true;
			else
				return false;
		}
	}

	String temp = "";

	public void find_string(String text, int index, int curr_state) {
		String workChar = "" + text.charAt(index);
		if (!charList.contains(workChar)) {
			if (index + 1 < text.length()) {
				beginIndex = ++index;
				find_string(text, index, 0);
				return;
			} else
				return;
		}
		if (charList.contains(workChar)) {
			int workIndex = charList.indexOf(workChar);
			int valueState = Integer.parseInt(state[curr_state][workIndex + 1]);
			if (valueState > goal) {
				if (Integer.parseInt(state[0][workIndex + 1]) == 1) {
					beginIndex = index;
					find_string(text, index, 0);
					return;
				} else if (index + 1 < text.length()) {
					beginIndex = ++index;
					find_string(text, index, 0);
					return;
				} else
					return;
			}

			else if (valueState == goal) {
				temp = text.substring(beginIndex, index + 1);
				if (index + 1 < text.length() && checkNext(text, index + 1, curr_state) == true) {
					temp = text.substring(beginIndex, index + 1);
					find_string(text, ++index, curr_state);
					return;
				}
				;

				matches.add(temp);
				if (++index < text.length()) {
					beginIndex = index;
					find_string(text, index, 0);
					return;
				} else
					return;
			} else {
				curr_state = valueState;
				find_string(text, ++index, curr_state);
				return;
			}

			// curr_state starts from 0 but in our array, we take 1 as first
			// state
		}

	}

	public String[] findMatches(String text) {
		// YOUR CODE HERE
		find_string(text, 0, 0);
		String[] matchArray = new String[matches.size()];

		int i = 0;
		for (String match : matches) {
			matchArray[i] = match;
			i++;

		}
		return matchArray;
	}
}


#### Evaluation Task 2

- Run the following cell to test your implementation.
- You can ignore the empty cells afterwards.
- Make sure that you have executed the test cell of Task 1 (the one with the `maven` line). Otherwise, you may get compiler errors for the test.

In [6]:
import org.junit.jupiter.api.Assertions;
import org.opentest4j.AssertionFailedError;

/**
 * A simple check whether an instance of the ExtendedAutomata class initialized with 
 * the given grammar would return the expected sub strings.
 */
public static void checkExtendedAutomata(String grammar, String input, String[] expectedResult) {
    try {
        ExtendedAutomata automata = new ExtendedAutomata(grammar);
        String[] result = automata.findMatches(input);
        Assertions.assertArrayEquals(expectedResult, result, "Your solution returned " + Arrays.toString(result)
                + " but " + Arrays.toString(expectedResult) + " was expected.");
        System.out.println("Test successfully completed.");
    } catch (AssertionFailedError e) {
        System.err.println(e);
        throw e;
    } catch (Throwable e) {
        System.err.println("Your solution caused an unexpected error:");
        throw e;
    }
}

// The single checks for the example
String grammar1 = "State\ta\tb\t!\n0\t5\t1\t5\n1\t2\t5\t5\n2\t3\t5\t5\n3\t3\t5\t4\n4:\t5\t5\t5\n5\t5\t5\t5";
checkExtendedAutomata(grammar1, 
        "baa! He said baaaa! babaa!! baaaa!", 
        new String[] { "baa!", "baaaa!", "baa!", "baaaa!" });

Test successfully completed.


In [7]:
// Ignore this cell