WFST:
    This class forms Weighted Finite State Transducers where every WFST has a start state, final state and intermediate states. The class has following methods/functions:
    * set_start_state: Method for setting and adding the starting state of WFST.
    * add_state: Add a new intermediary state to WFST.
    * add_final_state:  Method for setting and adding the final state of WFST.
    * add_transition: 
            - Adds a transition(edge/arc) from one state to another based on an input symbol, output symbol, and weight.
            - Ensures that both the from_state and to_state are added to the states dictionary.
            - Transitions are stored as a list of tuples (to_state, output_symbol, weight) under the corresponding input symbol.				
    * add_epsilon_transitions:  Adds a transition with an empty input symbol (epsilon transition).
    * process: 
            - Processes an input sequence to produce output sequences and weights.
            - Starts from the start state and transitions through states based on the input symbols.
            - Handles epsilon transitions.
            - Returns sequences that end in final states.
    * compose: Creates a new WFST by combining transitions from two WFSTs.

CompositeWFST:
    Given multiple WFST, this class forms CompositeWFST (Collection of many WFST eg: WFST from 1-10) so that whenever an input sequence is to processed it can be processed through the process method where it will find the appropriate WFST in the CompsiteWFST based on the label.

In [13]:
class WFST:
    def __init__(self):
        self.states = {}
        self.start_state = None
        self.final_state = set()

    def set_start_state(self, state):
        self.start_state = state
        self.add_state(state)

    def add_state(self, state):
        if state not in self.states:
            self.states[state] = {}

    def add_final_state(self, state):
        self.final_state.add(state)
        self.add_state(state)

    def add_transition(self, from_state, to_state, input_symbol, output_symbol, weight=0):
        self.add_state(from_state)
        self.add_state(to_state)
        if input_symbol not in self.states[from_state]:
            self.states[from_state][input_symbol] = []
        self.states[from_state][input_symbol].append((to_state, output_symbol, weight))

    def add_epsilon_transition(self, from_state, to_state, output_symbol, weight=0):
        self.add_transition(from_state, to_state, '', output_symbol, weight)

    def process(self, input_sequence):
        current_states = [(self.start_state, '', 0)]
        for symbol in input_sequence:
            next_states = []
            for (state, current_output, current_weight) in current_states:
                if symbol in self.states[state]:
                    for (next_state, output_symbol, weight) in self.states[state][symbol]:
                        next_states.append((next_state, current_output + output_symbol, current_weight + weight))
                if '' in self.states[state]:
                    for (next_state, output_symbol, weight) in self.states[state]['']:
                        next_states.append((next_state, current_output + output_symbol, current_weight + weight))
            current_states = next_states
        
        final_state = [(state, output, weight) for (state, output, weight) in current_states if state in self.final_state]
        if not final_state:
            for (state, current_output, current_weight) in current_states:
                if '' in self.states[state]:
                    for (next_state, output_symbol, weight) in self.states[state]['']:
                        if next_state in self.final_state:
                            final_state.append((next_state, current_output + output_symbol, current_weight + weight))
        return final_state

    def compose(self, other):
        result = WFST()
        result.set_start_state('0')
        i = 0
        for s1 in self.states:
            if s1 not in self.final_state:
                for symbol1 in self.states[s1]:
                    for (n1, o1, w1) in self.states[s1][symbol1]:
                        if symbol1 == '' and n1 in self.final_state:
                            break
                        result.add_transition(chr(i + 48), chr(i + 49), symbol1, o1, w1)
                        i += 1

        for s1 in other.states:
            if s1 not in other.final_state:
                for symbol1 in other.states[s1]:
                    for (n1, o1, w1) in other.states[s1][symbol1]:
                        result.add_transition(chr(i + 48), chr(i + 49), symbol1, o1, w1)
                        i += 1

        result.add_final_state(chr(i + 48))
        return result

class CompositeWFST:
    def __init__(self):
        self.wfsts = {}
    
    def add_wfst(self, key, wfst):
        self.wfsts[key] = wfst
    
    def process(self, input_sequence):
        if not input_sequence:
            return []

        composed_wfst = self.wfsts.get(input_sequence[0])
        if not composed_wfst:
            return []

        for symbol in input_sequence[1:]:
            next_wfst = self.wfsts.get(symbol)
            if next_wfst:
                composed_wfst = composed_wfst.compose(next_wfst)
            else:
                return []

        return composed_wfst.process(input_sequence)

Example creation of WFST for Twenty and One

In [None]:
wfst1 = WFST()
wfst1.set_start_state('0')
wfst1.add_final_state('2')
wfst1.add_transition('0', '1', 'twenty', '2')
wfst1.add_epsilon_transition('1', '2', '0')

wfst2 = WFST()
wfst2.set_start_state('0')
wfst2.add_final_state('1')
wfst2.add_transition('0', '1', 'one', '1')

composite_wfst = CompositeWFST()
composite_wfst.add_wfst('twenty', wfst1)
composite_wfst.add_wfst('one', wfst2)

In [20]:
input_sequence1 = ['twenty', 'one']
input_sequence2 = ['twenty']
result1 = composite_wfst.process(input_sequence1)
result2 = composite_wfst.process(input_sequence2)

print(result1[0][1])
print(result2[0][1])

21
20
