In [1]:
from viterbi import viterbi
from find_best_parameters import find_best_parameters

In [2]:
emissions = tuple(emission for emission in '01000000100000000000001000000010001010010001001001')

Napište program (který použijete v dalším příkladu) pro výpočet pravděpodobnostních parametrů ze známého rozložení transmisí a emisí. Vstupní data:

    pro kontrolu sada z přednášky

        01000000100000000000001000000010001010010001001001
        PPPMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP 

    parametry pro nejpravděpodobnější výsledek..

        01000000100000000000001000000010001010010001001001
        MMMMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP 

    ..porovnejte se zadanými z obrázku
    
Zkuste zapomenout, že model má již nějaké parametry, a navrhněte vlastní počáteční hodnoty pro pravděpodobnosti transmisí (výměna brankářů) a emisí (chycení nebo puštění branky). Následně aplikujte na tento upravený model iterační algoritmus Viterbiho učení a porovnejte, zda se získané pravděpodobnosti blíží těm z obrázku.
PS: Pstností parametry musí pochopitelně splňovat příslušná omezení (jako je sumace do jedničky a podobně). 


In [3]:
inner_states_presentation = tuple(state for state in 'PPPMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP')
inner_states_most_probable = tuple(state for state in 'MMMMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP')

In [4]:
def format_output(output):
    start_probabilities, transition_probabilities, emission_probabilities = output
    print("Starting probabilities:")
    for key in start_probabilities.keys():
        print(f"{key} : {round(start_probabilities[key] * 100, 2)}%")
        
    print("Transition probabilities:")
    for from_state in transition_probabilities.keys():
        for to_state in transition_probabilities[from_state].keys():
            print(f"{from_state} -> {to_state} : {round(transition_probabilities[from_state][to_state] * 100, 2)}%")
            
    print("Emission probabilities:")
    for state in emission_probabilities.keys():
        for emission in emission_probabilities[state].keys():
            print(f"{state} -> {emission} : {round(emission_probabilities[state][emission] * 100, 2)}%")

Predicted probabilities:

In [5]:
params = find_best_parameters(emissions, inner_states_presentation, max_iterations=100)
format_output(params)

Initial inner states: PPPMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP
#0: 
Probabilities estimates: 
{'P': 0.5, 'M': 0.5} {'P': {'P': 0.9565217391304348, 'M': 0.043478260869565216}, 'M': {'P': 0.038461538461538464, 'M': 0.9615384615384616}} {'P': {'1': 0.3333333333333333, '0': 0.6666666666666666}, 'M': {'1': 0.07692307692307693, '0': 0.9230769230769231}}
Most probable path: PMMMMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPP
#1: 
Probabilities estimates: 
{'P': 0.5, 'M': 0.5} {'P': {'P': 0.95, 'M': 0.05}, 'M': {'P': 0.034482758620689655, 'M': 0.9655172413793104}} {'P': {'1': 0.3333333333333333, '0': 0.6666666666666666}, 'M': {'1': 0.10344827586206896, '0': 0.896551724137931}}
Most probable path: PMMMMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPP
Total viterbi iterations: 2
Starting probabilities:
P : 50.0%
M : 50.0%
Transition probabilities:
P -> P : 95.0%
P -> M : 5.0%
M -> P : 3.45%
M -> M : 96.55%
Emission probabilities:
P -> 1 : 33.33%
P -> 0 : 66.67%
M -> 1 : 10.34%
M -> 0 : 89.

The algorithm hit the maximum very quickly - in two iterations.
The immediate issue that can be seen is that on the second guess of the inner streams, 
there are transitions from P to M, which sort of locks in the prediction of inners states
and stabilizes the output parameters.

In [6]:
optimal_path = viterbi(emissions, set(params[0].keys()), params[0], params[1], params[2])
print("O: " + "".join(emissions))
print("Final viterbi")
print("I: " + "".join(optimal_path))
print("Starting inner state sequence")
print("I: " + "".join(inner_states_presentation))

O: 01000000100000000000001000000010001010010001001001
Final viterbi
I: PMMMMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPP
Starting inner state sequence
I: PPPMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP


In [7]:
params_most_prob = find_best_parameters(emissions, inner_states_most_probable, max_iterations=100)
format_output(params_most_prob)

Initial inner states: MMMMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP
#0: 
Probabilities estimates: 
{'M': 0.5, 'P': 0.5} {'M': {'M': 0.9655172413793104, 'P': 0.034482758620689655}, 'P': {'M': 0.0, 'P': 1.0}} {'M': {'1': 0.10344827586206896, '0': 0.896551724137931}, 'P': {'1': 0.3333333333333333, '0': 0.6666666666666666}}
Most probable path: MMMMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP
#1: 
Probabilities estimates: 
{'M': 0.5, 'P': 0.5} {'M': {'M': 0.9655172413793104, 'P': 0.034482758620689655}, 'P': {'M': 0.0, 'P': 1.0}} {'M': {'1': 0.10344827586206896, '0': 0.896551724137931}, 'P': {'1': 0.3333333333333333, '0': 0.6666666666666666}}
Most probable path: MMMMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP
Total viterbi iterations: 2
Starting probabilities:
M : 50.0%
P : 50.0%
Transition probabilities:
M -> M : 96.55%
M -> P : 3.45%
P -> M : 0.0%
P -> P : 100.0%
Emission probabilities:
M -> 1 : 10.34%
M -> 0 : 89.66%
P -> 1 : 33.33%
P -> 0 : 66.67%


In [8]:
optimal_path_most_probable = viterbi(emissions, set(params_most_prob[0].keys()), params_most_prob[0], params_most_prob[1], params_most_prob[2])
print("O: " + "".join(emissions))
print("Final viterbi")
print("I: " + "".join(optimal_path))
print("Starting inner state sequence")
print("I: " + "".join(inner_states_most_probable))

O: 01000000100000000000001000000010001010010001001001
Final viterbi
I: PMMMMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPP
Starting inner state sequence
I: MMMMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP


We've reached the same maximum as before

 Zopakuje si předchozí příklad, jen zkuste nastavit všechny pravděpodobnosti na 0,5. Doberete se tak stejného výsledku jako z originálních dat? 

In [13]:
probabilities = (
    {'M': 0.5, 'P': 0.5},
    {'P': {'P': 0.5, 'M': 0.5}, 'M': {'P': 0.5, 'M': 0.5}},
    {'P': {'1': 0.5, '0': 0.5}, 'M': {'1': 0.5, '0': 0.5}}
)

In [14]:
params = find_best_parameters(emissions, inner_states_presentation, max_iterations=100, initial_probabilities=probabilities)
format_output(params)

Initial inner states: PPPMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP
#0: 
Probabilities estimates: 
{'M': 0.5, 'P': 0.5} {'P': {'P': 0.5, 'M': 0.5}, 'M': {'P': 0.5, 'M': 0.5}} {'P': {'1': 0.5, '0': 0.5}, 'M': {'1': 0.5, '0': 0.5}}
Most probable path: PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
#1: 
Probabilities estimates: 
{'P': 0.5, 'M': 0.5} {'P': {'P': 1.0, 'M': 0.0}, 'M': {'P': 0.0, 'M': 0.0}} {'P': {'1': 0.2, '0': 0.8}, 'M': {'1': 0.0, '0': 0.0}}
Most probable path: PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
Total viterbi iterations: 2
Starting probabilities:
P : 50.0%
M : 50.0%
Transition probabilities:
P -> P : 100.0%
P -> M : 0.0%
M -> P : 0.0%
M -> M : 0.0%
Emission probabilities:
P -> 1 : 20.0%
P -> 0 : 80.0%
M -> 1 : 0.0%
M -> 0 : 0.0%


In [15]:
optimal_path = viterbi(emissions, set(params[0].keys()), params[0], params[1], params[2])
print("O: " + "".join(emissions))
print("Final viterbi")
print("I: " + "".join(optimal_path))
print("Starting inner state sequence")
print("I: " + "".join(inner_states_presentation))

O: 01000000100000000000001000000010001010010001001001
Final viterbi
I: PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
Starting inner state sequence
I: PPPMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP


In [16]:
params = find_best_parameters(emissions, inner_states_most_probable, max_iterations=100, initial_probabilities=probabilities)
format_output(params)

Initial inner states: MMMMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP
#0: 
Probabilities estimates: 
{'M': 0.5, 'P': 0.5} {'P': {'P': 0.5, 'M': 0.5}, 'M': {'P': 0.5, 'M': 0.5}} {'P': {'1': 0.5, '0': 0.5}, 'M': {'1': 0.5, '0': 0.5}}
Most probable path: MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
#1: 
Probabilities estimates: 
{'M': 0.5, 'P': 0.5} {'M': {'M': 1.0, 'P': 0.0}, 'P': {'M': 0.0, 'P': 0.0}} {'M': {'1': 0.2, '0': 0.8}, 'P': {'1': 0.0, '0': 0.0}}
Most probable path: MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
Total viterbi iterations: 2
Starting probabilities:
M : 50.0%
P : 50.0%
Transition probabilities:
M -> M : 100.0%
M -> P : 0.0%
P -> M : 0.0%
P -> P : 0.0%
Emission probabilities:
M -> 1 : 20.0%
M -> 0 : 80.0%
P -> 1 : 0.0%
P -> 0 : 0.0%


In [17]:
optimal_path = viterbi(emissions, set(params[0].keys()), params[0], params[1], params[2])
print("O: " + "".join(emissions))
print("Final viterbi")
print("I: " + "".join(optimal_path))
print("Starting inner state sequence")
print("I: " + "".join(inner_states_presentation))

O: 01000000100000000000001000000010001010010001001001
Final viterbi
I: MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
Starting inner state sequence
I: PPPMMMMMMMMMMMMMMMMMMMMMMMMMMPPPPPPPPPPPPPPPPPPPPP


In both cases we have reached the same result that was shown in the presentation.