### 1. Load and preprocess data

We'll start by loading a dataset. For this example, we'll create a small, simple dataset. In a real-world scenario, you would load a properly annotated corpus.

### 2. Estimate probabilities

Now, let's calculate the transition and emission probabilities from our dataset. We'll use simple frequency counts for this example. For better performance, smoothing techniques are often used in practice.

In [5]:
from collections import defaultdict

# Initialize counts
transition_counts = defaultdict(lambda: defaultdict(int))
emission_counts = defaultdict(lambda: defaultdict(int))
tag_counts = defaultdict(int)
starting_tag_counts = defaultdict(int)

# Populate counts
for tag_sequence, sentence in zip(tags, sentences):
    starting_tag_counts[tag_sequence[0]] += 1
    for i in range(len(tag_sequence)):
        tag_counts[tag_sequence[i]] += 1
        emission_counts[tag_sequence[i]][sentence[i]] += 1
        if i > 0:
            transition_counts[tag_sequence[i-1]][tag_sequence[i]] += 1

# Calculate probabilities
transition_probabilities = defaultdict(lambda: defaultdict(float))
emission_probabilities = defaultdict(lambda: defaultdict(float))
starting_probabilities = defaultdict(float)

for tag, next_tags in transition_counts.items():
    total_transitions = sum(next_tags.values())
    for next_tag, count in next_tags.items():
        transition_probabilities[tag][next_tag] = count / total_transitions

for tag, words in emission_counts.items():
    total_emissions = sum(words.values())
    for word, count in words.items():
        emission_probabilities[tag][word] = count / total_emissions

total_starting_tags = sum(starting_tag_counts.values())
for tag, count in starting_tag_counts.items():
    starting_probabilities[tag] = count / total_starting_tags

print("\nTransition Probabilities:")
for tag, next_tags in transition_probabilities.items():
    print(f"{tag}: {dict(next_tags)}")

print("\nEmission Probabilities:")
for tag, words in emission_probabilities.items():
    print(f"{tag}: {dict(words)}")

print("\nStarting Probabilities:")
print(dict(starting_probabilities))


Transition Probabilities:
DET: {'NOUN': 1.0}
NOUN: {'VERB': 1.0}
VERB: {'PREP': 0.3333333333333333, 'ADV': 0.3333333333333333, 'ADJ': 0.3333333333333333}
PREP: {'DET': 1.0}

Emission Probabilities:
DET: {'The': 0.5, 'the': 0.25, 'A': 0.25}
NOUN: {'cat': 0.25, 'mat': 0.25, 'dog': 0.25, 'bird': 0.25}
VERB: {'sat': 0.3333333333333333, 'ran': 0.3333333333333333, 'flew': 0.3333333333333333}
PREP: {'on': 1.0}
ADV: {'away': 1.0}
ADJ: {'high': 1.0}

Starting Probabilities:
{'DET': 1.0}


### 3. Implement Viterbi algorithm

Now, let's implement the Viterbi algorithm to find the most likely tag sequence for a given sentence.

### 4. Evaluate the tagger

Now, let's evaluate the performance of our Viterbi POS tagger. We'll use the provided dataset as a simple test set.