<h1 style="color:red; text-align:center; text-decoration:underline;">Mod√®les de Markov Cach√©s (HMM)</h1>


<h2 style="color:green; text-decoration:underline;">Test 1 : Reconnaissance de s√©quences simples</h2>


In [16]:
from hmmlearn.hmm import CategoricalHMM
import numpy as np

# Mots et leur index
word_to_idx = {
    "fish": 0, "swim": 1, "in": 2, "water": 3,
    "cats": 4, "eat": 5, "dogs": 6, "run": 7,
    "park": 8, "birds": 9, "fly": 10, "over": 11, "trees": 12
}
idx_to_word = {v: k for k, v in word_to_idx.items()}

# Tags grammaticaux
tag_to_idx = {"NOUN": 0, "VERB": 1, "PREP": 2}
idx_to_tag = {v: k for k, v in tag_to_idx.items()}

# Liste de phrases √† tester (s√©quences de mots)
phrases = [
    ["fish", "swim", "in", "water"],
    ["cats", "eat", "fish"],
    ["dogs", "run", "in", "park"],
    ["birds", "fly", "over", "trees"]
]

# Initialisation du mod√®le HMM avec 3 √©tats : NOUN, VERB, PREP
model = CategoricalHMM(n_components=3, n_iter=100, random_state=0)

# Probabilit√©s initiales des tags
model.startprob_ = np.array([0.6, 0.3, 0.1])  # plus de chances que la phrase commence par un nom

# Probabilit√©s de transition entre tags (simples mais r√©alistes)
model.transmat_ = np.array([
    [0.2, 0.6, 0.2],  # NOUN ‚Üí NOUN, VERB, PREP
    [0.3, 0.3, 0.4],  # VERB ‚Üí
    [0.6, 0.3, 0.1],  # PREP ‚Üí
])

# Probabilit√©s d‚Äô√©mission : mot | tag
model.emissionprob_ = np.zeros((3, len(word_to_idx)))

# NOUNs
for word in ["fish", "water", "cats", "dogs", "park", "birds", "trees"]:
    model.emissionprob_[tag_to_idx["NOUN"], word_to_idx[word]] = 1.0
model.emissionprob_[tag_to_idx["NOUN"]] /= model.emissionprob_[tag_to_idx["NOUN"]].sum()

# VERBs
for word in ["swim", "eat", "run", "fly"]:
    model.emissionprob_[tag_to_idx["VERB"], word_to_idx[word]] = 1.0
model.emissionprob_[tag_to_idx["VERB"]] /= model.emissionprob_[tag_to_idx["VERB"]].sum()

# PREPs
for word in ["in", "over"]:
    model.emissionprob_[tag_to_idx["PREP"], word_to_idx[word]] = 1.0
model.emissionprob_[tag_to_idx["PREP"]] /= model.emissionprob_[tag_to_idx["PREP"]].sum()

# Tester chaque phrase
for phrase in phrases:
    obs = np.array([[word_to_idx[w]] for w in phrase])
    logprob, states = model.decode(obs, algorithm="viterbi")
    tags = [idx_to_tag[s] for s in states]
    print("Phrase :", " ".join(phrase))
    print("Tags   :", " ".join(tags))
    print("-" * 40)


Phrase : fish swim in water
Tags   : NOUN VERB PREP NOUN
----------------------------------------
Phrase : cats eat fish
Tags   : NOUN VERB NOUN
----------------------------------------
Phrase : dogs run in park
Tags   : NOUN VERB PREP NOUN
----------------------------------------
Phrase : birds fly over trees
Tags   : NOUN VERB PREP NOUN
----------------------------------------


<h3 style="color:#0056b3; text-decoration:underline;">R√©sultat</h3>

Le mod√®le HMM a correctement identifi√© la structure grammaticale des phrases test√©es en assignant des √©tiquettes pertinentes (NOUN, VERB, PREP) aux mots.  
Gr√¢ce √† l‚Äôalgorithme de Viterbi, il a inf√©r√© les s√©quences d‚Äô√©tats cach√©s les plus probables, illustrant la capacit√© du HMM √† mod√©liser des s√©quences linguistiques simples de mani√®re fiable et structur√©e.

<h2 style="color:green; text-decoration:underline;">Test 2 : Reconnaissance de mots √† partir de s√©quences sonores</h2>


In [17]:
from hmmlearn.hmm import CategoricalHMM
import numpy as np
import random

# √âtapes cach√©es = lettres du mot
letter_to_idx = {"H": 0, "A": 1, "L": 2, "R": 3}
idx_to_letter = {v: k for k, v in letter_to_idx.items()}

# Sons observables
sound_to_idx = {"ha": 0, "la": 1, "ra": 2}
idx_to_sound = {v: k for k, v in sound_to_idx.items()}

# HMM : 4 √©tats cach√©s possibles (lettres)
model = CategoricalHMM(n_components=4, n_iter=100, random_state=42)
model.startprob_ = np.array([0.25, 0.25, 0.25, 0.25])  # uniforme

# Transitions simples entre lettres
model.transmat_ = np.array([
    [0.1, 0.6, 0.2, 0.1],  # H ‚Üí
    [0.2, 0.2, 0.4, 0.2],  # A ‚Üí
    [0.3, 0.3, 0.2, 0.2],  # L ‚Üí
    [0.2, 0.4, 0.3, 0.1],  # R ‚Üí
])

# Probabilit√© qu‚Äôun √©tat (lettre) g√©n√®re un son
model.emissionprob_ = np.array([
    [0.8, 0.1, 0.1],  # H ‚Üí ha
    [0.3, 0.4, 0.3],  # A ‚Üí n'importe quel son
    [0.1, 0.8, 0.1],  # L ‚Üí la
    [0.1, 0.1, 0.8],  # R ‚Üí ra
])

# Liste de s√©quences sonores √† tester
phrases_sons = [
    ["ha", "la", "ra"],   # HAL ?
    ["la", "ra", "ha"],   # LAR ?
    ["ra", "la", "ha"],   # RAH ?
    ["la", "la", "ra"],   # LLR ?
    ["ra", "ra", "la"],   # RRL ?
]

print("üîä SIMULATION DE RECONNAISSANCE DE MOTS üîä\n")

for sounds in phrases_sons:
    obs_seq = np.array([[sound_to_idx[s]] for s in sounds])
    logprob, states = model.decode(obs_seq, algorithm="viterbi")
    lettres = [idx_to_letter[i] for i in states]
    print(f"Sons observ√©s : {' '.join(sounds)}")
    print(f"Mot reconnu  : {''.join(lettres)}")
    print("-" * 40)


üîä SIMULATION DE RECONNAISSANCE DE MOTS üîä

Sons observ√©s : ha la ra
Mot reconnu  : HAR
----------------------------------------
Sons observ√©s : la ra ha
Mot reconnu  : LRH
----------------------------------------
Sons observ√©s : ra la ha
Mot reconnu  : RLH
----------------------------------------
Sons observ√©s : la la ra
Mot reconnu  : LLR
----------------------------------------
Sons observ√©s : ra ra la
Mot reconnu  : RAL
----------------------------------------


<h3 style="color:#0056b3; text-decoration:underline;">R√©sultat</h3>
Le mod√®le HMM a √©t√© capable de reconna√Ætre des mots √† partir de s√©quences sonores incertaines.  
En associant chaque son √† une lettre probable √† l‚Äôaide de l‚Äôalgorithme de Viterbi, il a reconstruit des mots coh√©rents malgr√© l‚Äôambigu√Øt√© phon√©tique.  
Ce test montre l'efficacit√© du HMM dans des t√¢ches de reconnaissance de motifs sonores ou de d√©codage linguistique.