Etapes pour entrainer le modele word2vec sur le texte fourni:

1/ Pretraitement du texte : Tokenisation et nettoyage du texte.

2/ Entrainement du modele word2vec : utilisation de Gensim pour creer le modele.

3/ Utilisation du modele entraine : Extraire la representation vectorielle d un mot, calculer la similarite entre 2 mots et extraire les mots contextuels pour un mot donne.

In [1]:
from gensim.models import Word2Vec
from nltk.tokenize import word_tokenize
import nltk

In [2]:
nltk.download('punkt')

# Texte à utiliser pour l'entraînement du modèle
text = """Morocco and Marrakech: A Tapestry of Tradition and Modernity Morocco, located at the crossroads of Europe and Africa, is a country drenched in history, mystery, and cultural richness. A testament to the ancient civilizations that once flourished here, this North African kingdom boasts a unique blend of Arab, Berber, and European influences. At the heart of Morocco's rich tapestry lies Marrakech, one of its four imperial cities and a vibrant epicenter of tradition and modernity. Geographical Significance Morocco is bordered by the Atlantic Ocean to the west, the Mediterranean Sea to the north, Algeria to the east and southeast, and the vast Sahara desert to the south. Its strategic location has historically made it a sought-after territory and a melting pot of cultures, religions, and trade routes. Marrakech: The Red City Marrakech, often referred to as "The Red City" due to its distinctive red-hued buildings, stands against the backdrop of the snow-capped Atlas Mountains. Established in the 11th century, it has remained a crucial political, economic, and cultural center of Morocco. Journey through the Medina Marrakech's old town, the Medina, is a UNESCO World Heritage site and a labyrinthine maze of narrow alleys, bustling souks, and historical landmarks. The Djemaa el-Fna Square lies at the heart of the Medina and comes alive every evening with storytellers, musicians, snake charmers, and food stalls offering tantalizing Moroccan delicacies. Palaces and Gardens The city is also home to grand palaces like the Bahia Palace, showcasing intricate Islamic architecture, and the Saadian Tombs, remnants of the Saadian dynasty. The Majorelle Garden, restored by the fashion designer Yves Saint Laurent, is a tranquil oasis of cacti, palm trees, and cobalt blue accents. Modern Marrakech While tradition and history permeate Marrakech, the city is not averse to the modern world. Gueliz, the new town, is brimming with contemporary art galleries, stylish cafes, and chic boutiques, offering a stark contrast to the ancient Medina. Moroccan Cuisine No journey through Morocco and Marrakech would be complete without indulging in the local cuisine. Tagines, couscous, and pastilla are just a few of the many dishes that combine a plethora of flavors and spices like saffron, cumin, and mint. Paired with Moroccan mint tea, the culinary experience is truly unparalleled. In Conclusion Morocco, with Marrakech at its heart, offers travelers an unparalleled journey through time. The convergence of history, culture, architecture, and gastronomy makes it an enthralling destination for those seeking both adventure and reflection. As the Moroccan proverb goes, "He who does not travel does not know the value of men." In the case of Morocco and Marrakech, it's not just the value of men, but also the value of time, tradition, and tales that have spanned centuries."""


[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\imane\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping tokenizers\punkt.zip.


In [3]:
# Tokenisation du texte
tokens = word_tokenize(text.lower())

In [4]:
# Entraînement du modèle Word2Vec
model = Word2Vec([tokens], vector_size=100, window=5, min_count=1, sg=1)

In [5]:
# Fonction pour extraire la représentation vectorielle d'un mot
def get_word_vector(word):
    return model.wv[word]

In [6]:
# Fonction pour calculer la similarité entre deux mots
def calculate_similarity(word1, word2):
    return model.wv.similarity(word1, word2)

In [7]:
# Fonction pour extraire les mots contextuels pour un mot central donné
def get_contextual_words(word, topn=5):
    return model.wv.most_similar(word, topn=topn)

In [8]:
# Exemples d'utilisation
word = "morocco"
print("Représentation vectorielle de '{}' :".format(word))
print(get_word_vector(word))
print()

Représentation vectorielle de 'morocco' :
[-1.0787104e-02  1.1357489e-02  4.7213864e-03  1.0605202e-02
  7.6770317e-03 -1.4883145e-03  1.3278982e-02  2.1486816e-03
 -1.1675926e-02  2.6663290e-03  3.6973555e-03 -8.6488556e-03
  8.4184278e-03 -2.9588938e-03  9.7317575e-03 -2.2813133e-03
 -6.0239160e-03 -5.5977260e-03 -8.4881019e-03 -9.4352737e-03
  1.3128738e-03  8.5188476e-03  1.2491137e-02 -9.0254219e-03
  4.6671103e-03  1.8947250e-03 -3.5687841e-03 -1.0920721e-02
 -1.3277067e-03 -9.2583150e-03  9.1907745e-03 -4.9551674e-03
  7.7742212e-03  9.6808234e-03 -1.3991615e-03  8.6256331e-03
 -5.6434353e-04  5.8940351e-03  2.5633541e-03 -1.3004911e-02
 -1.9822021e-03 -6.2569970e-04 -1.5414362e-03 -4.2665092e-04
  1.1741413e-03 -1.4533347e-03  1.2447486e-03  1.9608380e-03
 -3.6037229e-03 -2.9710140e-03  2.1029406e-03 -2.4319175e-03
 -8.0459012e-04 -4.3499409e-03  1.8931287e-03 -1.9703254e-04
  3.3160583e-03  3.7160804e-05  2.2861038e-03 -6.2813764e-03
 -5.8252569e-03 -5.4448931e-03  1.0642011e-

In [9]:
word1 = "morocco"
word2 = "marrakech"
print("Similarité entre '{}' et '{}' :".format(word1, word2))
print(calculate_similarity(word1, word2))
print()

Similarité entre 'morocco' et 'marrakech' :
0.322483



In [10]:
central_word = "city"
print("Mots contextuels pour '{}' :".format(central_word))
print(get_contextual_words(central_word))

Mots contextuels pour 'city' :
[('also', 0.35807088017463684), ('modern', 0.34658390283584595), ('goes', 0.295348584651947), ('of', 0.2804988622665405), (',', 0.2797464430332184)]
