# `NLP for Digital Humanities`
#` Tanagra (version 2)`

##### [Motasem Alrahabi](https://obtic.sorbonne-universite.fr/alrahabi/index.html), [ObTIC](https://obtic.sorbonne-universite.fr/) - [Sorbonne Université](https://www.sorbonne-universite.fr/)

2025

Shortlink: https://rb.gy/pg0tb8

#`Introduction`:
This development aims to identify, geolocate, and map place names in texts while associating them with a positive, negative, or neutral sentiment based on their context.

The emotional mapping of place names is a method that involves associating emotions with specific geographic locations. This method can be used to explore how places are perceived and felt by people who live there or visit them.

Examples:
*   I love Paris...
*   I visited London and had a very good time there.
*   ...


Projects examples: [Victorian London](https://www.historypin.org/en/victorian-london/geo/51.5128,-0.116085,12/bounds/51.423936,-0.222,51.601491,-0.01017/paging/1), [Mapping Arabia](https://www.google.com/maps/d/u/0/edit?hl=fr&mid=1M--Wq2CJcCPjIccrm5h8LYoTANTTH4cc&ll=48.84129625249132%2C2.356532365466477&z=13), [Palladio](https://hdlab.stanford.edu/palladio/), etc.

---

### **Method:**

To generate a map with the place names found in a text, we will do the following steps

1- **`Extracting place names from the text`** using NLP techniques to detect geographical entities in the text: city names, country names, rivers, mountains, etc.

--> Challenges: Coverage - geographical entities are an open list (unknown words, graphical and historical variations, abbreviations); Granularity and delimitation - nesting, scope (e.g., "jardin du Luxembourg"); Homonymy - Paris (France) vs. Paris Hilton; Orange (city) vs. Orange (fruit); Metonymy - Charles de Gaulle (person, airport, aircraft carrier, etc.).

2- **`Geocoding`** : sending place names to a geocoding service to convert them into geographical coordinates (latitude and longitude). This can be done using a geocoding library or by making queries to an online geocoding service.

--> Challenges: Disambiguating duplicate place names, e.g., Paris (France) vs. Paris (United States); Tunis (country) or Tunis (capital). Data linking can also be ensured, connecting one or more entities to a unique reference (e.g., 1 av. de l'Exemple, Paris -> 1 avenue de l'Exemple, 75005 Paris). Databases like Wikidata are used for this purpose.

3- **`Displaying on a map`** : showing the geographical coordinates of places on a map using a mapping library such as Leaflet, Google Maps, Mapbox, etc. Markers can be created on the map for each place, and the place name can be displayed as a tooltip or label.

---

  ### **Steps:**
  This code will do the following
  
  1.   The input text is segmented, sentence by sentence (with Spacy).
  2.   Each sentence is read and searched for all LOC named entities (with Spacy). It is possible to replace the used model with another one. A blacklist of unwanted terms can be customized by the user.
  3.   For each found entity, the script searches for the geographical coordinates. It uses Geonamescache DB as primary source (Geonames is a global database of places (cities, rivers, mountains...). If no results, it connects to the Nominatim server in order to geocode the locations (Nominatim is a geocoding service based on the collaborative OpenStreetMap project).
  4.   If at least one named entity is found in a sentence, a new function analyzes the positive, negative and neutral sentiments of the whole sentence (using Bert-base-multilingual-uncased-sentiment - can be customized).
  5.   Using the library Folium (based on LeafLet, an open-source JavaScript library for creating interactive maps), the script displays on a map the named entities found. The color of each icon is calculated according to the average sentiment of this entity in the whole text: green for positive, red for negative and gray for neutral. The size of the icon is proportional to the number of its occurrences in the whole text. For each icon, a popup shows the place name, the occurences number, the number of positive, negative and neutral sentiments in the input text.
  6.   The script saves each entity in an output csv file: file, sentence, sentiment, lat and long.

---

In [None]:
%%capture
!pip install -q folium
import folium
from folium import plugins
from datetime import datetime, timedelta
import matplotlib.pyplot as plt

## `1) Load the named entity recognition model`:

In [None]:
%%capture
!pip install -q spacy
! python -m spacy download fr_core_news_md
#! python -m spacy download en_core_web_sm

import spacy
from spacy import displacy
from spacy.pipeline import EntityRuler

nlp = spacy.load('fr_core_news_md')
#nlp = spacy.load('en_core_web_sm')
nlp.max_length = 1500000 # pour les gros fichiers au delà d'un million de mots

## `2) Load a list of named entity terms to "exclude"`:


In [None]:
blacklist = ["XVIIIe", "XVIe"] # juste pour exemple

# on peut aussi faire appel à un fichier externe:

#with open("blacklist.txt", "r", encoding='utf-8') as file:
#  blacklist = [word.strip() for word in file.readlines()]

# Ou bien à un fichier sur le drive:
#from google.colab import files, output
#print("Please upload the blacklist:")
#uploaded_blacklist = files.upload()
#blacklist = []
#if len(uploaded_blacklist.keys()) > 0:
#    blacklist_filename = list(uploaded_blacklist.keys())[0]
#    blacklist_content = uploaded_blacklist[blacklist_filename].decode("utf-8")
#    blacklist = blacklist_content.splitlines()

## `3) Load a list of named entity terms to "include"`:

In [None]:
patterns = [
    {"label": "GPE", "pattern": "a11bc22"},
    {"label": "GPE", "pattern": [{"LOWER": "chez"}, {"LOWER": "moi"}]} #{"OP": "*"}    Permet des MWE
    #{"label": "ANNEE", "pattern": [{"TEXT": {"REGEX": r"^\d{4}$"}}]},  # Années (ex. 2023)
    #{"label": "MOD", "pattern": "LEMMA": "aimer"}
]
ruler = nlp.add_pipe("entity_ruler", before="ner")  # Ajout avant le composant NER
ruler.add_patterns(patterns)

## `4) Load the sentiment analysis model`:

In [None]:
%%capture
!pip install --upgrade tensorflow transformers
from transformers import pipeline
sentiment_analysis_model = pipeline("sentiment-analysis", model="nlptown/bert-base-multilingual-uncased-sentiment", framework="pt")  # pt pytorch / tf TensorFlow
#sentiment_analysis_model = pipeline("sentiment-analysis", model="bert-base-uncased", framework="pt")
# L'utilisateur peut aussi charger son propre modèle.

## `5) Segment the text into sentences`:

In [None]:
def segmenter_en_phrases(text):
  all_sentences = []
  text = text.replace('’', "'") # normalisation
  text = text.replace('“', '"')
  text = text.replace('”', '"')
  text = text.replace('‘', "'")
  text = text.replace("\r", " ").replace("\n", " ") # enlever les retours à la ligne
  text = text.replace("  ", " ")
  text = text.strip()
  doc = nlp(text)
  sentences = [sent.text for sent in doc.sents]
  all_sentences.extend(sentences)
  #print(all_sentences) # toutes les phrases du corpus dans une liste
  return all_sentences

## `6) Process each sentence: named entities, coordinates and sentiments`:

In [None]:
%%capture
!pip install -q geonamescache
!pip install -q geopy

from geopy.geocoders import Nominatim
from geopy.exc import GeocoderTimedOut

import geonamescache

In [None]:
def analyze_sentence(sentence):
  doc = nlp(sentence)
  analyzed_sentence = []
  entities = []
  latitude = None
  longitude = None
  sentiment_score_sentence = 0
  sentiment_label_sentence = ""

  for ent in doc.ents:
      if ent.label_ in {"LOC", "GPE"}:  # je ne prends pas les autres types d'entités (PER, ORG...)
        #print(ent.label_)
        if ent.text.lower() not in blacklist:
            entities.append((ent.text, ent.start, ent.end))

            # La méthode pour la géolocalisation:
            # J'utilise geonamescache (geonames.db est plus complet mais plus lourd),
            # Si on n'y trouve pas l'EN, on fait appel à l'API Nominatim (ce sera sûrement moins de requête dans ce cas là).

            # Charger la bdd geonames:
            gc = geonamescache.GeonamesCache()
            matching_cities = gc.get_cities_by_name(ent.text)

            if matching_cities:
                first_matching_city = list(matching_cities[0].values())[0]
                latitude = first_matching_city['latitude']
                longitude = first_matching_city['longitude']
            else:
                # Si la ville n'est pas trouvée dans geonamescache, utiliser Nominatim comme une deuxième option:
                geolocator = Nominatim(user_agent='my-app', timeout=10)
                try:
                    location = geolocator.geocode(ent.text)
                    if location:
                      latitude = location.latitude
                      longitude = location.longitude
                    else:
                      raise GeocoderTimedOut("Geocoding timed out for location using Nominatim: " + ent.text)
                except GeocoderTimedOut as e:
                #except (GeocoderTimedOut, Exception) as e:
                  #print(f"Error geocoding {location}: {e}")
                  #print(f"Geocoding timed out for location using Nominatim: {ent.text}")
                  latitude = None
                  longitude = None
                  location = None

            if entities and latitude and longitude:
              sentiment_score_sentence = sentiment_analysis_model(sentence)[0]["score"]
              sentiment_label_sentence = sentiment_analysis_model(sentence)[0]['label']
            result = {
                "entity": ent.text,
                "sentiment_label_sentence": sentiment_label_sentence,
                "sentiment_score_sentence": sentiment_score_sentence,
                "latitude": latitude,
                "longitude": longitude
            }
            analyzed_sentence.append(result)

  return analyzed_sentence

## `7) Load the corpus: choose one of these three options`:  ⚠

### 7-1) The input text is provided in the code


In [None]:
# Option 1: charger le texte en dur (hardcoded text):

#text = """J'adore Paris pour son atmosphère romantique et ses délicieuses pâtisseries.
#New York m'impressionne avec ses gratte-ciel imposants, mais je trouve la ville un peu trop bruyante à mon goût.
#La campagne française est d'une beauté paisible qui apaise l'âme.
#Je suis neutre à l'égard de Los Angeles, la ville a ses avantages, mais le trafic peut être vraiment agaçant.
#Venise m'a profondément ému avec ses canaux pittoresques et son histoire fascinante.
#Londres a une ambiance unique qui me rend nostalgique de mes voyages passés.
#Je n'aime pas du tout les hivers rigoureux de Montréal, mais l'été est absolument charmant.
#La plage de Bali est un véritable paradis, et j'adore passer du temps là-bas.
#Les déserts du Sahara sont à couper le souffle, mais la chaleur peut être accablante.
#Tokyo m'a laissé une impression mitigée, certains quartiers sont incroyablement modernes, tandis que d'autres conservent une atmosphère traditionnelle qui est fascinante."""

#all_sentences = segmenter_en_phrases(text)

### 7-2) The text is a file on the PC

In [None]:
# Download this text https://ws-export.wmcloud.org/?lang=fr&title=Le_Tour_du_monde_en_quatre-vingts_jours and upload it here:
import os
from google.colab import files, output
uploaded = files.upload()
file_name = next(iter(uploaded))
if os.path.isfile(file_name) and file_name.endswith(".txt"):
  with open(file_name, "r", encoding="utf-8") as file:
    text = file.read()
    all_sentences = segmenter_en_phrases(text)
else:
  print(f"Le fichier {file_name} n'est pas au format .txt et ne sera pas traité.")

Saving Le Tour du monde en quatre-ving - Jules Verne.txt to Le Tour du monde en quatre-ving - Jules Verne (1).txt


### 7-3) The text is a file on the Drive


In [None]:
# Option 3: charger le texte à partir du drive:
#from google.colab import drive
#drive.mount('/content/drive')
#directory = "/content/drive/MyDrive/Colab_Notebooks/Tanagra/input/"
#for filename in os.listdir(directory): #pas besoin finalement de parcourir un dossier car on prend uniquement un seul texte.
#    file_path = os.path.join(directory, filename)
#    if os.path.isfile(file_path) and filename.endswith(".txt"):
#      with open(path, 'r', encoding='utf-8') as file:
#        text = file.read()
#        all_sentences = segmenter_en_phrases(file_path)

### 7-4) The text is online

In [None]:
#import requests
#url = "https://www.gutenberg.org/cache/epub/800/pg800.txt"  # l tour du monde...

#response = requests.get(url)
#if response.status_code == 200:
#    text = response.text
#    print("Texte importé avec succès !")
#else:
#    print(f"Erreur lors de la récupération : {response.status_code}")

#all_sentences = segmenter_en_phrases(text)

## `8) Iterate over text and aggregate the results`:

In [None]:
import csv
from tqdm import tqdm
# Spécifier ici une limite de phrases à traiter (pour réduire le temps de traitement) :
limite = 100

# Save entity information to a CSV file
analyzed_sentences = []

with open('output.csv', 'w', newline='', encoding='utf-8') as csvfile:
    writer = csv.writer(csvfile, delimiter="\t")
    writer.writerow(['sentence', 'entity', 'label', 'score', 'latitude', 'longitude'])

    counter = 0
    # Ajoutez tqdm pour afficher la barre de progression
    for sentence in tqdm(all_sentences[:limite], desc="Traitement des phrases"):
        if counter >= limite:  # Limite maximale de phrases à analyser
            break
        analyzed_sentence = analyze_sentence(sentence)
        for result in analyzed_sentence:
            entity = result["entity"]
            sentiment_label = result["sentiment_label_sentence"]
            sentiment_score = result["sentiment_score_sentence"]
            latitude = result["latitude"]
            longitude = result["longitude"]

            # Vérifier si latitude et longitude sont None
            if latitude is None or longitude is None:
                #print(f"Skipping result due to missing location data for entity: {entity}")
                continue  # Passer cette phrase et aller à la suivante

            writer.writerow([sentence, entity, sentiment_label, round(sentiment_score, 2), latitude, longitude])
            counter += 1
            analyzed_sentences.append({
                'sentence': sentence,
                'entity': entity,
                'sentiment_label': sentiment_label,
                'sentiment_score': round(sentiment_score, 2),
                'latitude': latitude,
                'longitude': longitude
            })

# Iteration sur analyzed_sentences pour créer un dictionnaire des entités
entities_dict = {}
for sentence_info in analyzed_sentences:
    entity = sentence_info['entity']
    sentiment_label = sentence_info['sentiment_label']

    # Initialiser les informations de l'entité si elle n'existe pas
    if entity not in entities_dict:
        entities_dict[entity] = {
            'latitude': sentence_info['latitude'],
            'longitude': sentence_info['longitude'],
            'occurrences': 1,
            'positive_labels': 0,
            'negative_labels': 0,
            'neutral_labels': 0,
            'overall_sentiment': sentiment_label
        }
    else:
        entities_dict[entity]['occurrences'] += 1

    # Mettre à jour les comptes de labels de sentiment
    if sentiment_label in ['4 stars', '5 stars']:
        entities_dict[entity]['positive_labels'] += 1
    elif sentiment_label in ['1 star', '2 stars']:
        entities_dict[entity]['negative_labels'] += 1
    elif sentiment_label == '3 stars':
        entities_dict[entity]['neutral_labels'] += 1

# Mise à jour du sentiment global basé sur les comptes
for entity in entities_dict:
    positive_count = entities_dict[entity]['positive_labels']
    negative_count = entities_dict[entity]['negative_labels']
    neutral_count = entities_dict[entity]['neutral_labels']

    if positive_count > negative_count and positive_count > neutral_count:
        entities_dict[entity]['overall_sentiment'] = 'Positive'
    elif negative_count > positive_count and negative_count > neutral_count:
        entities_dict[entity]['overall_sentiment'] = 'Negative'
    elif neutral_count > positive_count and neutral_count > negative_count:
        entities_dict[entity]['overall_sentiment'] = 'Neutral'
    else:
        entities_dict[entity]['overall_sentiment'] = 'Mixed'

# Résultat final : vous pouvez afficher ou enregistrer entities_dict si nécessaire
#print(entities_dict)

Traitement des phrases: 100%|██████████| 100/100 [00:28<00:00,  3.45it/s]


* On peut filtrer à la main entities_dict, corriger par ChatGPT, et remettre la liste corrigées d'EN et de coordonnées

## `9) Visualization`:

In [None]:
# Initialiser le map
# map_center = [20, 0]  # [48.8566, 2.3522]
# map_zoom = 2.5
# Les thèmes:
# map = folium.Map(location=map_center, zoom_start=map_zoom)
# map = folium.Map(location=map_center, zoom_start=map_zoom, tiles='Stamen Terrain')
# map = folium.Map(location=map_center, zoom_start=map_zoom, tiles='Stamen Toner')
# map = folium.Map(location=map_center, zoom_start=map_zoom, tiles='Stamen Watercolor')
# map = folium.Map(location=map_center, zoom_start=map_zoom, tiles='CartoDB Dark_Matter')
# map = folium.Map(location=map_center, zoom_start=map_zoom, tiles='CartoDB Positron')  # , width='50%', height='50%')

# parcourir les entités
# for entity, info in entities_dict.items():
#     latitude = info['latitude']
#     longitude = info['longitude']
#     occurrences = info['occurrences']
#     overall_sentiment = info['overall_sentiment']
#     positive_labels = info['positive_labels']
#     negative_labels = info['negative_labels']
#     neutral_labels = info['neutral_labels']

#     # Déterminer la couleur de l'entité selon le sentiment général
#     if overall_sentiment == 'Positive':
#         color = 'green'
#     elif overall_sentiment == 'Negative':
#         color = 'red'
#     elif overall_sentiment == 'Neutral':
#         color = 'gray'
#     elif overall_sentiment == 'Mixed':
#         color = 'blueviolet'
#     else:
#         color = 'blue'

#     scaling_factor = 5  # Ajuster le size scaling des icones
#     size = int(occurrences) * scaling_factor

#     # Créer la pop up des icones
#     popup = f"<div style='width: 110px; height: 80px;'>"
#     popup += f"<b>Entity:</b> {entity}<br>"
#     popup += f"<b>Occurrences:</b> {occurrences}<br>"
#     popup += f"<b>Sentiment:</b> {overall_sentiment}<br>"
#     popup += f"<b>Positive:</b> {positive_labels}<br>"
#     popup += f"<b>Negative:</b> {negative_labels}<br>"
#     popup += f"<b>Neutral:</b> {neutral_labels}<br>"
#     popup += "</div>"

#     folium.CircleMarker(
#         location=[latitude, longitude],
#         radius=size,
#         popup=popup,
#         color=color,
#         fill=True,
#         fill_color=color,
#         tooltip=entity
#     ).add_to(map)

# Création de la légende de tous les résultats:
# total_entities = len(entities_dict)
# total_positive = sum(info['positive_labels'] for info in entities_dict.values())
# total_negative = sum(info['negative_labels'] for info in entities_dict.values())
# total_neutral = sum(info['neutral_labels'] for info in entities_dict.values())
# total_sentiments = total_positive + total_negative + total_neutral
# html_text = """
#     <div style="position: absolute;
#                 top: 50%; left: 0; transform: translate(0, -50%);
#                 z-index: 1000; background-color: white; border: 2px solid #ccc;
#                 padding: 10px; font-family: 'Trebuchet MS', sans-serif;">
#     <h4>Annotation Results:</h4>
#     <b>Total Entities: {}<br>
#     <b>Total Sentiments: {}<br>
#     <ul>
#     <li>Positive: {}</li>
#     <li>Negative: {}</li>
#     <li>Neutral: {}</li>
#     </ul>
#     </div>
#     """.format(total_entities, total_sentiments, total_positive, total_negative, total_neutral)

# Graphique Camembert: afficher la répartition des sentiments
# import matplotlib.pyplot as plt
# labels = ['Negative', 'Neutral', 'Positive']
# sizes = [total_negative, total_neutral, total_positive]
# colors = ['red', 'gray', 'green']
# plt.pie(sizes, labels=labels, colors=colors, autopct='%1.1f%%', startangle=140)
# plt.title('Répartition des sentiments')
# plt.setp(plt.gca().get_xticklabels(), fontsize=12)
# plt.setp(plt.gca().get_yticklabels(), fontsize=12)
# plt.show()

# Ajouter la légende à la carte
# map.get_root().html.add_child(folium.Element(html_text))

# map.save("map.html")  # enregistrer le fichier sur le disque
# map  # afficher sans enregistrement

## `9) Visualisation - Timeline`:

In [None]:
# Sentiment counts initialization
sentiment_counts = {
    'Positive': 0,
    'Negative': 0,
    'Neutral': 0,
    'Mixed': 0
}

# Map configuration
map_center = [20, 0]
map_zoom = 2.5
#map = folium.Map(location=map_center, zoom_start=map_zoom)
map = folium.Map(location=map_center, zoom_start=map_zoom, tiles='CartoDB Positron')

# Function to determine marker color based on sentiment
def get_color(sentiment):
    if sentiment == 'Positive':
        return 'green'
    elif sentiment == 'Negative':
        return 'red'
    elif sentiment == 'Neutral':
        return 'gray'
    elif sentiment == 'Mixed':
        return 'blueviolet'
    else:
        return 'blue'

# Features for the map
features = []
base_time = datetime.now()

# Sort entities based on first occurrence
sorted_entities = sorted(entities_dict.items(), key=lambda x: x[1].get('first_occurrence', 0))

# Determine the maximum occurrences for scaling
max_occurrences = max(info['occurrences'] for info in entities_dict.values())

for idx, (entity, info) in enumerate(sorted_entities):
    latitude = info['latitude']
    longitude = info['longitude']
    occurrences = info['occurrences']
    overall_sentiment = info['overall_sentiment']
    positive_labels = info['positive_labels']
    negative_labels = info['negative_labels']
    neutral_labels = info['neutral_labels']

    # Increment sentiment counts
    sentiment_counts[overall_sentiment] += 1

    # Adjust marker size relative to maximum occurrences
    scaling_factor = 20
    radius = (occurrences / max_occurrences) * scaling_factor

    # Time for animation
    event_time = base_time + timedelta(seconds=idx * 2)

    # Popup content
    popup_html = f"""
    <div style='width: 200px;'>
        <h4 style='margin:0; color:#2c3e50;'>{entity}</h4>
        <hr style='margin:5px 0;'>
        <p style='margin:5px 0;'><b>Occurrences:</b> {occurrences}</p>
        <p style='margin:5px 0;'><b>Sentiment:</b> {overall_sentiment}</p>
        <div style='margin-top:8px;'>
            <div style='color:green;'>Positive: {positive_labels}</div>
            <div style='color:red;'>Negative: {negative_labels}</div>
            <div style='color:gray;'>Neutral: {neutral_labels}</div>
        </div>
    </div>
    """

    # Feature for the map
    feature = {
        "type": "Feature",
        "geometry": {
            "type": "Point",
            "coordinates": [longitude, latitude],
        },
        "properties": {
            "time": event_time.isoformat(),
            "style": {
                "color": get_color(overall_sentiment),
                "fillColor": get_color(overall_sentiment),
                "fillOpacity": 0.3,
                "weight": 2,
                "radius": radius,
            },
            "popup": popup_html,
            "icon": "circle"
        }
    }
    features.append(feature)

# Calculate totals
total_entities = len(entities_dict)
total_positive = sum(info['positive_labels'] for info in entities_dict.values())
total_negative = sum(info['negative_labels'] for info in entities_dict.values())
total_neutral = sum(info['neutral_labels'] for info in entities_dict.values())
total_sentiments = total_positive + total_negative + total_neutral

# Add timestamped GeoJSON to the map
timestamped_geojson = plugins.TimestampedGeoJson(
    {
        "type": "FeatureCollection",
        "features": features,
    },
    period="PT1S",
    duration='PT1H',
    add_last_point=True,
    loop=False,
    max_speed=1,
    transition_time=100,
    auto_play=True,
    loop_button=True,
    date_options='YYYY/MM/DD HH:mm:ss',
    time_slider_drag_update=True
)
timestamped_geojson.add_to(map)

# Enhanced legend with all counters
legend_html = f'''
<div style="
    position: fixed;
    bottom: 50px; left: 50px; width: 280px;
    border:2px solid #888; z-index:9999; font-size:14px;
    background-color:white;
    padding: 15px;
    border-radius: 6px;
    box-shadow: 0 2px 5px rgba(0,0,0,0.1);
">
    <h4 style="margin-top:0; color:#2c3e50;"><b>Annotation results</b></h4>
    <div style="margin:10px 0;">
        <div style="font-weight:bold; margin-bottom:5px;">Entity statistics:</div>
        <div>Total unique entities: {total_entities}</div>
        <div>Total sentiment labels: {total_sentiments}</div>
    </div>
    <div style="margin:10px 0;">
        <div style="font-weight:bold; margin-bottom:5px;">Sentiment distribution:</div>
        <ul style="list-style-type: none; padding-left: 0; margin: 5px 0;">
            <li style="margin:3px 0;"><i class="fa fa-circle fa-1x" style="color:green"></i>&nbsp;Positive: {total_positive}</li>
            <li style="margin:3px 0;"><i class="fa fa-circle fa-1x" style="color:red"></i>&nbsp;Negative: {total_negative}</li>
            <li style="margin:3px 0;"><i class="fa fa-circle fa-1x" style="color:gray"></i>&nbsp;Neutral: {total_neutral}</li>
        </ul>
    </div>
    <div style="font-size:12px; color:#666; margin-top:10px;">
        Circle size indicates occurrence frequency
    </div>
</div>
'''
map.get_root().html.add_child(folium.Element(legend_html))

# Calculate sizes for pie chart using the total counts
sizes = [total_negative, total_neutral, total_positive,
         sentiment_counts.get('Mixed', 0)]
labels = ['Negative', 'Neutral', 'Positive', 'Mixed']
colors = ['red', 'gray', 'green', 'blueviolet']

# Generate and add pie chart
plt.figure(figsize=(8, 8))
plt.pie(sizes, labels=labels, colors=colors, autopct='%1.1f%%', startangle=90)
plt.axis('equal')
plt.title('Sentiment Distribution')
plt.savefig('sentiment_distribution.png', bbox_inches='tight', dpi=300)
plt.close()

# Save the map
map.save("dynamic_map.html")

# Display the map
map



---


#`End`

<p xmlns:cc="http://creativecommons.org/ns#" >Ce support est distribué selon les termes de la licence Creative Commons <a href="http://creativecommons.org/licenses/by-nc-sa/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;">CC BY-NC-SA 4.0

<img style="height:11px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:11px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"><img style="height:11px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/nc.svg?ref=chooser-v1"><img style="height:11px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1"></a></p>
