### Overview of the Dashboard:

This Jupyter Notebook visualizes the results of Sentiment Analysis and Topic Modeling. To recreate the graphs, etc., you just have to click on "Run All"

**Note:** One comment in the dataset can be assigned to multiple columns, therefore in some plots the some comments may be counted multiple times.

In [1]:
import dash
from dash import dcc, html
from dash.dependencies import Input, Output
import plotly.express as px
import pandas as pd
from jupyter_dash import JupyterDash
import warnings
from dashboard import *
warnings.simplefilter(action='ignore', category=FutureWarning)

In [2]:
# Load the dataset
df = pd.read_csv('../../data/ryanair_reviews_with_extra_features.csv')
df['Date Published Formatted'] = pd.to_datetime(df['Date Published'], errors='coerce')

# Topic names are shorthened for the plots
topics = {
    "Luggage": "topic_luggage",
    "Boarding": "topic_boarding",
    "Punctual": "topic_punctual",
    "Service": "topic_service",
    "Comfort": "topic_comfort",
    "Other Fees": "topic_other_fees",
    "Delay": "topic_delay",
    "Clean": "topic_clean"
}

# Sentiment colors and order are defined for consistency
sentiment_colors = {
    'negative': 'rgb(220, 20, 60)',  # Crimson - Red 
    'positive': 'rgb(34, 139, 34)',  # ForestGreen - Green
    'neutral': 'rgb(70, 130, 180)'   # SteelBlue - Blue 
}

category_orders = {'Sentiment': ['positive', 'neutral', 'negative']}

# Create a long-form dataframe for easier plotting
long_df = pd.melt(df, id_vars=['Sentiment'], value_vars=list(topics.values()), 
                  var_name='Topic', value_name='Value')
long_df = long_df[long_df['Value'] == True]
long_df['Topic'] = long_df['Topic'].map({v: k for k, v in topics.items()})

# Number of comments are multiplied since one topic can be mapped to multiple topics 
comment_counts = long_df.groupby(['Topic', 'Sentiment']).size().reset_index(name='Number of Comments')

note = 'Note: Some comments may address multiple topics; thus, some comments are counted more than once.'

topic_colors = {
    "Service": "lightgreen",
    "Luggage": "lightblue",
    "Punctual": "lightcoral",
    "Boarding": "lightgoldenrodyellow",
    "Comfort": "lightpink",
    "Other Fees": "lightseagreen",
    "Delay": "lightskyblue",
    "Clean": "lightsteelblue"
}

### Topics and Sentiment Distrubion of Comments Per Topic - Treemap

The treemap visualizations provide a detailed overview of the distribution of comments by topic and sentiment. Each box represents a specific topic or sentiment, with the size of the box corresponding to the number of comments. Users can interact with the treemaps by clicking on any box to focus on it and by hovering over the boxes to see the exact values. In order to keep the interaction features intact, please make sure this is the last run you ran.

In [17]:
fig_topic,fig_sen = create_treemap_visualizations(comment_counts, topic_colors, sentiment_colors, note)
fig_topic.show()
fig_sen.show()

### Topics and Sentiment Distrubion of Comments Per Topic - Sunburst Map

This sunburst chart visualizes the distribution of sentiments across different topics. Each segment of the chart represents a topic, with sub-segments showing the breakdown of sentiments (positive, negative, and neutral) within that topic. The size of each segment corresponds to the number of comments related to that sentiment and topic.

In [18]:
fig = create_sunburst_plot(long_df, note)
fig.show()

### Heatmap of Sentiment by Topic

This heatmap visualizes the distribution of comments by sentiment across various topics. The chart provides a clear depiction of how frequently each sentiment is associated with different topics.

In [4]:
fig = create_heatmap_visualization(long_df, note)
fig.show()

### Sentiment Distribution Per Topic Over Time

This line chart illustrates the trends in customer sentiment over time, spanning from 2012 to 2024. The chart provides insights into how the percentage of comments categorized under each sentiment has changed over the years, with a key feature allowing users to filter the data by specific topics. 

In order to keep the interaction features intact, please make sure this is the last run you ran.

If the plot stays in the "Loading" state please select the cell and run it again.

In [24]:
app = create_dash_app_sentiments_over_time(df, topics, sentiment_colors, note, category_orders)
app.run_server(mode='inline', debug=True)

### Total Number of Comments per Topic Over Time

This line chart visualizes the total number of comments per topic over time for Ryanair reviews, spanning from 2012 to 2024. The chart allows users to observe how the frequency of comments related to specific topics has varied over the years. Additionally, users can select multiple topics to filter the data, providing a more focused analysis.

In order to keep the interaction features intact, please make sure this is the last run you ran.

If the plot stays in the "Loading" state please select the cell and run it again.

In [25]:
app = create_dash_app_topics_over_time(df, topics, note)
app.run_server(mode='inline', debug=True)

### Distribution of Number Comments Per Sentiment

This plot is an interactive dashboard designed to display the distribution of comments per sentiment for Ryanair reviews. The dashboard allows users to filter the data by various categories and topics, providing a detailed and customizable view of customer feedback. Users can select multiple values for each category to include in the analysis.

In [22]:
categories = [
    "Type Of Traveller", 
    "Origin", 
    "Destination", 
    "Date Flown", 
    "Seat Comfort", 
    "Cabin Staff Service", 
    "Food & Beverages", 
    "Ground Service", 
    "Value For Money", 
    "Recommended"
]

app = create_sentiment_distribution_app(df, categories, topics, sentiment_colors)
app.run_server(mode='inline', debug=True)