`Module 2`
# Data Visualization in Snowflake

## Install Prerequisite Library
- Click on *Packages* in the top-right menu and enter `matplotlib` into the text box.
- Click on the Save button.

## Establish Snowflake Session

In [None]:
from snowflake.snowpark.context import get_active_session

session = get_active_session()

## Query Reviews with Sentiment Score

In [None]:
query = """
SELECT
    *,
FROM 
    REVIEWS_WITH_SENTIMENT
"""

df = session.sql(query).to_pandas()

df.head()

## Visualize Sentiment Scores by Product

In [None]:
import matplotlib.pyplot as plt

product_sentiment = df.groupby("PRODUCT")["SENTIMENT_SCORE"].mean().sort_values()
product_sentiment.plot(kind="barh", title="Average Sentiment by Product")
plt.xlabel("Sentiment Score")
plt.tight_layout()
plt.show()

## Visualize Shipping Patterns and Anomalies using a Time Series

In [None]:
shipment_counts = (
    df.groupby('SHIPPING_DATE')['ORDER_ID']
      .count()
      .reset_index(name='SHIPMENT_COUNT')
      .sort_values('SHIPPING_DATE')
)

shipment_counts.head()

In [None]:
shipment_counts.plot(x="SHIPPING_DATE", y="SHIPMENT_COUNT", kind="line", title="Shipments Per Day")
plt.ylabel("Number of Shipments")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

## Identifying Dates with Low Shipment Volume or Spikes

In [None]:
low_volume_days = shipment_counts[shipment_counts["SHIPMENT_COUNT"] < 5]
print("Low-volume shipping days:\n", low_volume_days)

## Visualization of Total Shipments by Carrier

In [None]:
carrier_counts = df.groupby("CARRIER")["ORDER_ID"].count()
carrier_counts.plot(kind="bar", title="Total Shipments by Carrier")
plt.ylabel("Number of Shipments")
plt.tight_layout()
plt.show()

## Visualization of Average Sentiment by Shipping Status

In [None]:
avg_sentiment = df.groupby("STATUS")["SENTIMENT_SCORE"].mean().sort_values()

avg_sentiment.plot(kind="barh", title="Avg Sentiment by Shipping Status")
plt.xlabel("Sentiment Score")
plt.ylabel("Shipping Status")
plt.tight_layout()
plt.show()