## Business Objective

Nova Financial Solutions aims to enhance its predictive analytics capabilities to significantly boost its financial forecasting accuracy and operational efficiency through advanced data analysis.

As a Data Analyst at Nova Financial Solutions, your primary task is to conduct a rigorous analysis of the financial news dataset. The focus of your analysis should be two-fold:

1. **Sentiment Analysis**: Perform sentiment analysis on the `headline` text to quantify the tone and sentiment expressed in financial news. This will involve using natural language processing (NLP) techniques to derive sentiment scores, which can be associated with the respective `Stock Symbol` to understand the emotional context surrounding stock-related news.

In [None]:
import pandas as pd

df = pd.read_csv("../data/news_data.csv")   # IMPORTANT: correct path
df.head()

## Step 6 â€” Required Minimum EDA Code

The following cells perform the required exploratory data analysis on the financial news headlines:

- Headline length analysis
- Publisher count
- Date distribution
- Keyword frequency

In [None]:
# Headline length analysis
df["headline_length"] = df["headline"].str.len()
df["headline_length"].describe()

In [None]:
# Publisher count

df["publisher"].value_counts().head(10)

In [None]:
# Date distribution
import matplotlib.pyplot as plt

df["date"] = pd.to_datetime(df["date"])
df["date"].dt.date.value_counts().sort_index().plot(figsize=(12,6))
plt.title("Articles Published Over Time")
plt.xlabel("Date")
plt.ylabel("Count")
plt.show()

In [None]:
# Keyword frequency
from collections import Counter
import re

words = " ".join(df["headline"].astype(str)).lower()
words = re.findall(r'\b[a-z]{3,}\b', words)   # only alphabetic words
Counter(words).most_common(20)