# Market Sentiment Analysis through EDA and Clustering

## Introduction

The focus of this project is to analyze stock market behavior using basic historical stock data, including daily prices, trading volume, and returns, sourced from Yahoo Finance. This data allows us to explore fundamental patterns in stock performance across various sectors, such as technology, finance, and healthcare. By examining stocks at a granular level, we can gain insights into broader market trends, sector-specific behaviors, and how different assets respond to market fluctuations. This type of base-level data is widely accessible and forms the foundation for much of financial analysis, making it an ideal starting point for understanding market dynamics without the complexity of derivative-based metrics.

To uncover meaningful patterns and relationships within the dataset, we will apply several data analysis techniques, including exploratory data analysis (EDA), clustering, and correlation analysis. EDA will help us understand the distribution of key metrics, identify outliers, and gain an initial overview of stock performance across sectors. Clustering will allow us to group stocks based on similar performance characteristics, such as average returns and volatility, which can reveal natural groupings and shared behaviors across sectors. Finally, correlation analysis will be used to examine interdependencies between sectors, especially in response to major market events, allowing us to observe how shocks to one sector may influence others. By applying these techniques, we aim to derive actionable insights about sector relationships, stock volatility, and patterns of market sentiment, providing a clear, data-driven foundation for further financial analysis.

TODO: insert a correlation matrix or some other visual here

## Motivation
Understanding stock market behavior and the interdependencies between sectors is crucial for investors, analysts, and policymakers alike. In an increasingly interconnected global economy, sectors rarely operate in isolation; instead, they are influenced by common economic forces, investor sentiment, and market events. By studying these relationships, we can gain insights into how certain sectors react to economic shocks, periods of high volatility, and changes in market sentiment. For investors, this knowledge aids in making diversified, informed decisions, potentially reducing risk by understanding how assets interact under different market conditions. For policymakers and economists, understanding sectoral interdependencies can help predict and mitigate the broader impact of financial crises or policy changes. This analysis not only deepens our understanding of stock market dynamics but also equips readers with the ability to make data-driven decisions in a complex, evolving financial landscape.

By analyzing raw stock data and directly exploring interdependencies and sector dynamics, this project gives readers a firsthand understanding of market relationships, enabling them to draw their own conclusions rather than relying on potentially biased interpretations. Furthermore, the techniques used in this project — such as clustering and correlation analysis — allow us to uncover unique, nuanced insights that may not be covered in general market commentary. This project empowers readers with the tools to understand how data-backed financial analysis is conducted and offers a transparent, replicable method for examining market behavior. Rather than simply consuming information, readers can engage with the analytical process, gaining a stronger foundation for making their own informed investment decisions and developing critical thinking skills regarding financial trends and market sentiment.

## Methods

### Exploratory Data Analysis (EDA) through Line Charts
The initial exploration of the dataset is conducted through line charts that compare sector performance over time and aggregate sector returns. By plotting the time series of stock prices and returns, we can visualize trends, seasonal patterns, and overall sector movement. Line charts are particularly effective for identifying fluctuations and periods of high or low volatility, allowing us to quickly assess how each sector behaves individually and in relation to others. This approach also highlights key events and market shifts, setting a strong foundation for deeper analysis. Line charts provide a clear, straightforward view of the data, making them a natural starting point for understanding market behavior across sectors.

### Before and After Density Plots
To examine the impact of specific market events, we use before and after Densities of returns for each sector. These plots allow us to visualize changes in the distribution of returns pre- and post-event, offering insights into how sectors respond to external shocks or significant events. For example, a major economic announcement might lead to a wider spread of returns (increased volatility) or a shift in the average return. By comparing the shape, spread, and central tendency of returns in these histograms, we can observe how market sentiment and sector stability are affected by particular events. This method is useful for identifying shifts in risk and investor behavior, making it an effective tool for event-based analysis.

### Clustering
Clustering is applied to group stocks based on performance metrics, such as average returns and volatility, using unsupervised machine learning techniques. This method allows us to identify clusters of stocks that exhibit similar behavior, which may be driven by underlying factors like sector characteristics or economic sensitivities. Clustering helps us uncover natural groupings within the market, allowing us to observe how certain stocks or sectors react similarly to market events and volatility. By identifying these groups, we gain insights into diversification opportunities, as clustered stocks may exhibit co-movements that can be either leveraged or avoided, depending on investment strategy. This approach also enables us to explore market structure and relationships at a deeper level.

### Correlation Heatmaps
Correlation heatmaps provide a visual representation of the relationships between sectors by displaying the correlation matrix of sector returns. Using color gradients to indicate correlation strength, heatmaps make it easy to spot sectors that tend to move together (positive correlation) or behave independently (low or negative correlation). This visualization is particularly useful when examining the impact of specific events, as we can create “before” and “after” heatmaps to see how sector relationships evolve in response to market shocks. The heatmaps provide an accessible, intuitive way to interpret complex interdependencies within the market, helping investors identify opportunities for diversification and manage risks associated with correlated movements across sectors.

## Main Results


In [1]:
from dashboard import Dashboard
dashboard = Dashboard()
dashboard.show()

BokehModel(combine_events=True, render_bundle={'docs_json': {'9111b762-db5d-484a-9871-08d1a56f8bed': {'version…

## Conclusions

This project has provided a comprehensive analysis of stock market behavior using basic historical stock data, focusing on sector relationships, stock volatility, and patterns of market sentiment. By applying EDA, clustering, and correlation analysis techniques, we have uncovered unique insights into market dynamics and sector behavior. By examining the time series of stock prices and returns, we have gained a strong foundation for understanding market trends and identifying patterns that may not be covered in general market commentary. By comparing sector performance over time and examining before and after histograms of returns, we have observed how market sentiment and sector stability are affected by specific events. By clustering stocks based on performance metrics, we have identified clusters of stocks that exhibit similar behavior, allowing us to observe how certain stocks or sectors react similarly to market events and volatility. By creating correlation heatmaps, we have visualized the relationships between sectors, enabling us to identify opportunities for diversification