# Market Mood & Moves  
## Project Overview and System Architecture

This project explores how financial news sentiment can influence
stock market movements. The objective is to design a clean,
leakage-free data pipeline that converts raw news into
time-aligned sentiment signals, which can later be used
for predictive modeling.

At this stage, the focus is on building strong foundations:
- correct data handling
- robust preprocessing
- sound financial reasoning


## Motivation: Behavioral Finance Perspective

Traditional financial theory assumes that markets are fully efficient
and prices always reflect all available information.

However, behavioral finance suggests that:
- investor sentiment
- news tone
- collective psychology

can temporarily influence prices, especially around major events
such as earnings, regulations, or macroeconomic announcements.

This project aims to capture such sentiment-driven effects
in a structured and quantitative manner.


## Problem Statement

Financial news arrives continuously, while stock prices are observed
at discrete trading intervals.

Key challenges include:
- filtering irrelevant or ambiguous news
- avoiding look-ahead bias
- aligning news timestamps with trading days
- aggregating sentiment signals meaningfully

If these steps are handled incorrectly, any downstream model
will produce misleading results.


## System Architecture Overview

The project is structured as a multi-stage pipeline:

1. Data Ingestion  
   - Collect news articles and stock price data

2. Data Storage  
   - Persist raw data in a structured format for reproducibility

3. Text Processing & Sentiment Extraction  
   - Clean text
   - Apply financial sentiment models

4. Temporal Alignment  
   - Align news with the correct trading day
   - Handle market hours and weekends

5. Feature Preparation  
   - Create model-ready datasets for future analysis

Each stage is designed to be modular and interpretable.


## Data Flow

Raw News (timestamps, headlines)
        ↓
Text Cleaning & Filtering
        ↓
Sentiment Scores
        ↓
Timezone-Aware Alignment
        ↓
Daily Aggregated Signals
        ↓
Merged with Stock Prices
        ↓
Model-Ready Feature Table


## Scope of Current Implementation

Current focus (Weeks 1–2):
- building the data pipeline
- understanding sentiment models
- ensuring temporal correctness

Out of scope for now:
- predictive modeling
- trading strategies
- performance evaluation

These components will be introduced in later stages
once the data foundations are reliable.


## Project Notebook Roadmap

The project is organized into the following notebooks:

1. Project Overview & Architecture  
2. Data Ingestion & Storage  
3. Text Processing & Sentiment Engine  
4. Time Alignment & Feature Preparation  

Each notebook corresponds to a distinct component
of the overall system.
