Weather Prediction Using Markov Chain Model

Project Report

1. Introduction

This project demonstrates the application of Markov Chain theory to predict weather patterns. A Markov Chain is a mathematical model that describes a sequence of events where the probability of each event depends only on the state of the previous event, not on the entire history.

In the context of weather prediction, we use historical weather data to build a probabilistic model that can predict the likelihood of transitioning from one weather condition to another.

2. What Does This Project Do?

The Weather Markov Chain Predictor is a web application that:

Collects Real Weather Data: Fetches current and historical weather information for any major city worldwide
Builds a Markov Chain Model: Analyzes weather patterns to understand how weather conditions transition over time
Predicts Future Weather: Calculates the probability of different weather conditions occurring next
Visualizes Results: Displays predictions through interactive charts, graphs, and a transition probability matrix

Key Features:

Real-time weather data from 120+ cities worldwide
3 months (90 days) of historical weather data per city
Interactive 3D globe showing city locations
Beautiful visualizations of probability distributions
Complete transition matrix showing all possible weather state changes

3. Understanding Markov Chains

What is a Markov Chain?

A Markov Chain is a stochastic model that satisfies the Markov Property:

"The future state depends only on the current state, not on the sequence of events that preceded it."

In mathematical terms:

P(Xₙ₊₁ = x | X₀, X₁, ..., Xₙ) = P(Xₙ₊₁ = x | Xₙ)

Why Use Markov Chains for Weather?

Weather patterns exhibit Markovian properties:

If it's currently Sunny, there's a certain probability it will become Cloudy next
If it's currently Rainy, there's a certain probability it will become Clear next
The next weather state depends primarily on the current state, not on what happened days ago

This makes Markov Chains an excellent model for short-term weather prediction.

4. How the Algorithm Works

Step 1: Data Collection

The system collects hourly weather data for a selected city:

Temperature (°C)
Humidity (%)
Wind Speed (km/h)
Weather Condition (Clear, Cloudy, Rain, etc.)

For each city, we gather 90 days × 24 hours = 2,160 hourly observations.

Step 2: State Identification

Weather conditions are categorized into discrete states:

Clear
Mainly Clear
Partly Cloudy
Overcast
Foggy
Light Drizzle
Drizzle
Heavy Drizzle
Light Rain
Rain
Heavy Rain
Light Snow
Snow
Heavy Snow
Showers
Thunderstorm

Step 3: Transition Counting

The algorithm analyzes sequential weather records to count transitions:

Example:

Hour 1: Sunny    → Hour 2: Sunny     (Sunny → Sunny: +1)
Hour 2: Sunny    → Hour 3: Cloudy    (Sunny → Cloudy: +1)
Hour 3: Cloudy   → Hour 4: Rainy     (Cloudy → Rainy: +1)
Hour 4: Rainy    → Hour 5: Cloudy    (Rainy → Cloudy: +1)

After processing all 2,159 transitions, we have a count matrix showing how many times each transition occurred.

Step 4: Probability Calculation

For each weather state, we calculate the probability of transitioning to every other state:

P(State A → State B) = Count(A → B) / Total transitions from A

Example: If "Sunny" appeared 500 times and transitioned to:

Sunny: 300 times → Probability = 300/500 = 60%
Cloudy: 150 times → Probability = 150/500 = 30%
Rainy: 50 times → Probability = 50/500 = 10%

These probabilities form the Transition Matrix.

Step 5: Building the Transition Matrix

The transition matrix is a table where:

Rows represent the current weather state
Columns represent the next weather state
Values represent the probability of transition

Each row sums to 100% (representing all possible next states).

Step 6: Making Predictions

Given the current weather condition, the system:

Looks up the corresponding row in the transition matrix
Retrieves all possible next states with their probabilities
Ranks them from most likely to least likely
Displays the results visually

5. Mathematical Foundation

Transition Matrix

A Markov Chain is defined by its transition matrix P, where:

P = [p₁₁  p₁₂  ...  p₁ₙ]
    [p₂₁  p₂₂  ...  p₂ₙ]
    [  ⋮    ⋮   ⋱    ⋮ ]
    [pₙ₁  pₙ₂  ...  pₙₙ]

Where:

pᵢⱼ = Probability of transitioning from state i to state j
Each row sums to 1: Σⱼ pᵢⱼ = 1

Stochastic Matrix Properties

The transition matrix is a stochastic matrix because:

All entries are non-negative: pᵢⱼ ≥ 0
Each row sums to 1: Σⱼ pᵢⱼ = 1

Prediction Formula

The probability of being in state j at time n+1, given we're in state i at time n:

P(Xₙ₊₁ = j | Xₙ = i) = pᵢⱼ

For multi-step predictions (k steps ahead):

P⁽ᵏ⁾ = Pᵏ

Where P⁽ᵏ⁾ is the k-step transition matrix.

6. Real-World Example

Let's walk through a real prediction from the system:

Scenario: Current Weather in Bangalore

Current Condition: Light Showers

Historical Analysis:

From 2,185 weather records, the system found:

"Light Showers" occurred 50 times
Transitions observed:
- Light Showers → Light Drizzle: 50 times
- Light Showers → Other states: 0 times

Calculated Probabilities:

P(Light Showers → Light Drizzle) = 50/50 = 100%

Prediction:

Next weather state: Light Drizzle (100% probability)

This means that in Bangalore's historical data, every time it was "Light Showers", the next hour it became "Light Drizzle".

Another Example: Overcast Weather

Current Condition: Overcast

From the transition matrix:

Overcast → Overcast: 76.2%
Overcast → Light Drizzle: 13.8%
Overcast → Partly Cloudy: 4.3%
Overcast → Other states: 5.7%

Prediction: Most likely to remain Overcast (76.2%), but there's a 13.8% chance of Light Drizzle.

7. Advantages of This Approach

1. Simplicity

Easy to understand and implement
No complex neural networks or machine learning required
Based on solid mathematical theory

2. Interpretability

Results are transparent and explainable
You can see exactly why a prediction was made
Probabilities are directly derived from historical data

3. Efficiency

Fast computation (processes 2,000+ records in seconds)
Low memory requirements
No training phase needed

4. Real-Time Updates

Model updates instantly with new data
Always reflects the latest weather patterns
No retraining required

5. Probabilistic Nature

Provides confidence levels (probabilities)
Shows all possible outcomes, not just one prediction
Helps in decision-making under uncertainty

8. Limitations

1. Memoryless Property

Only considers the current state
Doesn't account for longer-term patterns (e.g., seasonal trends)
Can't capture complex weather phenomena

2. Data Dependency

Accuracy depends on the quality and quantity of historical data
Rare weather events may not be well-represented
Local patterns may not generalize to other regions

3. Short-Term Predictions

Best for immediate next-state predictions
Accuracy decreases for longer-term forecasts
Not suitable for weekly or monthly predictions

4. Stationarity Assumption

Assumes weather patterns don't change over time
Doesn't account for climate change
May need periodic retraining with fresh data

9. Applications

This Markov Chain weather model can be used for:

Short-term Planning: Deciding whether to carry an umbrella
Event Management: Planning outdoor events based on weather probabilities
Agriculture: Irrigation scheduling based on rain predictions
Transportation: Route planning considering weather conditions
Education: Teaching probability theory and stochastic processes
Research: Baseline model for comparing advanced weather prediction methods

10. Technical Implementation

Algorithm Implementation

Below is the core Markov Chain algorithm implemented in this project:

async buildMarkovChain(city: string): Promise<MarkovChainData> {
  // Step 1: Fetch all historical weather records for the city
  const records = db.prepare(`
    SELECT * FROM weather_records
    WHERE city = ?
    ORDER BY timestamp ASC
  `).all(city);

  // Step 2: Initialize data structures
  const transitions: Record<string, Record<string, number>> = {};
  const states = new Set<string>();

  // Step 3: Count transitions between consecutive weather states
  for (let i = 0; i < records.length - 1; i++) {
    const fromCondition = records[i].condition;
    const toCondition = records[i + 1].condition;

    // Track all unique states
    states.add(fromCondition);
    states.add(toCondition);

    // Initialize nested object if needed
    if (!transitions[fromCondition]) {
      transitions[fromCondition] = {};
    }

    // Increment transition count
    transitions[fromCondition][toCondition] =
      (transitions[fromCondition][toCondition] || 0) + 1;
  }

  // Step 4: Calculate probabilities from counts
  const transitionsToStore = [];

  for (const [fromCondition, toStates] of Object.entries(transitions)) {
    // Calculate total transitions from this state
    const total = Object.values(toStates).reduce(
      (sum, count) => sum + count,
      0
    );

    // Calculate probability for each transition
    for (const [toCondition, count] of Object.entries(toStates)) {
      const probability = count / total;

      transitionsToStore.push({
        city,
        fromCondition,
        toCondition,
        transitionCount: count,
        probability,
      });
    }
  }

  // Step 5: Store in database for future use
  db.prepare("DELETE FROM markov_transitions WHERE city = ?").run(city);

  const insert = db.prepare(`
    INSERT INTO markov_transitions
    (city, from_condition, to_condition, transition_count, probability)
    VALUES (?, ?, ?, ?, ?)
  `);

  const insertMany = db.transaction((records: any[]) => {
    for (const record of records) {
      insert.run(
        record.city,
        record.fromCondition,
        record.toCondition,
        record.transitionCount,
        record.probability
      );
    }
  });

  insertMany(transitionsToStore);

  return {
    city,
    transitionMatrix: transitions,
    states: Array.from(states),
  };
}

Code Explanation

Part 1: Data Retrieval

const records = db
  .prepare(
    `
  SELECT * FROM weather_records 
  WHERE city = ? 
  ORDER BY timestamp ASC
`
  )
  .all(city);

Fetches all weather records for the specified city
Orders by timestamp to maintain chronological sequence
This ensures we analyze transitions in the correct order

Part 2: Initialization

const transitions: Record<string, Record<string, number>> = {};
const states = new Set<string>();

transitions: A nested object to store transition counts
- Structure: { "Sunny": { "Cloudy": 10, "Rainy": 5 }, ... }
states: A set to track all unique weather conditions

Part 3: Counting Transitions

for (let i = 0; i < records.length - 1; i++) {
  const fromCondition = records[i].condition;
  const toCondition = records[i + 1].condition;

  states.add(fromCondition);
  states.add(toCondition);

  if (!transitions[fromCondition]) {
    transitions[fromCondition] = {};
  }

  transitions[fromCondition][toCondition] =
    (transitions[fromCondition][toCondition] || 0) + 1;
}

Loops through consecutive pairs of weather records
For each pair, records a transition from fromCondition to toCondition
Increments the count for this specific transition
Example: If hour 5 is "Sunny" and hour 6 is "Cloudy", we increment transitions["Sunny"]["Cloudy"]

Part 4: Probability Calculation

for (const [fromCondition, toStates] of Object.entries(transitions)) {
  const total = Object.values(toStates).reduce((sum, count) => sum + count, 0);

  for (const [toCondition, count] of Object.entries(toStates)) {
    const probability = count / total;
    // Store probability...
  }
}

For each weather state, calculates the total number of transitions
Divides each transition count by the total to get probability
Example: If "Sunny" transitioned 300 times total:
- 180 times to "Sunny" → probability = 180/300 = 0.60 (60%)
- 90 times to "Cloudy" → probability = 90/300 = 0.30 (30%)
- 30 times to "Rainy" → probability = 30/300 = 0.10 (10%)

Part 5: Database Storage

const insertMany = db.transaction((records: any[]) => {
  for (const record of records) {
    insert.run(
      record.city,
      record.fromCondition,
      record.toCondition,
      record.transitionCount,
      record.probability
    );
  }
});

insertMany(transitionsToStore);

Uses a database transaction for efficiency
Stores all transitions in a single atomic operation
This makes the operation 30-60x faster than individual inserts
Allows reusing the computed probabilities without recalculation

Prediction Algorithm

async predictNextWeather(
  city: string,
  currentCondition: string
): Promise<WeatherPrediction> {
  // Fetch all possible transitions from current state
  const transitions = db.prepare(`
    SELECT * FROM markov_transitions
    WHERE city = ? AND from_condition = ?
    ORDER BY probability DESC
  `).all(city, currentCondition);

  // Return predictions sorted by probability
  return {
    currentCondition,
    predictions: transitions.map((t) => ({
      condition: t.to_condition,
      probability: t.probability,
    })),
  };
}

How it works:

Queries the database for all transitions from the current weather state
Results are automatically sorted by probability (highest first)
Returns a list of possible next states with their probabilities
The UI displays this as a bar chart and summary cards

11. Results and Accuracy

Sample Results from Bangalore

Dataset: 2,185 hourly weather observations (90 days)

Weather States Identified: 11 unique conditions

Clear
Mainly Clear
Partly Cloudy
Overcast
Light Drizzle
Drizzle
Light Rain
Rain
Heavy Rain
Light Showers
Heavy Drizzle

Total Transitions Analyzed: 2,184

Sample Transition Probabilities:

Overcast → Overcast: 76.2% (most stable state)
Light Drizzle → Light Drizzle: 54.9% (tends to persist)
Drizzle → Light Rain: 24.8% (likely to intensify)
Clear → Clear: 64.7% (stable clear weather)

Observations

Weather Persistence: Most weather conditions tend to persist (diagonal values in matrix are high)
Gradual Transitions: Weather typically changes gradually (e.g., Clear → Partly Cloudy → Overcast)
Rare Sudden Changes: Direct transitions from Clear to Heavy Rain are rare
Local Patterns: Bangalore shows high persistence of Overcast conditions (76.2%)

12. Conclusion

This project successfully demonstrates the application of Markov Chain theory to weather prediction. By analyzing historical weather patterns, we can build a probabilistic model that provides meaningful predictions about future weather states.

Key Achievements:

✅ Implemented a complete Markov Chain model from scratch
✅ Processed 2,000+ weather records efficiently
✅ Built an interactive visualization system
✅ Achieved real-time predictions with probability distributions
✅ Created an educational tool for understanding stochastic processes

Learning Outcomes:

Understanding of Markov Chain theory and its applications
Practical experience with probability and statistics
Data processing and analysis skills
Algorithm implementation and optimization
Real-world application of theoretical concepts

Future Enhancements:

Higher-Order Markov Chains: Consider multiple previous states
Seasonal Adjustments: Account for seasonal weather patterns
Multi-City Comparisons: Compare weather patterns across cities
Weather Severity Scoring: Incorporate temperature and wind data
Long-Term Forecasting: Extend predictions beyond immediate next state

13. References

Markov Chain Theory
- Norris, J. R. (1997). "Markov Chains". Cambridge University Press.
Weather Prediction Models
- Wilks, D. S. (2011). "Statistical Methods in the Atmospheric Sciences". Academic Press.
Stochastic Processes
- Ross, S. M. (2014). "Introduction to Probability Models". Academic Press.
Data Source
- Open-Meteo API: https://open-meteo.com/
- Free weather data with historical archives

14. Acknowledgments

This project was developed as part of the Theory of Computation (TOC) course to demonstrate practical applications of mathematical models in computer science.

Technologies Used:

Algorithm: Markov Chain (Stochastic Process)
Data Source: Open-Meteo Weather API
Database: SQLite (for fast data processing)
Frontend: React, TypeScript, Tailwind CSS
Backend: Node.js, Express
Visualization: Recharts, Framer Motion, COBE

Project Completed: November 2025

Total Lines of Code: ~3,500

Total Weather Records Processed: 2,185+ per city

Average Prediction Time: 2-3 seconds

Accuracy: Based on historical patterns (varies by location and weather stability)

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

tarinagarwal/TOC-Assignment

Folders and files

Latest commit

History

Repository files navigation