# Weather Prediction Dashboard Implementation

## Overview

Our goal is to build a weather prediction system that will take in minute-by-minute data updates, aggregate this data into daily summaries, and use it for further prediction of weather conditions. The system will be implemented using **Python**, **Pandas**, **Streamlit**, and **Random Forest Models** to provide weather predictions for the next 21 days.

In this system, we will:

1. Collect new weather data every minute.
2. Aggregate the minute-level data into daily summaries by calculating the mean for each feature.
3. Update the main weather dataset with daily data.
4. Display predictions interactively using **Streamlit** for better visualization and user interaction.

By using **Streamlit**, the management team can easily interact with the predictions, view trends, and analyze weather data in a more engaging and intuitive way.

## Steps for Implementation

### 1. Data Collection and Aggregation

- **Data Collection**: Each day, minute-by-minute weather data (temperature, humidity, wind speed, pressure, and rain status) will be collected. This could be from an external weather API or sensor-based data collection.
  
- **Data Aggregation**: Since we cannot use minute-level data for predictions directly, we aggregate the data by calculating the **daily mean** for all features. This helps us to summarize the data without overwhelming the prediction system.

Here’s how the data will be aggregated:
- **Average Temperature**: Calculate the mean of the temperature values for the day.
- **Humidity**: Calculate the mean humidity for the day.
- **Wind Speed**: Calculate the mean wind speed for the day.
- **Pressure**: Calculate the mean atmospheric pressure for the day.
- **Rain Percentage**: Calculate the percentage of the time it rained during the day.

This aggregation will allow us to ensure that we are working with a summary of the day’s data, which is more suitable for training prediction models and generating insights.

### 2. Updating the Main Weather Dataset (CSV)

Once the daily data is aggregated, we will update our main weather dataset (CSV file) with the new daily data. 

- **Step 1**: We will first check if the CSV file exists. If it does, we will read it and append the new daily data. The new daily data will be the mean of the collected minute-level values for the day.

- **Step 2**: If the CSV file doesn’t exist, we will create it with the necessary headers and store the first day's data (i.e., the aggregated mean values of temperature, humidity, wind speed, pressure, and rain percentage).

The script will look like this:


In [3]:
import pandas as pd
import numpy as np
import datetime
import os

# File path for the main weather dataset (CSV file)
csv_file_path = "updated_weather_data.csv"  # Replace with your actual path

# Function to collect new data (simulation for the example)
def collect_weather_data():
    # Simulating the collection of new minute-by-minute weather data for the day
    # In a real scenario, this data would be collected from a weather station or API
    new_data = {
        'timestamp': pd.date_range(start=datetime.datetime.now(), periods=1440, freq='T'),
        'avg_temperature': np.random.uniform(20, 30, 1440),  # Random data for temperature (°C)
        'humidity': np.random.uniform(50, 80, 1440),  # Random data for humidity (%)
        'avg_wind_speed': np.random.uniform(0, 10, 1440),  # Random data for wind speed (m/s)
        'pressure': np.random.uniform(1010, 1025, 1440),  # Random data for pressure (hPa)
        'rain_or_not': np.random.choice([0, 1], size=1440)  # Random rain (0 = No Rain, 1 = Rain)
    }
    df_new_data = pd.DataFrame(new_data)
    return df_new_data

# Function to aggregate the collected data into daily means
def aggregate_daily_data(df_new_data):
    # Calculate mean values for each feature for the entire day
    daily_data = {
        'date': pd.to_datetime(df_new_data['timestamp'].iloc[0]).date(),  # Extract the date
        'avg_temperature': df_new_data['avg_temperature'].mean(),
        'humidity': df_new_data['humidity'].mean(),
        'avg_wind_speed': df_new_data['avg_wind_speed'].mean(),
        'pressure': df_new_data['pressure'].mean(),
        'rain_percentage': df_new_data['rain_or_not'].mean() * 100  # Percentage of rain
    }
    return daily_data

# Function to update the main CSV file with the daily aggregated data
def update_csv_with_daily_data(daily_data, csv_file_path):
    # Check if the CSV file exists, if not create it with headers
    if os.path.exists(csv_file_path):
        # Load the existing data from CSV
        df_existing = pd.read_csv(csv_file_path)
    else:
        # If the file does not exist, create an empty DataFrame and set headers
        df_existing = pd.DataFrame(columns=['date', 'avg_temperature', 'humidity', 'avg_wind_speed', 'pressure', 'rain_percentage'])

    # Convert the daily_data dictionary to a DataFrame
    df_daily_data = pd.DataFrame([daily_data])

    # Concatenate the new data with the existing DataFrame
    df_existing = pd.concat([df_existing, df_daily_data], ignore_index=True)

    # Save the updated DataFrame back to the CSV
    df_existing.to_csv(csv_file_path, index=False)

# Main function to run the process
def main():
    # Step 1: Collect the new weather data (minute-by-minute)
    df_new_data = collect_weather_data()

    # Step 2: Aggregate the data into daily mean values
    daily_data = aggregate_daily_data(df_new_data)

    # Step 3: Update the main weather dataset (CSV) with the new daily data
    update_csv_with_daily_data(daily_data, csv_file_path)

    print("Weather data for the day has been updated successfully!")

# Run the main function
if __name__ == "__main__":
    main()


Weather data for the day has been updated successfully!


  'timestamp': pd.date_range(start=datetime.datetime.now(), periods=1440, freq='T'),


# Instructions to Run the Streamlit App

1. **Install the required libraries:**
   - Open your terminal or command prompt and run the following command:
     ```bash
     pip install streamlit pandas matplotlib scikit-learn
     ```

2. **Prepare your script:**
   - Ensure you have the following files:
     - **weather_prediction_script.py** (your Python script).(here Q1_app.py)
     - **weather_data.csv** (your weather data CSV).

3. **Run the Streamlit app:**
   - In the terminal or command prompt, navigate to the directory where **weather_prediction_script.py** is located.
   - Run the command:
     ```bash
     streamlit run weather_prediction_script.py (here streamlit run Q1_app.py)
     ```

4. **Access the app:**
   - Streamlit will automatically start a local server and open the app in your web browser.

5. **Stop the app:**
   - To stop the app, press `CTRL + C` in the terminal or command prompt.
