Automates fetching, storing, and analyzing hourly weather data from the OpenWeatherMap API. Includes MySQL database integration, raw data storage, and scripts for analysis using SQL and Python. Extensible for more cities or data sources, ideal for learning ETL pipelines, API integration, and data management.
This project provides a comprehensive solution for fetching, storing, and analyzing weather data for multiple cities. Using the OpenWeatherMap API, we retrieve hourly weather conditions, store them in a MySQL database, and perform detailed analyses to answer various questions related to weather trends and statistics.
The repository is organized as follows:
config.json: Configuration file containing API key and database settingsmain.py: Main script to fetch and store weather datautils.py: Utility functions for API calls and database operationsanalysis.py: Script for analyzing weather data stored in raw JSON filesDDL.SQL: SQL file to create and set up database schemaQUERIES.SQL: SQL file containing analytical queriesraw_data: Directory for storing raw weather data (JSON format) , automatically created
Update the config.json file with your OpenWeatherMap API key and database credentials:
{
"api_key": "<your_api_key>",
"base_url": "https://api.openweathermap.org/data/3.0/onecall",
"db_config": {
"host": "localhost",
"user": "<your_user>",
"password": "<your_password>",
"database": "openweatherapi"
}
}Open the DDL.SQL file and execute the script in your MySQL database to create the necessary tables:
CREATE SCHEMA OpenWeatherAPI;
CREATE TABLE LK_WEATHER_TYPE (...);
CREATE TABLE LK_CITIES (...);
CREATE TABLE FACT_WEATHER (...);Insert lookup data for weather types and cities using the SQL provided in DDL.SQL.
The main.py script automates the process of retrieving and storing weather data.
Define the cities to fetch weather data for in main.py:
cities = ['Milano', 'Bologna', 'Cagliari']Run the script to fetch and store weather data for the last 3 days:
python main.pyThe script:
- Retrieves data using the OpenWeatherMap API.
- Saves raw JSON data in the
raw_datadirectory. - Inserts cleaned data into the
FACT_WEATHERtable in the MySQL database.
Open the QUERIES.SQL file to find SQL queries for analysis, including:
- Distinct weather conditions observed in a specific period.
- Ranking of the most common weather conditions per city.
- Average temperature observed per city.
- City with the highest temperature.
- City with the highest daily temperature variation.
- City with the strongest wind.
Execute these queries directly in your MySQL client to retrieve the analysis results.
Run analysis.py to perform data analysis directly on the raw JSON files. It answers the same questions as the SQL queries but uses pandas for processing.
python analysis.pyExample outputs include:
- Weather condition rankings per city.
- Temperature averages.
- Highest temperature and wind speed.
- Daily temperature variations.
Feel free to submit pull requests for improvements or additional features. For major changes, please discuss them via issues first.