-
Notifications
You must be signed in to change notification settings - Fork 0
data migration
windula_k edited this page Nov 3, 2023
·
6 revisions

Separate python Job was written to conduct Data Migration
* Temperature (C), Apparent Temperature (C) columns round-off to 1 decimal place
* Visibility (km) round-off to 1 decimal place
* Formatted Date modified to this format '2008-07-12 17:00:00'
* Storage Space: More decimal places require more storage space. While this might not be significant for a small number of records, it can become substantial with large datasets, leading to increased storage costs and potentially slower backup and restore times.
* Memory Usage: When these values are loaded into memory, they will consume more space, which can lead to more memory being used overall by the database system. This can be particularly impactful if the dataset is large and is frequently accessed, as it can lead to increased memory pressure.
* Index Size and Performance: Indexes on columns with many decimal places will also be larger. This can slow down index creation and maintenance (like insert, update, and delete operations), as well as potentially slow down query performance due to the larger index size leading to less efficient I/O operations.
* Calculation Overhead: Mathematical operations on numbers with many decimal places can take longer to compute. This is because the underlying algorithms that handle arithmetic operations have to process more digits, which can increase CPU usage and slow down query performance.
dbUrl=$url
dbUser=$user
dbPassword=$password
database=$database_name
port=$port numberpopulate these values in the main.py
python main.pyweather_data table
CREATE TABLE weather_analytics_data_db.weather_data (
weather_data_id int NOT NULL AUTO_INCREMENT,
observed_at datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
summary varchar(30) default null,
precip_type ENUM ('rain', 'snow', 'sleet', 'hail') DEFAULT null,
temperature_celsius decimal default 0.0,
apprnt_temperature_celsius decimal default 0.0,
humidity decimal default 0.0,
wind_speed_km_h decimal default 0.0,
wind_bearing_degrees int default 0,
visibility_km decimal default 0.0,
cloud_cover tinyint(1) default 0,
pressure_millibar decimal default 0.0,
daily_summary varchar(150) default null,
primary key (weather_data_id),
unique key weather_id_observed_at (weather_data_id,observed_at)
)weather_summary table
CREATE TABLE weather_analytics_data_db.weather_summary (
weather_summary_id int NOT NULL AUTO_INCREMENT,
summary varchar(30) default null,
precip_type ENUM ('rain', 'snow', 'sleet', 'hail') DEFAULT null,
daily_summary varchar(150) default null,
primary key (weather_summary_id),
unique key weather_summary_id_precip_type (summary,precip_type,daily_summary)
)