-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HERE Data Refresh #285
Comments
created a Python function that will backup a partitioned table month to a zipped file in a temporary directory and then will copy that zipped file to s3 def backup_to_s3_month(con, month, s3):
'''Backup a month of ta data to specified s3
'''
with tempfile.TemporaryDirectory() as tempdir:
#backup month data to compressed file
copy_query = sql.SQL("COPY here.{} TO STDOUT WITH (FORMAT 'csv', HEADER TRUE)")
#compress results of the copy command to a file in the compressed directory
os.chdir(tempdir)
data_file_path = 'here_{}.csv.gz'.format(month)
with gzip.GzipFile(filename=data_file_path, mode='w') as data_file:
with con.cursor() as cur:
cur.copy_expert(copy_query.format(sql.Identifier('ta_'+month)), data_file)
#copy file to s3 bucket
subprocess.check_call(['aws','s3','cp',data_file_path, s3]) |
Comparing the difference in the number of links and the amount they match between versions of the map Using routing streetsThese are directional, filtered to actually be navigable by auto, and have some other processing for routing SELECT COUNT(nu.link_dir) new_links, COUNT(ole.link_dir) ole_links,
SUM(CASE WHEN ole.link_dir = nu.link_dir THEN 1 ELSE 0 END) AS matches,
SUM(CASE WHEN ole.link_dir IS NOT NULL AND nu.link_dir IS NULL THEN 1 ELSE 0 END) AS old_not_new,
SUM(CASE WHEN ole.link_dir IS NULL AND nu.link_dir IS NOT NULL THEN 1 ELSE 0 END) AS new_not_old
FROM here.routing_streets_18_3 ole
FULL OUTER JOIN here.routing_streets_19_4 nu on nu.link_dir = ole.link_dir
StreetsIf we look at all links however (not directional) SELECT COUNT(nu.link_id) new_links, COUNT(ole.link_id) ole_links,
SUM(CASE WHEN ole.link_id = nu.link_id THEN 1 ELSE 0 END) AS matches,
SUM(CASE WHEN ole.link_id IS NOT NULL AND nu.link_id IS NULL THEN 1 ELSE 0 END) AS old_not_new,
SUM(CASE WHEN ole.link_id IS NULL AND nu.link_id IS NOT NULL THEN 1 ELSE 0 END) AS new_not_old
FROM here_gis.streets_18_3 ole
FULL OUTER JOIN here_gis.streets_19_4 nu on nu.link_id = ole.link_id
Traffic StreetsAnd if we look at "traffic streets", filtered to be paved and allowing cars and traffic SELECT COUNT(nu.link_id) new_links, COUNT(ole.link_id) ole_links,
SUM(CASE WHEN ole.link_id = nu.link_id THEN 1 ELSE 0 END) AS matches,
SUM(CASE WHEN ole.link_id IS NOT NULL AND nu.link_id IS NULL THEN 1 ELSE 0 END) AS old_not_new,
SUM(CASE WHEN ole.link_id IS NULL AND nu.link_id IS NOT NULL THEN 1 ELSE 0 END) AS new_not_old
FROM here_gis.traffic_streets_18_3 ole
FULL OUTER JOIN here_gis.traffic_streets_19_4 nu on nu.link_id = ole.link_id
Traffic_streets taking into account direction of travelWITH traffic_link_dirs_19_4 AS(
SELECT link_id || 'F'::text AS link_dir
FROM here_gis.traffic_streets_19_4
WHERE dir_travel = ANY (ARRAY['F', 'B'])
UNION ALL
SELECT link_id || 'T'::text AS link_dir
FROM here_gis.traffic_streets_19_4
WHERE dir_travel = ANY (ARRAY['T', 'B'])
)
,traffic_link_dirs_18_3 AS(
SELECT link_id || 'F'::text AS link_dir
FROM here_gis.traffic_streets_18_3
WHERE dir_travel = ANY (ARRAY['F', 'B'])
UNION ALL
SELECT link_id || 'T'::text AS link_dir
FROM here_gis.traffic_streets_18_3
WHERE dir_travel = ANY (ARRAY['T', 'B'])
)
SELECT COUNT(nu.link_dir) new_links, COUNT(ole.link_dir) ole_links,
SUM(CASE WHEN ole.link_dir = nu.link_dir THEN 1 ELSE 0 END) AS matches,
SUM(CASE WHEN ole.link_dir IS NOT NULL AND nu.link_dir IS NULL THEN 1 ELSE 0 END) AS old_not_new,
SUM(CASE WHEN ole.link_dir IS NULL AND nu.link_dir IS NOT NULL THEN 1 ELSE 0 END) AS new_not_old
FROM traffic_link_dirs_18_3 ole
FULL OUTER JOIN traffic_link_dirs_19_4 nu on nu.link_dir = ole.link_dir
NodesI forgot that nodes are not unique on SELECT COUNT(nu.node_id) new_nodes, COUNT(ole.node_id) ole_nodes,
SUM(CASE WHEN ole.node_id = nu.node_id THEN 1 ELSE 0 END) AS matches,
SUM(CASE WHEN ole.node_id IS NOT NULL AND nu.node_id IS NULL THEN 1 ELSE 0 END) AS old_not_new,
SUM(CASE WHEN ole.node_id IS NULL AND nu.node_id IS NOT NULL THEN 1 ELSE 0 END) AS new_not_old
FROM here_gis.zlevels_18_3 ole
FULL OUTER JOIN here_gis.zlevels_19_4 nu on nu.node_id = ole.node_id AND nu.link_id = ole.link_id
WHERE (ole.node_id IS NOT NULL AND ole.intrsect = 'Y') OR
(nu.node_id IS NOT NULL AND nu.intrsect = 'Y');
|
2019 data ALTER TABLE here.ta_201901 RENAME TO ta_201901_old; ALTER TABLE here.ta_201901_old NO INHERIT here.ta;
ALTER TABLE here.ta_201902 RENAME TO ta_201902_old; ALTER TABLE here.ta_201902_old NO INHERIT here.ta;
ALTER TABLE here.ta_201903 RENAME TO ta_201903_old; ALTER TABLE here.ta_201903_old NO INHERIT here.ta;
ALTER TABLE here.ta_201904 RENAME TO ta_201904_old; ALTER TABLE here.ta_201904_old NO INHERIT here.ta;
ALTER TABLE here.ta_201905 RENAME TO ta_201905_old; ALTER TABLE here.ta_201905_old NO INHERIT here.ta;
ALTER TABLE here.ta_201906 RENAME TO ta_201906_old; ALTER TABLE here.ta_201906_old NO INHERIT here.ta;
ALTER TABLE here.ta_201907 RENAME TO ta_201907_old; ALTER TABLE here.ta_201907_old NO INHERIT here.ta;
ALTER TABLE here.ta_201908 RENAME TO ta_201908_old; ALTER TABLE here.ta_201908_old NO INHERIT here.ta;
ALTER TABLE here.ta_201909 RENAME TO ta_201909_old; ALTER TABLE here.ta_201909_old NO INHERIT here.ta;
ALTER TABLE here.ta_201910 RENAME TO ta_201910_old; ALTER TABLE here.ta_201910_old NO INHERIT here.ta;
ALTER TABLE here.ta_201911 RENAME TO ta_201911_old; ALTER TABLE here.ta_201911_old NO INHERIT here.ta;
ALTER TABLE here.ta_201912 RENAME TO ta_201912_old; ALTER TABLE here.ta_201912_old NO INHERIT here.ta;
DO $do$
DECLARE
startdate DATE;
yyyymm TEXT;
basetablename TEXT := 'ta_';
tablename TEXT;
yyyy INT := 2019;
BEGIN
FOR mm IN 01..12 LOOP
startdate:= to_date(yyyy||'-'||mm||'-01', 'YYYY-MM-DD');
IF mm < 10 THEN
yyyymm:= yyyy||'0'||mm;
ELSE
yyyymm:= yyyy||''||mm;
END IF;
tablename:= basetablename||yyyymm;
EXECUTE format($$CREATE TABLE here.%I
(CHECK (tx >= DATE '$$||startdate ||$$'AND tx < DATE '$$||startdate ||$$'+ INTERVAL '1 month'),
UNIQUE(link_dir, tx)
) INHERITS (here.ta);
ALTER TABLE here.%I OWNER TO here_admins;
$$
, tablename, tablename);
PERFORM here.create_link_dir_idx(tablename);
PERFORM here.create_tx_idx(tablename);
END LOOP;
END;
$do$ LANGUAGE plpgsql |
The routing streets issue was a simple bug 😭, edited the number in the table above ☝️ #285 (comment) |
Hmm, if I create a routing table with |
On March 24th the map is updated to 2019Q4 and data will be refreshed. Need to clarify how far back data will be refreshed. Assuming Jan 2019 for now.
The text was updated successfully, but these errors were encountered: