Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Routing harvesting duplicates channelepochs (and related epochs) in the routing DB #33

Open
kaestli opened this issue Dec 19, 2022 · 4 comments

Comments

@kaestli
Copy link
Collaborator

kaestli commented Dec 19, 2022

pre-existing channel-epochs are not recognized - each harvesting of the routing declaration introduces all channel epochs again as new ones, rather than updating the last-seen dates of the pre-existing ones.
(This behaviour is not observed with network epochs and station epochs; however those still have the issue that the lastseen date is not updated; cfr #31

@kaestli
Copy link
Collaborator Author

kaestli commented Nov 23, 2023

affected are the entities channelepoch, epoch (if referred to by a channel epoch), and routing (if referring to such an epoch, however, this is always the case...)

evidence:
Select network.code as net, station.code as sta, channelepoch.id,
channelepoch.locationcode as loc, channelepoch.code as chan,
epoch.starttime as start,
epoch.endtime as end,
epoch.lastseen as ep_lastseen,
routing.lastseen as rout_lastseen,
service.name
from network inner join channelepoch on network.id = channelepoch.network_ref
inner join station on station.id = channelepoch.station_ref
inner join epoch on channelepoch.epoch_ref = epoch.id
left join routing on routing.epoch_ref = epoch.id
left join endpoint on routing.endpoint_ref = endpoint.id
left join service on endpoint.service_ref = service.id
where station.code = 'HAMIK'
and channelepoch.code = 'HGZ'
order by epoch.starttime, epoch.lastseen, channelepoch.id;
-- 5000
-- note: only channelepochs have routing,
-- but all - channelepoch, epoch and routing are multiplied

Select station.code as sta,
epoch.starttime as start,
epoch.endtime as end,
epoch.lastseen as ep_lastseen,
routing.lastseen as rout_lastseen,
service.name
from station inner join stationepoch on station.id = stationepoch.station_ref
inner join epoch on stationepoch.epoch_ref = epoch.id
left join routing on routing.epoch_ref = epoch.id
left join endpoint on routing.endpoint_ref = endpoint.id
left join service on endpoint.service_ref = service.id
where station.code = 'HAMIK'
order by epoch.starttime, epoch.lastseen, stationepoch.id;
-- 1 epoch, no routing

Select network.code as sta,
epoch.starttime as start,
epoch.endtime as end,
epoch.lastseen as ep_lastseen,
routing.lastseen as rout_lastseen,
service.name
from network inner join networkepoch on network.id = networkepoch.network_ref
inner join epoch on networkepoch.epoch_ref = epoch.id
left join routing on routing.epoch_ref = epoch.id
left join endpoint on routing.endpoint_ref = endpoint.id
left join service on endpoint.service_ref = service.id
where network.code = 'CH'
order by epoch.starttime, epoch.lastseen, networkepoch.id
-- 1 epoch, no routing.

@kaestli
Copy link
Collaborator Author

kaestli commented Nov 23, 2023

(note that this issue would widely be mediated by the cleanup mechanism for routes which disappeared from EIDA (old lastseen dates), however that process is not automated.

@kaestli
Copy link
Collaborator Author

kaestli commented Nov 24, 2023

While channelepochs and routes are duplicated rather than the lastseen date updated,
for network epochs and station epochs the lastseen date is not updated (which is probably fine as long as it is not used).

The issue does not prevent the federator from "functioning", however it bloats the database and makes queries slow in the long-run.

@kaestli
Copy link
Collaborator Author

kaestli commented Feb 1, 2024

Until the code is fixed, the database can be kept under control as follows:

-- do this only once:

CREATE INDEX IF NOT EXISTS epoch_epochtype_ref_idx
    ON public.epoch USING btree
    (epochtype_ref ASC NULLS LAST)
    TABLESPACE pg_default;

CREATE UNIQUE INDEX IF NOT EXISTS epochtype_id_idx
    ON public.epochtype USING btree
    (id ASC NULLS LAST)
    TABLESPACE pg_default;

-- do the following regularly:

delete from channelepoch where epoch_ref in
(select id from epoch where lastseen < (select ((now() - '3 days'::interval)::date)::timestamp));					

delete from routing where epoch_ref in
(select id from epoch where lastseen < (select ((now() - '3 days'::interval)::date)::timestamp));

delete from epoch where lastseen < (select ((now() - '3 days'::interval)::date)::timestamp)
and id not in (select epoch_ref from stationepoch)
and id not in (select epoch_ref from networkepoch);

delete from epochtype where id not in (select epochtype_ref from epoch);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant