-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues in handling duplicate report_ids #192
Comments
What I actually did is following:
insert into repeated_report
select
dupgrp_no,
u,
'2018-05-' || d || '/20180505T000008Z-NL-AS9143-web_connectivity-20180505T000008Z_AS9143_YXblHbyqIlBUxqzkwQ344hJM4O19Nx9q2E90RUv4W6yFTi4QyS-0.2.0-probe.json',
'\xf74412b29a700451b8a0e6dedf6a4c49d781750a'::sha1 as orig_sha1,
'\x873269d0cf236fdfa30da85017c3bf0ee45b5b5e8796076010fe65bee749fa01f8f9e1fc35137b45875af95867a1613784371f00c5dcdddf3263fe38e7b9fe0b'::sha512 as orig_sha512
from (values ('06', false), ('07', true), ('08', false), ('09', false), ('10', false)) as t1 (d, u)
, (values (nextval('dupgrp_no_seq'))) AS t2 (dupgrp_no) ; I considered that extending
|
I was looking at old notes and I've found the following one:
Unfortunately, the affected |
@FedericoCeratto this may be relevant to the issues related to duplicate |
Fixed by moving to measurement uid |
When a the same report is submitted twice, because OONI Probe attempted to re-submit a report, the centrifugation step fails due to a duplicate key.
I see that there is already some logic to handle this situation in
load_global_duplicate_reports
(https://github.com/TheTorProject/ooni-pipeline/blob/master/af/shovel/centrifugation.py#L586) (which is great 👍 ), but this does not apply "cleanly" to when you are backfilling data.My understanding is that this table should somehow be populated by running
shovel/canned_repeated.py
, however I don't see it ever being called (nor is it part of the DAG).@darkk what is the process for handling duplicate reports?
Here are the relevant log lines of a situation of this sort:
The text was updated successfully, but these errors were encountered: