Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Social media event detector #404

Closed
wants to merge 10 commits into from
Closed

Social media event detector #404

wants to merge 10 commits into from

Conversation

FedericoCeratto
Copy link
Contributor

@FedericoCeratto FedericoCeratto commented Dec 9, 2022

Initial experimental version.

  • Support Clickhouse
  • Move computation into the database to improve performance
  • Generate blocking_events and blocking_status database table for research
  • Generate CC+ASN RSS feeds
  • Configure social media URLs for various services
  • Translate CC to country names
  • Fix Explorer URLs in RSS feeds
  • Fine-tune event detection parameters
  • Implement reprocessing of historical data on Prod

sql = """
INSERT INTO blocking_status (test_name, input, probe_cc, probe_asn,
confirmed_perc, pure_anomaly_perc, accessible_perc, cnt, status, old_status, change, stability)
SELECT test_name, input, probe_cc, probe_asn,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can improve this query by dropping one level of nesting and moving totcnt into a CTE, like this:

WITH new.cnt + blocking_status.cnt AS totcnt
SELECT
 empty(blocking_status.test_name) ? new.test_name : blocking_status.test_name AS test_name,
 empty(blocking_status.input) ? new.input : blocking_status.input AS input,
 empty(blocking_status.probe_cc) ? new.probe_cc : blocking_status.probe_cc AS probe_cc,
 (blocking_status.probe_asn = 0) ? new.probe_asn : blocking_status.probe_asn AS probe_asn,
 new.confirmed_perc    * new.cnt/totcnt * 0.015 + blocking_status.confirmed_perc    * blocking_status.cnt/totcnt * 0.985 AS confirmed_perc,
 new.pure_anomaly_perc * new.cnt/totcnt * 0.015 + blocking_status.pure_anomaly_perc * blocking_status.cnt/totcnt * 0.985 AS pure_anomaly_perc,
 new.accessible_perc   * new.cnt/totcnt * 0.015 + blocking_status.accessible_perc   * blocking_status.cnt/totcnt * 0.985 AS accessible_perc,
 new.cnt * 0.015 + blocking_status.cnt * 0.985 AS cnt,
  multiIf(
    accessible_perc < 80 AND stability > 0.95, 'BLOCKED',
    accessible_perc > 95 AND stability > 0.97, 'OK',
    blocking_status.status) AS status,
  blocking_status.status AS old_status,
  if(status = old_status, change * 0.985, 1) AS change,
 if(new.cnt > 0,
  cos(3.14/2*(new.accessible_perc - blocking_status.accessible_perc)/100) * 0.7 + blocking_status.stability * 0.3,
  blocking_status.stability) AS stability
FROM blocking_status FINAL
FULL OUTER JOIN
(
 SELECT test_name, input, probe_cc, probe_asn,
  countIf(confirmed = 't') * 100 / cnt AS confirmed_perc,
  countIf(anomaly = 't') * 100 / cnt - confirmed_perc AS pure_anomaly_perc,
  countIf(anomaly = 'f') * 100 / cnt AS accessible_perc,
  count() AS cnt,
  0 AS stability
 FROM fastpath
 WHERE test_name IN ['web_connectivity']
 AND msm_failure = 'f'
 AND measurement_start_time >= %(start_date)s
 AND measurement_start_time < %(end_date)s
 AND input IN %(urls)s
 GROUP BY test_name, input, probe_cc, probe_asn
) AS new
ON
 new.input = blocking_status.input
 AND new.probe_cc = blocking_status.probe_cc
 AND new.probe_asn = blocking_status.probe_asn
 AND new.test_name = blocking_status.test_name

@FedericoCeratto
Copy link
Contributor Author

Moved to new repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants