Analysis for Logging system #2

rufuspollock · 2020-06-04T19:42:57Z

Logging and reporting is a crucial aspect of a data factory system like this.

What kind of logs
Log format
Log storage
Log access

Job stories

When a Run is initiated by an Operator they want to see it is running and be notified of application and (meta)data errors as soon as possible, especially “halts” so that they can debug and re-run

If there are a lot of (data) errors I want to examine these in a system that allows me to analyse / view easily (i.e. don’t have my page crash as it tries to load 100k error messages)
I don’t want to receive 100k error emails …

When a scheduled Run happens as an Operator (Sysadmin), I want to be notified afterwards (with a report?) if something went wrong, so that I can do something about it …

When I need to report to my colleagues about the Harvesting system I want to get an overall report of how it is going, how many datasets are harvested etc so that I can tell them

Domain Model

Status info: this is Run is running, it is finished, it took this long …

If the process takes longer that I expect we could show a window with live logs (using the Airflow API). We haven’t yet a status like “running step X”, “running step Y”, “stopped by error”, “finished”. We need to add this to the NG Harvester.

(Raw) Log information …

Logs on run execution (classic INFO, WARN etc logging)
- Including handled application errors ERROR
(Meta)data errors (and warnings) => What do these look like?
(Unhandled) Exceptions or errors (caught by parent system)

Reports / Summaries e.g. 200 records processed, 5 errors, 2 warnings, 8 new dataset, 192 existing records updated

4 cases

Run Status Info (Live and Historic)
- Who: Someone running a Job in realtime: When something does not work I want to see history of jobs (e.g. when have jobs stopped running) so that I can debug
- Provided by: Orchestrator (ie. airflow) TODO: does orchestrator provide historic info (?)
- Format: Whatever API that gives
App Log
- Who: Someone running a Job (if they want real-time feedback)
  - Someone debugging a failed job (and a specific source)
  - Someone creating a new pipeline and wanting to debug it
- Provided by: Logging in the code using std log library and either config of the storage location in code or from orchestrator
- Format: Regular logs (text format) and a custom JSON file as a final log report
(Meta)Data Quality Warn / Errors
- Who: “Owner” of a harvest source who wants to get those corrected
  - A Harvest Admin who is overseeing the process and wants to know what happened (and maybe how to fix the pipeline)
- Provided by: Explicit recording as part of application code and a specific error format e.g. https://github.com/frictionlessdata/data-quality-spec
- Format: Analyze results of the quality tools to use and define some kind of JSON results or report file.
Report: E.g. how many runs happened. How many datasets harvested etc.
- Who: Non-tech people more.
  - Someone Running a Job
  - Harvest Admin
- Provided by: output from NG Harvester. Displayed via new WUI / SPA embedded in CKAN
- Format: Basic formatting of the logs (based by JSON) file, and then iterate based on feedback

hannelita · 2020-06-22T13:34:15Z

Google cloud composer already provides a lot of logs. We may be able to create a sink on GCP Operations Logging and redirect the created logs to another service

rufuspollock mentioned this issue Jun 8, 2020

[epic] MVP v0.1 for new Data Load approach with AirCan #1

Closed

29 tasks

hannelita mentioned this issue Jun 8, 2020

Local airflow with load code for local csv converting to json #3

Closed

5 tasks

rufuspollock self-assigned this Jun 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analysis for Logging system #2

Analysis for Logging system #2

rufuspollock commented Jun 4, 2020 •

edited

Loading

hannelita commented Jun 22, 2020

Analysis for Logging system #2

Analysis for Logging system #2

Comments

rufuspollock commented Jun 4, 2020 • edited Loading

Job stories

Domain Model

4 cases

hannelita commented Jun 22, 2020

rufuspollock commented Jun 4, 2020 •

edited

Loading