Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DF/data4es: Log diversification #375

Closed
wants to merge 5 commits into from

Conversation

anastasiakaida
Copy link
Contributor

@anastasiakaida anastasiakaida commented Jun 25, 2020

The current situation about how to define the log's producer (stage's script) is unclear.

To solve this issue we propose a solution that allows keeping stages scripts as they are.
The supervising script (data4es-run) runs processing chains so even if any sequence is changed (some stages are removed, changed, added) it would be easy to keep the way of logging the same.

  • Create a function that gets all stage's messages from STDERR and add script name to each message (so-called "wrapper")
  • Import and apply function in data4es scripts

STDERR wrapper is aimed to catch all messages from STDERR,
add to each one -- line by line -- the name of log producer
(script name) and redirect messaged to STDERR again.

It is considered to apply the following function for processing
chains the following way:

err_wrepper param1 param2 ... param*

Where a set of parameters can be presented as a sequence
kept in one variable like this:

"${base_dir}/../093_datasetsFormat/datasets_format.py -m s"

It doesn't matter how many variables will be passed -- it is supposed that
script name will be taken from the first argument.
The function is applied to all commands marked as "$cmd_%stage_number%"
in following chains:

*Sink chain
*Subchain 17
*Subchain 91
*Source chain
@anastasiakaida anastasiakaida changed the title [WIP] DF/data4es: Log differentiation DF/data4es: Log differentiation Jun 25, 2020
@anastasiakaida anastasiakaida changed the title DF/data4es: Log differentiation DF/data4es: Log diversification Jun 25, 2020
We need to return not only a variable value, but its name also.
The wrapper function now adds info about a stage, not
only just bare name of any script.
@mgolosova
Copy link
Collaborator

@anastasiakaida,

Couldn't make it work properly (see below). Did I miss something?

(dkb-dev) [dkb@aiatlas171 run]$ ./data4es-start --debug
(INFO) 14-08-2020 11:33:34 (data4es) Starting process.
Stage 91, datasets_processing.py: /home/dkb/dkb-dev.git/Utils/Dataflow/run/../shell_lib/err_wrapper: line 7: '/home/dkb/dkb-dev.git/Utils/Dataflow/run/../091_datasetsRucio/datasets_processing.py: No such file or directory
Stage 25', stage.py: /home/dkb/dkb-dev.git/Utils/Dataflow/run/../shell_lib/err_wrapper: line 7: '/home/dkb/dkb-dev.git/Utils/Dataflow/run/../025_chicagoES/stage.py: No such file or directory
Stage 19', run.sh: /home/dkb/dkb-dev.git/Utils/Dataflow/run/../shell_lib/err_wrapper: line 7: '/home/dkb/dkb-dev.git/Utils/Dataflow/run/../019_esFormat/run.sh: No such file or directory
Stage 17', adjustMetadata.py: /home/dkb/dkb-dev.git/Utils/Dataflow/run/../shell_lib/err_wrapper: line 7: '/home/dkb/dkb-dev.git/Utils/Dataflow/run/../017_adjustMetadata/adjustMetadata.py: No such file or directory
Stage 16', task2es.py: /home/dkb/dkb-dev.git/Utils/Dataflow/run/../shell_lib/err_wrapper: line 7: '/home/dkb/dkb-dev.git/Utils/Dataflow/run/../016_task2es/task2es.py: No such file or directory
Stage 19', run.sh: /home/dkb/dkb-dev.git/Utils/Dataflow/run/../shell_lib/err_wrapper: line 7: '/home/dkb/dkb-dev.git/Utils/Dataflow/run/../019_esFormat/run.sh: No such file or directory
Stage 'cmd', 'tee: /home/dkb/dkb-dev.git/Utils/Dataflow/run/../shell_lib/err_wrapper: line 7: 'tee: command not found
Stage 91, datasets_processing.py: /home/dkb/dkb-dev.git/Utils/Dataflow/run/../shell_lib/err_wrapper: line 7: '/home/dkb/dkb-dev.git/Utils/Dataflow/run/../091_datasetsRucio/datasets_processing.py: No such file or directory
Stage 'cmd', 'tee: /home/dkb/dkb-dev.git/Utils/Dataflow/run/../shell_lib/err_wrapper: line 7: 'tee: command not found
Stage 09', Oracle2JSON.py: /home/dkb/dkb-dev.git/Utils/Dataflow/run/../shell_lib/err_wrapper: line 7: '/home/dkb/dkb-dev.git/Utils/Dataflow/run/../009_oracleConnector/Oracle2JSON.py: No such file or directory
Stage 93', datasets_format.py: /home/dkb/dkb-dev.git/Utils/Dataflow/run/../shell_lib/err_wrapper: line 7: '/home/dkb/dkb-dev.git/Utils/Dataflow/run/../093_datasetsFormat/datasets_format.py: No such file or directory
Stage 95', amiDatasets.py: /home/dkb/dkb-dev.git/Utils/Dataflow/run/../shell_lib/err_wrapper: line 7: '/home/dkb/dkb-dev.git/Utils/Dataflow/run/../095_datasetInfoAMI/amiDatasets.py: No such file or directory
(INFO) 14-08-2020 11:33:34 (data4es) Finished process (took: 0 sec).

@mgolosova
Copy link
Collaborator

The PR is closed in favor of: #404, #406

@mgolosova mgolosova closed this Aug 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants