Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oracle-ES consistency #240

Merged
merged 28 commits into from
May 28, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
9903938
Add consistency query to stage 009.
Evildoor Mar 27, 2019
42d55b3
Add script for consistency check to stage 069.
Evildoor Mar 29, 2019
f27f278
Get index name from config rather than code.
Evildoor Apr 2, 2019
af9f212
Check for index' existence before working.
Evildoor Apr 2, 2019
4c560a6
Update documentation.
Evildoor Apr 2, 2019
2296f6d
Add a very basic consistency check script.
Evildoor Apr 2, 2019
b8a2ab1
Generalize 069-consistency.
Evildoor Apr 4, 2019
acfe45a
Save and display the info about different tasks.
Evildoor Apr 5, 2019
e39efe2
Merge remote-tracking branch 'origin/master' into oracle-es-consistency
Evildoor Apr 5, 2019
8a71791
Move certain shell functions to library.
Evildoor Apr 5, 2019
90380a9
Remove DEBUG mode.
Evildoor Apr 17, 2019
7bac202
Move ES consistency script into a separate stage.
Evildoor Apr 17, 2019
12dd86e
Update a query description.
Evildoor Apr 18, 2019
944b5a2
Update and explain a magic number.
Evildoor Apr 18, 2019
26a1dfe
Reword es_connect() description.
Evildoor Apr 18, 2019
20875b1
Change log prefixes to standard ones.
Evildoor Apr 18, 2019
46cf0af
Fix pop() results handling.
Evildoor Apr 18, 2019
72d85a9
Update ES parameters handling.
Evildoor Apr 18, 2019
cacba11
Remove batching of inconsistent records.
Evildoor Apr 18, 2019
4d8eb83
Merge remote-tracking branch 'origin/master' into oracle-es-consistency
Evildoor Apr 19, 2019
181fb14
Add consistency data samples.
Evildoor Apr 19, 2019
7117242
Update the dataflow README.
Evildoor Apr 19, 2019
165c5d2
Ignore two additional fields.
Evildoor May 21, 2019
8f84d22
Change messages formatting.
Evildoor May 22, 2019
d195650
Add _parent field handling.
Evildoor May 22, 2019
612bf52
Remove service fields before checking.
Evildoor May 22, 2019
01ae258
Remove interpreter directives from lib files.
Evildoor May 22, 2019
8f86ddd
Simplify a field retrieval.
Evildoor May 28, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 3 additions & 17 deletions Utils/Dataflow/run/data4es-consistency-check
Original file line number Diff line number Diff line change
Expand Up @@ -3,29 +3,15 @@
DEBUG=

base_dir=$( cd "$(dirname "$(readlink -f "$0")")"; pwd)
lib="$base_dir/../shell_lib"

# Directories with configuration files
[ -n "$DATA4ES_CONSISTENCY_CONFIG_PATH" ] && \
CONFIG_PATH="$DATA4ES_CONSISTENCY_CONFIG_PATH" || \
CONFIG_PATH="${base_dir}/../config:${base_dir}/../../Elasticsearch/config"

# Find configuration file in $CONFIG_PATH
get_config() {
[ -z "$1" ] && echo "get_config(): no arguments passed." && return 1
dirs=$CONFIG_PATH
while [ -n "$dirs" ]; do
dir=${dirs%%:*}
[ "$dirs" = "$dir" ] && \
dirs='' || \
dirs="${dirs#*:}"
[ -f "${dir}/${1}" ] && readlink -f "${dir}/${1}" && return 0
done
}

# EOP filter (required due to the unconfigurable EOP marker in pyDKB)
eop_filter() {
sed -e"s/\\x00//g"
}
source $lib/get_config
source $lib/eop_filter

# Oracle
cfg009=`get_config "consistency009.cfg"`
Expand Down
27 changes: 4 additions & 23 deletions Utils/Dataflow/run/data4es-start
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,9 @@ BATCH_SIZE=100
DEBUG=

base_dir=$( cd "$(dirname "$(readlink -f "$0")")"; pwd)
lib="$base_dir/../shell_lib"

log() {
level=INFO
[ $# -eq 2 ] && level="$1" && shift
[ -z "$SCRIPT_NAME" ] && SCRIPT_NAME="$(basename "$0")"
msg="$1"
echo "($level) `date +'%d-%m-%Y %T'` ($SCRIPT_NAME) $msg" >&2
}
source $lib/log

# Directories with configuration files
[ -n "$DATA4ES_CONFIG_PATH" ] && \
Expand Down Expand Up @@ -49,18 +44,7 @@ pidfile="$HOME_DIR/pid"
# Define commands to be used as dataflow nodes
# ---

# Find configuration file in $CONFIG_PATH
get_config() {
[ -z "$1" ] && log ERROR "get_config(): no arguments passed." && return 1
dirs=$CONFIG_PATH
while [ -n "$dirs" ]; do
dir=${dirs%%:*}
[ "$dirs" = "$dir" ] && \
dirs='' || \
dirs="${dirs#*:}"
[ -f "${dir}/${1}" ] && readlink -f "${dir}/${1}" && return 0
done
}
source $lib/get_config

# Oracle Connector
cmd_09="${base_dir}/../009_oracleConnector/Oracle2JSON.py"
Expand Down Expand Up @@ -101,10 +85,7 @@ cmd_95="${base_dir}/../095_datasetInfoAMI/amiDatasets.py -m s --userkey $AUTH_KE

# Service (glue) functions
# ---
# EOP filter (required due to the unconfigurable EOP marker in pyDKB)
eop_filter() {
sed -e"s/\\x00//g"
}
source $lib/eop_filter

# Buffer file name
get_buffer() {
Expand Down
6 changes: 6 additions & 0 deletions Utils/Dataflow/shell_lib/eop_filter
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash
mgolosova marked this conversation as resolved.
Show resolved Hide resolved

# EOP filter (required due to the unconfigurable EOP marker in pyDKB)
eop_filter() {
sed -e"s/\\x00//g"
}
18 changes: 18 additions & 0 deletions Utils/Dataflow/shell_lib/get_config
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env bash

get_config_lib=$(cd "$(dirname "$BASH_SOURCE")"; pwd)
source $get_config_lib/log

# Find configuration file in $CONFIG_PATH
get_config() {
[ -z "$1" ] && log ERROR "get_config(): no arguments passed." && return 1
mgolosova marked this conversation as resolved.
Show resolved Hide resolved
[ -z "$CONFIG_PATH" ] && CONFIG_PATH=`pwd`
dirs=$CONFIG_PATH
mgolosova marked this conversation as resolved.
Show resolved Hide resolved
while [ -n "$dirs" ]; do
dir=${dirs%%:*}
[ "$dirs" = "$dir" ] && \
dirs='' || \
dirs="${dirs#*:}"
[ -f "${dir}/${1}" ] && readlink -f "${dir}/${1}" && return 0
done
}
9 changes: 9 additions & 0 deletions Utils/Dataflow/shell_lib/log
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#!/usr/bin/env bash

log() {
level=INFO
[ $# -eq 2 ] && level="$1" && shift
[ -z "$SCRIPT_NAME" ] && SCRIPT_NAME="$(basename "$0")"
msg="$1"
echo "($level) `date +'%d-%m-%Y %T'` ($SCRIPT_NAME) $msg" >&2
}