Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement: read grok aliases from file #194

Merged
merged 22 commits into from
Dec 23, 2023

Conversation

itkovian
Copy link
Contributor

This PR aims to address (in part) #89, allowing GROK aliases to be defined in one or more files. I am not sure

  • Files are assumed to contain JSON, with a single outer Object.
  • Duplicate patterns are not handled
  • Needs proper tests.

(recreated from vectordotdev/vector#14914)

Copy link
Member

@fuchsnj fuchsnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few comments, overall it seems to be reasonable though. There was a large refactoring recently that moved most of the files from lib/* to src/*. The merge conflicts you are seeing are probably from that.

@@ -519,18 +520,22 @@ pub enum Error {

#[error(r#"mutation of read-only value"#)]
ReadOnlyMutation { context: String },

#[error(r#"invalid alias source"#)]
InvalidAliasSource { path: PathBuf },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function specific errors should not go here. I would suggest just re-using the existing InvalidArgument error above.

expr,
})?
.try_bytes_utf8_lossy()
.expect("filename not bytes")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not panic, since nothing is making sure the inputs are strings here. The argument type is ARRAY, but the elements can be any type. This should check that all arguments are strings, and return a InvalidArgument if it's not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize you probably just copied this from aliases above, but that is also a bug (that I just opened).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Will change.

"_status": "%{POSINT:status}",
"_message": "%{GREEDYDATA:message}"
}),
alias_sources: Value::Array(vec![]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a test with a least 1 real alias file loaded. If it's easier, you can use the more general VRL testing framework at lib/tests/tests/functions/... and include a file containing an alias.

@itkovian
Copy link
Contributor Author

itkovian commented Jun 5, 2023

@fuchsnj I tackled the other alias thing as well, can you check if this is what you expect to happen?

@fuchsnj
Copy link
Member

fuchsnj commented Jun 27, 2023

@fuchsnj I tackled the other alias thing as well, can you check if this is what you expect to happen?

Yes, this looks great, thanks!

@fuchsnj
Copy link
Member

fuchsnj commented Jun 27, 2023

Looks good overall. Please add a line to CHANGELOG.md for these changes. It also looks like the wasm32 CI check is failing. Make sure imports or other functionality that was added that isn't supported on wasm goes in the non_wasm module in this file so the conditional compilation works correctly. Let me know if you need any help with this.

@pront
Copy link
Contributor

pront commented Aug 2, 2023

Hello @itkovian, let us know if you have time to updated based on this feedback or if you want to hand this off. And thanks again for submitting this!

@itkovian
Copy link
Contributor Author

I was on holidays, will check asap.

@itkovian
Copy link
Contributor Author

itkovian commented Sep 4, 2023

Looks good overall. Please add a line to CHANGELOG.md for these changes. It also looks like the wasm32 CI check is failing. Make sure imports or other functionality that was added that isn't supported on wasm goes in the non_wasm module in this file so the conditional compilation works correctly. Let me know if you need any help with this.

I think I fixed the wasm thing, let me know if anything else is needed. Thx!

@pront pront requested a review from fuchsnj September 6, 2023 15:13
@pront
Copy link
Contributor

pront commented Sep 6, 2023

Hi @itkovian, there are some test failures. You can iterate on these locally by running ./scripts/checks.sh from the repo root.

@itkovian
Copy link
Contributor Author

Tests fail for the main branch as well?

The following warnings were emitted during compilation:

warning: error: unable to create target: 'No available targets are compatible with triple "wasm32-unknown-unknown"'
warning: 1 error generated.
warning: error: unable to create target: 'No available targets are compatible with triple "wasm32-unknown-unknown"'
warning: 1 error generated.

error: failed to run custom build command for `zstd-sys v2.0.8+zstd.1.5.5`

Caused by:
  process didn't exit successfully: `/Users/ageorges/.cargo/build-cache/debug/build/zstd-sys-c0f0ce56419d5513/build-script-build` (exit status: 1)
  --- stdout
  cargo:rerun-if-env-changed=ZSTD_SYS_USE_PKG_CONFIG
  cargo:rustc-cfg=feature="std"
  cargo:rerun-if-changed=wasm-shim/stdlib.h
  cargo:rerun-if-changed=wasm-shim/string.h
  TARGET = Some("wasm32-unknown-unknown")
  OPT_LEVEL = Some("0")
  HOST = Some("aarch64-apple-darwin")
  cargo:rerun-if-env-changed=CC_wasm32-unknown-unknown
  CC_wasm32-unknown-unknown = None
  cargo:rerun-if-env-changed=CC_wasm32_unknown_unknown
  CC_wasm32_unknown_unknown = None
  cargo:rerun-if-env-changed=TARGET_CC
  TARGET_CC = None
  cargo:rerun-if-env-changed=CC
  CC = None
  cargo:rerun-if-env-changed=CFLAGS_wasm32-unknown-unknown
  CFLAGS_wasm32-unknown-unknown = None
  cargo:rerun-if-env-changed=CFLAGS_wasm32_unknown_unknown
  CFLAGS_wasm32_unknown_unknown = None
  cargo:rerun-if-env-changed=TARGET_CFLAGS
  TARGET_CFLAGS = None
  cargo:rerun-if-env-changed=CFLAGS
  CFLAGS = None
  cargo:rerun-if-env-changed=CRATE_CC_NO_DEFAULTS
  CRATE_CC_NO_DEFAULTS = None
  DEBUG = Some("true")
  cargo:rerun-if-env-changed=CC_wasm32-unknown-unknown
  CC_wasm32-unknown-unknown = None
  cargo:rerun-if-env-changed=CC_wasm32_unknown_unknown
  CC_wasm32_unknown_unknown = None
  cargo:rerun-if-env-changed=TARGET_CC
  TARGET_CC = None
  cargo:rerun-if-env-changed=CC
  CC = None
  cargo:rerun-if-env-changed=CFLAGS_wasm32-unknown-unknown
  CFLAGS_wasm32-unknown-unknown = None
  cargo:rerun-if-env-changed=CFLAGS_wasm32_unknown_unknown
  CFLAGS_wasm32_unknown_unknown = None
  cargo:rerun-if-env-changed=TARGET_CFLAGS
  TARGET_CFLAGS = None
  cargo:rerun-if-env-changed=CFLAGS
  CFLAGS = None
  cargo:rerun-if-env-changed=CRATE_CC_NO_DEFAULTS
  CRATE_CC_NO_DEFAULTS = None
  cargo:rerun-if-env-changed=CC_wasm32-unknown-unknown
  CC_wasm32-unknown-unknown = None
  cargo:rerun-if-env-changed=CC_wasm32_unknown_unknown
  CC_wasm32_unknown_unknown = None
  cargo:rerun-if-env-changed=TARGET_CC
  TARGET_CC = None
  cargo:rerun-if-env-changed=CC
  CC = None
  cargo:rerun-if-env-changed=CFLAGS_wasm32-unknown-unknown
  CFLAGS_wasm32-unknown-unknown = None
  cargo:rerun-if-env-changed=CFLAGS_wasm32_unknown_unknown
  CFLAGS_wasm32_unknown_unknown = None
  cargo:rerun-if-env-changed=TARGET_CFLAGS
  TARGET_CFLAGS = None
  cargo:rerun-if-env-changed=CFLAGS
  CFLAGS = None
  cargo:rerun-if-env-changed=CRATE_CC_NO_DEFAULTS
  CRATE_CC_NO_DEFAULTS = None
  cargo:rerun-if-env-changed=CC_wasm32-unknown-unknown
  CC_wasm32-unknown-unknown = None
  cargo:rerun-if-env-changed=CC_wasm32_unknown_unknown
  CC_wasm32_unknown_unknown = None
  cargo:rerun-if-env-changed=TARGET_CC
  TARGET_CC = None
  cargo:rerun-if-env-changed=CC
  CC = None
  cargo:rerun-if-env-changed=CFLAGS_wasm32-unknown-unknown
  CFLAGS_wasm32-unknown-unknown = None
  cargo:rerun-if-env-changed=CFLAGS_wasm32_unknown_unknown
  CFLAGS_wasm32_unknown_unknown = None
  cargo:rerun-if-env-changed=TARGET_CFLAGS
  TARGET_CFLAGS = None
  cargo:rerun-if-env-changed=CFLAGS
  CFLAGS = None
  cargo:rerun-if-env-changed=CRATE_CC_NO_DEFAULTS
  CRATE_CC_NO_DEFAULTS = None
  running: "clang" "-O0" "-ffunction-sections" "-fdata-sections" "-fPIC" "-g" "-fno-omit-frame-pointer" "--target=wasm32-unknown-unknown" "-I" "wasm-shim/" "-I" "zstd/lib/" "-I" "zstd/lib/common" "-fvisibility=hidden" 
"-DXXH_STATIC_ASSERT=0" "-DZSTD_LIB_DEPRECATED=0" "-DXXH_PRIVATE_API=" "-DZSTDLIB_VISIBILITY=" "-DZSTDERRORLIB_VISIBILITY=" "-o" "/Users/ageorges/.cargo/build-cache/wasm32-unknown-unknown/debug/build/zstd-sys-3ee678ded
c1a7237/out/zstd/lib/common/debug.o" "-c" "zstd/lib/common/debug.c"
  cargo:warning=error: unable to create target: 'No available targets are compatible with triple "wasm32-unknown-unknown"'
  cargo:warning=1 error generated.
  exit status: 1
  running: "clang" "-O0" "-ffunction-sections" "-fdata-sections" "-fPIC" "-g" "-fno-omit-frame-pointer" "--target=wasm32-unknown-unknown" "-I" "wasm-shim/" "-I" "zstd/lib/" "-I" "zstd/lib/common" "-fvisibility=hidden" 
"-DXXH_STATIC_ASSERT=0" "-DZSTD_LIB_DEPRECATED=0" "-DXXH_PRIVATE_API=" "-DZSTDLIB_VISIBILITY=" "-DZSTDERRORLIB_VISIBILITY=" "-o" "/Users/ageorges/.cargo/build-cache/wasm32-unknown-unknown/debug/build/zstd-sys-3ee678ded
c1a7237/out/zstd/lib/common/entropy_common.o" "-c" "zstd/lib/common/entropy_common.c"
  cargo:warning=error: unable to create target: 'No available targets are compatible with triple "wasm32-unknown-unknown"'
  cargo:warning=1 error generated.
  exit status: 1

  --- stderr


  error occurred: Command "clang" "-O0" "-ffunction-sections" "-fdata-sections" "-fPIC" "-g" "-fno-omit-frame-pointer" "--target=wasm32-unknown-unknown" "-I" "wasm-shim/" "-I" "zstd/lib/" "-I" "zstd/lib/common" "-fvisi
bility=hidden" "-DXXH_STATIC_ASSERT=0" "-DZSTD_LIB_DEPRECATED=0" "-DXXH_PRIVATE_API=" "-DZSTDLIB_VISIBILITY=" "-DZSTDERRORLIB_VISIBILITY=" "-o" "/Users/ageorges/.cargo/build-cache/wasm32-unknown-unknown/debug/build/zst
d-sys-3ee678dedc1a7237/out/zstd/lib/common/debug.o" "-c" "zstd/lib/common/debug.c" with args "clang" did not execute successfully (status code exit status: 1).


warning: build failed, waiting for other jobs to finish...

vrl on 🌱 main [$] is 📦 v0.6.0 via 🦀 v1.72.0 took 19m55s 
✗ rustup target add wasm32-unknown-unknown
info: component 'rust-std' for target 'wasm32-unknown-unknown' is up to date

@pront
Copy link
Contributor

pront commented Sep 19, 2023

Tests fail for the main branch as well?

Hi @itkovian, checks are passing on the current HEAD (ad17a96).

The failure you posted above seems like an environment setup issue. Two commands to try:

rustup target add wasm32-unknown-unknown
cargo clean

If this is blocking you, you can just pick and choose what to run:

./scripts/checks.sh clippy tests vrl_tests

@pront
Copy link
Contributor

pront commented Oct 10, 2023

Hi @itkovian, just following up on this PR. Are you still interested in driving this to completion?

Cargo.toml Outdated Show resolved Hide resolved
@itkovian
Copy link
Contributor Author

@pront Yes, please :) Wasm build/tests should be fixed.

CHANGELOG.md Outdated
@@ -47,6 +47,7 @@
- fixed type definitions for side-effects inside of queries (https://github.com/vectordotdev/vrl/pull/258)
- replaced `Program::final_type_state` with `Program::final_type_info` to give access to the type definitions of both the target and program result (https://github.com/vectordotdev/vrl/pull/262)
- added `from_unix_timestamp` vrl function (https://github.com/vectordotdev/vrl/pull/277)
- added the `alias_sources` parameter for `parse_groks` to read sources from files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this line under "unreleased".

Copy link
Contributor

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @itkovian. Two things remain before we can merge this PR:

  • We are missing a real test.
  • I left a question about security.

Feel free to ignore the nits.

src/stdlib/parse_groks.rs Outdated Show resolved Hide resolved
src/stdlib/parse_groks.rs Outdated Show resolved Hide resolved
src/stdlib/parse_groks.rs Outdated Show resolved Hide resolved
src/stdlib/parse_groks.rs Show resolved Hide resolved
src/stdlib/parse_groks.rs Show resolved Hide resolved
Copy link
Contributor

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution @itkovian!

Left one comment and also some CI checks are failing (running cargo fmt should fix them).

lib/tests/tests/functions/test.json Outdated Show resolved Hide resolved
@pront pront enabled auto-merge December 23, 2023 08:57
@pront pront added this pull request to the merge queue Dec 23, 2023
Merged via the queue into vectordotdev:main with commit a1279c8 Dec 23, 2023
9 checks passed
@jszwedko
Copy link
Member

Just realizing we never documented this addition in the Vector documentation. If you feel inspired, would you mind doing that @itkovian ? See vectordotdev/vector#19081 for an example.

@itkovian
Copy link
Contributor Author

@jszwedko Sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants