Skip to content

Commit

Permalink
filter: add attribute matching capabilities
Browse files Browse the repository at this point in the history
- filter: add --attr-(not-)matches, unittests, updated docs with examples.
- filter: add --skip-broken because my hard drive contains tons of bad .conf
  files made by humans.
- filter: improve error handling around invalid conf files.
- Update changelog
  • Loading branch information
lowell80 committed Sep 29, 2023
1 parent d1fe106 commit 6c45127
Show file tree
Hide file tree
Showing 5 changed files with 168 additions and 21 deletions.
11 changes: 11 additions & 0 deletions docs/source/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,17 @@ Ksconf 0.12
In many cases, this really isn't a new dependency, since pluggy requires it as well.


Ksconf v0.12.1 (DRAFT)
~~~~~~~~~~~~~~~~~~~~~~~~~~~

* Add new attribute-level matching logic to ``ksconf filter``.
Use ``--attr-matches`` and/or ``--attr-not-matches`` to match specific attribute and value combinations for stanza matching.
This can be used to find props with a specific ``KV_MODE``, find saved search containing a specific search command, or list indexes not using ``volume:`` designation.
See the `ksconf_cmd_filter` docs for example usage.
* Fixed documentation generation bug that prevented command line options from showing up in the docs.



Ksconf v0.12.0 (2023-09-27)
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
55 changes: 55 additions & 0 deletions docs/source/cmd_filter.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,64 @@ installed yet, then you'll understand the value of this.

In many other cases, the usage of both ``ksconf filter`` and ``btool`` differ significantly.


.. note:: What if I want a filter default & local at the same time?

In situations where it would be beneficial to filter based on the combined view of default and local, then simply use `ksconf_cmd_merge` first.
Here are two options.


*Option 1:* Use a named temporary file

.. code-block:: sh
ksconf merge search/{default,local}/savedsearches.conf > savedsearches.conf
ksconf filter savedsearches.conf - --stanza "* last 3 hours"
*Option 2:* Chain both commands together


.. code-block:: sh
ksconf merge search/{default,local}/savedsearches.conf | ksconf filter --stanza "* last 3 hours"
Examples
--------


Searching for attribute/values combinations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Find all enabled input stanzas with a sourcetype prefixed with ``apache:``.

.. code-block:: sh
ksconf filter etc/apps/*/{default,local}/inputs.conf \
--enabled-only --attr-eq sourcetype 'apache:*'
List the names of saved searches using potentially expensive search commands:

.. code-block:: sh
ksconf filter etc/apps/*/{default,local}/savedsearches.conf \
-b --match regex \
--attr-eq search '.*\|\s*(streamstats|transaction) .*'
Show sourcetype stanzas where ``EVENT_BREAKER`` is defined but not enabled:

.. code-block:: sh
ksconf filter etc/deployment-apps/*/{default,local}/props.conf \
--skip-broken --match regex \
--attr-match-equals EVENT_BREAKER '.+' \
--attr-match-not-equals EVENT_BREAKER_ENABLE '(true|1)'
Note that both conditions listed must match for a stanza to match. Logical 'AND' not an 'OR'. Also note the use of ``--skip-broken`` because sometimes Splunk base apps have invalid conf files.


Lift and shift
~~~~~~~~~~~~~~

Expand Down
22 changes: 19 additions & 3 deletions docs/source/dyn/cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -325,12 +325,13 @@ ksconf filter

.. code-block:: none
usage: ksconf filter [-h] [-o FILE] [--comments] [--verbose]
usage: ksconf filter [-h] [-o FILE] [--comments] [--verbose] [--skip-broken]
[--match {regex,wildcard,string}] [--ignore-case]
[--invert-match] [--files-with-matches]
[--count | --brief] [--stanza PATTERN]
[--attr-present ATTR] [-e | -d] [--keep-attrs WC-ATTR]
[--reject-attrs WC-ATTR]
[--attr-present ATTR] [--attr-matches ATTR PATTERN]
[--attr-not-matches ATTR PATTERN] [-e | -d]
[--keep-attrs WC-ATTR] [--reject-attrs WC-ATTR]
CONF [CONF ...]
Filter the contents of a conf file in various ways. Stanzas can be included or
Expand All @@ -348,6 +349,9 @@ ksconf filter
to standard out.
--comments, -C Preserve comments. Comments are discarded by default.
--verbose Enable additional output.
--skip-broken Skip broken input files. Without this things like
duplicate stanzas and invalid entries will cause
processing to stop.
--match {regex,wildcard,string}, -m {regex,wildcard,string}
Specify pattern matching mode. Defaults to 'wildcard'
allowing for '*' and '?' matching. Use 'regex' for
Expand Down Expand Up @@ -381,6 +385,18 @@ ksconf filter
--attr-present ATTR Match any stanza that includes the ATTR attribute.
ATTR supports bulk attribute patterns via the
'file://' prefix.
--attr-matches ATTR PATTERN, --attr-eq ATTR PATTERN
Match any stanza containing ATTR == PATTERN. PATTERN
supports the special 'file://filename' syntax.
Matching can be a direct string comparison (equals),
or a regex and wildcard match. Note that all '--attr-
match' and '--attr-not-match' arguments are matched
together. For a stanza to match, all rules must apply.
If attr is missing from a stanza, the value becomes an
empty string for matching purposes.
--attr-not-matches ATTR PATTERN, --attr-ne ATTR PATTERN
Match any stanza containing ATTR != PATTERN. See '--
attr-matches' for additional details.
-e, --enabled-only Keep only enabled stanzas. Any stanza containing
'disabled = 1' will be removed. The value of
'disabled' is assumed to be false by default.
Expand Down
76 changes: 60 additions & 16 deletions ksconf/commands/filter.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,12 @@
import argparse
import sys
from argparse import ArgumentParser
from typing import List, Tuple

from ksconf.commands import ConfFileType, KsconfCmd, dedent
from ksconf.conf.parser import PARSECONF_MID_NC, conf_attr_boolean, write_conf_stream
from ksconf.consts import EXIT_CODE_SUCCESS
from ksconf.conf.parser import (PARSECONF_MID_NC, ConfParserException,
conf_attr_boolean, write_conf_stream)
from ksconf.consts import EXIT_CODE_BAD_CONF_FILE, EXIT_CODE_SUCCESS
from ksconf.filter import FilteredList, FilteredListWildcard, create_filtered_list
from ksconf.util.completers import conf_files_completer

Expand Down Expand Up @@ -64,6 +66,10 @@ def register_args(self, parser: ArgumentParser):
parser.add_argument("--verbose", action="store_true", default=False,
help="Enable additional output.")

parser.add_argument("--skip-broken", action="store_true", default=False,
help="Skip broken input files. Without this things like duplicate "
"stanzas and invalid entries will cause processing to stop.")

parser.add_argument("--match", "-m", # metavar="MODE",
choices=["regex", "wildcard", "string"],
default="wildcard",
Expand Down Expand Up @@ -112,20 +118,26 @@ def register_args(self, parser: ArgumentParser):
Match any stanza that includes the ATTR attribute.
ATTR supports bulk attribute patterns via the ``file://`` prefix."""))

'''# Add next
pg_sel.add_argument("--attr-eq", metavar=("ATTR", "PATTERN"), nargs=2, action="append",
pg_sel.add_argument("--attr-matches",
"--attr-eq",
metavar=("ATTR", "PATTERN"), nargs=2, action="append",
default=[],
help="""
Match any stanza that includes an attribute matching the pattern.
PATTERN supports the special ``file://filename`` syntax.""")
'''
''' # This will be more difficult
pg_sel.add_argument("--attr-ne", metavar=("ATTR", "PATTERN"), nargs=2, action="append",
help=dedent("""
Match any stanza containing ATTR == PATTERN.
PATTERN supports the special ``file://filename`` syntax. Matching can be a direct
string comparison (equals), or a regex and wildcard match.
Note that all ``--attr-match`` and ``--attr-not-match`` arguments are matched together.
For a stanza to match, all rules must apply.
If attr is missing from a stanza, the value becomes an empty string for matching purposes."""))

pg_sel.add_argument("--attr-not-matches",
"--attr-ne",
metavar=("ATTR", "PATTERN"), nargs=2, action="append",
default=[],
help="""
Match any stanza that includes an attribute matching the pattern.
PATTERN supports the special ``file://`` syntax.""")
'''
help=dedent("""
Match any stanza containing ATTR != PATTERN.
See ``--attr-matches`` for additional details."""))

pg_eod = pg_sel.add_mutually_exclusive_group()
pg_eod.add_argument("-e", "--enabled-only", action="store_true",
Expand Down Expand Up @@ -167,6 +179,20 @@ def prep_filters(self, args):
self.attr_presence_filters = create_filtered_list(args.match, flags)
self.attr_presence_filters.feedall(args.attr_present)

# Q: Should we check to see if the same attribute is used more than once (likely a typo?)
# A: No, let's trust the user; and avoid code bloat for hypothetical mistakes.
self.attr_value_filters: List[Tuple[str, FilteredList]] = []
if args.attr_matches:
for attr, value in args.attr_matches:
value_filter = create_filtered_list(args.match, flags)
value_filter.feed(value)
self.attr_value_filters.append((attr, value_filter))
if args.attr_not_matches:
for attr, value in args.attr_not_matches:
value_filter = create_filtered_list(args.match, flags | FilteredList.INVERT)
value_filter.feed(value)
self.attr_value_filters.append((attr, value_filter))

if args.enabled_only:
self.disabled_filter = lambda attrs: not is_disabled(attrs)
elif args.disabled_only:
Expand All @@ -191,6 +217,14 @@ def _test_stanza(self, stanza: str, attributes: dict) -> bool:
if not self.disabled_filter(attributes):
return False

# If attr matching is in use, then test all attribute/match. All must match.
if self.attr_value_filters:
for attr_name, attr_filter in self.attr_value_filters:
value = attributes.get(attr_name, "")
if not attr_filter.match(value):
return False
return True

# If there are no attribute level filters, automatically keep (preserves empty stanzas)
if not self.attr_presence_filters.has_rules:
return True
Expand Down Expand Up @@ -249,8 +283,18 @@ def run(self, args):
# Still would be helpful for a quick "grep" of a large number of files

for conf in args.conf:
conf.set_parser_option(keep_comments=args.comments)
cfg = conf.data
try:
conf.set_parser_option(keep_comments=args.comments)
cfg = conf.data
except ConfParserException as e:
action = "Aborting"
if args.skip_broken:
action = "Skipping"
self.stderr.write(f"{action} due to parsing error during {conf.name} due to {e}\n")
if action == "Aborting":
return EXIT_CODE_BAD_CONF_FILE
continue

cfg_out = dict()
for stanza_name, attributes in cfg.items():
keep = self._test_stanza(stanza_name, attributes) ^ args.invert_match
Expand Down
25 changes: 23 additions & 2 deletions tests/test_cli_filter.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@
from tests.cli_helper import FakeStdin, TestWorkDir, ksconf_cli


def get_brief(stdout: str, ret_type=set):
return ret_type(f for f in stdout.split("\n") if f)


class CliKsconfFilter(unittest.TestCase):

_sample01 = """\
Expand Down Expand Up @@ -292,8 +296,8 @@ def test_output_count(self):
def test_output_brief(self):
with ksconf_cli:
ko = ksconf_cli("filter", self.sample01, "--stanza", "Errors in *", "-b")
self.assertRegex(ko.stdout, r"Errors in the last hour[\r\n]")
self.assertRegex(ko.stdout, r"Errors in the last 24 hours[\r\n]")
stanzas = get_brief(ko.stdout, set)
self.assertSetEqual(stanzas, {"Errors in the last hour", "Errors in the last 24 hours"})

def test_output_list_combos(self):
with ksconf_cli:
Expand Down Expand Up @@ -485,6 +489,23 @@ def test_filter_attrs_whbllist(self):
self.assertIn("DATETIME_CONFIG", keys)
self.assertEqual(len(out["iis"]), 0)

def test_attr_eq_ne_savedsearch_match(self):
with ksconf_cli:
ko = ksconf_cli("filter", self.sample01, "-b",
"--attr-eq", "search", "*sourcetype=access_*")
self.assertEqual(ko.returncode, EXIT_CODE_SUCCESS)
stanzas = get_brief(ko.stdout, set)
self.assertSetEqual(stanzas, {"Errors in the last hour", "Errors in the last 24 hours"})

with ksconf_cli:
ko = ksconf_cli("filter", self.sample01, "-b",
"--attr-ne", "search", "*metrics.log*")
self.assertEqual(ko.returncode, EXIT_CODE_SUCCESS)
stanzas = get_brief(ko.stdout, set)
self.assertSetEqual(stanzas, {"Errors in the last hour",
"Errors in the last 24 hours",
"Splunk errors last 24 hours"})


if __name__ == '__main__': # pragma: no cover
unittest.main()

0 comments on commit 6c45127

Please sign in to comment.