diff --git a/examples/custom_search/README.md b/examples/custom_search/README.md deleted file mode 100644 index 07bf5911e..000000000 --- a/examples/custom_search/README.md +++ /dev/null @@ -1,137 +0,0 @@ -# Custom Search - -custom_search is a custom Splunk app (http://www.splunk.com/base/Documentation/latest/Developer/AppIntro) -that provides a single custom search command 'usercount'. - -The purpose of the app is to provide an example of how to define a custom search -command, how to configure it, and what the input and output should look like in -order to work with Splunk. Most of this is also documented in the Splunk -documentation at http://www.splunk.com/base/Documentation/latest/SearchReference/Aboutcustomsearchcommands - -## Example Commands - -* Count the number of processes each user has in a unix "top" event: - [examples/custom_search/bin/usercount.py](https://github.com/splunk/splunk-sdk-python/blob/master/examples/custom_search/bin/usercount.py) -* Count the top hashtags in a set of tweets: - [examples/twitted/twitted/bin/tophashtags.py](https://github.com/splunk/splunk-sdk-python/blob/master/examples/twitted/twitted/bin/tophashtags.py) -* Add a hashtags multivalue field to each tweet: - [examples/twitted/twitted/bin/hashtags.py](https://github.com/splunk/splunk-sdk-python/blob/master/examples/twitted/twitted/bin/hashtags.py) - -## Defining a Custom Search Command - -A custom search command is merely a Python script that reads input from stdin -and writes output to stdout. Input comes in as CSV (with an optional header), -and is in general meant to be read using Python's stdlib csv module -(using `csv.reader` or `csv.DictReader`). Output is also expected to be in CSV, -and is likewise meant to be used with `csv.writer` or `csv.DictWriter`. - -## Conceptual - -As noted above, a custom search command is just a Python script that reads data -in and writes data out. However, it might be useful to make a distinction -between two subtypes of custom search commands: - -* A streaming custom search command is one that is streamed data in. You can - think of it as applying a "function"/"transformation" to each event and then - writing out the result of that operation. It is a kind of "mapper". An - example of such a command might be a command that adds a field to each event. -* A non-streaming custom search command expects to have all the data before - it operates on it. As such, it is usually "reducing" the data into the - output by applying some sort of summary transformation on it. An example of - a non-streaming command is the 'stats' command, which will collect all the - data before it can calculate the statistics. - -Note that neither of these cases precludes having previews of the data, and you -can enable or disable preview functionality in the configuration. - -## Configuration - -Configuration of custom search commands is done in the local/commands.conf file -of your custom app. You can take a look at a few examples in the SDK: - -* [examples/custom_search/local/commands.conf](https://github.com/splunk/splunk-sdk-python/blob/master/examples/custom_search/local/commands.conf) -* [examples/twitted/twitted/local/commands.conf](https://github.com/splunk/splunk-sdk-python/blob/master/examples/twitted/twitted/local/commands.conf) - -The authoritative documentation for commands.conf can be found here: -http://www.splunk.com/base/Documentation/latest/Admin/commandsconf - -## Input - -The input format is just CSV, with an optional header. The general format -definition is: - -a. Several lines of header, in the form of "key:value" pairs, separated by - new lines. OPTIONAL -b. A blank newline -c. Data - -The presence of the header (and some fields in it) can be controlled in -commands.conf. - -Included an annotated sample input below. Python style '###' comments are used -to point out salient features. This input is truncated for brevity - you can see -the full input at tests/custom_search/usercount.in - -``` -### The following line is the first line of the header -authString:itayitay6e49d9164a4eced1a006f46d5710715c -sessionKey:6e49d9164a4eced1a006f46d5710715c -owner:itay -namespace:custom_search -keywords:%22%22%22sourcetype%3A%3Atop%22%22%20%22 -search:search%20sourcetype%3D%22top%22%20%7C%20head%202%20%7C%20usercount%20%7C%20head%20100%20%7C%20export -sid:1310074215.71 -realtime:0 -preview:0 -truncated:0 -### The above line is the last line of the header, following by the mandatory blank line. - -### Data starts in the line below. The first line includes the CSV "column headers", followed by the actual data for each row -"_cd","_indextime","_kv","_raw","_serial","_si","_sourcetype","_time",eventtype,host,index,linecount,punct,source,sourcetype,"splunk_server","tag::eventtype",timestamp -"28:138489",1310074203,1," PID USER PR NI VIRT RES SHR S pctCPU pctMEM cpuTIME COMMAND - 469 root ? ? 2378M 1568K 244K ? 7.2 ? 00:00.15 top - 95 _coreaudiod ? ? 2462M 12M 952K ? 5.3 ? 88:47.70 coreaudiod - 7722 itay ? ? 4506M 608M 99M ? 3.9 ? 75:02.81 pycharm -",0,"Octavian.local -os",top,1310074203,"top usb_device_registration_Linux_syslog","Octavian.local",os,120,"__________________________________________________",top,top,"Octavian.local","Linux -USB -device -os -process -registration -report -success -syslog -top",none -### This is the start of the second CSV record -"28:138067",1310074173,1," PID USER PR NI VIRT RES SHR S pctCPU pctMEM cpuTIME COMMAND - 369 root ? ? 2378M 1568K 244K ? 7.3 ? 00:00.15 top - 95 _coreaudiod ? ? 2462M 12M 952K ? 5.4 ? 88:46.09 coreaudiod - 7722 itay ? ? 4506M 608M 99M ? 3.9 ? 75:01.67 pycharm -",1,"Octavian.local -os",top,1310074173,"top usb_device_registration_Linux_syslog","Octavian.local",os,120,"__________________________________________________",top,top,"Octavian.local","Linux -USB -device -os -process -registration -report -success -syslog -top",none -### Above line is end of input -``` - -## Output - -Output of a custom search command is also in CSV. It follows the same format as -the input: an optional header, followed by a blank line, followed by the data. -Included below is a sample output for the above input: - -``` -### The configuration does not call for a header, so we start with the data immediately. The below line are the CSV column headers. -daemon,usbmuxd,windowserver,www,mdnsresponder,coreaudiod,itay,locationd,root,spotlight -1,1,1,1,1,1,73,1,37,1 -1,1,1,1,1,1,73,1,37,1 -### The end of the output. The preceding lines are the actual records for each row -``` \ No newline at end of file diff --git a/examples/custom_search/bin/usercount.py b/examples/custom_search/bin/usercount.py deleted file mode 100755 index 7346fd212..000000000 --- a/examples/custom_search/bin/usercount.py +++ /dev/null @@ -1,196 +0,0 @@ -#!/usr/bin/env python -# -# Copyright 2011-2015 Splunk, Inc. -# -# Licensed under the Apache License, Version 2.0 (the "License"): you may -# not use this file except in compliance with the License. You may obtain -# a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT -# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the -# License for the specific language governing permissions and limitations -# under the License. - -from __future__ import absolute_import -import csv, sys -import os - -sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), os.pardir, os.pardir, os.pardir))) - -from splunklib import six -from splunklib.six.moves import zip -from splunklib.six import StringIO -from splunklib.six.moves import urllib - -# Tees output to a logfile for debugging -class Logger: - def __init__(self, filename, buf = None): - self.log = open(filename, 'w') - self.buf = buf - - def flush(self): - self.log.flush() - - if self.buf is not None: - self.buf.flush() - - def write(self, message): - self.log.write(message) - self.log.flush() - - if self.buf is not None: - self.buf.write(message) - self.buf.flush() - -# Tees input as it is being read, also logging it to a file -class Reader: - def __init__(self, buf, filename = None): - self.buf = buf - if filename is not None: - self.log = open(filename, 'w') - else: - self.log = None - - def __iter__(self): - return self - - def next(self): - return self.readline() - - __next__ = next - - def readline(self): - line = self.buf.readline() - - if not line: - raise StopIteration - - # Log to a file if one is present - if self.log is not None: - self.log.write(line) - self.log.flush() - - # Return to the caller - return line - -def output_results(results, mvdelim = '\n', output = sys.stdout): - """Given a list of dictionaries, each representing - a single result, and an optional list of fields, - output those results to stdout for consumption by the - Splunk pipeline""" - - # We collect all the unique field names, as well as - # convert all multivalue keys to the right form - fields = set() - for result in results: - for key in result.keys(): - if(isinstance(result[key], list)): - result['__mv_' + key] = encode_mv(result[key]) - result[key] = mvdelim.join(result[key]) - fields.update(list(result.keys())) - - # convert the fields into a list and create a CSV writer - # to output to stdout - fields = sorted(list(fields)) - - writer = csv.DictWriter(output, fields) - - # Write out the fields, and then the actual results - writer.writerow(dict(list(zip(fields, fields)))) - writer.writerows(results) - -def read_input(buf, has_header = True): - """Read the input from the given buffer (or stdin if no buffer) - is supplied. An optional header may be present as well""" - - # Use stdin if there is no supplied buffer - if buf is None: - buf = sys.stdin - - # Attempt to read a header if necessary - header = {} - if has_header: - # Until we get a blank line, read "attr:val" lines, - # setting the values in 'header' - last_attr = None - while True: - line = buf.readline() - - # remove lastcharacter (which is a newline) - line = line[:-1] - - # When we encounter a newline, we are done with the header - if len(line) == 0: - break - - colon = line.find(':') - - # If we can't find a colon, then it might be that we are - # on a new line, and it belongs to the previous attribute - if colon < 0: - if last_attr: - header[last_attr] = header[last_attr] + '\n' + urllib.parse.unquote(line) - else: - continue - - # extract it and set value in settings - last_attr = attr = line[:colon] - val = urllib.parse.unquote(line[colon+1:]) - header[attr] = val - - return buf, header - -def encode_mv(vals): - """For multivalues, values are wrapped in '$' and separated using ';' - Literal '$' values are represented with '$$'""" - s = "" - for val in vals: - val = val.replace('$', '$$') - if len(s) > 0: - s += ';' - s += '$' + val + '$' - - return s - -def main(argv): - stdin_wrapper = Reader(sys.stdin) - buf, settings = read_input(stdin_wrapper, has_header = True) - events = csv.DictReader(buf) - - results = [] - - for event in events: - # For each event, we read in the raw event data - raw = StringIO(event["_raw"]) - top_output = csv.DictReader(raw, delimiter = ' ', skipinitialspace = True) - - # And then, for each row of the output of the 'top' command - # (where each row represents a single process), we look at the - # owning user of that process. - usercounts = {} - for row in top_output: - user = row["USER"] - user = user if not user.startswith('_') else user[1:] - - usercount = 0 - if user in usercounts: - usercount = usercounts[user] - - usercount += 1 - usercounts[user] = usercount - - results.append(usercounts) - - # And output it to the next stage of the pipeline - output_results(results) - - -if __name__ == "__main__": - try: - main(sys.argv) - except Exception: - import traceback - traceback.print_exc(file=sys.stdout) diff --git a/examples/custom_search/default/app.conf b/examples/custom_search/default/app.conf deleted file mode 100644 index 3db1dd146..000000000 --- a/examples/custom_search/default/app.conf +++ /dev/null @@ -1,13 +0,0 @@ -# -# Splunk app configuration file -# - -[ui] -is_visible = 1 -label = custom_search - -[launcher] -author = Splunk -description = -version = 1.0 - diff --git a/examples/custom_search/default/data/ui/nav/default.xml b/examples/custom_search/default/data/ui/nav/default.xml deleted file mode 100644 index c2128a6f3..000000000 --- a/examples/custom_search/default/data/ui/nav/default.xml +++ /dev/null @@ -1,18 +0,0 @@ - diff --git a/examples/custom_search/default/data/ui/views/README b/examples/custom_search/default/data/ui/views/README deleted file mode 100644 index 6cf74f0bc..000000000 --- a/examples/custom_search/default/data/ui/views/README +++ /dev/null @@ -1 +0,0 @@ -Add all the views that your app needs in this directory diff --git a/examples/custom_search/local/app.conf b/examples/custom_search/local/app.conf deleted file mode 100644 index 78666a91e..000000000 --- a/examples/custom_search/local/app.conf +++ /dev/null @@ -1,3 +0,0 @@ -[ui] - -[launcher] diff --git a/examples/custom_search/local/commands.conf b/examples/custom_search/local/commands.conf deleted file mode 100644 index 891c6af09..000000000 --- a/examples/custom_search/local/commands.conf +++ /dev/null @@ -1,7 +0,0 @@ -[usercount] -filename = usercount.py -streaming = false -retainsevents = false -overrides_timeorder = true -enableheader = true -passauth = true \ No newline at end of file diff --git a/examples/custom_search/metadata/default.meta b/examples/custom_search/metadata/default.meta deleted file mode 100644 index ad9ff9361..000000000 --- a/examples/custom_search/metadata/default.meta +++ /dev/null @@ -1,29 +0,0 @@ - -# Application-level permissions - -[] -access = read : [ * ], write : [ admin, power ] - -### EVENT TYPES - -[eventtypes] -export = system - - -### PROPS - -[props] -export = system - - -### TRANSFORMS - -[transforms] -export = system - - -### VIEWSTATES: even normal users should be able to create shared viewstates - -[viewstates] -access = read : [ * ], write : [ * ] -export = system diff --git a/examples/custom_search/metadata/local.meta b/examples/custom_search/metadata/local.meta deleted file mode 100644 index f62d553a7..000000000 --- a/examples/custom_search/metadata/local.meta +++ /dev/null @@ -1,7 +0,0 @@ -[app/ui] -owner = itay -version = 4.2.2 - -[app/launcher] -owner = itay -version = 4.2.2 diff --git a/tests/test_examples.py b/tests/test_examples.py index fe71a1e4b..059d54645 100755 --- a/tests/test_examples.py +++ b/tests/test_examples.py @@ -253,67 +253,6 @@ def test_upload(self): "upload.py --help", "upload.py --index=sdk-tests %s" % file_to_upload) - # The following tests are for the custom_search examples. The way - # the tests work mirrors how Splunk would invoke them: they pipe in - # a known good input file into the custom search python file, and then - # compare the resulting output file to a known good one. - def test_custom_search(self): - - def test_custom_search_command(script, input_path, baseline_path): - output_base, _ = os.path.splitext(input_path) - output_path = output_base + ".out" - output_file = io.open(output_path, 'bw') - - input_file = io.open(input_path, 'br') - - # Execute the command - result = run(script, stdin=input_file, stdout=output_file) - self.assertEquals(result, 0) - - input_file.close() - output_file.close() - - # Make sure the test output matches the baseline - baseline_file = io.open(baseline_path, 'br') - baseline = baseline_file.read().decode('utf-8') - - output_file = io.open(output_path, 'br') - output = output_file.read().decode('utf-8') - - # TODO: DVPL-6700: Rewrite this test so that it is insensitive to ties in score - - message = "%s: %s != %s" % (script, output_file.name, baseline_file.name) - check_multiline(self, baseline, output, message) - - # Cleanup - baseline_file.close() - output_file.close() - os.remove(output_path) - - custom_searches = [ - { - "script": "custom_search/bin/usercount.py", - "input": "../tests/data/custom_search/usercount.in", - "baseline": "../tests/data/custom_search/usercount.baseline" - }, - { - "script": "twitted/twitted/bin/hashtags.py", - "input": "../tests/data/custom_search/hashtags.in", - "baseline": "../tests/data/custom_search/hashtags.baseline" - }, - { - "script": "twitted/twitted/bin/tophashtags.py", - "input": "../tests/data/custom_search/tophashtags.in", - "baseline": "../tests/data/custom_search/tophashtags.baseline" - } - ] - - for custom_search in custom_searches: - test_custom_search_command( - custom_search['script'], - custom_search['input'], - custom_search['baseline']) - # The following tests are for the Analytics example def test_analytics(self): # We have to add the current path to the PYTHONPATH,