Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding DSV over HTTP data adapter. #4096

Merged
merged 5 commits into from Sep 12, 2017
Merged

Adding DSV over HTTP data adapter. #4096

merged 5 commits into from Sep 12, 2017

Conversation

@dennisoelkers
Copy link
Member

@dennisoelkers dennisoelkers commented Aug 16, 2017

Description

Motivation and Context

This adapter allows us to fetch delimiter-separated value files over
HTTP to back a lookup table with their content. The user is able to
specify the delimiter, (a) character(s) indicating lines which are
supposed to be ignored (comments) and if the lookup table should only
check for the presence of a key (in which case only true/false is
returned instead of the value).

Unlike the CSV adapter, this adapter does not support a starting line
naming the columns and specifying the key/value column.

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
This adapter allows us to fetch delimiter-separated value files over
HTTP to back a lookup table with their content. The user is able to
specify the delimiter, (a) character(s) indicating lines which are
supposed to be ignored (comments) and if the lookup table should only
check for the presence of a key (in which case only true/false is
returned instead of the value).

Unlike the CSV adapter, this adapter does not support a starting line
naming the columns and specifying the key/value column.
@dennisoelkers dennisoelkers force-pushed the add-dsv-over-http-data-adapter branch from 08c3993 to 74a08e3 Aug 18, 2017
Copy link
Contributor

@joschi joschi left a comment

Please add the missing license headers:

[INFO] --- license-maven-plugin:2.11:check (default) @ graylog2-server ---
[INFO] Checking licenses...
[WARNING] Missing header in: /home/travis/build/Graylog2/graylog2-server/graylog2-server/src/test/java/org/graylog2/lookup/adapters/dsvhttp/DSVParserTest.java
[WARNING] Missing header in: /home/travis/build/Graylog2/graylog2-server/graylog2-server/src/test/java/org/graylog2/lookup/adapters/dsvhttp/HTTPFileRetrieverTest.java
[WARNING] Missing header in: /home/travis/build/Graylog2/graylog2-server/graylog2-server/src/main/java/org/graylog2/lookup/adapters/dsvhttp/HTTPFileRetriever.java
[WARNING] Missing header in: /home/travis/build/Graylog2/graylog2-server/graylog2-server/src/main/java/org/graylog2/lookup/adapters/dsvhttp/DSVParser.java
@joschi
Copy link
Contributor

@joschi joschi commented Aug 22, 2017

The DSVParser class currently doesn't support using the delimiter character inside a quoted string:

diff --git a/graylog2-server/src/test/java/org/graylog2/lookup/adapters/dsvhttp/DSVParserTest.java b/graylog2-server/src/test/java/org/graylog2/lookup/adapters/dsvhttp/DSVParserTest.java
index 1db1a6b6e..93cb70a58 100644
--- a/graylog2-server/src/test/java/org/graylog2/lookup/adapters/dsvhttp/DSVParserTest.java
+++ b/graylog2-server/src/test/java/org/graylog2/lookup/adapters/dsvhttp/DSVParserTest.java
@@ -72,7 +72,9 @@ public class DSVParserTest {
         final String input = "# Sample file for testing\n" +
                 "\"foo\":\"23\"\n" +
                 "\"bar\":\"42\"\n" +
-                "\"baz\":\"17\"";
+                "\"baz\":\"17\"\n" +
+                "\"qux\":\"42:23\"\n" +
+                "\"qu:ux\":\"42\"";
         final DSVParser dsvParser = new DSVParser("#", ":", "\"", false, false, 0, Optional.of(1));

         final Map<String, String> result = dsvParser.parse(input);
@@ -80,11 +82,13 @@ public class DSVParserTest {
         assertThat(result)
                 .isNotNull()
                 .isNotEmpty()
-                .hasSize(3)
+                .hasSize(5)
                 .containsExactly(
                         new AbstractMap.SimpleEntry<>("foo", "23"),
                         new AbstractMap.SimpleEntry<>("bar", "42"),
-                        new AbstractMap.SimpleEntry<>("baz", "17")
+                        new AbstractMap.SimpleEntry<>("baz", "17"),
+                        new AbstractMap.SimpleEntry<>("qux", "42:23"),
+                        new AbstractMap.SimpleEntry<>("qu:ux", "42")
                 );
     }

Test result:

java.lang.AssertionError: 
Expecting:
  <{"bar"="42", "baz"="17", "foo"="23", "qu"="ux", "qux"="42"}>
to contain exactly (and in same order):
  <[foo=23, bar=42, baz=17, qux=42:23, qu:ux=42]>
but some elements were not found:
  <[MapEntry[key="qux", value="42:23"], MapEntry[key="qu:ux", value="42"]]>
and others were not expected:
  <[MapEntry[key="qux", value="42"], MapEntry[key="qu", value="ux"]]>


	at org.graylog2.lookup.adapters.dsvhttp.DSVParserTest.parseQuotedStrings(DSVParserTest.java:86)

As a user I would expect this to work, otherwise what's the point of having quoted keys/values?

@joschi
Copy link
Contributor

@joschi joschi commented Aug 22, 2017

@dennisoelkers Would it make sense to make the entry-separator (currently "\n") configurable?

Some sources might use "\r" or "\r\n" for line separation, and some might not even use newline characters but something like k1:v1;k2:v2;k3:v2;....

(Inspired by the Splitter class from Guava)

@joschi joschi self-assigned this Aug 22, 2017
@dennisoelkers
Copy link
Member Author

@dennisoelkers dennisoelkers commented Sep 11, 2017

Thanks, @joschi. Addressed both of your remarks.

Jochen Schalanda
@joschi
joschi approved these changes Sep 12, 2017
@joschi joschi merged commit 9cdc220 into master Sep 12, 2017
5 checks passed
5 checks passed
ci-web-linter Jenkins build graylog-pr-linter-check 1896 has succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
graylog-project/pr Jenkins build graylog-project-pr-snapshot 428 has succeeded
Details
license/cla Contributor License Agreement is signed.
Details
@joschi joschi deleted the add-dsv-over-http-data-adapter branch Sep 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.