METRON-627: Add HyperLogLogPlus implementation to Stellar #397

mmiklavc · 2016-12-15T21:05:25Z

This PR addresses https://issues.apache.org/jira/browse/METRON-627

Leverages the HLLP implementation from https://github.com/addthis/stream-lib/blob/master/src/main/java/com/clearspring/analytics/stream/cardinality/HyperLogLogPlus.java

4 new Stellar functions have been added that allow a user to initialize a cardinality estimator, add items, merge estimators, and calculate cardinality estimates.

`HLLP_ADD`

Description: Add value to the HyperLogLogPlus estimator set.
Input:
- hyperLogLogPlus - the hllp estimator to add a value to
- value+ - value to add to the set. Takes a single item or a list.
Returns: The HyperLogLogPlus set with a new value added

`HLLP_CARDINALITY`

Description: Returns HyperLogLogPlus-estimated cardinality for this set.
Input:
- hyperLogLogPlus - the hllp set
Returns: Long value representing the cardinality for this set

`HLLP_INIT`

Description: Initializes the HyperLogLogPlus estimator set. p must be a value between 4 and sp and sp must be less than 32 and greater than 4.
Input:
- p - the precision value for the normal set
- sp - the precision value for the sparse set. If p is set, but sp is 0 or not specified, the sparse set will be disabled.
Returns: A new HyperLogLogPlus set

`HLLP_MERGE`

Description: Merge hllp sets together. The resulting estimator is initialized with p and sp precision values from the first provided hllp estimator set.
Input:
- hllp - List of hllp estimators to merge. Takes a single hllp set or a list.
Returns: A new merged HyperLogLogPlus estimator set

Note: Added new library to metron-common pom and added 3 new items to dependencies_with_url.csv.

Testing

Stellar REPL

Spun up the Stellar REPL in quick-dev. And verified that the function composition is working as expected and returning correct cardinality estimates for simple sparse set cases. For example:

[Stellar]>>> HLLP_CARDINALITY(HLLP_MERGE( HLLP_ADD(HLLP_ADD(HLLP_INIT(), "runnings"), "cool"), HLLP_ADD(HLLP_ADD(HLLP_INIT(), "bobsled"), "team")))
4
[Stellar]>>> HLLP_CARDINALITY(HLLP_ADD(null, [1, 2, 3, 4, 4, 5]))
5

Profiler

Note: In order to fully test this PR with the profiler, the fixes in 420 are required - #420

Testing in a single node cluster with Bro will not give you a reasonable distribution of data, so I've ginned up a script to create some. This script will replace the timestamp and destination ip address for a dns and http record template. The split of distinct IP's is set to 60%, 20%, 20%.

#!/usr/bin/python
import random
import sys
import time
import datetime

def main():
  freq_s = float(sys.argv[1])
  r1 = '{"dns": {"ts":%TIMESTAMP%,"uid":"CXtY9x4GUiVhHwLLAa","id.orig_h":"192.168.66.1","id.orig_p":5353,"id.resp_h":"%DEST_ADDR%","id.resp_p":5353,"proto":"udp","trans_id":0,"query":"_googlecast._tcp.local","qclass":1,"qclass_name":"C_INTERNET","qtype":12,"qtype_name":"PTR","AA":false,"TC":false,"RD":false,"RA":false,"Z":0,"rejected":false}}'
  r2 = '{"http": {"ts":%TIMESTAMP%,"uid":"C9RfeyXjNU00nvE8i","id.orig_h":"192.168.138.158","id.orig_p":49184,"id.resp_h":"%DEST_ADDR%","id.resp_p":80,"trans_depth":1,"method":"GET","host":"va872g.g90e1h.b8.642b63u.j985a2.v33e.37.pa269cc.e8mfzdgrf7g0.groupprograms.in","uri":"/?285a4d4e4e5a4d4d4649584c5d43064b4745","user_agent":"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0)","request_body_len":0,"response_body_len":560,"status_code":200,"status_msg":"OK","tags":[],"resp_fuids":["FvZ5o85OV1E3xiZH"],"resp_mime_types":["text/html"]}}'

  while True:
    # print str(random.gauss(mu, sigma))
    num = random.randint(1,10)
    dest = "224.0.0.25"
    if (num <= 6):
        dest = dest + str(1)
    elif (num <= 8):
        dest = dest + str(2)
    else:
        dest = dest + str(3)

    ts = datetime.datetime.now()
    if (random.randint(1,2) == 1):
        record = r1
    else:
        record = r2
    print record.replace("%DEST_ADDR%", dest).replace("%TIMESTAMP%", ts.strftime("%s.%f"))
    sys.stdout.flush()
    time.sleep(freq_s)

if __name__ == '__main__':
  main()

Run as follows to create a random record every 3 seconds:
python bro-data.py 3 | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list node1:6667 --topic bro

Setup $METRON_HOME/config/profiler.properties

profiler.period.duration=1
profiler.period.duration.units=MINUTES

Also setup Adjust $METRON_HOME/config/zookeeper/global.json to adjust the capture duration:

"profiler.client.period.duration" : "1",
"profiler.client.period.duration.units" : "MINUTES"

Setup profiler.json with 3 profiles. (Note that if you delete a profile, you will need to restart the topology in order to remove it fully.)

{
    "profiles": [
    {
        "profile": "cardinality",
        "foreach": "ip_src_addr",
        "onlyif": "true",
        "init" : {
            "hllp": "HLLP_INIT()"
        },
        "update": {
            "hllp" : "HLLP_ADD(hllp, ip_dst_addr)"
        },
        "result": "HLLP_CARDINALITY(hllp)"
    },
    {
        "profile": "sketchy_mad_state",
        "foreach": "ip_src_addr",
        "onlyif": "true",
        "init" : {
            "last_cardinality" : "GET_FIRST(PROFILE_GET('cardinality', ip_src_addr, 1, 'MINUTES'))",
            "trailing_outlier_state" : "OUTLIER_MAD_STATE_MERGE(PROFILE_GET('sketchy_mad_state', ip_src_addr, 5, 'MINUTES'))",
            "s": "OUTLIER_MAD_ADD(trailing_outlier_state, last_cardinality)"
        },
        "update": { },
        "result": "s"
    },
    {
        "profile": "sketchy_mad_score",
        "foreach": "ip_src_addr",
        "onlyif": "true",
        "init" : {
            "last_cardinality" : "GET_FIRST(PROFILE_GET('cardinality', ip_src_addr, 1, 'MINUTES'))",
            "trailing_outlier_state" : "OUTLIER_MAD_STATE_MERGE(PROFILE_GET('sketchy_mad_state', ip_src_addr, 5, 'MINUTES'))",
            "score" : "OUTLIER_MAD_SCORE(trailing_outlier_state, last_cardinality)"
        },
        "update": { },
        "result": "score"
    }
    ]
}

The 3 profiles setup are as follows:

Initialize the HLLP set and maintain an HLLP cardinality set per ip_src_addr encountered.
Add latest cardinality calculation to the MAD state
Generate a MAD score every tick that comprises a sliding window of the last 5 minutes of hllp scores for this ip_src_addr

Setup $METRON_HOME/config/zookeeper/enrichments/bro.json

{
  "enrichment" : {
    "fieldMap": {
      "geo": ["ip_dst_addr", "ip_src_addr"],
      "host": ["host"],
      "stellar" : {
          "config" : {
              "profiler_score" : "GET_FIRST(PROFILE_GET( 'sketchy_mad_score', ip_src_addr, 1, 'MINUTES'))",
              "is_alert" : "if profiler_score > 3.5 then true else false"
          }
      }
    }
  },
  "threatIntel": {
    "fieldMap": {
      "hbaseThreatIntel": ["ip_src_addr", "ip_dst_addr"]
    },
    "fieldToTypeMap": {
      "ip_src_addr" : ["malicious_ip"],
      "ip_dst_addr" : ["malicious_ip"]
    },
    "triageConfig" : {
      "riskLevelRules" : {
        "profiler_score > 3.5" : 10
      },
      "aggregator" : "MAX"
    }
  }
}

Push the config to Zookeeper
$METRON_HOME/bin/zk_load_configs.sh -m PUSH -i $METRON_HOME/config/zookeeper/ -z node1:2181

Start the profiler topology
$METRON_HOME/bin/start_profiler_topology.sh

cestella · 2016-12-16T15:45:50Z

dependencies_with_url.csv

@@ -21,6 +21,8 @@ com.jcraft:jsch:jar:0.1.42:compile,BSD,http://www.jcraft.com/jsch/
 com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:compile,CDDL,http://jaxb.java.net/
 com.sun.xml.bind:jaxb-impl:jar:2.2.5-2:compile,CDDL,http://jaxb.java.net/
 com.twitter:jsr166e:jar:1.1.0:compile,CC0 1.0 Universal,http://github.com/twitter/jsr166e
+it.unimi.dsi:fastutil:jar:6.5.11:compile,ASLv2,https://github.com/vigna/fastutil
+it.unimi.dsi:fastutil:jar:7.0.6:compile,ASLv2,https://github.com/vigna/fastutil


Any idea if we can rely on just 7.0.6? rather than 6.5.11 and 7.0.6? Is streamlib depending on both?

I have to look at this further. I just did another cursory dependency review and I only see 7.0.6 referenced. 6.5.11 was definitely showing up in the report.

Update - this 6.5.11 dep was added somewhere else along the way: org.apache.solr:solr-test-framework:jar:5.2.1:test

This is part of the transitive test dependencies for Solr. I excluded fastutil and ran the integration tests for metron-solr and nothing broke. Worst case scenario, we can later add 7.0.6 as a test dep if needed, or simply keep the original. But seeing as it works without it, I'm inclined to keep the exclusion unless anyone has any objections.

cestella

Cool! Good contribution @mmiklavc! A few small things, but overall great. :)

cestella · 2016-12-16T15:46:41Z

metron-platform/metron-common/README.md

-    * fill - the fill character string
-    * len - the required length
-  * Returns: Last element of the list
+    * p (required) - the precision value for the normal set


What are the bounds for p and sp?

cestella · 2016-12-16T15:48:23Z

metron-platform/metron-common/README.md

+  * Description: Add value to the set
+  * Input:
+    * hyperLogLogPlus - the hllp set
+    * o - Object to add to the set


Can we make o be an object or a list? This way you can call HLLP_OFFER(hllp, [ field1, field2]) or HLLP_OFFER(hllp, field1).

Heh, I just made the same comment above. Incidentally, our BLOOM_ADD currently works like a vararg. I may open a separate PR to change that as well.

Sounds good

cestella · 2016-12-16T15:50:56Z

metron-platform/metron-common/README.md

+    * hllpn - additional sets to merge
+  * Returns: A new merged HyperLogLogPlus estimator set
+
+### `HLLP_OFFER`


I'm not in love with HLLP_OFFER..I'd prefer something like HLLP_ADD to match BLOOM_ADD. I understand that HLLP is probabalistic, so they're probably being cagey about having their API calls make checks that their maths can't cash, but I think we can educate sufficiently to make it safe to call this ADD IMO.

In addition to this terminology change, what do you think of changing the implementation to accept a single item or array?
HLLP_ADD(hllp, "value1")
or
HLLP_ADD(hllp, [ "value1", "value2", "value3" ] )

I like that idea! We should probably make BLOOM_ADD function that way too.

That is a great idea. That seems like a common API design best practice with stellar, we could probably use a stellar programmers guide with stuff like that

DO consider accepting scalar and list inputs to your functions

DO return NaN from xxxxxx

This is a 'plan for success' kind of thing. We want the stellar library to be large, so large that you can't rely on looking through all the other functions to find the 'right' way to do things

cestella · 2016-12-16T15:54:06Z

...tron-common/src/main/java/org/apache/metron/common/dsl/functions/DataStructureFunctions.java

+          , name="INIT"
+          , description="Initializes the set"
+          , params = {
+                      "p (required) - the precision value for the normal set"


Are their any documents that we can link to describe the tradeoffs for these values? I'm thinking of something like here discussing the tradeoff between accuracy and size.

There are some diagrams. Think it's sufficient to link in the description, or should we consider embedding these images right into the README?

I think a link and any description or guidance you can provide in the description will help.

cestella · 2016-12-16T15:56:20Z

...tron-common/src/main/java/org/apache/metron/common/dsl/functions/DataStructureFunctions.java

+          , name="MERGE"
+          , description="Merge hllp sets together"
+          , params = {
+                      "hllp1 - first hllp set"


Can we make this fit the pattern from BLOOM_MERGE to just take a list? That was done because interactions with the profiler, which is largely where this will be used, will be made easier with that arrangement (i.e. PROFILE_GET returns a list of these, so you'd call this 99% of the time via HLLP_MERGE(PROFILE_GET(....)).

cestella · 2016-12-16T15:59:49Z

metron-platform/metron-common/src/main/java/org/apache/metron/common/utils/ConversionUtils.java

+   * @return New List with the elements cast to the desired type
+   */
+  public static <T, U> List<U> convertList(List<T> from, Class<U> clazz) {
+    return Lists.transform(from, s -> clazz.cast(s));


My issue with this is that above we are calling a method convert and it is a coercive operation (e.g. convert("1", Integer.class) == 1) rather than just a simple cast. Can you make convertList call convert on the elements? If you really want a naive version of this, feel free to make a castList method.

cestella · 2016-12-16T16:01:22Z

metron-platform/metron-common/src/main/java/org/apache/metron/common/utils/HyperLogLogPlus.java

+
+import java.io.Serializable;
+import java.util.List;
+


Can you link the google paper from arxiv on which this is based in the Javadoc comments and in the Stellar description?

Heads up, I searched arxiv and it doesn't appear that this paper is there.

I'm going to link to the Google research link - https://research.google.com/pubs/pub40671.html

cestella · 2016-12-16T16:07:50Z

Because this is a function that will 99% be used from the profiler, could you please design a simple test around using this with the profiler and give us quick rundown on the test procedure as a comment on this PR. I'm thinking something similar (but perhaps less intense) to here.

ottobackwards · 2016-12-16T19:42:12Z

Should functions like this have a more expansive description, that perhaps includes a set of sample data and the outcome? Similar to what the Profiler functions have?

ottobackwards · 2016-12-16T19:43:02Z

I also often find myself wondering if we need peer jiras for documentation/wiki work on this stuff created

mmiklavc · 2016-12-16T19:50:03Z

@ottobackwards +1 on some working examples. We currently plop the whole lot of Stellar documentation in the metron-common README and I wonder if we should start splitting any of this out into separate md files. This set of functions really works as a small suite. Same for the Bloom filter functions. And some deeper-dive might prove useful. I like @cestella 's recommendation of using the profiler for an end-to-end test. This really gets at illustrating the value from a user/customer perspective.

I think we want to consider how we choose to use the Metron wiki vs the README files. It feels like we could keep the README's lean and mean, dealing specifically with relevant options and links to external papers and documentation. And then on the wiki we can go into greater technical detail and maybe even start assembling a cookbook of examples. Thoughts?

cestella · 2016-12-16T19:55:03Z

A couple things. First off, these functions (Bloom and HLL) really should exist in stellar-statistics. rather than being in common IMO (thoughts?). Second, stellar should exist in its own project with a bare set of common functions and the stellar infrastructure.

ottobackwards · 2016-12-16T19:56:46Z

I agree with keeping the README's lean. What I'm struggling with ( and this may just be as an FOSS noob ) is that there seems to be no process around calling out documentation. On internal company projects, we would require jira and confluence entries for product documentation and internal documentation needs - to put in the backlog. I don't see where we do that here. There is a general understanding that "even if you don't contribute code, contributing documentation is good", but where are we tracking what needs it etc?

cestella · 2016-12-16T19:57:29Z

Now, regarding the wiki vs the Readme. The reason I am hesitant to move things to the wiki is that the wiki doesn't really track with the code well. What happens when we change things around and want examples from an older version? We don't have a solution for that at the moment.

ottobackwards · 2016-12-16T19:58:31Z

In that case, we should have multiple concise README's with more detail and a top level overview readme

ottobackwards · 2016-12-16T19:59:23Z

Maybe we could build a readme-wiki, like a section of our wiki which is just the readme's from the current build?

That would also give doc-contributors a way to just work in confluence and not have to go through the code. That might help?

cestella · 2016-12-16T20:01:27Z

I imagine the wiki is for things like process docs (bylaws, etc.) and high level coverage of end-to-end usecases aimed at multiple features working in concert to achieve an end (e.g. the blogs). These get updated per release. The README examples are more targeted and aimed at demonstrating the feature without necessarily being tied to a specific usecase (e.g. the MAD example just generates random data, not security data).

ottobackwards · 2016-12-16T20:07:51Z

But, I just download the binaries or install through ambari? You mean I have to look through the codebase?

cestella · 2016-12-16T20:09:59Z

@ottobackwards Good point, there is some overlap between end-user docs and the stuff in the README.md's. Things like the stellar language and function docs should be part of the wiki, I suspect. We probably need something more high level than the examples in the stellar-statistics README that I cited earlier.

ottobackwards · 2016-12-16T20:16:47Z

what we need is a plan, whatever it is ;)

cestella · 2016-12-16T20:17:28Z

Ok, I propose a [discuss] around docs. These are great questions, but they are probably not going to result in getting us HLL functionality soon ;)

nickwallen · 2016-12-22T19:16:04Z

I was thinking how I could use your HLLP functionality with the Profiler. I think I could use this functionality to track the in-degree and out-degree of a host over time. This might be an interesting motivating example.

For example, calculating the in-degree would look like the following.

{
  "profile": "in-degree",
  "onlyif": "source.type == 'yaf'"
  "foreach": "ip_dst_addr",
  "init": {
    "in": "HLLP_INIT(5, 6)"
  }
  "update": {
    "in": "HLLP_ADD(in, ip_src_addr)"
  }
  "result": {
    "HLLP_CARDINALITY(in)"
  }
}

Calculating the out-degree would look like this.

{
  "profile": "out-degree",
  "onlyif": "source.type == 'yaf'"
  "foreach": "ip_src_addr",
  "init": {
    "out": "HLLP_INIT(5, 6)"
  }
  "update": {
    "out": "HLLP_ADD(out, ip_dst_addr)"
  }
  "result": {
    "HLLP_CARDINALITY(out)"
  }
}

Of course, with your HLLP_MERGE it might be more interesting to store the HLLP object itself and call HLLP_CARDINALITY after-the-fact, but I just want to make sure I'm using the API correctly.

nickwallen · 2016-12-22T19:17:32Z

@mmiklavc Meant to ask previously, does this look right? Am I using your API correctly?

cestella · 2016-12-22T19:20:10Z

@nickwallen That does look right, but you probably want the result to just be out so you can decouple the in-degree and out-degree calculation from the window size in the profiler.

nickwallen · 2016-12-22T19:23:49Z

Yes, understood. I think that's what I wrote at the bottom of my last post. More clear for a motivating example to actually do the calculation as the result.

cestella · 2016-12-22T19:26:02Z

@nickwallen Fair point; I missed that last sentence. Sorry for the confusion.

nickwallen · 2016-12-22T19:30:18Z

Very cool stuff, @mmiklavc. Can't wait to see it hit master.

Just to lend my $.02, I do agree with @ottobackwards that a more detailed example would be helpful. Maybe something in a REPL session that uses multiple lines. Kind of hard for me to grok the single-line example in the PR description.

Also, per @cestella, I agree that this makes sense in metron-analytics/metron-statistics. He mentioned stellar-statistics, but I think he meant metron-statistics.

mmiklavc · 2017-01-03T13:51:00Z

@nickwallen Thanks for the feedback! - I'm currently working on finishing up some performance metrics (p/sp vals vs accuracy, cardinality, serialized mem consumption) in addition to an example utilizing the profiler. Your example above is exactly in line with what I was thinking might make for a good demo of this functionality.

Also, agreed on the single-line example - that was meant to make cut-and-paste easy. The unit and integration tests are much easier to read. I'll provide similar examples here.

mmiklavc · 2017-01-12T23:23:01Z

@nickwallen @cestella Moved the HLLP implementation wrapper and Stellar functions to the metron-statistics project. Bloomfilter Stellar functions should probably be moved to the stats project as well, but that's for a separate PR, I feel.

I also added a utility for generating performance metrics that includes a README (HLLP.md) that is linked to in the main metron-statistics README as well as the metron-common README. There is a link provided to the Google paper, an exposition of performance results, and a new default precision for the sparse and dense (normal) sets.

Demo sample soon to follow using the profiler.

mmiklavc · 2017-01-17T17:32:53Z

Hm, something fishy going on here with Travis. Running all tests fine from Maven (on OSX) locally.

[INFO] metron-analytics ................................... SUCCESS [  1.079 s]
[INFO] metron-maas-common ................................. SUCCESS [ 10.757 s]
[INFO] metron-maas-service ................................ SUCCESS [01:25 min]
[INFO] metron-statistics .................................. SUCCESS [ 46.321 s]
[INFO] metron-profiler-common ............................. SUCCESS [ 13.036 s]
[INFO] metron-profiler-client ............................. SUCCESS [01:21 min]
[INFO] metron-profiler .................................... SUCCESS [04:34 min]

dlyle65535 · 2017-01-17T17:49:55Z

I was able to duplicate the failures in my environment by doing the following (based on what Travis does).

mkdir mike_test
cd mike_test
git clone --depth=50 https://github.com/apache/incubator-metron.git apache/incubator-metron
cd apache/incubator-metron/
git fetch origin +refs/pull/397/merge
git checkout -qf FETCH_HEAD
mvn -q integration-test install && build_utils/verify_licenses.sh

It bombs out with:

cardinality_gives_distinct_value_estimate_with_precisions_set(org.apache.metron.statistics.approximation.HyperLogLogPlusFunctionsIntegrationTest)  Time elapsed: 0.004 sec  <<< ERROR!
org.apache.metron.common.dsl.ParseException: Unable to parse HLLP_CARDINALITY(
  HLLP_ADD(
    HLLP_ADD(
      HLLP_INIT(5, 6),
      val1
    ),
    val2
  )
)
: Syntax error @ 8:3 no viable alternative at input 'HLLP_CARDINALITY(HLLP_ADD(HLLP_ADD(HLLP_INIT(5,6),val1),val2)\n'

jjmeyer0 · 2017-01-17T18:02:42Z

This looks like an issue with the grammar. My guess is it isn't properly handling white space at the end. If you change the variable hllpDefaultConstructorRule to the below, it should work. I know this isn't a permanent solution, but it narrows the problem down.

  /**
   *HLLP_CARDINALITY(
   *  HLLP_ADD(
   *    HLLP_ADD(
   *      HLLP_INIT(),
   *      val1
   *    ),
   *    val2
   *  )
   *  )*/
  @Multiline
  private static String hllpDefaultConstructorRule;

mmiklavc · 2017-01-17T18:16:35Z

@dlyle65535 the thing I'm not fully understanding is how this is working in my local branch. Doing a git fetch and checkout shouldn't merge anything from master, so the most recent master shouldn't be in this test run. Doing a git pull does a fetch and merge, which I would anticipate causing problems.

@jjmeyer0 Yeah, that's what I thought as well. Again, unclear yet why this is not a problem locally. Furthermore, it wasn't a problem in the last commit I had that I believe had the newline. I'll give it a shot anyhow.

cestella · 2017-01-17T18:21:27Z

@jjmeyer0 @mmiklavc It appears that a regression was added relatively recently that affects whitespace at the end of stellar rules (the whitespace isn't sent to the skip channel in the lexer in that situation, for some reason..not sure why). A stopgap is what JJ said, a slightly more permanant fix considering whitespace has no meaning in Stellar is to trim the rule in BaseStellarProcessor (line 111). I'd still consider it important to investigate what is confusing the grammar suddenly, mind you.

jjmeyer0 · 2017-01-17T18:23:32Z

Okay, I took a look. It's because EOL isn't defined as a fragment.

If EOL : '\n'; is changed to fragment EOL : '\n'; it should work. Without the fragment keyword antlr4 makes a token for EOL and it gets confused on how to parse it.

cestella · 2017-01-17T18:26:18Z

@jjmeyer0 Makes sense and that explains things. @mmiklavc despite it being out of the scope of this ticket, it should just be a 1-liner and we should probably correct it as part of this IMO.

jjmeyer0 · 2017-01-17T18:27:42Z

@mmiklavc have you rebased lately? If not, I'm sure the PR I submitted for Stellar comparisons is the culprit.

mmiklavc · 2017-01-17T18:38:42Z

Ok, this is making more sense now. GitHub has merged this PR out of band of the build. That fetch is not doing a merge. It's fetching what has already been merged out of band from the build - git fetch origin +refs/pull/397/merge. This also explains how GitHub is able to tell us if a PR has merge conflicts long before Travis runs its build. So yes, the grammar corrections should work. On it.

mmiklavc · 2017-01-17T20:34:38Z

Merged with latest master and modified Stellar.g4 to handle EOL fragments. Test went from failing to passing locally.

mmiklavc · 2017-01-18T19:04:35Z

Kick Travis

cestella · 2017-01-18T20:02:36Z

Did you verify that these functions run on:

the profiler
the parser topology via field transformations
the enrichment topology via stellar enrichments
the Stellar REPL

If so, then I'm +1

mmiklavc · 2017-01-20T20:58:17Z

@cestella Verified the above - modifying the testing plan now

cestella · 2017-01-26T18:30:31Z

@mmiklavc Can you please merge master in and commit this?

METRON-627: Add HyperLogLogPlus implementation to Stellar

afce305

cestella reviewed Dec 16, 2016

View reviewed changes

cestella suggested changes Dec 16, 2016

View reviewed changes

mmiklavc added 4 commits January 9, 2017 10:00

work in progress commit

414a3a9

Merge with master

c7f57a4

Refactor, fix statistics output

90d9659

METRON-627: Updated with sensible default precision values

261e696

METRON-627: Add default init for HLLP_ADD(null, 'val')

9078094

mmiklavc added 2 commits January 17, 2017 13:09

Merge branch 'master' into hyperloglog

9e1ff93

METRON-627: Fix Stellar lexer to handle newline at end

d392f04

mmiklavc closed this Jan 18, 2017

mmiklavc reopened this Jan 18, 2017

METRON-627: Fix merge conflict with master

e4cbbe4

asfgit closed this in f3ca3c0 Jan 27, 2017

METRON-627: Add HyperLogLogPlus implementation to Stellar #397

METRON-627: Add HyperLogLogPlus implementation to Stellar #397

Conversation

mmiklavc commented Dec 15, 2016 • edited

HLLP_ADD

HLLP_CARDINALITY

HLLP_INIT

HLLP_MERGE

Testing

Stellar REPL

Profiler

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cestella left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cestella Dec 16, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ottobackwards Dec 16, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cestella Dec 16, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cestella commented Dec 16, 2016 • edited

ottobackwards commented Dec 16, 2016

ottobackwards commented Dec 16, 2016

mmiklavc commented Dec 16, 2016

cestella commented Dec 16, 2016

ottobackwards commented Dec 16, 2016

cestella commented Dec 16, 2016

ottobackwards commented Dec 16, 2016

ottobackwards commented Dec 16, 2016 • edited

cestella commented Dec 16, 2016

ottobackwards commented Dec 16, 2016

cestella commented Dec 16, 2016

ottobackwards commented Dec 16, 2016

cestella commented Dec 16, 2016

nickwallen commented Dec 22, 2016

nickwallen commented Dec 22, 2016

cestella commented Dec 22, 2016

nickwallen commented Dec 22, 2016

cestella commented Dec 22, 2016

nickwallen commented Dec 22, 2016

mmiklavc commented Jan 3, 2017

mmiklavc commented Jan 12, 2017

mmiklavc commented Jan 17, 2017

dlyle65535 commented Jan 17, 2017

jjmeyer0 commented Jan 17, 2017

mmiklavc commented Jan 17, 2017

cestella commented Jan 17, 2017

jjmeyer0 commented Jan 17, 2017

cestella commented Jan 17, 2017

jjmeyer0 commented Jan 17, 2017

mmiklavc commented Jan 17, 2017

mmiklavc commented Jan 17, 2017

mmiklavc commented Jan 18, 2017

cestella commented Jan 18, 2017

mmiklavc commented Jan 20, 2017

cestella commented Jan 26, 2017

mmiklavc commented Dec 15, 2016 •

edited

`HLLP_ADD`

`HLLP_CARDINALITY`

`HLLP_INIT`

`HLLP_MERGE`

cestella Dec 16, 2016 •

edited

ottobackwards Dec 16, 2016 •

edited

cestella Dec 16, 2016 •

edited

cestella commented Dec 16, 2016 •

edited

ottobackwards commented Dec 16, 2016 •

edited