Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
4e5fc2a
murmurhash implementation and tests
sanzmauro Aug 28, 2020
8f565bc
fixing rubocop
sanzmauro Aug 28, 2020
134ecfc
updated csv file
sanzmauro Sep 2, 2020
ff17436
wip
sanzmauro Sep 3, 2020
3371b7e
pr feedback
sanzmauro Sep 7, 2020
40d46c8
pr feedback
sanzmauro Sep 7, 2020
c9a8295
Merge pull request #328 from splitio/add-murmurhash-128-custom
sanzmauro Sep 10, 2020
5bc5be3
wip
sanzmauro Sep 9, 2020
148b550
add_bulk_v2 implementation working
sanzmauro Sep 9, 2020
28e0fb6
fixed rubocop
sanzmauro Sep 10, 2020
ec09e80
removed TODOs and fix tests
sanzmauro Sep 10, 2020
5a9eac9
added impression manager tests
sanzmauro Sep 10, 2020
fedf835
fix rubocop
sanzmauro Sep 10, 2020
ca6b2b7
added impression observer
sanzmauro Sep 10, 2020
102ba45
added impression observer test
sanzmauro Sep 10, 2020
1547993
added previousTime in impression to send
sanzmauro Sep 10, 2020
bd221e6
added should_add_pt
sanzmauro Sep 10, 2020
2fd7986
fix rubocop
sanzmauro Sep 10, 2020
af207cb
added impression mode config
sanzmauro Sep 10, 2020
726ee66
updated error message
sanzmauro Sep 11, 2020
5240d0e
fixed robocop
sanzmauro Sep 11, 2020
6104e8b
added counter and tests
sanzmauro Sep 11, 2020
15806f3
fix rubocop
sanzmauro Sep 11, 2020
d1aa523
debug
sanzmauro Sep 11, 2020
85b66e4
debug
sanzmauro Sep 11, 2020
60d12fd
pr feedback
sanzmauro Sep 14, 2020
53753f0
Merge pull request #329 from splitio/add-impression-manager-v2
sanzmauro Sep 14, 2020
08362e3
Merge pull request #330 from splitio/add-impression-manager-v2-cleani…
sanzmauro Sep 14, 2020
918018f
Merge pull request #331 from splitio/add-impression-observer-v2
sanzmauro Sep 14, 2020
e141735
added tests
sanzmauro Sep 14, 2020
98de4bf
deubg
sanzmauro Sep 14, 2020
bf8bc28
Merge pull request #332 from splitio/add-impression-modes
sanzmauro Sep 14, 2020
751cfff
Merge pull request #333 from splitio/add-impression-counter
sanzmauro Sep 14, 2020
6ec912f
removed matching_key
sanzmauro Sep 14, 2020
57dacc4
fix tests
sanzmauro Sep 14, 2020
5e15875
updated dto and tests
sanzmauro Sep 14, 2020
019999c
added header and removed ip logic
sanzmauro Sep 14, 2020
5ad56f4
Merge pull request #334 from splitio/update-dto-impressions
sanzmauro Sep 14, 2020
57245e3
added task sender and tests
sanzmauro Sep 15, 2020
eda4c5a
improvements
sanzmauro Sep 15, 2020
4283b25
fix tests
sanzmauro Sep 16, 2020
54e1846
pr feedback
sanzmauro Sep 16, 2020
349d9c0
polishing and added tests
sanzmauro Sep 16, 2020
0e8d5f4
fixed rubocop
sanzmauro Sep 16, 2020
c1a942f
renamed header name
sanzmauro Sep 16, 2020
8f75f2b
added e2e tests
sanzmauro Sep 16, 2020
7e8023c
added debug mode tests
sanzmauro Sep 16, 2020
4c86496
added block until ready
sanzmauro Sep 16, 2020
f754cd3
fixed rubocop
sanzmauro Sep 16, 2020
1bef23c
Merge pull request #335 from splitio/new-endpoint-and-task
sanzmauro Sep 17, 2020
adb69dd
wip
sanzmauro Sep 17, 2020
f3a4fa6
Merge pull request #336 from splitio/integration-tests
sanzmauro Sep 17, 2020
8ae6704
polishing
sanzmauro Sep 21, 2020
1969f10
fix unicorn and polishing
sanzmauro Sep 22, 2020
900a213
pr suggestions
sanzmauro Sep 23, 2020
86868e6
added test
sanzmauro Sep 23, 2020
6107e5d
Merge pull request #337 from splitio/imp-dedupe-polishing
sanzmauro Sep 24, 2020
da1454c
updated changes
sanzmauro Sep 25, 2020
385293d
updated version
sanzmauro Sep 25, 2020
a1a68d8
Merge pull request #338 from splitio/impression-dedupe-support
sanzmauro Sep 28, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion CHANGES.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
CHANGES

7.2.0 (Sep 25, 2020)
- Added deduplication logic for impressions data.
- Now there are two modes for Impressions when the SDK is in standalone mode, OPTIMIZED (default) that only ships unique impressions and DEBUG for times where you need to send ALL impressions to debug an integration.
- Impression listener remains unchanged and will still get all impressions.

7.1.3 (Jul 31, 2020)
- Updated rake development dependency to ~> 12.3.3
- Updated rake development dependency to ~> 12.3.3.

7.1.2 (Jun 15, 2020)
- Fixed uninitialized constant LocalhostSplitStore::YAML for console apps.
Expand Down
117 changes: 117 additions & 0 deletions ext/murmurhash/3_x64_128.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
/*
* MurmurHash3_x64_128 (C) Austin Appleby
*/

#include "murmurhash.h"

void
murmur_hash_process3_x64_128(const char * key, uint32_t len, uint32_t seed, void *out)
{
const uint8_t * data = (const uint8_t*)key;
const int nblocks = len / 16;

uint64_t h1 = seed;
uint64_t h2 = seed;

const uint64_t c1 = (uint64_t)BIG_CONSTANT(0x87c37b91114253d5);
const uint64_t c2 = (uint64_t)BIG_CONSTANT(0x4cf5ad432745937f);

//----------
// body

const uint64_t * blocks = (const uint64_t *)(data);

int i;

for(i = 0; i < nblocks; i++)
{
uint64_t k1 = getblock64(blocks,i*2+0);
uint64_t k2 = getblock64(blocks,i*2+1);

k1 *= c1; k1 = ROTL64(k1,31); k1 *= c2; h1 ^= k1;

h1 = ROTL64(h1,27); h1 += h2; h1 = h1*5+0x52dce729;

k2 *= c2; k2 = ROTL64(k2,33); k2 *= c1; h2 ^= k2;

h2 = ROTL64(h2,31); h2 += h1; h2 = h2*5+0x38495ab5;
}

//----------
// tail

const uint8_t * tail = (const uint8_t*)(data + nblocks*16);

uint64_t k1 = 0;
uint64_t k2 = 0;

switch(len & 15)
{
case 15: k2 ^= ((uint64_t)tail[14]) << 48;
case 14: k2 ^= ((uint64_t)tail[13]) << 40;
case 13: k2 ^= ((uint64_t)tail[12]) << 32;
case 12: k2 ^= ((uint64_t)tail[11]) << 24;
case 11: k2 ^= ((uint64_t)tail[10]) << 16;
case 10: k2 ^= ((uint64_t)tail[ 9]) << 8;
case 9: k2 ^= ((uint64_t)tail[ 8]) << 0;
k2 *= c2; k2 = ROTL64(k2,33); k2 *= c1; h2 ^= k2;

case 8: k1 ^= ((uint64_t)tail[ 7]) << 56;
case 7: k1 ^= ((uint64_t)tail[ 6]) << 48;
case 6: k1 ^= ((uint64_t)tail[ 5]) << 40;
case 5: k1 ^= ((uint64_t)tail[ 4]) << 32;
case 4: k1 ^= ((uint64_t)tail[ 3]) << 24;
case 3: k1 ^= ((uint64_t)tail[ 2]) << 16;
case 2: k1 ^= ((uint64_t)tail[ 1]) << 8;
case 1: k1 ^= ((uint64_t)tail[ 0]) << 0;
k1 *= c1; k1 = ROTL64(k1,31); k1 *= c2; h1 ^= k1;
};

//----------
// finalization

h1 ^= len; h2 ^= len;

h1 += h2;
h2 += h1;

h1 = fmix64(h1);
h2 = fmix64(h2);

h1 += h2;
h2 += h1;

((uint64_t*)out)[0] = h1;
((uint64_t*)out)[1] = h2;
}

VALUE
murmur3_x64_128_finish(VALUE self)
{
uint8_t digest[16];
uint64_t out[2];

_murmur_finish128(self, out, murmur_hash_process3_x64_128);
assign_by_endian_128(digest, out);
return rb_str_new((const char*) digest, 16);
}

VALUE
murmur3_x64_128_s_digest(int argc, VALUE *argv, VALUE klass)
{
uint8_t digest[16];
uint64_t out[2];

_murmur_s_digest128(argc, argv, klass, (void*)out, murmur_hash_process3_x64_128);
assign_by_endian_128(digest, out);
return rb_str_new((const char*) digest, 16);
}

VALUE
murmur3_x64_128_s_rawdigest(int argc, VALUE *argv, VALUE klass)
{
uint64_t out[2];

_murmur_s_digest128(argc, argv, klass, (void*)out, murmur_hash_process3_x64_128);
return rb_assoc_new(ULL2NUM(out[0]), ULL2NUM(out[1]));
}
140 changes: 140 additions & 0 deletions ext/murmurhash/MurmurHash3.java
Original file line number Diff line number Diff line change
Expand Up @@ -159,4 +159,144 @@ public static long murmurhash3_x86_32(CharSequence data, int seed) {

return h1 & 0xFFFFFFFFL;
}

// The following set of methods and constants are borrowed from:
// `This method is borrowed from `org.apache.commons.codec.digest.MurmurHash3`

// Constants for 128-bit variant
private static final long C1 = 0x87c37b91114253d5L;
private static final long C2 = 0x4cf5ad432745937fL;
private static final int R1 = 31;
private static final int R2 = 27;
private static final int R3 = 33;
private static final int M = 5;
private static final int N1 = 0x52dce729;
private static final int N2 = 0x38495ab5;

/**
* Gets the little-endian long from 8 bytes starting at the specified index.
*
* @param data The data
* @param index The index
* @return The little-endian long
*/
private static long getLittleEndianLong(final byte[] data, final int index) {
return (((long) data[index ] & 0xff) ) |
(((long) data[index + 1] & 0xff) << 8) |
(((long) data[index + 2] & 0xff) << 16) |
(((long) data[index + 3] & 0xff) << 24) |
(((long) data[index + 4] & 0xff) << 32) |
(((long) data[index + 5] & 0xff) << 40) |
(((long) data[index + 6] & 0xff) << 48) |
(((long) data[index + 7] & 0xff) << 56);
}

public static long[] hash128x64(final String data, final long seed) {
final byte[] dataBytes = data.getBytes();
return hash128x64(dataBytes, 0, dataBytes.length, seed);
}

/**
* Generates 128-bit hash from the byte array with the given offset, length and seed.
*
* <p>This is an implementation of the 128-bit hash function {@code MurmurHash3_x64_128}
* from from Austin Applyby's original MurmurHash3 {@code c++} code in SMHasher.</p>
*
* @param data The input byte array
* @param offset The first element of array
* @param length The length of array
* @param seed The initial seed value
* @return The 128-bit hash (2 longs)
*/
public static long[] hash128x64(final byte[] data, final int offset, final int length, final long seed) {
long h1 = seed;
long h2 = seed;
final int nblocks = length >> 4;

// body
for (int i = 0; i < nblocks; i++) {
final int index = offset + (i << 4);
long k1 = getLittleEndianLong(data, index);
long k2 = getLittleEndianLong(data, index + 8);

// mix functions for k1
k1 *= C1;
k1 = Long.rotateLeft(k1, R1);
k1 *= C2;
h1 ^= k1;
h1 = Long.rotateLeft(h1, R2);
h1 += h2;
h1 = h1 * M + N1;

// mix functions for k2
k2 *= C2;
k2 = Long.rotateLeft(k2, R3);
k2 *= C1;
h2 ^= k2;
h2 = Long.rotateLeft(h2, R1);
h2 += h1;
h2 = h2 * M + N2;
}

// tail
long k1 = 0;
long k2 = 0;
final int index = offset + (nblocks << 4);
switch (offset + length - index) {
case 15:
k2 ^= ((long) data[index + 14] & 0xff) << 48;
case 14:
k2 ^= ((long) data[index + 13] & 0xff) << 40;
case 13:
k2 ^= ((long) data[index + 12] & 0xff) << 32;
case 12:
k2 ^= ((long) data[index + 11] & 0xff) << 24;
case 11:
k2 ^= ((long) data[index + 10] & 0xff) << 16;
case 10:
k2 ^= ((long) data[index + 9] & 0xff) << 8;
case 9:
k2 ^= data[index + 8] & 0xff;
k2 *= C2;
k2 = Long.rotateLeft(k2, R3);
k2 *= C1;
h2 ^= k2;

case 8:
k1 ^= ((long) data[index + 7] & 0xff) << 56;
case 7:
k1 ^= ((long) data[index + 6] & 0xff) << 48;
case 6:
k1 ^= ((long) data[index + 5] & 0xff) << 40;
case 5:
k1 ^= ((long) data[index + 4] & 0xff) << 32;
case 4:
k1 ^= ((long) data[index + 3] & 0xff) << 24;
case 3:
k1 ^= ((long) data[index + 2] & 0xff) << 16;
case 2:
k1 ^= ((long) data[index + 1] & 0xff) << 8;
case 1:
k1 ^= data[index] & 0xff;
k1 *= C1;
k1 = Long.rotateLeft(k1, R1);
k1 *= C2;
h1 ^= k1;
}

// finalization
h1 ^= length;
h2 ^= length;

h1 += h2;
h2 += h1;

h1 = fmix64(h1);
h2 = fmix64(h2);

h1 += h2;
h2 += h1;

return new long[] { h1, h2 };
}
}
6 changes: 5 additions & 1 deletion ext/murmurhash/murmurhash.c
Original file line number Diff line number Diff line change
Expand Up @@ -243,9 +243,13 @@ Init_murmurhash(void)
iv_seed = rb_intern("@seed");
iv_buffer = rb_intern("@buffer");


cDigest_MurmurHash3_x86_32 = rb_path2class("Digest::MurmurHashMRI3_x86_32");
rb_define_singleton_method(cDigest_MurmurHash3_x86_32, "digest", murmur3_x86_32_s_digest, -1);
rb_define_singleton_method(cDigest_MurmurHash3_x86_32, "rawdigest", murmur3_x86_32_s_rawdigest, -1);
rb_define_private_method(cDigest_MurmurHash3_x86_32, "finish", murmur3_x86_32_finish, 0);

cDigest_MurmurHash3_x64_128 = rb_path2class("Digest::MurmurHashMRI3_x64_128");
rb_define_singleton_method(cDigest_MurmurHash3_x64_128, "digest", murmur3_x64_128_s_digest, -1);
rb_define_singleton_method(cDigest_MurmurHash3_x64_128, "rawdigest", murmur3_x64_128_s_rawdigest, -1);
rb_define_private_method(cDigest_MurmurHash3_x64_128, "finish", murmur3_x64_128_finish, 0);
}
Binary file modified lib/murmurhash/murmurhash.jar
Binary file not shown.
5 changes: 5 additions & 0 deletions lib/splitclient-rb.rb
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@
require 'splitclient-rb/cache/adapters/redis_adapter'
require 'splitclient-rb/cache/fetchers/segment_fetcher'
require 'splitclient-rb/cache/fetchers/split_fetcher'
require 'splitclient-rb/cache/hashers/impression_hasher'
require 'splitclient-rb/cache/observers/impression_observer'
require 'splitclient-rb/cache/repositories/repository'
require 'splitclient-rb/cache/repositories/segments_repository'
require 'splitclient-rb/cache/repositories/splits_repository'
Expand All @@ -28,6 +30,7 @@
require 'splitclient-rb/cache/senders/impressions_sender'
require 'splitclient-rb/cache/senders/metrics_sender'
require 'splitclient-rb/cache/senders/events_sender'
require 'splitclient-rb/cache/senders/impressions_count_sender'
require 'splitclient-rb/cache/senders/localhost_repo_cleaner'
require 'splitclient-rb/cache/stores/store_utils'
require 'splitclient-rb/cache/stores/localhost_split_builder'
Expand All @@ -52,6 +55,8 @@
require 'splitclient-rb/engine/api/segments'
require 'splitclient-rb/engine/api/splits'
require 'splitclient-rb/engine/api/events'
require 'splitclient-rb/engine/common/impressions_counter'
require 'splitclient-rb/engine/common/impressions_manager'
require 'splitclient-rb/engine/parser/condition'
require 'splitclient-rb/engine/parser/partition'
require 'splitclient-rb/engine/parser/evaluator'
Expand Down
34 changes: 34 additions & 0 deletions lib/splitclient-rb/cache/hashers/impression_hasher.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
module SplitIoClient
module Hashers
class ImpressionHasher
def initialize
@murmur_hash_128_64 = case RUBY_PLATFORM
when 'java'
Proc.new { |key, seed| Java::MurmurHash3.hash128x64(key, seed) }
else
Proc.new { |key, seed| Digest::MurmurHashMRI3_x64_128.rawdigest(key, [seed].pack('L')) }
end
end

def process(impression)
impression_data = "#{unknown_if_null(impression[:k])}"
impression_data << ":#{unknown_if_null(impression[:f])}"
impression_data << ":#{unknown_if_null(impression[:t])}"
impression_data << ":#{unknown_if_null(impression[:r])}"
impression_data << ":#{zero_if_null(impression[:c])}"

@murmur_hash_128_64.call(impression_data, 0)[0];
end

private

def unknown_if_null(value)
value == nil ? "UNKNOWN" : value
end

def zero_if_null(value)
value == nil ? 0 : value
end
end
end
end
22 changes: 22 additions & 0 deletions lib/splitclient-rb/cache/observers/impression_observer.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
module SplitIoClient
module Observers
class ImpressionObserver
LAST_SEEN_CACHE_SIZE = 500000

def initialize
@cache = LruRedux::TTL::ThreadSafeCache.new(LAST_SEEN_CACHE_SIZE)
@impression_hasher = Hashers::ImpressionHasher.new
end

def test_and_set(impression)
return if impression.nil?

hash = @impression_hasher.process(impression)
previous = @cache[hash]
@cache[hash] = impression[:m]

previous.nil? ? nil : [previous, impression[:m]].min
end
end
end
end
Loading