Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change BloomFilter implementation for Sparse Joins #1806

Merged
merged 46 commits into from May 3, 2019
Merged
Show file tree
Hide file tree
Changes from 37 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
d70edee
update docs
nevillelyh Mar 22, 2019
8181a11
fix maven-metadata.xml URL in scripts/bump_scio.sh
nevillelyh Mar 22, 2019
b4c2a0c
Move scio-contrib back to scio-extra (#1776)
nevillelyh Mar 25, 2019
9584792
Add a hand written Coder for pairs (#1775)
jto Mar 26, 2019
b75aaa1
Make BQ annotations serializable (#1773)
anish749 Mar 26, 2019
d3a41b5
use golang image for CircleCI deploy, fix #1772 (#1779)
nevillelyh Mar 26, 2019
01dd7bc
Remove the usage of Future around ScioContext and Tap's (#1666)
regadas Feb 13, 2019
e5675b9
Fix benchmark rebase
regadas Mar 13, 2019
62bbd20
Add schema and row coders (#1698)
jto Mar 20, 2019
dc6f4a4
Simplify query row transform (#1767)
regadas Mar 21, 2019
f48c2cd
Fix typos
jto Mar 26, 2019
4a8496c
use camelCase for typed arguments, fix #1770 (#1780)
nevillelyh Mar 26, 2019
ddf1f09
Rework sql syntax (#1778)
regadas Mar 27, 2019
ac346dc
Fix: use same protoc (#1781)
regadas Mar 27, 2019
65f0f43
Bump coursier version (#1782)
regadas Mar 27, 2019
6bb54ed
Merge remote-tracking branch 'origin/master'
anish749 Mar 31, 2019
a8fba4f
update pair scollection functions to use mutable bloom filters.
anish749 Apr 2, 2019
96b9088
turn off scalastyle for use of return
anish749 Apr 2, 2019
80ab0b5
Merge branch 'master' of github.com:spotify/scio
anish749 Apr 4, 2019
def347f
Merge branch 'master' of github.com:spotify/scio
anish749 Apr 8, 2019
244210a
Merge branch 'master' into mutableBloomFilters
anish749 Apr 8, 2019
5ebb747
add sparse bloom filter implementation to save memory
anish749 Apr 10, 2019
4d63959
delayed mutable bf
Apr 24, 2019
756d6f4
fix bf set init condition, delayed init tests.
Apr 24, 2019
5780fae
clean up sparse mutable bf
Apr 25, 2019
32348f5
less values
Apr 25, 2019
1a59d10
add gens for sparse mutable bf
Apr 26, 2019
c3e0fbd
Merge branch 'master2k19' into mutableBloomFilters
anish749 Apr 29, 2019
3f05ce6
new scala fmt
anish749 Apr 29, 2019
dcb8056
fix imports and ret type
anish749 Apr 29, 2019
0f251dd
Update scio-core/src/main/scala/com/spotify/scio/util/BloomFilter.scala
nevillelyh May 1, 2019
d416e6a
Update scio-core/src/main/scala/com/spotify/scio/util/BloomFilter.scala
nevillelyh May 1, 2019
d346d61
Update scio-core/src/main/scala/com/spotify/scio/util/BloomFilter.scala
nevillelyh May 1, 2019
070bb8b
Update scio-core/src/main/scala/com/spotify/scio/util/BloomFilter.scala
nevillelyh May 1, 2019
a5fed50
Update scio-core/src/main/scala/com/spotify/scio/util/BloomFilter.scala
nevillelyh May 1, 2019
1b719c3
Update scio-core/src/main/scala/com/spotify/scio/util/BloomFilter.scala
nevillelyh May 1, 2019
cdfe002
Update scio-core/src/main/scala/com/spotify/scio/util/BloomFilter.scala
nevillelyh May 1, 2019
699fe13
add mutation detection tests
anish749 May 1, 2019
4f1a3d7
Merge branch 'master' into mutableBloomFilters
May 2, 2019
52f6313
Update scio-core/src/main/scala/com/spotify/scio/util/BloomFilter.scala
regadas May 2, 2019
ae146cf
Apply suggestions from code review
regadas May 2, 2019
9ff97a7
address review comments
anish749 May 2, 2019
e6bab9f
more restricted access
anish749 May 2, 2019
66bd624
foreach -> while
anish749 May 2, 2019
5ed4707
add warning
anish749 May 2, 2019
2ff5176
remove return
anish749 May 2, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 3 additions & 1 deletion build.sbt
Expand Up @@ -457,7 +457,9 @@ lazy val scioBigQuery: Project = Project(
// DataFlow testing requires junit and hamcrest
"org.hamcrest" % "hamcrest-all" % hamcrestVersion % "test,it",
"com.github.alexarchambault" %% "scalacheck-shapeless_1.13" % scalacheckShapelessVersion % "test,it",
"me.lyh" %% "shapeless-datatype-core" % shapelessDatatypeVersion % "test"
"me.lyh" %% "shapeless-datatype-core" % shapelessDatatypeVersion % "test",
// Our BloomFilters are Algebird Monoids and hence uses tests from Algebird Test
"com.twitter" %% "algebird-test" % algebirdVersion % "test"
)
)
.dependsOn(
Expand Down
Expand Up @@ -18,11 +18,10 @@
package com.spotify.scio.coders.instances

import com.spotify.scio.coders.Coder
import com.twitter.algebird.{BF, Batched, CMS, TopK}
import com.twitter.algebird.{Batched, CMS, TopK}

trait AlgebirdCoders {
implicit def cmsCoder[K]: Coder[CMS[K]] = Coder.kryo
implicit def bfCoder[K]: Coder[BF[K]] = Coder.kryo
anish749 marked this conversation as resolved.
Show resolved Hide resolved
implicit def topKCoder[K]: Coder[TopK[K]] = Coder.kryo
implicit def batchedCoder[U]: Coder[Batched[U]] = Coder.kryo
}