Skip to content

Commit

Permalink
Add String Column Support for Count Distinct Aggregation (#1196)
Browse files Browse the repository at this point in the history
* add string support for count distinct column

* fix: use native hashing algorithm and fix bug
  • Loading branch information
atangwbd committed Jun 28, 2023
1 parent b4b4e02 commit 169b86e
Showing 1 changed file with 3 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,9 @@ private[offline] object SlidingWindowFeatureUtils {
// In Feathr's use case, we want to treat the count aggregation as simple count of non-null items.
val rewrittenDef = s"CASE WHEN ${featureDef} IS NOT NULL THEN 1 ELSE 0 END"
new CountAggregate(rewrittenDef)
case AggregationType.COUNT_DISTINCT => new CountDistinctAggregate(featureDef)
case AggregationType.COUNT_DISTINCT =>
val rewrittenDef = s"CASE WHEN ${featureDef} IS NOT NULL THEN hash(${featureDef}) ELSE 0 END"
new CountDistinctAggregate(rewrittenDef)
case AggregationType.AVG => new AvgAggregate(featureDef)
case AggregationType.MAX => new MaxAggregate(featureDef)
case AggregationType.MIN => new MinAggregate(featureDef)
Expand Down

0 comments on commit 169b86e

Please sign in to comment.