Skip to content
This repository has been archived by the owner on Sep 20, 2022. It is now read-only.

[HIVEMALL-90] Refine incomplete AUC UDAF implementation #63

Closed
wants to merge 11 commits into from

Conversation

takuti
Copy link
Member

@takuti takuti commented Mar 18, 2017

What changes were proposed in this pull request?

Since AUC UDAF (classification) did not work correctly for some specific merge orders, this PR fixes the issue by modifying the UDAF's merge() and terminate() implementation.

Moreover, unit tests are refined accordingly, and a utility method is created in HiveUtils.

What type of PR is it?

Bug Fix

What is the Jira issue?

https://issues.apache.org/jira/browse/HIVEMALL-90

How was this patch tested?

  • Unit test
  • Manual test on EMR

How to use this feature?

Nothing has been changed from current AUC UDAF.

@coveralls
Copy link

coveralls commented Mar 18, 2017

Coverage Status

Coverage increased (+0.2%) to 36.955% when pulling 4937579 on takuti:fix-auc into cb63532 on apache:master.

long fp = PrimitiveObjectInspectorFactory.writableLongObjectInspector.get(fpObj);
long tp = PrimitiveObjectInspectorFactory.writableLongObjectInspector.get(tpObj);
long fpPrev = PrimitiveObjectInspectorFactory.writableLongObjectInspector.get(fpPrevObj);
long tpPrev = PrimitiveObjectInspectorFactory.writableLongObjectInspector.get(tpPrevObj);

Map<Double, Double> areaPartialMap = (Map<Double, Double>) ObjectInspectorFactory.getStandardMapObjectInspector(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ObjectInspectorFactory.getStandardMapObjectInspector(
  PrimitiveObjectInspectorFactory.writableDoubleObjectInspector,
  PrimitiveObjectInspectorFactory.writableLongObjectInspector
)

Invalid casting Map<DoubleWritable, LongWritable> to Map<Double, Double>

@@ -35,7 +39,9 @@
import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AbstractAggregationBuffer;
import org.apache.hadoop.hive.serde2.io.DoubleWritable;
import org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryMap;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import

PrimitiveObjectInspectorFactory.writableLongObjectInspector).getMap(
HiveUtils.castLazyBinaryObject(areaPartialMapObj));

Map<Double, Long> fpPartialMap = (Map<Double, Long>) ObjectInspectorFactory.getStandardMapObjectInspector(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TerminatePartial returns Writable objects but receiving Java objects.

partialResult[2] = new LongWritable(myAggr.fp);
partialResult[3] = new LongWritable(myAggr.tp);
partialResult[4] = new LongWritable(myAggr.fpPrev);
partialResult[5] = new LongWritable(myAggr.tpPrev);
partialResult[6] = myAggr.areaPartialMap;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revise OI. javaDoubleObjectInspector instead of writableDoubleObjectInspector?

@myui
Copy link
Member

myui commented Apr 9, 2017

@takuti terminatePartial/merge OIs are invalid ones.

@takuti
Copy link
Member Author

takuti commented Apr 11, 2017

@myui fixed. plz check them.

@coveralls
Copy link

coveralls commented Apr 11, 2017

Coverage Status

Coverage increased (+0.3%) to 37.038% when pulling 810f540 on takuti:fix-auc into cb63532 on apache:master.

@asfgit asfgit closed this in 8aae974 Apr 13, 2017
@myui
Copy link
Member

myui commented Apr 13, 2017

@takuti LGTM. Merged to small modifications. Thanks!

@takuti takuti deleted the fix-auc branch April 13, 2017 07:30
takuti added a commit to takuti/incubator-hivemall that referenced this pull request Apr 21, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants