Skip to content

fix MAX_INTERMEDIATE_SIZE for DoubleMeanHolder#9568

Merged
himanshug merged 4 commits intoapache:masterfrom
aP0StAl:fix-mean-holder
Mar 28, 2020
Merged

fix MAX_INTERMEDIATE_SIZE for DoubleMeanHolder#9568
himanshug merged 4 commits intoapache:masterfrom
aP0StAl:fix-mean-holder

Conversation

@aP0StAl
Copy link
Contributor

@aP0StAl aP0StAl commented Mar 26, 2020

Description

Fixed the bug with allocating a large amount of memory in the buffer

@himanshug
Copy link
Contributor

+1 after build, thanks , good catch.

How did you get to this? this was really sneaky.

@himanshug himanshug added this to the 0.18.0 milestone Mar 26, 2020
@himanshug himanshug added the Bug label Mar 26, 2020
Copy link
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice finding! Would you add a unit test? You can add a test method to a new class DoubleMeanAggregatorFactoryTest that verifies DoubleMeanAggregatorFactory.getMaxIntermediateSize().

@aP0StAl
Copy link
Contributor Author

aP0StAl commented Mar 26, 2020

@himanshug I found this when I was trying to figure out how to implement a module to solve this issue.
I think I found another bug:

@Override
public Object deserialize(Object object)
{
if (object instanceof String) {
return DoubleMeanHolder.fromBytes(StringUtils.decodeBase64(StringUtils.toUtf8((String) object)));
} else if (object instanceof DoubleMeanHolder) {
return object;
} else {
throw new IAE("Unknown object type [%s]", Utils.safeObjectClassGetName(object));
}
}
@Nullable
@Override
public Object finalizeComputation(@Nullable Object object)
{
if (object instanceof DoubleMeanHolder) {
return ((DoubleMeanHolder) object).mean();
} else if (object == null) {
return null;
} else {
throw new IAE("Unknown object type [%s]", object.getClass().getName());
}
}

In some cases, the object may turn out to be byte[] . It's right?

@aP0StAl
Copy link
Contributor Author

aP0StAl commented Mar 26, 2020

@jihoonson Ok. I will add a test tomorrow.

@himanshug
Copy link
Contributor

got it, so you were looking at the code and knew Long.SIZE wan't number of bytes but bits.

In some cases, the object may turn out to be byte[] . It's right?

at that point we are never gonna get a byte array.

@aP0StAl
Copy link
Contributor Author

aP0StAl commented Mar 26, 2020

Are you sure? I had an exception before I fixed a similar code (deserialize and finalizeComputation):
} else if (object instanceof byte[]) {
return DoubleSumVersionFilterHolder.fromBytes((byte[]) object).getSum();

Unknown Exception / Unknown object type [[B] / org.apache.druid.java.util.common.IAE

@himanshug
Copy link
Contributor

himanshug commented Mar 26, 2020

can you send me the druid query that reproduces behavior that of getting byte[] in doubleMean deserialize(..) ?

@aP0StAl
Copy link
Contributor Author

aP0StAl commented Mar 27, 2020

@himanshug It can be either deserialize or finalizeComputation:

{
"queryType": "groupBy",
"dataSource": "test_4",
"granularity": "all",
"dimensions": [],
"aggregations": [
{ "type": "doubleMean", "name": "value", "fieldName": "value"}
],
"intervals": [ "2018-12-01T00:00:00.000/2018-12-02T00:00:00.000" ]
}

@aP0StAl aP0StAl closed this Mar 27, 2020
@aP0StAl aP0StAl reopened this Mar 27, 2020
@himanshug
Copy link
Contributor

ok, can you please add the byte[] type handling in deserialize(..) and finalizeComputation(..) please. thanks

return DoubleMeanHolder.fromBytes(StringUtils.decodeBase64(StringUtils.toUtf8((String) object)));
} else if (object instanceof DoubleMeanHolder) {
return object;
} else if (object instanceof byte[]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make byte[] be the firs if check in both places ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But DoubleMeanHolder in both places too. Maybe we will replace only String in deserialize(..)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant, just move the byte[] check as being the first check in if-statement chain as I am expecting that would be the case most of the time.

@himanshug himanshug merged commit 9081b5f into apache:master Mar 28, 2020
himanshug pushed a commit to himanshug/druid that referenced this pull request Mar 28, 2020
* fix MAX_INTERMEDIATE_SIZE for DoubleMeanHolder

* byte[] type handling in deserialize and finalizeComputation for DoubleMeanAggregatorFactory

* DoubleMeanAggregatorFactory tests: Max Intermediate Size, Deserialize, finalizeComputation

* moved byte[] check to first position

Co-authored-by: Stanislav <S.Poryadnyi@abcconsulting.ru>
clintropolis pushed a commit that referenced this pull request Mar 31, 2020
* fix MAX_INTERMEDIATE_SIZE for DoubleMeanHolder (#9568)

* fix MAX_INTERMEDIATE_SIZE for DoubleMeanHolder

* byte[] type handling in deserialize and finalizeComputation for DoubleMeanAggregatorFactory

* DoubleMeanAggregatorFactory tests: Max Intermediate Size, Deserialize, finalizeComputation

* moved byte[] check to first position

Co-authored-by: Stanislav <S.Poryadnyi@abcconsulting.ru>

* remove commons-lang3  usage from DoubleMeanAggregatorFactoryTest

Co-authored-by: Stanislav Poryadnyi <37914083+aP0StAl@users.noreply.github.com>
Co-authored-by: Stanislav <S.Poryadnyi@abcconsulting.ru>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants