Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script: ulong via fields API #76519

Merged
merged 16 commits into from
Aug 17, 2021

Conversation

stu-elastic
Copy link
Contributor

Exposes unsigned long via the fields API.

Unsigned longs default to java signed longs. That means the upper range
appears negative. Consumers should use Long.compareUnsigned(long, long)
Long.divideUnsigned(long, long) and Long.remainderUnsigned(long, long)
to correctly work with values known to be unsigned long.

Alternatively, users may treat the unsigned long type as BigInteger using
the field API, field('ul').as(Field.BigInteger).getValue(BigInteger.ZERO).

field('ul').as(Field.BigInteger).getValue(BigInteger.valueOf(1000))
field('ul').getValue(1000L)

This change also implements the beginning of the converters for the fields
API. The following conversions have been added:

ulong <-> BigInteger
long <-> BigInteger
double -> BigInteger
String (parsed as long or double) -> BigInteger
double -> long
String (parsed as long or double) -> long
Date (epoch milliseconds) -> long
Nano Date (epoch nanoseconds) -> long
boolean (1L for true, 0L for false) -> long

Fixes: #64361

Exposes unsigned long via the fields API.

Unsigned longs default to java signed longs.  That means the upper range
appears negative.  Consumers should use `Long.compareUnsigned(long, long)`
`Long.divideUnsigned(long, long)` and `Long.remainderUnsigned(long, long)`
to correctly work with values known to be unsigned long.

Alternatively, users may treat the unsigned long type as `BigInteger` using
the field API, `field('ul').as(Field.BigInteger).getValue(BigInteger.ZERO)`.
```
field('ul').as(Field.BigInteger).getValue(BigInteger.valueOf(1000))
field('ul').getValue(1000L)
```

This change also implements the beginning of the converters for the fields
API.  The following conversions have been added:
```
ulong <-> BigInteger
long <-> BigInteger
double -> BigInteger
String (parsed as long or double) -> BigInteger
double -> long
String (parsed as long or double) -> long
Date (epoch milliseconds) -> long
Nano Date (epoch nanoseconds) -> long
boolean (1L for true, 0L for false) -> long
```

Fixes: elastic#64361
@stu-elastic stu-elastic added >enhancement :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache v8.0.0 v7.15.0 labels Aug 13, 2021
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Aug 13, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@stu-elastic stu-elastic added the auto-backport Automatically create backport pull requests when merged label Aug 13, 2021
Copy link
Contributor

@jdconrad jdconrad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Walked through this fully with @stu-elastic and this looks great! Thanks for this change.

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Most of my comments are small nits.

* Converts between one scripting {@link Field} type and another, {@code CF}, with a different underlying
* value type, {@code CT}.
*/
public interface Converter<CT, CF extends Field<CT>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe these should be TC for target class and FC for Field class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

// No instances, please
private Converters() {}

public static BigIntegerField LongToBigInteger(LongField sourceField) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These can be private (or package private if we want tests for these), since they are only called in this file right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to package private.

* {@link Converters} for scripting fields. These constants are exposed as static fields on {@link Field} to
* allow a user to convert via {@link Field#as(Converter)}.
*/
public class Converters {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be package private since it is only accessed by Field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used by by UnsignedLongField in x-pack, so we have to leave it public.

}

@Override
public java.math.BigInteger getNonPrimitiveValue() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why are these fully qualified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed package prefixes.

return new BigIntegerField(sourceField.getName(), new DelegatingFieldValues<BigInteger, String>(fv) {
protected long parseNumber(String str) {
try {
return Long.parseLong(str);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail on values larger than Long.MAX_VALUE, and then lose precision (or fail again) when trying to parse as double.

I think for string, since the value can be arbitrarily large, we should completely use BigInteger and BigDecimal. So something like this:

try {
    return new BigInteger(str);
} catch (NumberFormatException e) {
    return new BigDecimal(str).toBigInteger();
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh nice, changed.

public double getDoubleValue() {
String str = values.getNonPrimitiveValue();
try {
return Double.parseDouble(str);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for first trying double? The underlying data could likewise be a BigInteger or BigDecimal that isn't parsable as long. The user asked to convert to long, so if the data can't fit it, it seems like getting a NumberFormatException is appropriate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, changed to using getLongValue(), which will attempt to parse the long.

/**
* Convert this {@code Field} into another {@code Field} type using a {@link Converter} of the target type.
*
* As this is called from user scripts, {@code as} may be called to convert a field into it's same type, if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's -> its

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

* Convert this {@code Field} into another {@code Field} type using a {@link Converter} of the target type.
*
* As this is called from user scripts, {@code as} may be called to convert a field into it's same type, if
* so {@code this} is cast via that converters {@link Converter#getFieldClass()}.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think "that" can be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

// UnsignedLongFields must define their own conversions as they are in x-pack
@Override
public <CT, CF extends Field<CT>> Field<CT> as(Converter<CT, CF> converter) {
if (converter.getFieldClass().isInstance(this)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate we have to repeat this. I wonder if we could instead have as be final on Field, and have an implementation method (eg convert) that is protected which can be overridden here, so that the identity check can always be first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added convert which is what subclasses should override.

return converter.getFieldClass().cast(bigIntegerField);
}

return super.as(converter);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a special case for converting to Long, since the getLongValue() doesn't return the right long in this case? ie if the value is under Long.MAX_VALUE then long is returned, otherwise an error should be raised, not a negative number returned? String would be similar, otherwise large values would show up as negative numbers?

Copy link
Contributor Author

@stu-elastic stu-elastic Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnsignedLongScriptDocValues/UnsignedLongField returns longs with exactly this property, values may be negative.

I expect diligent users will perform "confirming" type conversions and never take the original type from field('fieldname'), in this case the a user who wants to handle possible negative values themselves would only be able to do a field('ul').as(Field.Long) confirming type conversion.

If we want to allow field('ul').as(Field.UnsignedLong), we'll need to implement Field augmentation for painless.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the complication with Long type since that is the underlying native type returned here, but I don’t think conversion to string will work correctly? It should not show negative values, yet I think it will since it would just stringify the raw long value. A minimal test for this conversion would be good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added minimal test.

Copy link
Member

@rjernst rjernst Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point at the test? I can't find it. I was specifically talking about conversion to string, to ensure that unsigned long values do not show up as negative values with this conversion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does not implement conversion to String fields, only from String fields. To BigInteger and to Long are implemented here.

@rjernst
Copy link
Member

rjernst commented Aug 17, 2021

One general comment I forgot to make is I think we should have tests for every method/conversion we have exposed. I realize that is a lot, it doesn’t have to all come in this PR, but at least some minimal verification that this all works as we think it does.

@stu-elastic stu-elastic merged commit aea8bff into elastic:master Aug 17, 2021
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
7.x Commit could not be cherrypicked due to conflicts

To backport manually run backport --upstream elastic/elasticsearch --pr 76519

stu-elastic added a commit to stu-elastic/elasticsearch that referenced this pull request Aug 17, 2021
Exposes unsigned long via the fields API.

Unsigned longs default to java signed longs.  That means the upper range
appears negative.  Consumers should use `Long.compareUnsigned(long, long)`
`Long.divideUnsigned(long, long)` and `Long.remainderUnsigned(long, long)`
to correctly work with values known to be unsigned long.

Alternatively, users may treat the unsigned long type as `BigInteger` using
the field API, `field('ul').as(Field.BigInteger).getValue(BigInteger.ZERO)`.
```
field('ul').as(Field.BigInteger).getValue(BigInteger.valueOf(1000))
field('ul').getValue(1000L)
```

This change also implements the beginning of the converters for the fields
API.  The following conversions have been added:
```
ulong <-> BigInteger
long <-> BigInteger
double -> BigInteger
String (parsed as long or double) -> BigInteger
double -> long
String (parsed as long or double) -> long
Date (epoch milliseconds) -> long
Nano Date (epoch nanoseconds) -> long
boolean (1L for true, 0L for false) -> long
```

Backport: aea8bff
stu-elastic added a commit that referenced this pull request Aug 17, 2021
Exposes unsigned long via the fields API.

Unsigned longs default to java signed longs.  That means the upper range
appears negative.  Consumers should use `Long.compareUnsigned(long, long)`
`Long.divideUnsigned(long, long)` and `Long.remainderUnsigned(long, long)`
to correctly work with values known to be unsigned long.

Alternatively, users may treat the unsigned long type as `BigInteger` using
the field API, `field('ul').as(Field.BigInteger).getValue(BigInteger.ZERO)`.
```
field('ul').as(Field.BigInteger).getValue(BigInteger.valueOf(1000))
field('ul').getValue(1000L)
```

This change also implements the beginning of the converters for the fields
API.  The following conversions have been added:
```
ulong <-> BigInteger
long <-> BigInteger
double -> BigInteger
String (parsed as long or double) -> BigInteger
double -> long
String (parsed as long or double) -> long
Date (epoch milliseconds) -> long
Nano Date (epoch nanoseconds) -> long
boolean (1L for true, 0L for false) -> long
```

Backport: aea8bff
@stu-elastic
Copy link
Contributor Author

master (8.x): aea8bff
7.x (7.15.0): 5491d94

2lambda123 pushed a commit to 2lambda123/elastic-elasticsearch that referenced this pull request May 3, 2024
We introduced script values support for unsigned_long from 7.15
in elastic/elasticsearch#76519.
But forgot add this document for 7.x.

This just backport documentation from
elastic/elasticsearch#64422.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache >enhancement Team:Core/Infra Meta label for core/infra team v7.15.0 v8.0.0-alpha2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support unsigned long field type in painless
6 participants