Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function Score: Refactor RandomScoreFunction to be consistent, and return values in range [0.0, 1.0] #7446

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Expand Up @@ -140,8 +140,9 @@ not.

===== Random

The `random_score` generates scores via a pseudo random number algorithm
that is initialized with a `seed`.
The `random_score` generates scores using a hash of the `_uid` field,
with a `seed` for variation. If `seed` is not specified, the current
time is used.

[source,js]
--------------------------------------------------
Expand Down
Expand Up @@ -20,57 +20,54 @@

import org.apache.lucene.index.AtomicReaderContext;
import org.apache.lucene.search.Explanation;
import org.apache.lucene.util.StringHelper;
import org.elasticsearch.index.fielddata.AtomicFieldData;
import org.elasticsearch.index.fielddata.IndexFieldData;
import org.elasticsearch.index.fielddata.SortedBinaryDocValues;

/**
* Pseudo randomly generate a score for each {@link #score}.
*/
public class RandomScoreFunction extends ScoreFunction {

private final PRNG prng;
private int originalSeed;
private int saltedSeed;
private final IndexFieldData<?> uidFieldData;
private SortedBinaryDocValues uidByteData;

public RandomScoreFunction(long seed) {
/**
* Creates a RandomScoreFunction.
*
* @param seed A seed for randomness
* @param salt A value to salt the seed with, ideally unique to the running node/index
* @param uidFieldData The field data for _uid to use for generating consistent random values for the same id
*/
public RandomScoreFunction(int seed, int salt, IndexFieldData<?> uidFieldData) {
super(CombineFunction.MULT);
this.prng = new PRNG(seed);
this.originalSeed = seed;
this.saltedSeed = seed ^ salt;
this.uidFieldData = uidFieldData;
if (uidFieldData == null) throw new NullPointerException("uid missing");
}

@Override
public void setNextReader(AtomicReaderContext context) {
// intentionally does nothing
AtomicFieldData leafData = uidFieldData.load(context);
uidByteData = leafData.getBytesValues();
if (uidByteData == null) throw new NullPointerException("failed to get uid byte data");
}

@Override
public double score(int docId, float subQueryScore) {
return prng.nextFloat();
uidByteData.setDocument(docId);
int hash = StringHelper.murmurhash3_x86_32(uidByteData.valueAt(0), saltedSeed);
return (hash & 0x00FFFFFF) / (float)(1 << 24); // only use the lower 24 bits to construct a float from 0.0-1.0
}

@Override
public Explanation explainScore(int docId, float subQueryScore) {
Explanation exp = new Explanation();
exp.setDescription("random score function (seed: " + prng.originalSeed + ")");
exp.setDescription("random score function (seed: " + originalSeed + ")");
return exp;
}

/**
* A non thread-safe PRNG
*/
static class PRNG {

private static final long multiplier = 0x5DEECE66DL;
private static final long addend = 0xBL;
private static final long mask = (1L << 48) - 1;

final long originalSeed;
long seed;

PRNG(long seed) {
this.originalSeed = seed;
this.seed = (seed ^ multiplier) & mask;
}

public float nextFloat() {
seed = (seed * multiplier + addend) & mask;
return seed / (float)(1 << 24);
}

}
}
Expand Up @@ -75,7 +75,7 @@ public static FactorBuilder factorFunction(float boost) {
return (new FactorBuilder()).boostFactor(boost);
}

public static RandomScoreFunctionBuilder randomFunction(long seed) {
public static RandomScoreFunctionBuilder randomFunction(int seed) {
return (new RandomScoreFunctionBuilder()).seed(seed);
}

Expand Down
Expand Up @@ -28,7 +28,7 @@
*/
public class RandomScoreFunctionBuilder implements ScoreFunctionBuilder {

private Long seed = null;
private Integer seed = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a particular reason (which I fail to see obviously :-) why this is an object and not a regular int?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is to be able to do if (seed != null) in the toXContent method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, I rest my case.. not sure if the noargs constructor makes sense, if that one was gone, the seed would never be empty

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just went with what was there. I think having the option to not supply the seed (ie you don'g care about reproducing, you just want some randomness) is a good option to keep.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this was changed from a long to an int? In 1.4 I can no longer use this function because the seed I was using matches data that is only 64 bits.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harmsk This was due to using a 32 bit hash, however it was fixed so longs (as well as strings) work later in #8311


public RandomScoreFunctionBuilder() {
}
Expand All @@ -44,7 +44,7 @@ public String getName() {
*
* @param seed The seed.
*/
public RandomScoreFunctionBuilder seed(long seed) {
public RandomScoreFunctionBuilder seed(int seed) {
this.seed = seed;
return this;
}
Expand All @@ -53,7 +53,7 @@ public RandomScoreFunctionBuilder seed(long seed) {
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
builder.startObject(getName());
if (seed != null) {
builder.field("seed", seed.longValue());
builder.field("seed", seed.intValue());
}
return builder.endObject();
}
Expand Down
Expand Up @@ -24,6 +24,8 @@
import org.elasticsearch.common.lucene.search.function.RandomScoreFunction;
import org.elasticsearch.common.lucene.search.function.ScoreFunction;
import org.elasticsearch.common.xcontent.XContentParser;
import org.elasticsearch.index.fielddata.IndexFieldData;
import org.elasticsearch.index.mapper.FieldMapper;
import org.elasticsearch.index.query.QueryParseContext;
import org.elasticsearch.index.query.QueryParsingException;
import org.elasticsearch.index.query.functionscore.ScoreFunctionParser;
Expand All @@ -32,9 +34,6 @@

import java.io.IOException;

/**
*
*/
public class RandomScoreFunctionParser implements ScoreFunctionParser {

public static String[] NAMES = { "random_score", "randomScore" };
Expand All @@ -51,7 +50,7 @@ public String[] getNames() {
@Override
public ScoreFunction parse(QueryParseContext parseContext, XContentParser parser) throws IOException, QueryParsingException {

long seed = -1;
int seed = -1;

String currentFieldName = null;
XContentParser.Token token;
Expand All @@ -60,28 +59,23 @@ public ScoreFunction parse(QueryParseContext parseContext, XContentParser parser
currentFieldName = parser.currentName();
} else if (token.isValue()) {
if ("seed".equals(currentFieldName)) {
seed = parser.longValue();
seed = parser.intValue();
} else {
throw new QueryParsingException(parseContext.index(), NAMES[0] + " query does not support [" + currentFieldName + "]");
}
}
}

if (seed == -1) {
seed = parseContext.nowInMillis();
seed = (int)parseContext.nowInMillis();
}

ShardId shardId = SearchContext.current().indexShard().shardId();
seed = salt(seed, shardId.index().name(), shardId.id());
int salt = (shardId.index().name().hashCode() << 10) | shardId.id();

return new RandomScoreFunction(seed);
}
final FieldMapper<?> mapper = SearchContext.current().mapperService().smartNameFieldMapper("_uid");
IndexFieldData<?> uidFieldData = SearchContext.current().fieldData().getForField(mapper);

public static long salt(long seed, String index, int shardId) {
long salt = index.hashCode();
salt = salt << 32;
salt |= shardId;
return salt^seed;
return new RandomScoreFunction(seed, salt, uidFieldData);
}

}

This file was deleted.