New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FunctionManager #11851

Open
wants to merge 3 commits into
base: master
from

Conversation

@rongrong
Contributor

rongrong commented Nov 5, 2018

Add FunctionManager and FunctionNamespace, and use FunctionManager instead of FunctionRegistry through out the codebase.

@nezihyigitbasi

Add FunctionNamespace interface and StaticFunctionNamespace

  • the commit message detail lines can be longer (72 characters), no need to wrap them at a shorter width
Show resolved Hide resolved ...o-main/src/main/java/com/facebook/presto/metadata/FunctionNamespace.java Outdated
Signature resolveFunction(QualifiedName name, List<TypeSignatureProvider> parameterTypes);
WindowFunctionSupplier getWindowFunctionImplementation(Signature signature);

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 6, 2018

Contributor

nit: getAggregateFunctionImplementation/getScalarFunctionImplementation return an implementation, but getWindowFunctionImplementation returns a supplier for the implementation. What do you think about naming this method getWindowFunctionSupplier or getWindowFunctionImplementationSupplier?

Show resolved Hide resolved ...o-main/src/main/java/com/facebook/presto/metadata/FunctionNamespace.java Outdated
@nezihyigitbasi

Add catalog and schema to Signature LGTM

I don't recall exactly why we wanted to add catalog and schema to the Signature, but it was probably because it was simplifying the implementation.

this.schema = requireNonNull(schema, "schema is null");
}
public Signature(String name,

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 6, 2018

Contributor
public Signature(
        String name,
@@ -109,6 +127,19 @@ public static Signature internalScalarFunction(String name, TypeSignature return
return new Signature(name, SCALAR, ImmutableList.of(), ImmutableList.of(), returnType, argumentTypes, false);
}
public static Signature qualifySignature(Signature signature, String catalog, String schema)
{
return new Signature(signature.name,

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 6, 2018

Contributor
return new Signature(
        signature.name,
@nezihyigitbasi

Add FunctionManager as new function resolution class

public FunctionManager(TypeManager typeManager, BlockEncodingSerde blockEncodingSerde, FeaturesConfig featuresConfig)
{
functionNamespaces = ImmutableMap.of();

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 6, 2018

Contributor

nit: We can use this to be consistent and also change the order a bit

this.typeManager = requireNonNull(typeManager, "typeManager is null");
this.blockEncodingSerde = requireNonNull(blockEncodingSerde, "blockEncodingSerde is null");
this.featuresConfig = requireNonNull(featuresConfig, "featuresConfig is null");
this.functionNamespaces = ImmutableMap.of();
this.operatorNamespace = new StaticFunctionNamespace(typeManager, blockEncodingSerde, featuresConfig);
addFunctions(TEMP_DEFAULT_CATALOG, functions);
}
public synchronized void addFunctions(String catalog, List<? extends SqlFunction> functions)

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 6, 2018

Contributor

Do we need this method to be public? For now we will only have a default catalog/schema, so the other method should be enough, and we can mark this as private and in the future if we need this to be public we can update it.

public synchronized void addFunctions(String catalog, List<? extends SqlFunction> functions)
{
Map<Boolean, List<SqlFunction>> operatorAndFunctionMap = functions.stream()
.collect(Collectors.partitioningBy(function -> isOperator(function.getSignature())));

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 6, 2018

Contributor

static import partitioningBy

public Signature resolveFunction(Session session, QualifiedName name, List<TypeSignatureProvider> parameterTypes)
{
try {
return operatorNamespace.resolveFunction(name, parameterTypes);

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 6, 2018

Contributor

I think we should have two methods: resolveFunction and resolveOperator instead of trying to do both in the same method.

}
for (SqlPathElement element : session.getPath().getParsedPath()) {
FunctionNamespace namespace = functionNamespaces.get(getCatalog(element, session));
if (namespace != null) {

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 6, 2018

Contributor

If we check against null and continue, that would help us get rid of the nestedness below:

if (namespace == null) {
    continue;
}
try {
    Signature signature = namespace.resolveFunction(name, parameterTypes);
    return qualifySignature(signature, getCatalog(element, session), TEMP_DEFAULT_SCHEMA);
}
catch (PrestoException e) {
    if (!e.getErrorCode().equals(FUNCTION_NOT_FOUND.toErrorCode())) {
        throw e;
    }
}
public WindowFunctionSupplier getWindowFunctionImplementation(Signature signature)
{
try {
return operatorNamespace.getWindowFunctionImplementation(signature);

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 6, 2018

Contributor

Looks like we do in many of these methods:

  • try resolving in the operator namespace
  • on exception, try the function namespace

Is it easy to get rid of this behavior? (I don't really remember/know why we check the operator namespace for functions).

This comment has been minimized.

@rongrong

rongrong Nov 7, 2018

Contributor

I don't really like how the code looks with handling unresolved functions through exceptions. It would look nicer if the namespace.resolveFunction would just return null and only FunctionManager would throw.

throw e;
}
}
throw new OperatorNotFoundException(

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 6, 2018

Contributor

Why don't we throw e above when it is an instance of OperatorNotFoundException? Doesn't it already include the info we need?

throw e;
}
}
throw new OperatorNotFoundException(OperatorType.CAST, ImmutableList.of(fromType), toType);

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 6, 2018

Contributor

ditto

@rongrong rongrong force-pushed the rongrong:function branch from 0feb8d8 to 5b4fc90 Nov 7, 2018

@rongrong

This comment has been minimized.

Contributor

rongrong commented Nov 7, 2018

A few changes:

  • removed the operatorNamespace from FunctionManager for now cause it's not obvious from these diffs why this is needed. We can introduce it when necessary.
  • Separate the global function namespace out as a special one. Currently this would have no catalog and schema attached to it (the catalog, schema from the resolved function would be null).
  • The map of dynamic function namespaces in FunctionManager will be a Map<SqlPathElement, FunctionNamespace> so each schema can define their own functions.
public synchronized void addFunctions(String catalog, String schema, List<? extends SqlFunction> functions)
{
if (!dynamicFunctionNamespaces.containsKey(catalog)) {

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 7, 2018

Contributor

Now that dynamicFunctionNamespaces is a Map<SqlPathElement, FunctionNamespace>, you should fix the various containsKey and get calls here and below as they pass String to those methods instead of SqlPathElement.

public class FunctionManager
{
private final FunctionNamespace globalFunctionNamespace;
private volatile Map<SqlPathElement, FunctionNamespace> dynamicFunctionNamespaces;

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 7, 2018

Contributor

Currently we only have a single subclass (StaticFunctionNamespace) of FunctionNamespace, and we use instances of StaticFunctionNamespace for the dynamicFunctionNamespaces field, which is confusing. Looking at the rest of the patch, do you think we will need a DynamicFunctionNamespace class? If we won't, maybe we should make FunctionNamespace a class instead of an interface, and use it for both static/dynamic namespaces. Thoughts?

This comment has been minimized.

@rongrong

rongrong Nov 8, 2018

Contributor

Within this three diffs dynamicFunctionNamespaces are not really used. If we really don't want to introduce potentially unnecessary logic, I can change FunctionNamespace to be a class for now (and we can change that later when needed). We can remove the dynamicFunctionNamespaces private variable all together for now.

try {
return globalFunctionNamespace.resolveFunction(name, parameterTypes);
}
catch (Exception e) {

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 7, 2018

Contributor

As you pointed out offline, I also don't like the way we do resolution through catching exceptions. We should get rid of this.

boolean canResolveOperator(OperatorType operatorType, Type returnType, List<? extends Type> argumentTypes);
Signature resolveOperator(OperatorType operatorType, List<? extends Type> argumentTypes)
throws OperatorNotFoundException;

This comment has been minimized.

@electrum

electrum Nov 8, 2018

Collaborator

Nit: don't declare unchecked exceptions

}
public Signature resolveOperator(OperatorType operatorType, List<? extends Type> argumentTypes)
throws OperatorNotFoundException

This comment has been minimized.

@electrum

electrum Nov 8, 2018

Collaborator

Same here

public StaticFunctionNamespace(TypeManager typeManager, BlockEncodingSerde blockEncodingSerde, FeaturesConfig featuresConfig)
{
functionRegistry = new FunctionRegistry(typeManager, blockEncodingSerde, featuresConfig);

This comment has been minimized.

@electrum

electrum Nov 8, 2018

Collaborator

Rather than passing in the exact arguments needed to construct a FunctionRegistry and then constructing it directly (without any "added value"), we should instead pass in a FunctionRegistry directly.

try {
return globalFunctionNamespace.getWindowFunctionSupplier(signature);
}
catch (PrestoException e) {

This comment has been minimized.

@electrum

electrum Nov 8, 2018

Collaborator

Name the parameter ignored

Same goes for below

This comment has been minimized.

@dain

dain Nov 12, 2018

Contributor

For context, naming it ignored (or `expected) suppresses the IntelliJ warning about unused exception field.

return globalFunctionNamespace.getWindowFunctionSupplier(signature);
}
catch (PrestoException e) {
//do nothing and try to resolve normally

This comment has been minimized.

@electrum

electrum Nov 8, 2018

Collaborator

nit: space after //

Same goes for below

try {
return globalFunctionNamespace.getCoercion(fromType, toType);
}
catch (PrestoException e) {

This comment has been minimized.

@electrum

electrum Nov 8, 2018

Collaborator

I believe this can be simplified to

catch (OperatorNotFoundException e) {
    throw new OperatorNotFoundException(...);
}
{
private static final String OPERATOR_PREFIX = "$operator$";
private FunctionUtils()

This comment has been minimized.

@electrum

electrum Nov 8, 2018

Collaborator

Nit: declare empty constructor as

private FunctionUtils() {}
*/
package com.facebook.presto.metadata;
public class FunctionUtils

This comment has been minimized.

@electrum

electrum Nov 8, 2018

Collaborator

Utility classes should be final

{
}
public static boolean isOperator(Signature signature)

This comment has been minimized.

@electrum

electrum Nov 8, 2018

Collaborator

This method is not used anywhere

@@ -40,6 +41,8 @@
private final TypeSignature returnType;
private final List<TypeSignature> argumentTypes;
private final boolean variableArity;
private final Optional<String> catalog;

This comment has been minimized.

@electrum

electrum Nov 8, 2018

Collaborator

I don't think it's valid to have a schema without a catalog. This should probably be Optional<CatalogSchemaName>

@rongrong rongrong force-pushed the rongrong:function branch 3 times, most recently from 12a1d61 to f347597 Nov 11, 2018

@nezihyigitbasi

"Add FunctionNamespace and FunctionManager" looks good.

@nezihyigitbasi

"Move static methods from FunctionRegistry to FunctionUtils" looks good.

@@ -143,7 +143,7 @@ public SqlScalarFunction build()
private static boolean isOperator(Signature signature)
{
for (OperatorType operator : OperatorType.values()) {
if (signature.getName().equals(FunctionRegistry.mangleOperatorName(operator))) {
if (signature.getName().equals(FunctionUtils.mangleOperatorName(operator))) {

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 12, 2018

Contributor

static import

@@ -202,12 +202,12 @@ public Expression toExpression(Object object, Type type)
// able to encode it in the plan that gets sent to workers.
// We do this by transforming the in-memory varbinary into a call to from_base64(<base64-encoded value>)
FunctionCall fromBase64 = new FunctionCall(QualifiedName.of("from_base64"), ImmutableList.of(new StringLiteral(VarbinaryFunctions.toBase64((Slice) object).toStringUtf8())));
Signature signature = FunctionRegistry.getMagicLiteralFunctionSignature(type);
Signature signature = FunctionUtils.getMagicLiteralFunctionSignature(type);

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 12, 2018

Contributor

static import (also the ones at L209 and L210)

@@ -98,7 +98,7 @@
public static class BenchmarkData
{
@Param({"$operator$hash_code", "old_hash", "another_hash"})
private String name = FunctionRegistry.mangleOperatorName(HASH_CODE);
private String name = FunctionUtils.mangleOperatorName(HASH_CODE);

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 12, 2018

Contributor

static import

@nezihyigitbasi

Add CatalogSchemaName to Signature looks good.

@@ -49,7 +52,8 @@ public Signature(
@JsonProperty("longVariableConstraints") List<LongVariableConstraint> longVariableConstraints,
@JsonProperty("returnType") TypeSignature returnType,
@JsonProperty("argumentTypes") List<TypeSignature> argumentTypes,
@JsonProperty("variableArity") boolean variableArity)
@JsonProperty("variableArity") boolean variableArity,
@JsonProperty("catalog") Optional<CatalogSchemaName> catalogSchemaName)

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 12, 2018

Contributor

@JsonProperty("catalog") -> @JsonProperty("catalogSchemaName") ?

@nezihyigitbasi

Use FunctionManager instead of FunctionRegistry in Metadata looks good.

@@ -69,7 +69,6 @@ public boolean isAggregationFunction(QualifiedName name)
throw new UnsupportedOperationException();
}
@Override

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 12, 2018

Contributor

why is this removed?

BlockEncodingManager blockEncodingManager = new BlockEncodingManager(typeManager);
FeaturesConfig featuresConfig = new FeaturesConfig();
FunctionManager functionManager = new FunctionManager(typeManager, blockEncodingManager, featuresConfig);
return new FunctionRegistry(typeManager, blockEncodingManager, new FeaturesConfig(), functionManager);

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Nov 12, 2018

Contributor

We create multiple FeaturesConfigs in this method (another one is created on L299).

@nezihyigitbasi

Overall LGTM.

@martint it would be nice if you can also take a quick look.

@dain dain self-requested a review Nov 12, 2018

@dain dain self-assigned this Nov 12, 2018

@dain

This comment has been minimized.

Contributor

dain commented Nov 12, 2018

I would like to review this, so please don't merge yet

@dain

This comment has been minimized.

Contributor

dain commented Nov 12, 2018

The author of these commits should be Cole Bowden <bowdencm@fb.com>

@dain

@rongrong instead of saying This PR contains the first 3 commits of #11161 with related build failures fixed. can you let me know what the change actually is, and maybe for this one, the motivation. I ask because this is a pretty small part of the changes from Cole, and removes the multiple namespaces part of his PR.

It looks like this change is basically to hide FunctionRegistry using this abstraction chain: Metadata -> FunctionManager->`` -> FunctionNamespace -> `FunctionRegistry`. If so, I think we can make `FunctionNamespace` and `FunctionRegistry` package protected to prevent further leaking of their details.

It doesn't seem like Signature.getCatalogSchemaName() is ever called, so I'm not sure why we are making this change now. (motivation and context would help here).

Other than the desire for more context this seems very straight forward.

import static com.facebook.presto.spi.type.VarbinaryType.VARBINARY;
import static com.google.common.base.Preconditions.checkArgument;
public final class FunctionUtils

This comment has been minimized.

@dain

dain Nov 12, 2018

Contributor

I feel like this should have a more specific name like: FunctionSignatureUtils or SignatureMangler.... not sureI like those two, but something more specific than "utilities for functions".

This comment has been minimized.

@rongrong

rongrong Nov 13, 2018

Contributor

This is one of the commits that's more or less still kept its look from the original PR. I think the intention is to move the static methods out so other classes don't reference FunctionRegistry directly. However, if you look at the methods here, only mangleOperatorName is widely used. All others can potentially be moved somewhere else. I was debating whether I want to do that now, but figured that maybe it's better to do it later once FunctionRegistry is removed and the logic for operators and functions are sorted out into proper places.

This comment has been minimized.

@dain

dain Nov 13, 2018

Contributor

If we can make the literal changes I suggested, I think we can rename this to something like OperatorSignatureUtils

@rongrong

This comment has been minimized.

Contributor

rongrong commented Nov 13, 2018

@dain The author is no longer Cole because these are pretty much all reimplemented. The structure and content of all commits you are seeing now are dramatically changed compare to the original PR. There were a lot of git reset due to restructure of diffs which caused the name change. Not sure what's your motivation for put Cole's name behind it. I can change them to Cole, but these commits are probably 80%+ done by me at the moment and I should be responsible for them.

@rongrong

This comment has been minimized.

Contributor

rongrong commented Nov 13, 2018

For the change to Signature, the only reason it's here is that adding Catalog and Schema was the first commit in the original PR. I agree that it's not related. We can move that out.

@rongrong rongrong force-pushed the rongrong:function branch 3 times, most recently from a23415b to 297aaaa Nov 13, 2018

@dain

I agree about dropping the Signature change for now, as it will be confusing to everyone since it is not used. I also suggest that Add FunctionNamespace and FunctionManager and Use FunctionManager instead of FunctionRegistry in Metadata be squashed together; they are not really independent, and I like to see the code along with the use if possible. Finally for the, Move static methods from FunctionRegistry to FunctionUtils commit, I have suggestions for different organization (BTW, if you do that the commit will need a new title).

I read through both PRs, and this looks like all new code to me also, so the author tag is correct.

return OPERATOR_PREFIX + operatorName;
}
public static Type typeForMagicLiteral(Type type)

This comment has been minimized.

@dain

dain Nov 13, 2018

Contributor

I'm pretty sure this is @ VisibleForTesting also

This comment has been minimized.

@rongrong

rongrong Nov 14, 2018

Contributor

It's used in LiteralEncoder as well.

private FunctionUtils() {}
public static Signature getMagicLiteralFunctionSignature(Type type)

This comment has been minimized.

@dain

dain Nov 13, 2018

Contributor

Ok did a bunch of digging... I suggest we move the literal related functions to LiteralEncoder. I also suggest we hide the "magic" literal stuff a bit better, by introducing this method:

    private static Optional<Signature> resolveLiteralFunction(QualifiedName name, List<TypeSignatureProvider> parameterTypes, TypeManager typeManager)
    {
        if (!name.getSuffix().startsWith(MAGIC_LITERAL_FUNCTION_PREFIX)) {
            return Optional.empty();
        }

        // extract type from function name
        String typeName = name.getSuffix().substring(MAGIC_LITERAL_FUNCTION_PREFIX.length());

        // lookup the type
        Type type = typeManager.getType(parseTypeSignature(typeName));

        // verify we have one parameter of the proper type
        checkArgument(parameterTypes.size() == 1, "Expected one argument to literal function, but got %s", parameterTypes);

        return Optional.of(getMagicLiteralFunctionSignature(type));
    }

Then in FunctionRegistry.resolveFunction, we do something like:

        return resolveLiteralFunction(name, parameterTypes, typeManager)
                .orElseThrow(() -> new PrestoException(FUNCTION_NOT_FOUND, message));

This comment has been minimized.

@rongrong

rongrong Nov 17, 2018

Contributor

If this is just a private function in FunctionRegistry not sure how much "hiding" it adds. A practical problem with this is that there's an error about message being not final. Do you think this is really necessary?

import static com.facebook.presto.spi.type.VarbinaryType.VARBINARY;
import static com.google.common.base.Preconditions.checkArgument;
public final class FunctionUtils

This comment has been minimized.

@dain

dain Nov 13, 2018

Contributor

If we can make the literal changes I suggested, I think we can rename this to something like OperatorSignatureUtils

rongrong added some commits Nov 9, 2018

Add FunctionNamespace and FunctionManager
Current implmentation of FunctionNamespace delegate all operations to
FunctionRegistry. FunctionManager only manages one FunctionNamespace
which is the wrapper of FunctionRegistry.

@rongrong rongrong force-pushed the rongrong:function branch 3 times, most recently from 2e825f3 to 738d707 Nov 17, 2018

@rongrong rongrong force-pushed the rongrong:function branch from 738d707 to e9a7c6e Nov 17, 2018

@rongrong

This comment has been minimized.

Contributor

rongrong commented Nov 17, 2018

@dain addressed your comments other than the resolveLiteralFunction suggestion. Let me know what you think.

@dain

dain approved these changes Nov 19, 2018

One minor comment. Looks good.

My point above about the resolveLiteralFunction change is that we could encapsulate the magic literal concept, and gather all of that code that deal with it into the LiteralEncoder. I think that change is independent of this PR, so I can deal with it after this goes in.

It looks like @electrum hasn't cleared his changes requested flag, and I'm not sure if @martint wanted to review this.

}
@VisibleForTesting
public static OperatorType unmangleOperator(String mangledName)

This comment has been minimized.

@dain

dain Nov 19, 2018

Contributor

For consistency, should this be unmangleOperatorName?

@martint

Looks good. However, it needs an explanation of the motivation in the commit message (especially the first one). Since these commits are a small part of a larger plan, they lack context and it looks like a somewhat unnecessary refactoring by themselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment