Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Improve select clause transformation for Linq provider #2079

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

maca88
Copy link
Contributor

@maca88 maca88 commented Mar 20, 2019

Fixes #2234
Fixes #2334

@maca88 maca88 changed the title Fix conditional expressions not being translated into hql WIP - Fix conditional expressions not being translated into hql Mar 20, 2019
@maca88 maca88 changed the base branch from 5.2.x to master April 7, 2019 12:40
@maca88 maca88 changed the title WIP - Fix conditional expressions not being translated into hql WIP - Improve select clause transformation for Linq provider Apr 7, 2019
@maca88
Copy link
Contributor Author

maca88 commented Apr 7, 2019

Rebased to master and extended the purpose of this PR (description updated).

@maca88
Copy link
Contributor Author

maca88 commented Sep 30, 2019

Rebased and all tests should be now fixed.

@maca88 maca88 changed the title WIP - Improve select clause transformation for Linq provider Improve select clause transformation for Linq provider Sep 30, 2019
@@ -417,7 +417,7 @@ private void OverrideStandardHQLFunctions()
RegisterFunction("nullif", new StandardSafeSQLFunction("nullif", 2));
RegisterFunction("lower", new StandardSafeSQLFunction("lower", NHibernateUtil.String, 1));
RegisterFunction("upper", new StandardSafeSQLFunction("upper", NHibernateUtil.String, 1));
RegisterFunction("mod", new StandardSafeSQLFunction("mod", NHibernateUtil.Double, 2));
RegisterFunction("mod", new StandardSafeSQLFunction("mod", NHibernateUtil.Int32, 2));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally depending on the dialect it should be either Int32 (FB <3.0) or Int64 (FB >3.0).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the documentation the result varies depending on the arguments. Also the arguments are rounded before appling the mod operator, which is not consistent with .NET and other databases. In the current master, the mod operator is never executed on the DB, which allows to consistently calculate the mod operation on the client side, but it prevents us from using it in a subquery. In this PR the mod operator is currently always executed on the database, which will not work correctly when using decimal values as arguments as all dialects have Int32 set as the return type. In order to fix that I will make custom ISQLFunction classes for each dialect, which will have the information whether the mod operator can be translated for the given arguments or not. For example, in firebird when one of the arguments is decimal the mod operator will not be translated to sql in order to keep the consistency. As mod is probably not the only function which its result type varies depending on the arguments I am setting again to WIP, until the mod and other similar functions are fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried supplying null as the type? In such case, NHibernate infers the type from the first argument of the function. But maybe you need to do that according to the second argument, or considering both, in which case some changes a bit deeper will be required. (See FindFunctionReturnType.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I tried, but in case of mod function the returning type depends on both parameters. I am intending to add an overload for FindFunctionReturnType which will take all function arguments and also extend ISQLFunction with a transitional interface that will do the same for ReturnType mehod. Unfurtunately, the new ReturnType method will not be enough, because the same method is called also for a hql query that may not have the information about the type. For example:

var q = session.CreateQuery("select mod(a.Id, :p1) from Animal a")
               .SetParameter("p1", 2.1m);

In the above query, the hql string is analyzed (FindFunctionReturnType gets called) when the CreateQuery method is called, which means that we do not have the type of the paramater. For this reason the transitional ISQLFunction interface will need to have an additional property DefaultReturnType which will be used when arguments types cannot be determined. So for the given example, all values will be converted to int, because the default return type for the mod function is set to Int32, which is also how it currently works. For a linq query the parameter will be wrapped with a transparentcast in order to preserve the type when building the hql tree, so that FindFunctionReturnType will be able to determine all parameter types.

case TypeCode.Double:
return true;
default:
return new[]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to replace with just if-elif-else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified it by using the || operator.

@maca88 maca88 changed the title Improve select clause transformation for Linq provider WIP - Improve select clause transformation for Linq provider Oct 1, 2019
@maca88
Copy link
Contributor Author

maca88 commented Nov 12, 2019

Due to multiple issues that I've encountered along the way to address the parameter type issue for methods that have more than one parameters, I expanded what this PR does. For better understanding of the changes that were made in the last commit, I will try to explain what and why was changed. As previously mentioned, ISQLFunction was extended with GetReturnType method which supports multiple arguments and DefaultReturnType property which is used when the argument types are not known (in practice it is used only for hql). GetReturnType has an additional throwOnError parameter in order to preserve the validation that aggregate functions were doing inside the old method.
mod function was updated to use the new GetReturnType method where the return type is calculated based on the argument types. By testing I find out that all databases have the same logic for the return type so all dialects shares the same logic (ModulusFunctionTypeDetector), the only difference is that some databases do not support floating-point and decimal types.
A dedicated iif function was added in order to return the correct return type, as previously it was hardcoded to return the second argument inside MethodNode class.
The first issue that I encountered with parameter type detection were polymorphic queries, which I've tried to fix but at the end due to added complexity I decided to leave them as they are. In order to explain the issue lets see the following polymorphic query:

s.Query<IEntity>().Where(o => o.MyEnum == MyEnum.Option1).ToList()

Entities that implement IEntity may map MyEnum differently, one as a string and others as a number. In order to detect the correct type for MyEnum.Option1 in advance we would need to use the ExpressionsHelper.TryGetMappedType for all implementors and for each create a VisitorParameters instance and pass it to implementor QueryTranslatorImpl instance inside ASTQueryTranslatorFactory. Unfortunately, this is not enough because the equivalent hql query would also fail when having different MyEnum mappings, as ParameterMetadata is bound to IQueryPlan and not to IQueryTranslator. By having ParameterMetadata bound to IQueryPlan, queries that are executed for each IQueryTranslator are using the same parameter types. And there is a third issue which is related to IResultTransformer and PostExecuteTransformer for Linq that are shared for all IQueryTranslator, for example:

s.Query<IEntity>().Select(o => new { o.Id, o.Name }).ToList()

When some implementors have the Name property mapped and some not, the query would fail to execute as only one IResultTransformer is created for both type of implementors. As this is a very rare scenario that nobody seem to encounter and considering the complexity, I've concluded that it is not worth supporting.

The next thing that was changed is how the hql tree is created for Linq queries and when parameter types are detected. Prior this PR, the parameter types were all detected when translating the hql tree except for parameters that are using MappedAs method, which were detected by ExpressionParameterVisitor and passed inside DefaultQueryProvider.PrepareQuery method in order to override the type detected in hql translation. This PR deprecates ExpressionParameterVisitor which also modifies the linq expression by removing all MappedAs calls and it is replaced by ExpressionKeyVisitor and ConstantTypeLocator. The parameter detection is now done by ExpressionKeyVisitor but it does not remove the MappedAs calls and neither tries to get the type from MappedAs. MappedAs calls are removed by ConstantTypeLocator upon detection of all parameter types, which are then stored inside VisitorParameters, meaning that users having its own query model rewriter factory (IQueryModelRewriterFactory) could use them. By removing ExpressionParameterVisitor, the creation of NhLinqExpression is cheaper as the expression is traversed only once by ExpressionKeyVisitor to generate the key, which is used to retrieve the query plan. ConstantTypeLocator is used only in the translation step which occurs when building the query plan. Prior this PR, when using MappedAs only one query plan was created even when different types where used for the same query and it worked because the types were detected on NhLinqExpression instantiation and were overriden in DefaultQueryProvider.PrepareQuery. With this PR, the ExpressionKeyVisitor generates different keys when MappedAs type is changed in order to avoid using the same query plan and the parameter types are always retrieved from the ParameterMetadata.
ExpressionKeyVisitor was additionally updated to be more consistent with .NET ExpressionStringBuilder by supporting all expressions that could be used, except for few ( e.g. TryExpression, SwitchExpression, CatchBlock) that doesn't make sense supporting them. It also gained the ability to create a key for a parsed re-linq expression containing a QueryModel by mimic QueryModel.ToString() method, which is used to create a child expression key to prevent selecting the same columns multiple times. For example:

s.Query<Animal>().Select(a => a.MotherOrFather != null ? a.MotherOrFather : a).ToList(); // where MotherOrFather is an unmapped property

in the above example the hql nominates would be all three a from the test, if true and if false expressions, which means that prior this PR instead of 5 columns (suppose that Animal has 5 columns), 15 would be selected instead.
The logic for determine parameter types that hql is using (inside AbstractQueryImpl) was moved into ParameterHelper which is now also used by ConstantTypeLocator when there is no related member expression to get the type from.
ExpressionParameterVisitor also converts char to a string, which is probably because two tests (CharIndexFunction and CharIndexOffsetNegativeFunction) fail on Sql Server and Sql CE databases. For me this looks like a workaround and instead of doing the same trick, I've modified AbstractCharType.Set method to set the parameter size according to the SqlType length.
The rewriting of query model was moved from HqlGeneratorExpressionVisitor into a new class QueryModelRewriter in order to make HqlGeneratorExpressionVisitor a readonly visitor where its job should be only to generate the hql tree.
In the previous post I've mentioned that I will be using TransparentCast in order to preserve the parameter types for the hql translation, but due to its limitation it is only possible to preserve primitive types. In order to avoid the limitation, the parameter types are send directly to HqlSqlWalker, which are used to set the IExpectedTypeAwareNode.ExpectedType property upon parsing a parameter. By having the parameter types set, the newly added method GetReturnType for hql functions will always have all argument types and the DefaultReturnType property would never be used for Linq queries. Hql translation was modified in order to set the expected type only when is not set in order to prevent overriding types that were passed into HqlSqlWalker. ConstantTypeLocator is smarter than the hql logic for type detection, for example hql does assume parameter types for arithemtic operations based on the other side (e.g. select @p1 + DoubleColumn), which is not always correct. Hql logic also fails to determine the parameter type when having nested conditional expressions. TransaprentCast was updated to support also unsigned numbers, which are supported by some databases (e.g. MySql).
As with this PR arithemtic operations are executed on the server, I made a fixture (DriverNumericTypesFixture) that tests arithemtic operations by mixing different numeric types and comparing server and client side results, producing more than 800 variations. The fixture revealed issues that are related to floating-point types in MySql and Oracle and decimal types in general. In Oracle dialect floating-types are mapped as FLOAT(24) and DOUBLE PRECISION, which do not conform to the IEEE standard and produce more than 70 variations to fail. It gets even worse when using a floating-type number bigger than the max .NET decimal number which throws an overflow exception as the driver tries to parse the value to decimal. This issue is also documented in one of the linq average tests. Even when the arithemtic operations were executed on the client the same result was returned because the returned floating-point value was different from the one that was saved. The only solution that I found to work was to use BINARY_FLOAT and BINARY_DOUBLE types that were introduced with version 10g, which is what I've used to fix the issue. Those binary types are not supported by OracleClientDriver so the fix was not applied for it and it seems that Microsoft deprecated System.Data.OracleClient, which I think we should too.
MySql has also similar issues with floating-point types, which may cause a different value to be returned from the database. For example float.MaxValue will throw an exception in MySql upon saving (SO post). In order to avoid the issue, the values where carefully selected so that they will stay the same when they are saved. Unfortunately, setting specific values was not enough as there is another issue that is not related to MySql database, but with MySql driver. Tests that are using floating-point parameters are failing and upon debugging the driver code I found out that parameters are sent as strings. Sending parameters as strings has an impact on double and float as they can be differently evaluated by the database. For example when there is no e-notation (e.g. 1.2f) in the string the value will be evaluated as NUMBER type, which may cause issues for both double and float parameters. When there is an e-notation (e.g. 1.3e+15) the value will be evaluated as DOUBLE by the database, which may produce unexpected results for float parameters. From the driver code the only solution in order to avoid sending parameters as strings is to prepare the query before executing it. By preparing the query, the parameters are sent in a binary format which fixed the failing tests. Enabling prepared statements will throw an exception when saving an entity with a native generator, because two sql statements will be executed within the same command separated by a semicolon, an insert statement and a query to retrieve the last id (SELECT LAST_INSERT_ID()), which is not supported by Prepare method. As there are only 4 tests that fail I made a rule to skip them for MySql. Another issue involving MySql is about casting to float or decimal type. As MySql prior version 8, does not support casting to floating-point types, MySQL5Dialect uses DECIMAL(19,5) for emulate that, which truncates values that are larger than the defined precision/scale, which produced tests with a big float parameter to fail (e.g. Select(o => @floatParameter * DoubleColumn)). Removing the defined casts was not a solution as it produces more than 50 other failing tests, so the only solution was to merge #2036, which adds a cast only when the database types are different. As float and dobule uses the same sql type (DECIMAL(19,5)), the cast is not applied and therefore the tests were fixed.
The final issue was with decimal types as the defined dialect precision for decimal is DECIMAL(19,5), which lower than .NET decimal. In order to avoiding truncation, arithemtic operations on decimals are by default executed on the client. For users that want to override this behavior two special static methods were introduced ClientEval and DatabaseEval. For example:

db.Products.Select(a => DatabaseEval(() => a.UnitPrice * 1234.4321m)).ToList()

the above query will execute the expression inside DatabaseEval argument on the database instead on the client. Similar for ClientEval:

db.Animals.Select(a => ClientEval(() => a.Id + 5)).ToList()

the above query will execute the ClientEval argument on the client which allows the user to fallback to the old behavior.
AddJoinsReWriter was fixed for not mapped properties where the property type was an entity type (adding FatherOrMother property of type Animal on Animal entity) by using ExpressionsHelper.TryGetMappedType, otherwise a join was added for the not mapped property.

As this PR now depends on #2036, I will leave it to WIP.

@maca88
Copy link
Contributor Author

maca88 commented Nov 13, 2019

Rebased on top of #2036 and fixed Odbc tests by moving the logic of setting the parameter size from AbstractCharType.Set to each driver that sets it. Not sure how to set oracle.use_binary_floating_point_types to true for TeamCity, adding the setting to teamcity.build doesn't seem to be enough.

_useBinaryFloatingPointTypes = PropertiesHelper.GetBoolean(
Environment.OracleUseBinaryFloatingPointTypes,
settings,
false);

base.Configure(settings);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow I messed up this in #2349.

@@ -432,7 +432,7 @@ private void OverrideStandardHQLFunctions()
{
RegisterFunction("current_timestamp", new CurrentTimeStamp());
RegisterFunction("current_date", new NoArgSQLFunction("current_date", NHibernateUtil.LocalDate, false));
RegisterFunction("length", new StandardSafeSQLFunction("char_length", NHibernateUtil.Int64, 1));
RegisterFunction("length", new StandardSafeSQLFunction("char_length", NHibernateUtil.Int32, 1));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aligned with other dialects.

@@ -31,6 +31,11 @@ public IBodyClause Clone(CloneContext cloneContext)
return new NhOuterJoinClause(JoinClause.Clone(cloneContext));
}

public override string ToString()
{
return $"outer {JoinClause}";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to add it in #2328, which is needed for creating unique cache keys.

@maca88
Copy link
Contributor Author

maca88 commented May 25, 2020

Squashed and rebased.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants