Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NH-4043 - Complete keyword registration in dialects. #654

Merged
merged 2 commits into from
Jul 9, 2017

Conversation

ngbrown
Copy link
Contributor

@ngbrown ngbrown commented Jun 28, 2017

This is NH-4043.

  • Need to use results from CI Build to finish dialects.
  • Baseline all dialects with ANSI SQL 2003 keywords
  • Lowercase all remaining dialect keywords.

RegisterKeyword("integer"); // a commonly-used alias for 'int'
RegisterKeyword("tinyint");
RegisterKeyword("smallint");
RegisterKeyword("absolute");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be done in a loop somehow? Like following:

var keywords = new [] { ... };  
foreach (var keyword in keywords) 
{
    RegisterKeyword(keyword);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I would like to have following function on base dialect:

protected void RegisterKeywords(IEnumerable<string> keywords) //maybe also with params
{
    Keywords.UnionWith(keywords);
}

Copy link
Contributor Author

@ngbrown ngbrown Jun 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you concerned by the number of lines it takes, or by the slowness of hundreds of function calls?

If I change a ReSharper array initializer wrap setting, I get something like this:

private static readonly string[] keywordsToRegister =
{
	"absolute", "action", "ada", "add", "admin", "after", "aggregate", "alias", "all", "allocate", "alter", "and", "any",
	"are", "array", "as", "asc", "assertion", "at", "authorization", "avg", "backup", "before", "begin", "between",
	"bigint", "binary", "bit", "bit_length", "blob", "boolean", "both", "breadth", "break", "browse", "bulk", "by",
	"call", "cascade", "cascaded", "case", "cast", "catalog", "char", "char_length", "character", "character_length",
	"check", "checkpoint", "class", "clob", "close", "clustered", "coalesce", "collate", "collation", "column", "commit",
	"completion", "compute", "connect", "connection", "constraint", "constraints", "constructor", "contains",
	"containstable", "continue", "convert", "corresponding", "count", "create", "cross", "cube", "current",
	"current_date", "current_path", "current_role", "current_time", "current_timestamp", "current_user", "cursor",
	"cycle", "data", "database", "date", "datetime", "day", "dbcc", "deallocate", "dec", "decimal", "declare", "default",
	"deferrable", "deferred", "delete", "deny", "depth", "deref", "desc", "describe", "descriptor", "destroy",
	"destructor", "deterministic", "diagnostics", "dictionary", "disconnect", "disk", "distinct", "distributed",
	"domain", "double", "drop", "dummy", "dump", "dynamic", "each", "else", "end", "end-exec", "equals", "errlvl",
	"escape", "every", "except", "exception", "exec", "execute", "exists", "exit", "external", "extract", "false",
	"fetch", "file", "fillfactor", "first", "float", "for", "foreign", "fortran", "found", "free", "freetext",
	"freetexttable", "from", "full", "function", "general", "get", "global", "go", "goto", "grant", "group", "grouping",
	"having", "holdlock", "host", "hour", "identity", "identity_insert", "identitycol", "if", "ignore", "image",
	"immediate", "in", "include", "index", "indicator", "initialize", "initially", "inner", "inout", "input",
	"insensitive", "insert", "int", "integer", "intersect", "interval", "into", "is", "isolation", "iterate", "join",
	"key", "kill", "language", "large", "last", "lateral", "leading", "left", "less", "level", "like", "limit", "lineno",
	"load", "local", "localtime", "localtimestamp", "locator", "lower", "map", "match", "max", "min", "minute",
	"modifies", "modify", "module", "money", "month", "names", "national ", "natural", "nchar", "nclob", "new", "next",
	"no", "nocheck", "nonclustered", "none", "not", "ntext", "null", "nullif", "numeric", "nvarchar", "object",
	"octet_length", "of", "off", "offsets", "old", "on", "only", "open", "opendatasource", "openquery", "openrowset",
	"openxml", "operation", "option", "or", "order", "ordinality", "out", "outer", "output", "over", "overlaps", "pad",
	"parameter", "parameters", "partial", "pascal", "path", "percent", "plan", "position", "postfix", "precision",
	"prefix", "preorder", "prepare", "preserve", "primary", "print", "prior", "privileges", "proc", "procedure",
	"public", "raiserror", "read", "reads", "readtext", "real", "reconfigure", "recursive", "ref", "references",
	"referencing", "relative", "replication", "restore", "restrict", "result", "return", "returns", "revoke", "right",
	"role", "rollback", "rollup", "routine", "row", "rowcount", "rowguidcol", "rows", "rule", "save", "savepoint",
	"schema", "scope", "scroll", "search", "second", "section", "select", "sequence", "session", "session_user", "set",
	"sets", "setuser", "shutdown", "size", "smalldatetime", "smallint", "smallmoney", "some", "space", "specific",
	"specifictype", "sql", "sql_variant", "sqlca", "sqlcode", "sqlerror", "sqlexception", "sqlstate", "sqlwarning",
	"start", "state", "statement", "static", "statistics", "structure", "substring", "sum", "system_user", "table",
	"temporary", "terminate", "text", "textsize", "than", "then", "time", "timestamp", "timezone_hour",
	"timezone_minute", "tinyint", "to", "top", "trailing", "tran", "transaction", "translate", "translation", "treat",
	"trigger", "trim", "true", "truncate", "tsequal", "under", "union", "unique", "uniqueidentifier", "unknown",
	"unnest", "update", "updatetext", "upper", "usage", "use", "user", "using", "value", "values", "varbinary",
	"varchar", "variable", "varying", "view", "waitfor", "when", "whenever", "where", "while", "with", "without", "work",
	"write", "writetext", "year", "zone",
};

protected virtual void RegisterKeywords()
{
	RegisterKeywords(keywordsToRegister);
}

The nice thing about one-per-line was that it could be easily sorted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By this: "by the slowness of hundreds of function calls"

Copy link
Member

@hazzik hazzik Jun 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nice thing about one-per-line was that it could be easily sorted.

You can keep them one by line in the array initializer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont need to put this into the static field

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, does not really matter, leave it as many functions. It seems that C# does not bother with optimizing string arrays (as it does for numbers)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could optimize this only by providing this array to the HashSet<> constructor

Copy link
Member

@fredericDelaporte fredericDelaporte Jun 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe a RegisterKeywords(params string[] keywords)? If the HashSet is still empty, it could be replaced by a new one constructed from the array. But that is maybe a bit overkill for a small optimization.

Assert.That(match, Is.EquivalentTo(reservedDb));
}

[Test]
public void CheckForExcessReservedWordsHardCodedInDialect()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test relevant? It looks like the approach taken to solve this issue is to set keywords on old dialect like SQL2000, while using indeed a more recent server likely having more keywords, like SQL2016 (what we have on the build agent). So indeed testing this against an actual SQL Server 2000 would likely yield some excess words. Does it matter much? I do not think. So I do not think this test is useful.

(Do not try to emulate this with SQL Server compatibility level: that is just a backward compatibility setting, re-enabling obsolete features but not disabling newer ones. For actually testing a SQL Server 2000, only an actual SQL Server 2000 is suitable.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fredericDelaporte I did check against an actual SQL Server 2000... yes, checking every database and version is a pain.

@ngbrown
Copy link
Contributor Author

ngbrown commented Jun 28, 2017

One option is to do what Hibernate did, and automatically register all ANSI SQL 2003 keywords, and then give each dialect a chance to register additional ones.

@@ -140,7 +147,7 @@ public void CheckForExcessReservedWordsHardCodedInDialect()
}

// Don't fail incase the driver returns nothing.
//Assert.That(sf.Dialect.Keywords, Is.EquivalentTo(reservedDb));
// This is an info-only test.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may issue an Assert.Warn instead. Since a recent PR it will not cause the build to fail.
Assume.That would be slightly better, but it currently causes a failure, excepted in a PR where I have started using it and fixed it.

Copy link
Member

@fredericDelaporte fredericDelaporte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have concerns about the last commit. It looks to me it weakens quoting effectiveness. Granted, auto-quoting and auto-importing seem a bit messy. (I am not even sure those features have any effectiveness, because they just apply to schema generation, not to querying.)

factory.Dialect.Keywords.UnionWith(GetReservedWords(dialect, connectionHelper));
var dialect = sessionFactory.Dialect;
var connectionHelper = new SuppliedConnectionProviderConnectionHelper(sessionFactory.ConnectionProvider);
sessionFactory.Dialect.Keywords.UnionWith(GetReservedWords(dialect, connectionHelper));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be further simplified to:

dialect.Keywords.UnionWith(GetReservedWords(dialect, connectionHelper));

}

public static void QuoteTableAndColumns(Configuration configuration)
{
ISet<string> reservedDb = GetReservedWords(configuration.GetDerivedProperties());
// Instatiates a new instance of the dialect so doesn't benefit from the Update call.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, Insta**n**tiates.

ISet<string> reservedDb = GetReservedWords(configuration.GetDerivedProperties());
// Instatiates a new instance of the dialect so doesn't benefit from the Update call.
var dialect = Dialect.Dialect.GetDialect(configuration.GetDerivedProperties());
QuoteTableAndColumns(configuration, dialect);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feature loss I think. Previously, this was calling GetReservedWords which was (re)loading the reserved words as the Update call, but only to complete its local list before quoting. Now it will not, causing the quoting to be less effective.

Same for the new method taking a dialect: if the dialect was not previously updated with GetReservedWords, it will no more have the words from GetReservedWords. A number of code paths do call QuoteTableAndColumns without updating the dialect before.

Auto-quoting and auto-importing were somewhat two separate things, not really depending on each other.

Copy link
Contributor Author

@ngbrown ngbrown Jul 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the problem was, that the QuoteTableAndColumns wasn't even using the keywords loaded into the Dialect. This was a major problem on the .NET Standard branch because there is no functioning auto-importing, so schema names weren't getting quoted at all, because the DbConnection based GetSchema had to return an empty list.

The auto-import is enabled if auto-quoting is enabled, and auto-import takes place first during the session factory construction.

I think the only paths that use auto-quoting without going through the session factory construction, and thus the auto-importing, are some unit tests that are trying to test just that feature. For those the auto-quoting to conflate auto-importing, it's not allowing the tests the isolation they may desire.

Copy link
Member

@fredericDelaporte fredericDelaporte Jul 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about here and here?

Those are not code from NHibernate.Test, and your change removes for them the addition of reserved word by querying the db.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for the feedback.

I reverted the original signature for QuoteTableAndColumns to it's previous behavior, marked it obsolete, and nothing in NHibernate now uses it. Everywhere that calls the new QuoteTableAndColumns overload calls Update beforehand (except for the two explicit unit tests).

I also refactored the GetReservedWords to better encapsulate the dispose pattern they are using (finally { Release() }).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A number of code paths do call QuoteTableAndColumns without updating the dialect before.

@fredericDelaporte I think this is intentional. We do not need to update the dialect on SchemaExport/SchemaUpdate as it is a throw-away dialect. However, it does not harm.

private static ISet<string> GetReservedWords(Dialect.Dialect dialect, IConnectionHelper connectionHelper)
{
ISet<string> reservedDb = new HashSet<string>();
ISet<string> reservedDb = new HashSet<string>(dialect.Keywords);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Causes Update code to try re-add all the already known words. Would be better to avoid this for Update case.


if (sf.ConnectionProvider.Driver is OdbcDriver)
{
Assert.Inconclusive("ODBC has excess keywords reserved");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not supported by the "result comparison tool" currently, causing a build failure. You already need this change in teamcity.build file.

Feel free to replicate it in your PR, I will adjust mine if your get merged before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now available as a PR on its own. #655

Copy link
Member

@fredericDelaporte fredericDelaporte Jul 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is merged, you can rebase/merge master for avoiding Assert.Inconclusive causing a failure for builds using results comparison.

Copy link
Member

@fredericDelaporte fredericDelaporte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is looking good, excepted some remaining points.

By the way I have installed a Sap SQL Anywhere 17 locally, going to check its keyword list.

@@ -2246,6 +2248,11 @@ protected void RegisterKeyword(string word)
Keywords.Add(word);
}

protected void RegisterKeywords(params string[] keywords)
{
Keywords.UnionWith(keywords);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small additional improvement maybe:

Keywords.UnionWith(keywords.Select(k => k.ToLowerInvariant()));

That would avoid future bugs like the uppercase registration in Sybase dialects.

It would be more robust to change those lookups constructor to use StringComparer.OrdinalIgnoreCase, but that would need to be done everywhere we fetch a new set from the one of the dialect. _sqlFunctions is already built as a case insensitive dictionary by example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be higher performance to not generate new strings, so I made the HashSet's case insensitive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know if this a really more performant overall, token lookups included, since an Ordinal comparison is maybe faster than an OrdinalIgnoreCase. But at least it is more robust, so I prefer having that ignore case.


RegisterKeywords();

RegisterDefaultProperties();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

DataTable dtTypes = connection.GetSchema(DbMetaDataCollectionNames.DataTypes);
foreach (DataRow row in dtTypes.Rows)
{
result.Add(row["TypeName"].ToString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add an overridable boolean property IncludeDataTypesInReservedWords, defaulting to true, overridden to false in PostgreSQLMetadata. That would avoid to duplicate the logic in this other class. And that would allow to easily remove them from other databases if needed. (Since we do not test all of them, we may know it is needed a bit late.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with IncludeDataTypesInReservedWords.... squashed into previous commit.

@@ -201,7 +201,7 @@ public SessionFactoryImpl(Configuration cfg, IMapping mapping, Settings settings
}
if (settings.IsAutoQuoteEnabled)
{
SchemaMetadataUpdater.QuoteTableAndColumns(cfg);
SchemaMetadataUpdater.QuoteTableAndColumns(cfg, this.Dialect);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unneeded this qualifier.

@@ -141,7 +141,8 @@ public void Execute(Action<string> scriptAction, bool doUpdate)
autoKeyWordsImport = autoKeyWordsImport.ToLowerInvariant();
if (autoKeyWordsImport == Hbm2DDLKeyWords.AutoQuote)
{
SchemaMetadataUpdater.QuoteTableAndColumns(configuration);
SchemaMetadataUpdater.Update(configuration, dialect);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to be moved in some initialization method (in constructors indeed). Here it will needlessly reload the keywords at each Execute call after the first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of doing remote IO in a constructor. There will be no improvement in this case because everywhere Execute is called, it is immediately after newing up a new SchemaUpdate instance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with SchemaExport, it could be done on a Initialize method, called only once from method requiring it. SchemaUpdate can be used by users, we cannot assume they will never call many time its execute.

@fredericDelaporte
Copy link
Member

Here is the output of EnsureReservedWordsHardCodedInDialect for a Sap SQL Anywhere 17 (previously Sybase):

Update Dialect SybaseSQLAnywhere12Dialect with RegisterKeyword:
  RegisterKeyword("aes_decrypt");
  RegisterKeyword("asc");
  RegisterKeyword("attach");
  RegisterKeyword("backup");
  RegisterKeyword("bit");
  RegisterKeyword("bottom");
  RegisterKeyword("break");
  RegisterKeyword("capability");
  RegisterKeyword("cascade");
  RegisterKeyword("char_convert");
  RegisterKeyword("checkpoint");
  RegisterKeyword("clear");
  RegisterKeyword("comment");
  RegisterKeyword("compressed");
  RegisterKeyword("conflict");
  RegisterKeyword("convert");
  RegisterKeyword("dbspace");
  RegisterKeyword("deleting");
  RegisterKeyword("desc");
  RegisterKeyword("detach");
  RegisterKeyword("elseif");
  RegisterKeyword("encrypted");
  RegisterKeyword("endif");
  RegisterKeyword("exception");
  RegisterKeyword("executing");
  RegisterKeyword("executing_user");
  RegisterKeyword("existing");
  RegisterKeyword("externlogin");
  RegisterKeyword("extract");
  RegisterKeyword("force");
  RegisterKeyword("forward");
  RegisterKeyword("goto");
  RegisterKeyword("holdlock");
  RegisterKeyword("identified");
  RegisterKeyword("index");
  RegisterKeyword("inserting");
  RegisterKeyword("install");
  RegisterKeyword("instead");
  RegisterKeyword("integrated");
  RegisterKeyword("invoking");
  RegisterKeyword("invoking_user");
  RegisterKeyword("iq");
  RegisterKeyword("isolation");
  RegisterKeyword("json");
  RegisterKeyword("kerberos");
  RegisterKeyword("key");
  RegisterKeyword("lock");
  RegisterKeyword("login");
  RegisterKeyword("long");
  RegisterKeyword("membership");
  RegisterKeyword("message");
  RegisterKeyword("mode");
  RegisterKeyword("modify");
  RegisterKeyword("noholdlock");
  RegisterKeyword("notify");
  RegisterKeyword("nvarchar");
  RegisterKeyword("off");
  RegisterKeyword("openstring");
  RegisterKeyword("openxml");
  RegisterKeyword("option");
  RegisterKeyword("options");
  RegisterKeyword("others");
  RegisterKeyword("passthrough");
  RegisterKeyword("pivot");
  RegisterKeyword("print");
  RegisterKeyword("privileges");
  RegisterKeyword("proc");
  RegisterKeyword("procedure_owner");
  RegisterKeyword("publication");
  RegisterKeyword("raiserror");
  RegisterKeyword("readtext");
  RegisterKeyword("reference");
  RegisterKeyword("refresh");
  RegisterKeyword("remote");
  RegisterKeyword("remove");
  RegisterKeyword("rename");
  RegisterKeyword("reorganize");
  RegisterKeyword("resource");
  RegisterKeyword("restore");
  RegisterKeyword("restrict");
  RegisterKeyword("rowtype");
  RegisterKeyword("save");
  RegisterKeyword("session");
  RegisterKeyword("session_user");
  RegisterKeyword("setuser");
  RegisterKeyword("share");
  RegisterKeyword("spatial");
  RegisterKeyword("sqlcode");
  RegisterKeyword("stop");
  RegisterKeyword("subtrans");
  RegisterKeyword("subtransaction");
  RegisterKeyword("synchronize");
  RegisterKeyword("temporary");
  RegisterKeyword("timeline");
  RegisterKeyword("tinyint");
  RegisterKeyword("truncate");
  RegisterKeyword("tsequal");
  RegisterKeyword("unbounded");
  RegisterKeyword("uniqueidentifier");
  RegisterKeyword("unpivot");
  RegisterKeyword("unsigned");
  RegisterKeyword("updating");
  RegisterKeyword("validate");
  RegisterKeyword("varbinary");
  RegisterKeyword("varbit");
  RegisterKeyword("variable");
  RegisterKeyword("varray");
  RegisterKeyword("view");
  RegisterKeyword("wait");
  RegisterKeyword("waitfor");
  RegisterKeyword("work");
  RegisterKeyword("writetext");
  RegisterKeyword("xml");

@ngbrown
Copy link
Contributor Author

ngbrown commented Jul 3, 2017

@fredericDelaporte I had downloaded SQL Anywhere 16 and gotten similar results, but which Dialect should that be added to? To a new Dialect for 16/17 or to one of the older ones, 10, 11, or 12?

@fredericDelaporte
Copy link
Member

fredericDelaporte commented Jul 3, 2017

Adding a new dialect without testing if it needs some changes should not be done I think. (That is the reason I will likely not contribute the specific driver I have locally written for accessing Anywhere 17, which has once again changed its assembly name.) I rather let actual Anywhere users contribute these.

I am more for adding them to an old dialect. Having more keyword than needed should not be harmful, while having less is surely harmful.

(Well, when quoted, for some databases they start to be case sensitive while being insensitive when not quoted. So quoting too much can be "harmful" too. But anyway, users are supposed to upgrade from old databases and so they will have to tackle the trouble sooner or later for these "extra" keyword. So I deem it preferable to have a bit too many keywords, including "futures" one, than having missing keywords.)


static AnsiSqlKeywords()
{
var keywordsSql2003 = new HashSet<string>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be case insensitive as well?

Copy link
Contributor Author

@ngbrown ngbrown Jul 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hazzik, It's not necessary. The AnsiSqlKeywords are iterated over as an IEnumerable when initializing the _sqlKeywords HashSet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, why this is even a set? we just need an array, dont we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following what Hibernate had. They had a HashSet too, even though I assume Java unions HashSets the same way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way, I changed it to an array for lower initialization cost.

@ngbrown ngbrown force-pushed the NH-4043 branch 3 times, most recently from 704c5b8 to 507239f Compare July 3, 2017 23:13
Copy link
Member

@fredericDelaporte fredericDelaporte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix for Sybase types declaration is welcome, but should be isolated in a commit referencing a specific Jira for this. (NH-4046)

if (IncludeDataTypesInReservedWords)
{
DataTable dtTypes = connection.GetSchema(DbMetaDataCollectionNames.DataTypes);
foreach (DataRow row in dtTypes.Rows)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anywhere 17 supports this too, without a different name (using TypeName works for it). But its schema specific class overrides that method, and has not be updated for including data types. Maybe should it be updated.

RegisterColumnType(DbType.String, 32767, "NVARCHAR($l)");
RegisterColumnType(DbType.String, 2147483647, "LONG NVARCHAR");
RegisterColumnType(DbType.Binary, "BINARY(1)");
RegisterColumnType(DbType.Binary, "BINARY(8000)");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I understand now why basic tests were failing with string truncation now... But this fix deserves a Jira on its own.

I have created NH-4046. I think it should be fixed at least as a separated commit, maybe even a separated PR. SybaseASE15Dialect has the issue too.

We should be able to use SybaseASA9Dialect default lengths safely, since this is an old name for Anywhere. In this regard, Binary(8000) is maybe too big. SybaseASA9Dialect declares Binary(255) instead.


private static ISet<string> GetReservedWords(Dialect.Dialect dialect, IConnectionHelper connectionHelper)
{
ISet<string> reservedWords = new HashSet<string>(dialect.Keywords, StringComparer.OrdinalIgnoreCase);
Copy link
Member

@hazzik hazzik Jul 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This HashSet is completely useless:

  1. We create this hash set from dialect.Keywords.
  2. We add another hash set from metaData.GetReservedWords().
  3. Then we merge this hash set into dialect.Keywords;

But instead it should just return metaData.GetReservedWords();


string autoKeyWordsImport = PropertiesHelper.GetString(Environment.Hbm2ddlKeyWords, configuration.Properties, "not-defined");
autoKeyWordsImport = autoKeyWordsImport.ToLowerInvariant();
if (autoKeyWordsImport == Hbm2DDLKeyWords.AutoQuote)
Copy link
Member

@hazzik hazzik Jul 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, probably, shall be the same check as in SessionFactoryImpl.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, probably, shall be the same check as in SessionFactoryImpl.

It would look more consistent, but would complicate code for nothing, in its current state. The check in SessionFactoryImpl treats AutoQuote and KeywordImport as two separated options. But they indeed are only one configuration setting, currently not allowing having auto-quoting without having keyword-import.

To have things actually consistent, we would need to put a helper somewhere for getting those two booleans from the actual configuration setting, or refactor SchemaUpdate and SchemaExport for taking/building a session factory setting object. I do not think it is worth the hassle.

@hazzik
Copy link
Member

hazzik commented Jul 5, 2017

I have cleaned the code:

  • Dialect encapsulates Keywords
  • AbstractDataBaseSchema and SybaseAnywhereColumnMetaData return case insensitive hash set (in fact, they can return just an array/IEnumerable)
  • SchemaMetadataUpdater does not do unnecessary hash sets recast
  • SchemaMetadataUpdater just sets IsQuoted for table and column names, so it does not do unnecessary string operations

@ngbrown
Copy link
Contributor Author

ngbrown commented Jul 6, 2017

@hazzik Thanks for the cleanup. I assume I still need to split off the type changes into NH-4046?

@hazzik
Copy link
Member

hazzik commented Jul 6, 2017

yes.

@ngbrown
Copy link
Contributor Author

ngbrown commented Jul 6, 2017

I just added a commit to undo it. I could have re-written the history, but maybe this whole thing will be squashed on merge anyways?

@hazzik
Copy link
Member

hazzik commented Jul 6, 2017

There is JIRA issue will be fixed by this PR: https://nhibernate.jira.com/browse/NH-3158

- Dialect encapsulates Keywords
- Use case-insensitive HashSet for keywords
- AbstractDataBaseSchema and SybaseAnywhereColumnMetaData return case insensitive hash set (in fact, they can return an array/IEnumerable<string> instead of ISet<string>)
- SchemaMetadataUpdater just sets IsQuoted for table and column names, so it does not do unnecessary string operations
- Initialize SchemaUpdate once
@hazzik
Copy link
Member

hazzik commented Jul 6, 2017

I've regrouped changes into 2 commits. One is NH-3158, another one is NH-4043. There were no tests that checking that dialect keywords were used.

- Baseline all dialects with ANSI SQL 2003 keywords
- Add TypeNames to returned keyword words list
@ngbrown
Copy link
Contributor Author

ngbrown commented Jul 8, 2017

Looks good.

@ngbrown ngbrown mentioned this pull request Jul 8, 2017
29 tasks

string autoKeyWordsImport = PropertiesHelper.GetString(Environment.Hbm2ddlKeyWords, configuration.Properties, "not-defined");
autoKeyWordsImport = autoKeyWordsImport.ToLowerInvariant();
if (autoKeyWordsImport == Hbm2DDLKeyWords.AutoQuote)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, probably, shall be the same check as in SessionFactoryImpl.

It would look more consistent, but would complicate code for nothing, in its current state. The check in SessionFactoryImpl treats AutoQuote and KeywordImport as two separated options. But they indeed are only one configuration setting, currently not allowing having auto-quoting without having keyword-import.

To have things actually consistent, we would need to put a helper somewhere for getting those two booleans from the actual configuration setting, or refactor SchemaUpdate and SchemaExport for taking/building a session factory setting object. I do not think it is worth the hassle.

@fredericDelaporte fredericDelaporte merged commit eea4cf4 into nhibernate:master Jul 9, 2017
@hazzik hazzik added this to the 5.0 milestone Aug 3, 2017
@ngbrown ngbrown deleted the NH-4043 branch September 15, 2017 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants