Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update column where value is not null #1379

Open
michasacuer opened this issue Nov 12, 2020 · 18 comments
Open

Update column where value is not null #1379

michasacuer opened this issue Nov 12, 2020 · 18 comments
Labels
feature A new feature (we all like those) propose-close

Comments

@michasacuer
Copy link

michasacuer commented Nov 12, 2020

Describe the question
I want to update columns where values are not null. I can find values that are null in my database like that:

Update.Table("someTable").InSchema("schema")
                .Set(new { IsValid = false }).Where(new { IsValid = DBNull.Value });

but how to reverse Where(new { IsValid = DBNull.Value }); condition to is not null ?

Documentation Pages You've Read So Far
github docs

@jzabroski
Copy link
Collaborator

To the best of my knowledge, this isn't possible. I would have to double check the test coverage to see if I'm forgetting something.

The workaround is to use

 Execute.Sql(@"
UPDATE [schema].[SomeTable]
Set IsValid = false
WHERE IsValid is not null");

@michasacuer
Copy link
Author

Waiting for updates

@fubar-coder
Copy link
Member

fubar-coder commented Nov 23, 2020

It's not supported and it most likely never will, because FM is not an ORM. But you can still use the SetExistingRowsTo extension method for new colums. Example:

Create.Column("IsValid").OnTable("SomeTable").AsBoolean()
                .SetExistingRowsTo(false).WithDefaultValue(false);

The DB processor should automatically ensure that the existing rows are set to the new value, by creating an UPDATE SQL statement or other means.

EDIT: BTW: There is a way to use an ORM from inside a migration by using Execute.WithConnection. If your ORM doesn't support an external transaction, you can still pass TransactionBehavior.None to the Migration attribute. Dapper or other micro ORMs (like RepoDB) are very useful in this case.

@jzabroski
Copy link
Collaborator

jzabroski commented Nov 23, 2020

It's not supported and it most likely never will, because FM is not an ORM.

Well, I can honestly see supporting ORM-like expressions in some lightweight context. It's just more of a wish list feature given the workaround is not that unpleasant. I have never heard anyone say they wouldn't use FluentMigrator because they had to call Execute.Sql occasionally for things Update doesn't support. I think we could spec out what that ORM-like expression syntax would look like as an Issue, and sticky it as a wish list for someone to go do.

One thing that would be nice is if calling Create.Column("x").OnTable("Y").AsDataType().NotNullable() better supported chaining so that it first seeds data, then makes it not nullable. In this way you could "filet" the migration so that instead of SetExistingRowsTo(constant value), you could call/cc the computation, such that you had knowledge of the column and table name and could use that to lambda into the seeding statement:

Create.Column("x").OnTable("Y").AsBoolean().NotNullable()
  .SeedWith(
    false,
    () => new { x = false }, // dummy expression to create anonymous type T without specifying the T via type inference
    (obj) => obj.x == false || obj.x != null
  );

Frankly, Create.Column .. NotNullable should have a Roslyn warning about requiring such seeding.

@jzabroski
Copy link
Collaborator

Waiting for updates

@michasacuer Sorry if I just didn't understand your brief comment, but I am a little confused about what you are waiting for. I gave you the workaround. Why do you think you are still blocked? Or are you just really hoping to only use the Update expression syntax?

@fubar-coder
Copy link
Member

fubar-coder commented Nov 23, 2020

I think we could spec out what that ORM-like expression syntax would look like as an Issue, and sticky it as a wish list for someone to go do.

I fear that this might might evolve into a micro ORM over time and there are already micro ORMs.

Create.Column("x").OnTable("Y").AsBoolean().NotNullable()
  .SeedWith(
    false,
    () => new { x = false }, // dummy expression to create anonymous type T without specifying the T via type inference
    (obj) => obj.x == false || obj.x != null
  );

This will look quite complicated for the average user and I'm not sure if it's worth the hassle. Another problem with this syntax is, that F# or VB.NET might produce Linq expressions that are different from C#'s. This function also remembers me of my colleagues that are having massive problems understanding the Aggregate LINQ function.

Frankly, Create.Column .. NotNullable should have a Roslyn warning about requiring such seeding.

This sounds like an excellent idea.

@jzabroski
Copy link
Collaborator

That's a good point about too complicated for the average user. The moment you said it I knew you were right. I was just brainstorming. I have, similar to @michasacuer , wanted to do a simple seeding on a non-nullable column.

What about one of these two variants?

Create.Column("x").OnTable("Y").AsBoolean().NotNullable()
  .SeedWith(
    (Execute exec) => exec.Sql("UPDATE Y SET x = false")
  );
Create.Column("x").OnTable("Y").AsBoolean().NotNullable()
  .SeedWith(
    // problem here would be that
    // there is no way to know where Execute is "rooted", so may be complex control flow
    () => Execute.Sql("UPDATE Y SET x = false") 
  );

@michasacuer
Copy link
Author

Waiting for updates

@michasacuer Sorry if I just didn't understand your brief comment, but I am a little confused about what you are waiting for. I gave you the workaround. Why do you think you are still blocked? Or are you just really hoping to only use the Update expression syntax?

Thanks for that, dont think about that comment :P If its not supported; its not and i think its the end of a topic and you can close this issue, thanks for help

@fubar-coder
Copy link
Member

What about one of these two variants?

Create.Column("x").OnTable("Y").AsBoolean().NotNullable()
  .SeedWith(
    (Execute exec) => exec.Sql("UPDATE Y SET x = false")
  );

I like this one, but what about a DML interface that allows both a fluent syntax for updates and the execution of SQL statements?

Create.Column("x").OnTable("Y").AsBoolean().NotNullable()
  .SeedWith(
    // problem here would be that
    // there is no way to know where Execute is "rooted", so may be complex control flow
    () => Execute.Sql("UPDATE Y SET x = false") 
  );

Yes, we would need to be able to handle nested DDL/DML statements. Sounds interesting too.

There is still something we need to think about for each of the suggestions above: Support for "IfDatabase" or something similar.

@jzabroski
Copy link
Collaborator

I like this one, but what about a DML interface that allows both a fluent syntax for updates and the execution of SQL statements?

What do you have in mind?

I thought about your comment about being ORM-like. I had another use case for expression syntax that allows for boolean comparisons: Partitioning functions in SQL Server. These are soooo painful to type out in T-SQL syntax with a lot of repetitiveness. It would be a lot nicer to write some C# with higher order functions.

@fubar-coder
Copy link
Member

fubar-coder commented Nov 26, 2020

Something like this:

Create.Column("x").OnTable("Y").AsBoolean().NotNullable()
  .SeedWith(
    (IDmlExpression exec) => exec.Sql("UPDATE Y SET x = false")
  );

with IDmlExpression being:

public interface IDmlExpression : IExecuteExpressionRoot
{
    IDeleteExpressionRoot Delete { get; }
    IUpdateExpressionRoot Update { get; }
}

@jzabroski
Copy link
Collaborator

public interface IDmlExpression : IExecuteExpressionRoot
{
    IDeleteExpressionRoot Delete { get; }
    IUpdateExpressionRoot Update { get; }
}

I am not clear why you want to expose Delete { get; } and Update { get; } from here. For Delete, how would you seed a column value via Delete?

Why not something closer to just:

  1. Update ColumnExpressionBuilderHelper.SetNullable(bool isNullable)
         /// <summary>
        /// Either updates the IsNullable flag on the column, or creates/removes the SetNotNull expression, depending
        /// on whether the column has a 'Set existing rows' expression or 'Seed with' expression.
        /// </summary>
        public virtual void SetNullable(bool isNullable)
  1. Implement SeedWith with logic similar to how ColumnExpressionBuilderHelper.SetExistingRowsTo works
        /// <inheritdoc />
        public ICreateColumnOptionSyntax SeedWith(Action<IExecuteExpressionRoot, IExecuteExpressionRoot> exec)
        {
            // this is just pseudo-code, I doubt it works.
            var execExpr = exec();
            ColumnHelper.SeedWith(execExpr); // similar to how ColumnHelper.SetExistingRowsTo works
            return this;
        }
  1. Wave my hands on the hard part, which is implementing and testing ColumnExpressionBuilderHelper.SeedWith ;-)

@fubar-coder
Copy link
Member

Yes, you're right about the Delete part.

I'm not so sure that the SeedWith functionality is really useful. I'm always thinking about Firebird where you add a non-nullable column to the table and set the default value with a UPDATE tableWithNewColumn SET newColumn=newValue WHERE newColumn IS NULL. Other databases might have different mechanisms to seed the new column with the correct value and I'm not sure that it's a good idea to use a DB-specific SQL statement instead of a SetExistingRowsTo which can be used to set the default value regardless of the underlying DB, because the DB processor takes care of the differences.

It would surely be interesting to allow expressions to initialize the columns - especially when those columns are usually filled by an expression in an INSERT trigger, but I cannot see an easy way to do this.

Using a SeedWith would require (allow?) adding more DB specific code to migrations in cases where a simple SetExistingRowsTo won't suffice, but we're already having a Execute.Sql, so I'm not sure that it makes sense to add a SeedWith.

It's difficult for me to exmplain why I'm that hesitant, but a SeedWith doesn't seem that useful to me. In other words: I cannot see an advantage over using either SetExistingRowsTo, Execute.Sql, or a Micro ORM.

@jzabroski
Copy link
Collaborator

jzabroski commented Dec 1, 2020

I'm always thinking about Firebird where you add a non-nullable column to the table and set the default value with a UPDATE tableWithNewColumn SET newColumn=newValue WHERE newColumn IS NULL.

I'm not familiar with what Firebird does. Can you elaborate?

What I was thinking of, was given:

Create.Column("FooEnum").OnTable("Example").AsInt32().NotNullable()
  .SeedWith(
    () => Execute.Sql(@"
UPDATE ex
FROM dbo.Example ex
  LEFT JOIN dbo.FooLookupTable flt
    on ex.FooVarCharValue = flt.[Name]
SET Foo = (COALESCE(flt.[Id], 0);") 
  );

generating the following:

-- We know this column needs to be NOT NULL,
-- but we're given instructions on how to seed it, 
-- so we can make it NULL first, then flip it to NOT NULL after the seeding is done.
ALTER TABLE dbo.Example ADD FooEnum INT NULL;

UPDATE ex
FROM dbo.Example ex
  LEFT JOIN dbo.FooLookupTable flt
    on ex.FooVarCharValue = flt.[Name]
SET Foo = (COALESCE(flt.[Id], 0);

ALTER TABLE dbo.Example ADD FooEnum INT NOT NULL;

This is a pretty common refactoring pattern for me, especially in SQL Server, because when a Fact table uses strings for Dimension values, SQL Server cannot do an "optimized bitmap filter trick": When a query plan uses a batch-mode hash join where one of the inputs is an optimized bitmap filter with an underlying bookmark lookup or key lookup, the bitmap filter is "pushed down" into the Index Scan, thus making an Index Scan behave like a massively parallel index seek bookmark lookup.

It's annoying to have to write:

Create.Column("FooEnum").OnTable("Example").AsInt32().Nullable();

Execute.Sql(@"
UPDATE ex
FROM dbo.Example ex
  LEFT JOIN dbo.FooLookupTable flt
    on ex.FooVarCharValue = flt.[Name]
SET Foo = (COALESCE(flt.[Id], 0);
");

Alter.Column("FooEnum").OnTable("Example").AsInt32().NotNullable();

Why?

  1. I can't use fluent indexes on this column, because I can't alter a column that has indexes/constraints.
  2. I have to repeat myself and it's less physically contained in terms of what my actual intent here is.
  3. Sometimes, my seeding script involves Execute.WithConnection and reading mapping values in from a CSV file defined by a business analyst.

@michasacuer Since you sparked this debate - any thoughts?

@jzabroski
Copy link
Collaborator

I guess the downside to this syntax would be debuggability if for some reason the update were the Deadlock on a Provider that doesn't support Transactional DDL like SQL Server's Transact-SQL dialect of SQL. It would definitely be a mystery to some why the whole statement would fail and how to fix it. But I guess that can happen with SetExistingRowsTo as well.

Just thinking out loud and trying to be my own devil's advocate.

@fubar-coder
Copy link
Member

I'm always thinking about Firebird where you add a non-nullable column to the table and set the default value with a UPDATE tableWithNewColumn SET newColumn=newValue WHERE newColumn IS NULL.

I'm not familiar with what Firebird does. Can you elaborate?

The situation is strange:

On Firebird 2, you can only add nullable columns, but you can modify the metadata to make it non-nullable. Firebird 3 was extended to support altering the column to be non-null. The difficult thing around added non-nullable fields is described in this null guide.

Create.Column("FooEnum").OnTable("Example").AsInt32().NotNullable()
  .SeedWith(
    () => Execute.Sql(@"
UPDATE ex
FROM dbo.Example ex
  LEFT JOIN dbo.FooLookupTable flt
    on ex.FooVarCharValue = flt.[Name]
SET Foo = (COALESCE(flt.[Id], 0);") 
  );

This is a pretty common refactoring pattern for me, especially in SQL Server, because when a Fact table uses strings for Dimension values, SQL Server cannot do an "optimized bitmap filter trick": When a query plan uses a batch-mode hash join where one of the inputs is an optimized bitmap filter with an underlying bookmark lookup or key lookup, the bitmap filter is "pushed down" into the Index Scan, thus making an Index Scan behave like a massively parallel index seek bookmark lookup.

So, you usually use numeric enum values instead of strings for SQL Server?

@jzabroski
Copy link
Collaborator

jzabroski commented Dec 2, 2020

On Firebird 2, you can only add nullable columns, but you can modify the metadata to make it non-nullable. Firebird 3 was extended to support altering the column to be non-null. The difficult thing around added non-nullable fields is described in this null guide.

Wow, I'm so glad I don't use Firebird!

So, you usually use numeric enum values instead of strings for SQL Server?

Yes. I usually use an ORM like EntityFramework so that my presentation tier can transparently convert those numeric enum values back into strings. My LookupTable tables are all the same: LookupTableId, Name. Sometimes, there is a Description column as well if the concept needs something like a tooltip affordance. If I want a table with string columns, I create a view that joins to the LookupTable and provides the Name instead of the Id.

While you can compress string data via page compression and/or columnar indices, SQL Server does not support "optimized bitmap filter trick" on strings. Has to be an INT data type. Very powerful trick. Many do not know about it. SQL Server 2019 was extended to further support this on columnar clustered indices. It is also useful for encoding time as integer dimensions, as then time queries become faster. It is also useful for encoding intervals as integer values, as then Allen's Interval Algebra becomes more easily scalable in a SQL database that doesn't support bitemporal indices / access methods that allow efficiently traversing both sides of a range (almost every popular DB doesn't support this, although Postgres has a pg_bitemporal extension that support Allen's relations according to this PowerPoint slide deck).

@jzabroski jzabroski added the feature A new feature (we all like those) label Dec 2, 2020
@michasacuer
Copy link
Author

Any updates?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature (we all like those) propose-close
Projects
None yet
Development

No branches or pull requests

4 participants