Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking changes for auto increment/pk fields #19

Closed
bombsimon opened this issue Nov 25, 2019 · 8 comments
Closed

Breaking changes for auto increment/pk fields #19

bombsimon opened this issue Nov 25, 2019 · 8 comments

Comments

@bombsimon
Copy link
Collaborator

The commit eab1a9e introduces breaking changes.

Click to see diff for the commit
diff --git a/bulk_insert.go b/bulk_insert.go
index ee09f01..cae9d99 100644
--- a/bulk_insert.go
+++ b/bulk_insert.go
@@ -103,7 +103,7 @@ func extractMapValue(value interface{}, excludeColumns []string) (map[string]int
                _, hasForeignKey := field.TagSettingsGet("FOREIGNKEY")

                if !containString(excludeColumns, field.Struct.Name) && field.StructField.Relationship == nil && !hasForeignKey &&
-                       !field.IsIgnored && !(field.DBName == "id" || fieldIsAutoIncrement(field)) {
+                       !field.IsIgnored && !fieldIsAutoIncrement(field) {
                        if field.Struct.Name == "CreatedAt" || field.Struct.Name == "UpdatedAt" {
                                attrs[field.DBName] = time.Now()
                        } else if field.StructField.HasDefaultValue && field.IsBlank {
@@ -125,5 +125,5 @@ func fieldIsAutoIncrement(field *gorm.Field) bool {
        if value, ok := field.TagSettingsGet("AUTO_INCREMENT"); ok {
                return strings.ToLower(value) != "false"
        }
-       return field.IsPrimaryKey
+       return false
 }

If you use the default gorm.Model for your ID/primary key the field won't have the tag AUTO_INCREMENT. Since the field is missing the tag fieldIsAutoIncrement will always return false and thus not be ignored. This will result in a duplicate key error. The field does however have the primary_key tag:

type Model struct {
    ID        uint `gorm:"primary_key"`
    CreatedAt time.Time
    UpdatedAt time.Time
    DeletedAt *time.Time `sql:"index"`
}

I know that a field doesn't auto increment by default just because it's a primary key but even if the table is created with gorm migrates or not I think it's common to use gorm.Model for your ID field and want it to be auto incremented.

Code to reproduce

package main

import (
	"github.com/jinzhu/gorm"
	_ "github.com/jinzhu/gorm/dialects/postgres"
	gormbulk "github.com/t-tiger/gorm-bulk-insert"
)

type MyType struct {
	gorm.Model
	Value string
}

func main() {
	// docker run -it --rm -p 5432:5432  -e POSTGRES_PASSWORD=pass postgres
	db, _ := gorm.Open(
		"postgres",
		"host=localhost port=5432 user=postgres dbname=test password=pass sslmode=disable",
	)

	defer db.Close()

	toInsert := []interface{}{
		MyType{Value: "first"},
		MyType{Value: "second"},
	}

	if err := gormbulk.BulkInsert(db, toInsert, 3000); err != nil {
		panic(err)
	}
}

Proposed fix

diff --git a/bulk_insert.go b/bulk_insert.go
index fb49a8b..dc38b05 100644
--- a/bulk_insert.go
+++ b/bulk_insert.go
@@ -124,5 +125,5 @@ func fieldIsAutoIncrement(field *gorm.Field) bool {
        if value, ok := field.TagSettingsGet("AUTO_INCREMENT"); ok {
                return strings.ToLower(value) != "false"
        }
-       return false
+       return field.DBName == "id" && field.IsPrimaryKey
 }
@t-tiger
Copy link
Owner

t-tiger commented Nov 25, 2019

Thank you for your feedback. Actually, I was aware this potential problem will be occurred. Therefore I set the Release Tag, previous version is v1.0.0 and this time is v1.1.0. (Sorry, there is not enough announcement)

Let me explain why I made this modification.

In previous code, there was a problem that insertion was not possible unless the primary key is auto_increment. Because when the column is primary_key, the return value of fieldIsAutoIncrement is always true (unless AUTO_INCREMENT is explicitly set false), and the column will not be targeted of insertion.

There is no problem with auto_increment columns. This is because the database itself will give a number. But what about natural keys that aren't surrogate keys? if it's a natural key, it will fail to insert because this will not be the target of insertion for the reason of primary_key, even though the value has been set by manually.

After all, in my opinion, only way to make both natural key and surrogate key compatible is to set the AUTO_INCREMENT tag explicitly for the surrogate key.

But if there are other good ideas, I'd like to adopt them. I'd be glad to know that there was such a context for this change.

@bombsimon
Copy link
Collaborator Author

Thank you for your response!

Regarding the new release tag I did a similar mistake myself recently which made me re-visit semver.org. If you're doing any changes that's not in a backwards compatible manner (which this isn't) you should probably bump your major version and release v2.0.0. All code relying on the previous behaviour is now broken.

I'm currently not really too familiar with the gorm source code but it's obvious that there are underlying mechanics to figure out which field(s) are primary and that seems to be based on the name id like your previous commit. I would guess that the loop here somehow filters out blank primary key fields. At least when using gorm.AutoMigrate() the primary key id will be set to auto increment without the tag.

Regarding the issue you had, see my proposed solution. That is, not only look for primary key but also ensure the name of the field is id. If you create a custom field named something other than id and set it to primary key the function would still return false.

I'm not sure I fully understand your thoughts around surrogate keys. If I would need a surrogate key I would specify it manually, definitely not name it id and manually set the AUTO_INCREMENT tag (which would exclude it from your columns to let the DBM allocate it). This is an example:

type MyType struct {
	gorm.Model      // Gives me PRIMARY KEY id
	Value           string
	CustomSurrogate int    `gorm:"AUTO_INCREMENT"`
	PrimaryValue    string `gorm:"PRIMARY_KEY"`
}

With your current version this gives me a duplicate key error, with the proposed diff this works as intended and allocates an ID both for my ID field (from gorm.Model) and for CustomSurrogate.

MySQL only supports one primary key but PostgreSQL supports multiple so I'm using multiple primary keys in my example to show that you can set a value to PrimaryValue and ensure it's preserved while inserting.


With the change you made, what's your proposed solution for people using gorm.Model?
Don't you think this is the preferred and recommended way?
Do you think everyone should duplicate gorm.Model just to add the AUTO_INCREMENT tag?
Should I manually populate my ID field in the struct?
If so, how do I avoid duplicates?

In my opinion I think it's of importance to support embedding gorm.Model out of the box and I'll gladly discuss ideas and help with implementation if you wish. :)

@t-tiger
Copy link
Owner

t-tiger commented Nov 26, 2019

Thank you for your ideas about the release version and auto_increment. I agree with all of these.

I think this library should follow the gorm ecosystem, and respect the default behavior. In that viewpoint, my perception was not enough.

I don't have enough time today because of my private situation, so I will think about it carefully and reply tomorrow. I'm grateful for your continuous effort.

@t-tiger
Copy link
Owner

t-tiger commented Nov 27, 2019

I confirmed the gorm behavior again and I came up with one idea for this issue. To explain why it was introduced, I'd like to summarize the point of issues.

gorm.Model behavior

When embedding gorm.Model, it will behave as follows.

  • Fields of ID, CreatedAt, UpdatedAt, DeletedAt are set
  • ID is attached to primary_key tag, but auto_increment tag does not exist
  • However there is no auto_increment tag, when AutoMigrate is executed, a table with auto_increment will be created

For this reason, forcing to add the auto_increment tag is against gorm's standard behavior, so I want to abolish it. This is due to my lack of awareness.

Previous problem

On the other hand, the previous version had the following problems.

  • Return value of the fieldIsAutoIncrement function is true event for the primary_key that is not actually auto_increment.
  • Then, the column is not subject to insertion from if statement condition.
  • There is no problem when the value is set automatically like auto_increment, but when the value is set manually like natural key, error occurs because natural key (not surrogate key) is missed by above reason.

My idea

Both current and previous versions have some issues. But I came up with the following solution.

  • If the column is primary_key and the value is empty, it will not be insertion target. (this is for auto_increment)
  • If the column is primary_key and if the value is present, it will be insertion target. (this is for natural key)

Specifically, example code is as follows.

if! containString (excludeColumns, field.Struct.Name) && field.StructField.Relationship == nil &&! hasForeignKey &&
   ! field.IsIgnored &&! fieldIsAutoIncrement (field) && !fieldIsPrimaryAndBlank (field) {
    // ~~
}

func fieldIsAutoIncrement(field *gorm.Field) bool {
	if value, ok := field.TagSettingsGet("AUTO_INCREMENT"); ok {
		return strings.ToLower(value) != "false"
	}
	return false
}

// add this function
func fieldIsPrimaryAndBlank (field * gorm.Field) bool {
   return field.IsPrimaryKey && field.IsBlank
}

In this case, the compatibility is not be broken, and it is in line with the behavior of gorm.
Please let me know your thoughts about this.

@bombsimon
Copy link
Collaborator Author

I think it looks good! The only difference here is that you'll also leave out blank primary keys even if it's not the id field. I think this is OK because a blank primary key would be set to it's default value (e.g. 0 or "") and then it would error on duplicate primary key either way.

This might not be the perfect solution but on the other hand I don't see any use case where you would use this package and omit your primary keys. At least I wouldn't expect that to work.

@t-tiger
Copy link
Owner

t-tiger commented Dec 1, 2019

Thank you for confirmation. (Sorry for the late reply🙏)

As you say, there is an issue that primary_key it's not the ID field is also excluded, but such a case should not be assumed originally.

I will implement this fix today. If you don't mind, would you please review at that time?

@bombsimon
Copy link
Collaborator Author

Sounds great, just tag me when ready and I'll review your implementation!

@bombsimon
Copy link
Collaborator Author

bombsimon commented Dec 2, 2019

Closing this, solved in 8a8235c. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants