added support for 1 to 1 relations and added support for n unique key…#3
added support for 1 to 1 relations and added support for n unique key…#3
Conversation
| if len(conflict.columns) == 1 && len(conflictingColumns) == 0 { | ||
| isFKManyto1 = 0 | ||
| conflictingColumns = conflict.columns | ||
| selectParams = []interface{}{secondaryID} |
There was a problem hiding this comment.
Just thought of a case not covered. If the secondary user has a row but the primary does not the secondary row will still be deleted. We need to also check that the primary has a row too in order for the secondary to be deleted, otherwise there is no conflict to resolve.
nikooo777
left a comment
There was a problem hiding this comment.
I don't fully understand this PR, but I took a look at it and it seems to logically make sense.
Still I don't feel experienced enough to say anything else.
lyoshenka
left a comment
There was a problem hiding this comment.
i recommend splitting the two cases out into separate functions. it will be far easier, and i don't think there's much overlap
| selectParams = []interface{}{secondaryID} | ||
| } | ||
|
|
||
| //This query checks for conflicting rows based on the keys. There are two cases. The first is when we have a 1To1 |
There was a problem hiding this comment.
id prefer not to have large comments like this in the code. we can either use godoc-style comments at the top of the function which describe what the function does, or write up the docs somewhere else. other than that, the code should speak for itself as much as possible
|
some code i threw together real quick func mergeModels(tx boil.Executor, primaryID uint64, secondaryID uint64, foreignKeys []foreignKey, conflictingKeys []conflictingUniqueKey) error {
if len(foreignKeys) < 1 {
return nil
}
var err error
for _, conflict := range conflictingKeys {
if len(conflict.columns) == 1 && conflict.columns[0] == conflict.objectIdColumn {
err = deleteOneToOneConflictsBeforeMerge(tx, conflict, primaryID, secondaryID)
} else {
err = deleteOneToManyConflictsBeforeMerge(tx, conflict, primaryID, secondaryID)
}
if err != nil {
return err
}
}
for _, fk := range foreignKeys {
// TODO: use NewQuery here, not plain sql
query := fmt.Sprintf(
"UPDATE %s SET %s = %s WHERE %s = %s",
fk.foreignTable, fk.foreignColumn, strmangle.Placeholders(dialect.IndexPlaceholders, 1, 1, 1),
fk.foreignColumn, strmangle.Placeholders(dialect.IndexPlaceholders, 1, 2, 1),
)
_, err = tx.Exec(query, primaryID, secondaryID)
if err != nil {
return errors.Err(err)
}
}
return checkMerge(tx, foreignKeys)
}
func deleteOneToOneConflictsBeforeMerge(tx boil.Executor, conflict conflictingUniqueKey, primaryID uint64, secondaryID uint64) error {
query := fmt.Sprintf(
"SELECT COUNT(*) FROM %s WHERE %s IN (%s)",
conflict.table, conflict.objectIdColumn,
strmangle.Placeholders(dialect.IndexPlaceholders, 2, 1, 1),
)
var count int
err := tx.QueryRow(query, primaryID, secondaryID).Scan(&count)
if err != nil {
return errors.Err(err)
}
if count > 2 {
return errors.Err("it should not be possible to have more than two rows here")
} else if count != 2 {
return nil // no conflicting rows
}
query = fmt.Sprintf(
"DELETE FROM %s WHERE %s = %s",
conflict.table, conflict.objectIdColumn, strmangle.Placeholders(dialect.IndexPlaceholders, 1, 1, 1),
)
_, err = tx.Exec(query, secondaryID)
return errors.Err(err)
}
func deleteOneToManyConflictsBeforeMerge(tx boil.Executor, conflict conflictingUniqueKey, primaryID uint64, secondaryID uint64) error {
conflictingColumns := strmangle.SetComplement(conflict.columns, []string{conflict.objectIdColumn})
if len(conflictingColumns) < 1 {
return nil
} else if len(conflictingColumns) > 1 {
return errors.Err("this doesnt work for unique keys with more than two columns (yet)")
}
query := fmt.Sprintf(
"SELECT %s FROM %s WHERE %s IN (%s) GROUP BY %s HAVING count(distinct %s) > 1",
conflictingColumns[0], conflict.table, conflict.objectIdColumn,
strmangle.Placeholders(dialect.IndexPlaceholders, 2, 1, 1),
conflictingColumns[0], conflict.objectIdColumn,
)
rows, err := tx.Query(query, primaryID, secondaryID)
defer rows.Close()
if err != nil {
return errors.Err(err)
}
args := []interface{}{secondaryID}
for rows.Next() {
var value string
err = rows.Scan(&value)
if err != nil {
return errors.Err(err)
}
args = append(args, value)
}
// if no rows found, no need to delete anything
if len(args) < 2 {
return nil
}
query = fmt.Sprintf(
"DELETE FROM %s WHERE %s = %s AND %s IN (%s)",
conflict.table, conflict.objectIdColumn, strmangle.Placeholders(dialect.IndexPlaceholders, 1, 1, 1),
conflictingColumns[0], strmangle.Placeholders(dialect.IndexPlaceholders, len(args)-1, 2, 1),
)
_, err = tx.Exec(query, args...)
if err != nil {
return errors.Err(err)
}
return nil
} |
b0f6d96 to
226fa42
Compare
|
@lyoshenka I split it out into the two methods per your suggestion. I also tested the merge for the flagged edge case and it passes. |
lyoshenka
left a comment
There was a problem hiding this comment.
im having a bit of trouble following some of this. whats an example in lbry where we'd use this now?
also, could you add some tests for this behavior? sqlboiler already has a lot of features for testing. it would be good to keep using them.
finally, comments like //Grab scanned values for query arguments are unnecessary, and may be a sign that the code itself is not readable. please consider rewriting the code instead so that it reads better.
| // used in the delete query. | ||
| colNames, err := rows.Columns() | ||
| if err != nil { | ||
| log.Fatal(err) |
| } | ||
|
|
||
| args := []interface{}{secondaryID} | ||
| //Since we don't don't know if advance how many columns the query returns, we have dynamically assign them to be |
There was a problem hiding this comment.
why don't we know in advance how many columns will be returned. isn't it just len(conflictingColumns)?
There was a problem hiding this comment.
We do, but we can't predict the columns in the code. So we have to be dynamic. One query might have 1, another might have 5. I can't create a generic struct for scanning purposes. Front the research I did online, they said to use this pointer solution for dynamic scans.
There was a problem hiding this comment.
@lyoshenka is this acceptable? or were you just curious? If you approve I would like deploy this fix.
|
Feel free to merge, though you should still remove the |
|
Also, would still love tests for this if possible |
|
@tiger5226 should i merge this? |
|
No, I was talking with @tzarebczan and we will need to make changes to the schema. There are certain conflicts where we don't want to delete. For example, youtube data. So in this example we need to change the index to not be unique. |
|
If its not on github, it didnt happen 😉 What schema changes are you planning to make? I think youtube_data should probably still be unique. All the code Niko wrote depends on that - it would be a lot of work to change it. I'd prefer allowing Merge to error if there's a conflict, rather than making youtube_data not unique. Then we can handle the merge manually. |
|
In my defense I posted that about 2 minutes after we discussed it :) to make sure it was on github haha. I was actually planning on merging it when Tom flagged his concern based on your comment to merge it. |
|
I talked with Tom about this more tonight. So this PR fixes the fact that manual merging is required. However, this is for things like stripe where we want this to happen for example. However, youtube data is a special exception potentially. @tzarebczan mentioned that he thinks the youtube data is actually unique per channel not per user. He plans on double checking with @nikooo777 tomorrow. |
|
what do you mean by unique per channel and not per user? the youtube_data table is strictly related to the user table, a youtube_data row can only exist if there is a user associated with it, and only one row can exist per user. i don't think we have a concept of channels in our database, do we? |
|
@nikooo777 I will let @tzarebczan answer your question, but based on what you said, what is the expected behavior when we merge 2 users that each have a youtube account? If we have two records conflict ( meaning the uniqueness keys all match ) we delete the secondary users record. The main concern is that we would have two accounts for two users that request to be merged into one. I think, we should delete the secondary youtube_data until we support multiple youtube accounts per user. What impacts could this gave? |
|
I don't think deleting data is acceptable in this case. I'd much rather drop the uniqueness constraint as I had previously proposed in an issue on internal-apis. |
I was referring to this comment. I would prefer that we drop the constraint as well and allow multiple channels. Does this mean that this PR is blocked by that or can this be done until that is handled? How common is it that we merge two users with different youtube channels? Also is there an issue for this? |
|
probably not my call to make, but I'd vote for blocked. I don't agree in ever dropping any youtube_data rows. Opinions @lyoshenka and @kauffj ? |
|
Per @nikooo777 from slack, issue for multiple youtube accounts is https://github.com/lbryio/internal-apis/issues/323, will block this by for now. @alyssaoc just want to make sure this is on your radar. |
|
We have a PR in review https://github.com/lbryio/internal-apis/pull/639. Once this PR is merged we can potentially merge this as well. |
|
@tiger5226 how comfortable do you feel merging this? Should we still run through a few scenarios locally before doing so? |
|
Yes! This can be merged now! I am really confident. Nonetheless, I would still want to retest it since it's a months old issue. |
…s for conflict resolution during merging. split out into two functions.
226fa42 to
29172e9
Compare
added support for 1 to 1 relations and added support for n unique keys for conflict resolution during merging.