Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mssql support #592

Merged
merged 12 commits into from Jan 16, 2024
Merged

Add mssql support #592

merged 12 commits into from Jan 16, 2024

Conversation

ddadaal
Copy link
Contributor

@ddadaal ddadaal commented Jan 10, 2024

image

Fixes #591.

Notes on MSSQL:

  • no TRUE or FALSE literals in mssql.
    • Use 1 and 0 instead
  • cannot give an alias to the table when update: in update table as t set xxx = xxx, as t is invalid
  • mssql doesn't allow circular cascade paths on foreign keys (see this stackoverflow answer)
  • some words are not allowed as identifiers (like users)
    • Add a utils.QuoteDbIdentifiers and utils.QuoteSql functions to quote identifiers in SQLs in any dialectors
  • use datetimeoffset instead of timestamp for datetime when modeling.
    • Use CustomTime for datetime properties/columns and don't specify type using tag. CustomTime will choose correct type for all dbs
  • no join using
    • use regular join on instead

Tested functionalities:

  • Login and Register
  • Collect data from wakatime plugins
  • Dashboard
  • Projects
  • Leaderboards
  • Email (not data yet)
  • Integration with wakatime
    • sync to wakatime.com
    • import data (import can start, but no data can be downloaded. error is Get "https://api.wakatime.com/api/v1/users/current/data_dumps": unexpected EOF. seems not related to db.

Copy link
Owner

@muety muety left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this contribution and for diving into the depths of all of this! 🙌 I did a bunch of quick tests myself with MSSQL and from what I found, everything is working like a charm!

Only when running the API tests against SQLite, I'm getting a bunch of errors, including 500 responses for the metrics- and WakaTime projects-endpoints.

utils/db.go Show resolved Hide resolved
repositories/heartbeat.go Outdated Show resolved Hide resolved
models/summary.go Outdated Show resolved Hide resolved
models/shared.go Show resolved Hide resolved
repositories/user.go Outdated Show resolved Hide resolved
repositories/heartbeat.go Outdated Show resolved Hide resolved
@muety
Copy link
Owner

muety commented Jan 12, 2024

Let me know when you want me to go for another round of review!

@ddadaal
Copy link
Contributor Author

ddadaal commented Jan 13, 2024

Let me know when you want me to go for another round of review!

Thanks for your review. Definitely! I focused only on the mssql functionalities in the last few days. I will try with other dbs and fix the problems. Some problems seem to be pretty annoying, especially the foreign key constraint one. It may take a few days to fix all of these.

@ddadaal
Copy link
Contributor Author

ddadaal commented Jan 14, 2024

Hello, I think it's now ready for another review.

@ddadaal ddadaal marked this pull request as ready for review January 14, 2024 10:26
@ddadaal ddadaal requested a review from muety January 15, 2024 01:18
@muety
Copy link
Owner

muety commented Jan 15, 2024

Very cool, thank you so much! Only found one last issue.

Every heartbeat has a hash property that is used for detecting duplicates. The heartbeats table has a unique index defined on the according column and in the repository, we effectively set ON CONFLICT DO NOTHING (see here) when inserting new heartbeats. This causes duplicate heartbeats to be silently ignored and prevents from clutter in the database. However, this mechanism doesn't seem to work with MSSQL.

Try inserting the same heartbeat twice in a row:

curl --location 'http://localhost:3000/api/heartbeat' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <YOUR-BASE64-HASHED-API-KEY>' \
--data '[{
    "entity": "/home/user1/dev/project1/main.go",
    "project": "wakapi",
    "language": "Go",
    "is_write": true,
    "type": "file",
    "category": null,
    "branch": null,
    "time": 1705342578.955
}]'

Both requests should result in a 201 response status, while only one heartbeat entry is actually inserted to the database. When running with MSSQL, I'm getting a 500 error for the second request, though, alongside the following error message:

[ERROR] failed to batch-insert heartbeats - mssql: Cannot insert duplicate key row in object 'dbo.heartbeats' with unique index 'idx_heartbeats_hash'. The duplicate key value is (2234263491a774d0)

Any ideas how to work around this?

@ddadaal
Copy link
Contributor Author

ddadaal commented Jan 16, 2024

Very cool, thank you so much! Only found one last issue.

Every heartbeat has a hash property that is used for detecting duplicates. The heartbeats table has a unique index defined on the according column and in the repository, we effectively set ON CONFLICT DO NOTHING (see here) when inserting new heartbeats. This causes duplicate heartbeats to be silently ignored and prevents from clutter in the database. However, this mechanism doesn't seem to work with MSSQL.

Try inserting the same heartbeat twice in a row:

curl --location 'http://localhost:3000/api/heartbeat' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <YOUR-BASE64-HASHED-API-KEY>' \
--data '[{
    "entity": "/home/user1/dev/project1/main.go",
    "project": "wakapi",
    "language": "Go",
    "is_write": true,
    "type": "file",
    "category": null,
    "branch": null,
    "time": 1705342578.955
}]'

Both requests should result in a 201 response status, while only one heartbeat entry is actually inserted to the database. When running with MSSQL, I'm getting a 500 error for the second request, though, alongside the following error message:

[ERROR] failed to batch-insert heartbeats - mssql: Cannot insert duplicate key row in object 'dbo.heartbeats' with unique index 'idx_heartbeats_hash'. The duplicate key value is (2234263491a774d0)

Any ideas how to work around this?

It is because of a bug in sqlserver dialector. An issue for it has been opened for a long time, but have not got response from gorm authors yet. I tried to fix it (attempted fix commit) but it didn't work, and the problem seems too complicated.

The root cause is sqlserver doesn't support on conflict clause, so the dialector developers use merge into to emulate the behavior. The merge into approach needs to explicitly write what columns might get conflict in the final sql, but the code uses primary keys, not unique columns.

Current workaround is that all heartbeats will be inserted one by one, and errors caused by duplicate hash will be ignored. It guarantees correctness but impacts performance. Considering most of time only one heartbeat will be inserted at once, I think the performance penalty should be acceptable.

Another alternative approach is to write raw sql just for mssql. Although there are a lot of ways to emulate the behavior (merge into is the most common one), all of them are so complex that I think it is not worth it.

@muety
Copy link
Owner

muety commented Jan 16, 2024

Thanks a lot for the detailed explanation and for the fix. I think it's very reasonable as a workaround. Again: 🙏 🙌.

@muety muety merged commit d9aefdd into muety:master Jan 16, 2024
6 of 7 checks passed
@muety
Copy link
Owner

muety commented Jan 16, 2024

I'll let the code run in production for a short while and push out a new release soon if no errors pop up (likely tomorrow or so).

models/user.go Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature request] Support SQL Server as database
2 participants