Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(warehouse): temp table support for postgres #2964

Merged
merged 46 commits into from
Mar 9, 2023

Conversation

achettyiitr
Copy link
Member

@achettyiitr achettyiitr commented Feb 9, 2023

Description

Temp table support

  • Temp staging tables support for POSTGRES. It will make a lot of our pipeline simpler by not adding extra handling for CrashRecoveries, Dropping, etc.
  • Added separate components for LoadTable, LoadUserTable, and Diagnostic.
  • Added support for observability around loading tables for POSTGRES.
  • Make dedup optional based on destinationIDs.

Additional changes

  • Added a separate component called loadfile.Downloader. This will download the load files for a specific table during an upload.
    • number of workers configured
  • Deepsource recommendations
    • use %q instead of "%s" for quoted strings
    • Unnecessary block

Notion Ticket

https://www.notion.so/rudderstacks/Use-temp-tables-for-POSTGRES-286d6889c2e744d78f6297110054b551?pvs=4

Security

  • The code changed/added as part of this pull request won't create any security issues with how the software is being used.

@codecov
Copy link

codecov bot commented Feb 9, 2023

Codecov Report

Patch coverage: 45.62% and project coverage change: +0.27 🎉

Comparison is base (1d9f63b) 53.43% compared to head (7f5611b) 53.71%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2964      +/-   ##
==========================================
+ Coverage   53.43%   53.71%   +0.27%     
==========================================
  Files         342      346       +4     
  Lines       52747    53614     +867     
==========================================
+ Hits        28185    28797     +612     
- Misses      22947    23181     +234     
- Partials     1615     1636      +21     
Impacted Files Coverage Δ
utils/misc/misc.go 51.37% <ø> (-0.49%) ⬇️
warehouse/identity/identity.go 0.51% <0.00%> (+0.06%) ⬆️
...ehouse/integrations/azure-synapse/azure-synapse.go 0.31% <0.00%> (+0.02%) ⬆️
warehouse/integrations/postgres/postgres.go 0.53% <0.00%> (-28.10%) ⬇️
warehouse/internal/service/recovery.go 100.00% <ø> (ø)
warehouse/upload.go 23.58% <0.00%> (-0.02%) ⬇️
warehouse/integrations/mssql/mssql.go 1.62% <6.20%> (+0.66%) ⬆️
warehouse/integrations/postgres-legacy/postgres.go 28.62% <28.62%> (ø)
warehouse/integrations/manager/manager.go 51.19% <54.16%> (-3.10%) ⬇️
...nternal/service/loadfiles/downloader/downloader.go 82.55% <82.55%> (ø)
... and 17 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@achettyiitr achettyiitr marked this pull request as draft February 10, 2023 11:31
@@ -135,7 +149,7 @@ func connect(cred credentialsT) (*sql.DB, error) {
Host: net.JoinHostPort(cred.host, strconv.Itoa(port)),
RawQuery: query.Encode(),
}
pkgLogger.Debugf("synapse connection string : %s", connUrl.String())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This contains sensitive information like passwords and access keys. We should probably not log it.

@achettyiitr achettyiitr marked this pull request as ready for review February 25, 2023 09:02
@@ -151,7 +163,6 @@ func Connect(cred CredentialsT) (*sql.DB, error) {
Host: net.JoinHostPort(cred.Host, strconv.Itoa(port)),
RawQuery: query.Encode(),
}
pkgLogger.Debugf("mssql connection string : %s", connUrl.String())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This contains sensitive information like Passwords, we should not probably log it.

@achettyiitr achettyiitr removed the request for review from abhimanyubabbar February 28, 2023 13:55
@@ -0,0 +1,132 @@
package load_file_downloader
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can create a package for all things associated with loadfiles

Suggested change
package load_file_downloader
package loadfile

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then we can have Downloader inside this file.

fileNamesLock sync.RWMutex
)

objects := l.uploader.GetLoadFilesMetadata(warehouseutils.GetLoadFilesOptionsT{Table: tableName})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we download all the files associated with an upload table ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We download only the once corresponding to that table.


// Creating staging table
sortedColumnKeys = warehouseutils.SortColumnKeysFromColumnMap(tableSchemaInUpload)
stagingTableName = warehouseutils.StagingTableName(provider, tableName, tableNameLimit)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flag for a temporary table.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are now having legacy implementation back driven by config Warehouse.postgres.useLegacy set to true by default.

@achettyiitr achettyiitr requested a review from lvrach March 1, 2023 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants