Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: warehouse backend config refactoring #3602

Merged
merged 40 commits into from
Aug 7, 2023
Merged

Conversation

fracasula
Copy link
Collaborator

@fracasula fracasula commented Jul 7, 2023

Description

Small refactoring for the backend config in the warehouse.

Changes:

  • removed duplicated code
  • decreased number of global variables
  • simplified backend config management
  • increased test coverage
  • better memory management
  • rewrite of the warehouse GRPC tests
  • removed tenant manager from slaves
    • this is initialized only once in the warehouse Setup() if isMaster()

TODOs

  • write tests for new backend config
  • run warehouse tests with -race
  • follow up on new TODO comments added in the code

Notion Ticket

< Notion Link >

Security

  • The code changed/added as part of this pull request won't create any security issues with how the software is being used.

@fracasula fracasula force-pushed the chore.whBackendConfig branch 2 times, most recently from d798c8d to 7bd3c22 Compare July 21, 2023 14:42
@achettyiitr achettyiitr force-pushed the chore.whBackendConfig branch 6 times, most recently from 6301bf8 to 5bbf8d8 Compare August 1, 2023 15:30
Comment on lines -1066 to -1068
var connectionFlags backendconfig.ConnectionFlags
for _, wConfig := range config {
connectionFlags = wConfig.ConnectionFlags // the last connection flags should be enough, since they are all the same in multi-workspace environments
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are already doing this in bcManager, no need do it here.

Comment on lines -1093 to -1097
if val, ok := connectionFlags.Services["warehouse"]; ok {
if UploadAPI.connectionManager != nil {
UploadAPI.connectionManager.Apply(connectionFlags.URL, val)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are already doing this in bcManager, no need do it here.

@@ -193,173 +190,3 @@ func TestUploadJob_ProcessingStats(t *testing.T) {
})
}
}

func Test_GetNamespace(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is moved to bcManager.

@cisse21 cisse21 merged commit e48c98e into master Aug 7, 2023
37 checks passed
@cisse21 cisse21 deleted the chore.whBackendConfig branch August 7, 2023 09:26
bcm.internalControlPlaneClient = cpclient.NewInternalClientWithCache(
backendconfig.GetConfigBackendURL(),
c.GetString("CONFIG_BACKEND_URL", "https://api.rudderstack.com"),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we keep duplicating the variable name and its default? Isn't GetConfigBackendURL good enough?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid calling global function for backendconfig was difficult to test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, because you replaced the bc with the tenant manager. Let's keep it this way for now then 👍

Comment on lines +92 to +94
case data, ok := <-ch:
if !ok {
return
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achettyiitr nice catch 👍


namespace, err := s.schema.GetNamespace(ctx, source.ID, destination.ID)
if err != nil {
s.logger.Errorw("getting namespace",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achettyiitr we don't have enough context here, right? We should at least log the error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Let me make the changes in a separate PR.

for _, tc := range testcases {
tc := tc

t.Run("should return namespace", func(t *testing.T) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achettyiitr How come the test name is always the same? Shall we include a name for the test in the test case itself? Something that would describe the type of test in order to make it self-explanatory would be nice.

@@ -1447,12 +1405,23 @@ func Start(ctx context.Context, app app.App) error {
}
}()

g, ctx := errgroup.WithContext(ctx)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achettyiitr it feels like we don't need a group here since none of the two routines are returning an error. also, we're overriding the context with the group one which could cause confusion in the code after the group that might be using the context. wdyt?

Comment on lines 108 to 112
s.warehousesMu.Lock()
if len(s.warehouses) > 0 {
ch <- s.warehouses
}
s.warehousesMu.Unlock()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achettyiitr this version was better, now we're holding the lock for the duration of the whole subscribe. I think we should put this code back.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason, we are adding this is because there is a race condition happening if we are not subscribers and we get backed config. The subscriber will not get anything until the next time we get the backend config changes
So, during the subscription also, we are sending the last copy of the data.

Comment on lines +258 to +260
warehouses = lo.Filter(warehouses, func(warehouse model.Warehouse, _ int) bool {
return warehouse.Destination.DestinationDefinition.Name == wh.destType
})
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achettyiitr we could also do this to save ourselves the next for loop:

warehouses = lo.Filter(warehouses, func(warehouse model.Warehouse, _ int) bool {
	if warehouse.Destination.DestinationDefinition.Name != wh.destType {
		return false
	}

	if warehouseutils.IDResolutionEnabled() &&
		slices.Contains(warehouseutils.IdentityEnabledWarehouses, wh.destType) {
		wh.setupIdentityTables(ctx, warehouse)
		if shouldPopulateHistoricIdentities && warehouse.Destination.Enabled {
			// non-blocking populate historic identities
			wh.populateHistoricIdentities(ctx, warehouse)
		}
	}

	return true
})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants