Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Flaky integration test for multi-shard aggregation #1884

Closed
etiennedi opened this issue Mar 29, 2022 · 2 comments · Fixed by #1885
Closed

Bug: Flaky integration test for multi-shard aggregation #1884

etiennedi opened this issue Mar 29, 2022 · 2 comments · Fixed by #1885
Labels
bug Technical Debt It's impossible to avoid technical debt, but one of the most important things is to track it!

Comments

@etiennedi
Copy link
Member

--- FAIL: Test_Aggregations_MultiShard (1.04s)
    --- FAIL: Test_Aggregations_MultiShard/numerical_aggregations_without_grouping_(formerly_Meta) (0.02s)
        --- FAIL: Test_Aggregations_MultiShard/numerical_aggregations_without_grouping_(formerly_Meta)/multiple_fields,_multiple_aggregators (0.01s)
            --- FAIL: Test_Aggregations_MultiShard/numerical_aggregations_without_grouping_(formerly_Meta)/multiple_fields,_multiple_aggregators/text_fields_(sector) (0.00s)
                aggregations_integration_test.go:1450: 
                    	Error Trace:	aggregations_integration_test.go:1450
                    	Error:      	elements differ
                    	            	
                    	            	extra elements in list A:
                    	            	([]interface {}) (len=1) {
                    	            	 (aggregation.TextOccurrence) {
                    	            	  Value: (string) (len=4) "Food",
                    	            	  Occurs: (int) 60
                    	            	 }
                    	            	}
                    	            	
                    	            	
                    	            	extra elements in list B:
                    	            	([]interface {}) (len=2) {
                    	            	 (aggregation.TextOccurrence) {
                    	            	  Value: (string) (len=4) "Food",
                    	            	  Occurs: (int) 52
                    	            	 },
                    	            	 (aggregation.TextOccurrence) {
                    	            	  Value: (string) (len=10) "Financials",
                    	            	  Occurs: (int) 9
                    	            	 }
                    	            	}
                    	            	
                    	            	
                    	            	listA:
                    	            	([]aggregation.TextOccurrence) (len=1) {
                    	            	 (aggregation.TextOccurrence) {
                    	            	  Value: (string) (len=4) "Food",
                    	            	  Occurs: (int) 60
                    	            	 }
                    	            	}
                    	            	
                    	            	
                    	            	listB:
                    	            	([]aggregation.TextOccurrence) (len=2) {
                    	            	 (aggregation.TextOccurrence) {
                    	            	  Value: (string) (len=4) "Food",
                    	            	  Occurs: (int) 52
                    	            	 },
                    	            	 (aggregation.TextOccurrence) {
                    	            	  Value: (string) (len=10) "Financials",
                    	            	  Occurs: (int) 9
                    	            	 }
                    	            	}
                    	Test:       	Test_Aggregations_MultiShard/numerical_aggregations_without_grouping_(formerly_Meta)/multiple_fields,_multiple_aggregators/text_fields_(sector)
                    	```
@etiennedi
Copy link
Member Author

I think I know what’s causing the flakiness: In this particular test, we are setting the limit of the text aggregation to 1. This means each shard should only return the top 1 most common word for this field. In a “normal”, non-flaky run we end up with this:

  • Shard 1: “Food” (15 matches out of 30)
  • Shard 2: “Food” (26 matches out of 30)
  • Shard 3: “Food” (19 matches out of 30)

This is aggregated to “Food” (60 matches out of 90) which is also what we expect in the tests.

However, on a flaky run what we see is this:

  • Shard 1: “Financials” (15 matches out of 30)
  • Shard 2: “Food” (26 matches out of 30)
  • Shard 3: “Food” (19 matches out of 30)

This is aggregated to

  • “Food” (45 out of 90)
  • “Financials” (15 out of 90)

So what’s happening here? It seems shard 1 owns exactly 15 “Food” and 15 “Financials”. So there is no real top 1 group and which one we get back is random. I’ll look for ways to make this more deterministic. Maybe it’s enough to reorder the ids in a way that we never end up with this 15/15 split on a single shard.

@etiennedi
Copy link
Member Author

image

I’ve now changed the (fixed) ids so that they are distributed as 16 / 26 / 18, so now the first shard always has a clear majority (16 vs 15) the totals are now 31, 30, 29

@etiennedi etiennedi added bug Technical Debt It's impossible to avoid technical debt, but one of the most important things is to track it! labels Mar 29, 2022
etiennedi added a commit that referenced this issue Mar 29, 2022
See #1884 for a detailed explanation of what was going on. Fixes #1884.
antas-marcin added a commit that referenced this issue Mar 29, 2022
gh-1884 WEAVIATE-70 fix flaky multi-shard integration test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Technical Debt It's impossible to avoid technical debt, but one of the most important things is to track it!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant