Skip to content

Conversation

@komamitsu
Copy link
Contributor

@komamitsu komamitsu commented Jun 18, 2024

Description

After merging #1909, we observed the integration test often failed with the Cosmos DB emulator due to failing to create a namespace.

admin.createNamespace(getNamespaceName(firstColumnType), true, options);

I noticed CosmosCrossPartitionScanIntegrationTest used the default thread number (10). So, this PR limits it to 1 as well as other tests.

Related issues and/or PRs

Changes made

  • Reduce the concurrency for DDL operations in CosmosCrossPartitionScanIntegrationTest from 10 to 3

Checklist

The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

  • I first tried to reduce the concurrency to 3. But it still failed, and I changed it to 1.
  • If this change makes the integration test with Cosmos DB emulator more stable, we may be able to increase maxParallelForks to 2 or 3 later.

Release notes

N/A

@komamitsu komamitsu self-assigned this Jun 18, 2024
@komamitsu komamitsu added the github_actions Pull requests that update GitHub Actions code label Jun 18, 2024
@komamitsu komamitsu changed the title Reduce the number of concurrent DDLs to 3 for Cosmos DB in the integration tests Reduce the number of concurrent DDLs from 10 to 1 for Cosmos DB in the integration tests Jun 18, 2024
Copy link
Contributor

@feeblefakie feeblefakie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

Copy link
Contributor

@Torch3333 Torch3333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you.

}

@Override
protected int getThreadNum() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we override the isParallelDdlSupported() method instead of this? Or we might be able to remove the isParallelDdlSupported() method and use getThreadNum() instead for the sake of consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brfrn169 Good point! Actually I noticed the same and created a branch to remove executeDdls() and isParallelDdlSupported() a17d246.

I'll create a PR with the change for only branches 3 and master that contain those unnecessary methods (reducing concurrency for Cosmos DB integration test is needed for all the release and support branches)

@brfrn169
Copy link
Collaborator

@komamitsu Also, maybe, we can override isParallelDdlSupported() (and make it return false) in CosmosMultiplePartitionKeyIntegrationTest and CosmosMultipleClusteringKeyScanIntegrationTest just in case?

@komamitsu komamitsu requested a review from brfrn169 June 19, 2024 01:37
@komamitsu
Copy link
Contributor Author

we can override isParallelDdlSupported()

@brfrn169 I'll remove the method later as a separate PR since the method doesn't exist in older branches than 3, while reducing the concurrency addressed with this PR is needed in all the release and support branches (#1920 (comment)).

@brfrn169
Copy link
Collaborator

@komamitsu Sorry, I checked the source code again. It seems like isParallelDdlSupported() is still necessary because the thread pool is used not only for DDLs but also the test execution. So maybe, in this PR, I think we can override isParallelDdlSupported() and make it return false in CosmosCrossPartitionScanIntegrationTest, CosmosMultiplePartitionKeyIntegrationTest and CosmosMultipleClusteringKeyScanIntegrationTest. What do you think?

@komamitsu
Copy link
Contributor Author

@brfrn169 Ah, you're right. I didn't notice the executorService was also used for DML operations. But, the older branches than branch 3 don't have execDdls() and isParallelDdlSupported(), so we need to port the methods to the old branches...?

I'm wondering what is the purpose of concurrently issuing the DML operations in the test class. If it's to reduce the integration test duration, I noticed the total duration wasn't increased with a single thread and we don't need execDdls() and isParallelDdlSupported() by leveraging on getThreadNum():

With 10 threads:

If the concurrent DML operations are needed to see if things work fine even under multi-threads, we need to separate the concurrencies for DDL and DML by introducing execDdls() and isParallelDdlSupported().

@komamitsu
Copy link
Contributor Author

I noticed the total duration wasn't increased with a single thread and we don't need execDdls() and isParallelDdlSupported() by leveraging on getThreadNum():

However, separating the concurrencies for DDL and DML might be always helpful for flexible tuning. Let me know if you think we should add execDdls() and isParallelDdlSupported() in all the release and supported branches.

@brfrn169
Copy link
Collaborator

@komamitsu

I'm wondering what is the purpose of concurrently issuing the DML operations in the test class.

Actually, I don't remember the reasons.

If it's to reduce the integration test duration, I noticed the total duration wasn't increased with a single thread and we don't need execDdls() and isParallelDdlSupported() by leveraging on getThreadNum():

If it doesn't affect the integration test duration for all the storages, we can execute the DDLs in a single thread. Could you please check that? Thanks.

@komamitsu
Copy link
Contributor Author

@brfrn169

If it doesn't affect the integration test duration for all the storages, we can execute the DDLs in a single thread. Could you please check that? Thanks.

Oh, I didn't talked about other storages at all, but only Cosmos DB. My understanding is as follows:

  • Configuring getThreadNum() affects the concurrencies of DDL and DML operations
  • For the Cosmos DB integration test, making getThreadNum() return 1 seemed to mitigate the namespace failure issue and the total duration didn't increase. This is the quickest solution for the Cosmos DB integration test. (I guess the bottleneck of the test cases is Cosmos DB and increasing the client side concurrency doesn't help much)
  • For other storage integration test, whether concurrent DDL reduces the duration depends on underlying database/storage. The configurable concurrent DDL should be kept as is

Okay. I can port execDdls() and isParallelDdlSupported() to the older branches than 3, and then mitigate the namespace creation failure issue by isParallelDdlSupported(). Do you want me to do that?

@brfrn169
Copy link
Collaborator

@komamitsu

Okay. I can port execDdls() and isParallelDdlSupported() to the older branches than 3, and then mitigate the namespace creation failure issue by isParallelDdlSupported(). Do you want me to do that?

Sounds good. Thank you!

via isParallelDdlSupported() instead of getThreadNum()
Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

feeblefakie pushed a commit that referenced this pull request Jun 19, 2024
…e integration tests (#1920)

Co-authored-by: Toshihiro Suzuki <brfrn169@gmail.com>
komamitsu added a commit that referenced this pull request Jun 19, 2024
…e integration tests (#1920)

Co-authored-by: Toshihiro Suzuki <brfrn169@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

github_actions Pull requests that update GitHub Actions code improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants