Skip to content

Conversation

@kota2and3kan
Copy link
Contributor

@kota2and3kan kota2and3kan commented Feb 29, 2024

Description

This PR updates the script create-no-superuser-sqlserver.sh that creates no superuser for the SQL Server integration test.

The CI failed for some reason as follows.

  • The sqlcmd failed with the error Login timeout expired..
  • The Create no superuser step hung and failed by the GitHub Actions timeout (360min by default).
  • Integration test for Japanese words failed with garbled characters.

However, these errors are fixed by retrying without any code changes... (i.e., it looks like a flaky test...)

To address (debug) these issues, we discussed and decided to update CI as follows.

  • The sqlcmd failed with the error Login timeout expired..
    • In the current CI, we run the sqlcmd command right after the SQL Server container starts. So, we guess that the SQL Server process does not start in the container.
    • So, we add sleep and check if we can run the SELECT 1 as a first step of the script.
    • We try SELECT 1 several times, and if the SELECT 1 command succeeds, we continue to the next steps.
  • The Create no superuser step hung and failed by the GitHub Actions timeout (360min by default).
    • The root cause is not found...
    • So, as a workaround, I set the timeout to 1 min instead of the default value 3600 min to the Create no superuser step.
    • In the case of succeeded CI, the Create no superuser takes about 10 seconds. And, I set 30 sec (3sec * 10 times) as a retry configuration in this update. So, I think the 1 min is appropriate as a timeout.
  • Integration test for Japanese words failed with garbled characters.
    • In the current CI, we set the default collation via a environment variable. However, we guess that the default collation is not applied to the test_db for some reason.
    • So, we set collation explicitly when we create test_db by the SQL CREATE DATABASE test_db COLLATE Japanese_BIN2.

Please take a look!

Related issues and/or PRs

Changes made

  • Implement connection check (SELECT 1) and retry feature in create-no-superuser-sqlserver.sh.
  • Add echo command to debug the issue.
  • Specify the database collation explicitly when we create test_db that we use for integration test.
  • Update CI to use the new script.
  • Set timeout configuration of GitHub Actions.

Checklist

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

I checked this script works in my local environment as follows.

  • Succeeded pattern
    $ ./ci/no-superuser/create-no-superuser-sqlserver.sh sqlserver22 SqlServer22 10 3
    INFO: Creating no superuser start.
    INFO: Sleep 3 seconds to wait for SQL Server start.
    INFO: Retry count: 0
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    INFO: Retry count: 1
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    INFO: Retry count: 2
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    INFO: Retry count: 3
    
    -----------
              1
    
    (1 rows affected)
    INFO: sqlcmd command succeeded. Continue creating no superuser.
    INFO: Create login start
    INFO: Create login end
    INFO: Create database start
    INFO: Create database end
    INFO: Create no_superuser start
    INFO: Create no_superuser end
    INFO: Add role db_ddladmin start
    INFO: Add role db_ddladmin end
    INFO: Add role db_datawriter start
    INFO: Add role db_datawriter end
    INFO: Add role db_datareader start
    INFO: Add role db_datareader end
    INFO: SELECT @@version start
    name collation_name
    ---- --------------
    master SQL_Latin1_General_CP1_CI_AS
    tempdb SQL_Latin1_General_CP1_CI_AS
    model SQL_Latin1_General_CP1_CI_AS
    msdb SQL_Latin1_General_CP1_CI_AS
    test_db Japanese_BIN2
    
    (5 rows affected)
    INFO: SELECT @@version end
    INFO: SELECT @@version start
                                                                                                                                                                                                                      
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Microsoft SQL Server 2022 (RTM-CU11) (KB5032679) - 16.0.4105.2 (X64)
            Nov 14 2023 18:33:19
            Copyright (C) 2022 Microsoft Corporation
            Express Edition (64-bit) on Linux (Ubuntu 22.04.3 LTS) <X64>                                     
    
    (1 rows affected)
    INFO: SELECT @@version end
    INFO: Creating no superuser succeeded.
  • Error pattern
    $ ./ci/no-superuser/create-no-superuser-sqlserver.sh sqlserver22 SqlServer22 10 3
    INFO: Creating no superuser start.
    INFO: Sleep 3 seconds to wait for SQL Server start.
    INFO: Retry count: 0
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    INFO: Retry count: 1
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    INFO: Retry count: 2
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    INFO: Retry count: 3
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    INFO: Retry count: 4
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    INFO: Retry count: 5
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    INFO: Retry count: 6
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    INFO: Retry count: 7
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    INFO: Retry count: 8
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    INFO: Retry count: 9
    Error response from daemon: No such container: sqlserver22
    INFO: sqlcmd command failed. Will retry after 3 seconds.
    ERROR: sqlcmd command failed 10 times. Please check your configuration.

Also, I tried to run the CI (SQL Server integration test) 25 times. However, the issue is not reproduced. So, I guess we might be able to fix the issue with the updates of this PR.
https://github.com/scalar-labs/scalardb/actions/runs/8182728411

Note: The attempt 3 failed. But, as a discussion result, we found the root cause. It depends on the specification of SQL Server (how it treat byte data). This error is not related to this PR. This is a very rare case, so we will consider this issue in another chance.
https://github.com/scalar-labs/scalardb/actions/runs/8182728411/attempts/3

Release notes

N/A

@kota2and3kan kota2and3kan added bugfix github_actions Pull requests that update GitHub Actions code labels Feb 29, 2024
@kota2and3kan kota2and3kan self-assigned this Feb 29, 2024
@kota2and3kan
Copy link
Contributor Author

We discussed and decided to keep CI simple as long as possible.
So, I close this PR. I will fix only username (root cause) in another PR.

@kota2and3kan kota2and3kan deleted the fix-ci-issue-sqlserver branch February 29, 2024 04:33
@kota2and3kan kota2and3kan restored the fix-ci-issue-sqlserver branch March 6, 2024 08:52
@kota2and3kan
Copy link
Contributor Author

The integration test of SQL Server failed several times. We discussed and decided to update CI to resolve the issue and further debugging. So, I re-open this PR.

@kota2and3kan kota2and3kan reopened this Mar 6, 2024
@kota2and3kan kota2and3kan changed the title Add connection check and retry process to script that creates "no superuser" Update CI to fix issues of SQL Server integration test Mar 6, 2024
MSSQL_PID: "Express"
SA_PASSWORD: "SqlServer17"
ACCEPT_EULA: "Y"
MSSQL_COLLATION: "Japanese_BIN2"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set the collation by the CREATE DATABASE in the script. So, we can remove this configuration from here.

Comment on lines +612 to +613
run: ./ci/no-superuser/create-no-superuser-sqlserver.sh sqlserver17 SqlServer17 10 3
timeout-minutes: 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set the retry count to 10 times and the retry interval to 3 sec. We can reconsider these values in the future if CI error occurs frequently.

Also, I set the timeout of this step as 1 min. In the case of succeeded CI, this step takes about 10 seconds.

@@ -1,23 +1,79 @@
#!/bin/bash
set -eu
set -u
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To continue the script (to run retry) even after SELECT 1 failed, I removed the set -e option.

Comment on lines +18 to +39
while [[ ${COUNT} -lt ${MAX_RETRY_COUNT} ]]
do
sleep ${RETRY_INTERVAL}

echo "INFO: Retry count: ${COUNT}"

docker exec -t ${SQL_SERVER_CONTAINER_NAME} /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P ${SQL_SERVER_PASSWORD} -d master -Q "SELECT 1"
if [[ $? -eq 0 ]]; then
break
else
echo "INFO: sqlcmd command failed. Will retry after ${RETRY_INTERVAL} seconds."
fi

COUNT=$((COUNT + 1))

if [[ ${COUNT} -eq ${MAX_RETRY_COUNT} ]]; then
echo "ERROR: sqlcmd command failed ${MAX_RETRY_COUNT} times. Please check your configuration." >&2
exit 1
fi
done

echo "INFO: sqlcmd command succeeded. Continue creating no superuser."
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, this script checks whether it can access SQL Server or not by using SELECT 1.

# Create database
docker exec -t ${SQL_SERVER_CONTAINER_NAME} /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P ${SQL_SERVER_PASSWORD} -d master -Q "CREATE DATABASE test_db"
echo "INFO: Create database start"
docker exec -t ${SQL_SERVER_CONTAINER_NAME} /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P ${SQL_SERVER_PASSWORD} -d master -Q "CREATE DATABASE test_db COLLATE Japanese_BIN2"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script set collation Japanese_BIN2 explicitly in the CREATE DATABASE command.


# Check the collation of test_db (for debugging purposes)
echo "INFO: SELECT @@version start"
docker exec -t ${SQL_SERVER_CONTAINER_NAME} /opt/mssql-tools/bin/sqlcmd -S localhost -U no_superuser -P no_superuser_password -d test_db -Q "SELECT name, collation_name FROM sys.databases" -W
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For debugging purposes, I added the SQL command to see the database collation settings.

@kota2and3kan kota2and3kan requested review from a team, Torch3333, brfrn169, feeblefakie and komamitsu and removed request for a team March 8, 2024 09:15
Copy link
Contributor

@komamitsu komamitsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

@brfrn169 brfrn169 merged commit e448bbc into master Mar 11, 2024
@brfrn169 brfrn169 deleted the fix-ci-issue-sqlserver branch March 11, 2024 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix github_actions Pull requests that update GitHub Actions code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants