Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry for DNS lookup failure exception in TiDBInitializer #3884

Merged
merged 7 commits into from Apr 1, 2021

Conversation

handlerww
Copy link
Contributor

@handlerww handlerww commented Mar 30, 2021

What problem does this PR solve?

Sometimes when DNS service is not stable, in-pod DNS lookup requests will fail. We need to add the retry logic for the related exception.

What is changed and how does it work?

Code changes

  • Has Go code change
  • Has CI related scripts change

Tests

  • Unit test
  • E2E test
  • Manual test
  • No code

Manual test:

  1. Deploy a basic TiDB cluster
  2. Create TidbInitializer.
apiVersion: pingcap.com/v1alpha1
kind: TidbInitializer
metadata:
  name: initialize-demo
spec:
  image: tnir/mysqlclient
  imagePullPolicy: IfNotPresent
  cluster:
    name: initialize-demo
  initSql: "create database hello;"
  1. The initialization succeeded.
cluster1-tidb-initializer-tpq5h            0/1     Completed   0          13mcluster1-tidb-initializer-tpq5h            0/1     Completed   0          13m
  1. The startup script in configMap shows as below:
import os, sys, time, MySQLdb
host = 'cluster1-tidb'
permit_host = '%'
port = 4000
retry_count = 0
for i in range(0, 10):
    try:
        conn = MySQLdb.connect(host=host, port=port, user='root', connect_timeout=5, charset='utf8mb4')
    except MySQLdb.OperationalError as e:
        print(e)
        retry_count += 1
        time.sleep(1)
        continue
    break
if retry_count == 10:
    sys.exit(1)
with open('/data/init.sql', 'r') as sql:
    for line in sql.readlines():
        conn.cursor().execute(line)
        conn.commit()
if permit_host != '%%':
    conn.cursor().execute("update mysql.user set Host=%s where User='root';", (permit_host,))
conn.cursor().execute("flush privileges;")
conn.commit()
conn.close()

Side effects

  • Breaking backward compatibility
  • Other side effects:

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

Release Notes

Please refer to Release Notes Language Style Guide before writing the release note.

Add retry for DNS lookup failure exception in TiDBInitializer.

@handlerww
Copy link
Contributor Author

/cc @DanielZhangQD @dragonly @july2993

@codecov-io
Copy link

codecov-io commented Mar 30, 2021

Codecov Report

Merging #3884 (d5de01d) into master (a456093) will decrease coverage by 5.47%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #3884      +/-   ##
==========================================
- Coverage   67.87%   62.39%   -5.48%     
==========================================
  Files         175      171       -4     
  Lines       18612    18122     -490     
==========================================
- Hits        12632    11308    -1324     
- Misses       4881     5717     +836     
+ Partials     1099     1097       -2     
Flag Coverage Δ
e2e ?
unittest 62.39% <ø> (-0.02%) ⬇️

@@ -341,11 +341,18 @@ var tidbInitStartScriptTpl = template.Must(template.New("tidb-init-start-script"
host = '{{ .ClusterName }}-tidb'
permit_host = '{{ .PermitHost }}'
port = 4000
for i in range(0, 10):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if after 10 times try and all failed it will continue to run, but conn is not connect successful ?
i think this is not expected

Copy link
Contributor

@DanielZhangQD DanielZhangQD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • DanielZhangQD
  • july2993

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@DanielZhangQD
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: d5de01d

@ti-chi-bot ti-chi-bot merged commit 3761156 into pingcap:master Apr 1, 2021
ti-srebot pushed a commit to ti-srebot/tidb-operator that referenced this pull request Apr 1, 2021
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor

cherry pick to release-1.1 in PR #3888

@handlerww handlerww deleted the exception-initializer branch April 1, 2021 03:02
shonge pushed a commit to shonge/tidb-operator that referenced this pull request Apr 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants