Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Several PgRegress test failures #2388

Closed
jaki opened this issue Sep 20, 2019 · 7 comments
Closed

[YSQL] Several PgRegress test failures #2388

jaki opened this issue Sep 20, 2019 · 7 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/failing-test Tests and testing infra

Comments

@jaki
Copy link
Contributor

jaki commented Sep 20, 2019

There are several Postgres regress test failures that have been unaddressed for a long period of time. The majority of these are in the CentOS ASAN build mode. Here are a list of tests to look into:

  • TestPgRegressBetaFeature
  • TestPgRegressFeature
  • TestPgRegressMisc
  • TestPgRegressPgMisc
  • TestPgRegressPgMiscIndependent

This issue will serve to track the progress and investigation on these tests.

@jaki jaki added area/ysql Yugabyte SQL (YSQL) kind/failing-test Tests and testing infra labels Sep 20, 2019
@jaki
Copy link
Contributor Author

jaki commented Sep 20, 2019

I begin with a few details that I have picked up. There are two prominent failures:

  • Memory leaks: For ASAN, when a memory leak is detected, the exit code of the process becomes non-zero. As @mbautin has mentioned, some memory leaks shouldn't be tracked (I suppose they are false positives), so there are environment variables that are used to suppress such leaks from showing up. A related issue highlighting this is issue Set more env vars for ASAN yb-ctl #2372. The memory leaks should be investigated to see whether they are genuine leaks or they should be suppressed.

  • RPC timeouts: This happens even in a DEBUG build. Here are some errors that I have gathered from the TestPgRegressFeature java test:

    • Postgres regress test yb_feature_temp has ERROR: Timed out: Read RPC to 127.160.221.135:21464 timed out after 19.861s or similar after the statement CREATE TEMP TABLE temptest (col int);. On ASAN, it also exits with exit code 1 due to memory leaks.

      • [BAD EXIT][FAIL] Jenkins, CentOS, Clang, ASAN
      • [FAIL][INTERMITTENT] Jenkins, CentOS, Clang, DEBUG
      • [BAD EXIT][FAIL] Local, CentOS, Clang, ASAN
    • Postgres regress test yb_feature_db has ERROR: Timed out: CreateNamespace timed out after deadline expired. Time elapsed: 120.003s, allowed: 120.000s: CreateNamespace RPC to 127.160.221.135:21464 timed out after 120.000s or similar after the statement CREATE DATABASE test_1; or similar.

      • [FAIL] Jenkins, CentOS, Clang, ASAN
      • [PASS] Jenkins, CentOS, Clang, DEBUG
      • [FAIL] Local, CentOS, Clang, ASAN

@jaki
Copy link
Contributor Author

jaki commented Sep 20, 2019

The following tests have bad exit code (i.e. memory leak) in Jenkins, CentOS, Clang, ASAN:

  • yb_feature_alter_table, yb_feature_temp, yb_feature_db in TestPgRegressFeature
  • yb_pg_triggers, yb_triggers in TestPgRegressBetaFeatures
  • yb_create_index in TestPgRegressMisc
  • yb_pg_errors in TestPgRegressPgMisc
  • yb_pg_insert, yb_index_including in TestPgRegressPgMiscIndependent

That is not to say that there are no failures in addition to the memory leaks.

@jaki
Copy link
Contributor Author

jaki commented Sep 20, 2019

As for the yb_feature_db failure, the failure seems to have been masked from the max log size exceeded failure. After commit e2a2f91, the RPC timeout has been revealed. This is evidenced by the Jenkins logs: https://jenkins.dev.yugabyte.com/job/github-yugabyte-db-centos-master-clang-asan/lastCompletedBuild/testReport/junit/org.yb.pgsql/TestPgRegressFeature/history/.

  • Build 203: has max log size exceeded failure
  • Build 204: has commit e2a2f91
  • Build 206: has RPC timeout failure

This means that this issue may have been around for quite a while without our knowing.

@jaki
Copy link
Contributor Author

jaki commented Sep 23, 2019

The yb_feature_db CREATE DATABASE CreateNamespace timed out error seems related to issue #1573.

spolitov added a commit that referenced this issue Oct 2, 2019
Summary:
psql function describeOneTableDetails calls PGclear only if result has more than 0 tuples.
That is wrong and even empty PGresult should be cleared.

Test Plan: ybd asan --java-test org.yb.pgsql.TestPgRegressMisc#testPgRegressMisc

Reviewers: timur, neil, dmitry

Reviewed By: dmitry

Subscribers: ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D7329
@jaki
Copy link
Contributor Author

jaki commented Jan 14, 2020

According to https://jenkins.dev.yugabyte.com/job/github-yugabyte-db-centos-master-clang-debug/568/artifact/java/yb-pgsql/target/surefire-reports_org.yb.pgsql.TestPgRegressFeature__testPgRegressFeature/org.yb.pgsql.TestPgRegressFeature-output.txt.gz and https://jenkins.dev.yugabyte.com/job/github-yugabyte-db-mac-master-clang-debug/849/artifact/java/yb-pgsql/target/surefire-reports_org.yb.pgsql.TestPgRegressFeature__testPgRegressFeature/org.yb.pgsql.TestPgRegressFeature-output.txt.gz, there are still failures for yb_feature_temp and yb_feature_db, respectively. The failure messages for each of these tests are still present.

This makes TestPgRegressFeature still fail. However, the other tests originally listed now seem okay. If they start failing again, they should not be tied to this issue.

@jaki
Copy link
Contributor Author

jaki commented Jan 25, 2021

One year later, and yb_feature_db is still failing.

--- /tmp/yb_test.tmp.7350.8848.14057.pid10219/pgregress_output/results/yb_feature_db.out  2021-01-23 00:19:43.977390426 +0000
***************
*** 35,44 ****
--- 35,48 ----
  -- Test case for CREATE DATABASE supported options
  --
  CREATE DATABASE test_1;
+ ERROR:  Timed out: Timed out waiting for Namespace Creation
  CREATE DATABASE test_2 TEMPLATE = template0 IS_TEMPLATE = FALSE;
  CREATE DATABASE test_3 TEMPLATE = DEFAULT IS_TEMPLATE = DEFAULT LC_COLLATE = DEFAULT LC_CTYPE = DEFAULT ENCODING = DEFAULT TABLESPACE = DEFAULT ALLOW_CONNECTIONS = FALSE CONNECTION LIMIT = 10;
+ ERROR:  Timed out: Timed out waiting for Namespace Creation
  CREATE DATABASE test_4 ENCODING = UNICODE;
+ ERROR:  Timed out: Timed out waiting for Namespace Creation
  CREATE DATABASE test_5 ENCODING = UTF8;
+ ERROR:  Timed out: Timed out waiting for Namespace Creation
  --
  -- Test case for ALTER DATABASE unsupported options
  --
***************

org.yb.pgsql.TestPgRegressFeature#testPgRegressFeature stability is at 16.5% failure rate overall, 100% on ASAN. It shows yb_feature_db failing but not yb_feature_temp in the last 500 commits, so the yb_feature_db failure is probably the only remaining one.

@jaki
Copy link
Contributor Author

jaki commented Jan 25, 2021

I will close this because TestPgRegressFeature is covered in issue #4809 and the rest of the failures here are fixed.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/failing-test Tests and testing infra
Projects
None yet
Development

No branches or pull requests

4 participants