Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically detect if data/logs/wals are pointing to tmpfs and soft disable O_DIRECT instead of crashing #18

Closed
bmatican opened this issue Jan 23, 2018 · 2 comments
Assignees
Labels
kind/enhancement This is an enhancement of an existing feature

Comments

@bmatican
Copy link
Contributor

bmatican commented Jan 23, 2018

Recently we had a report (#17) of a docker run which has the data dirs pointing to a tmpfs path and ultimately crashed the masters:
F0123 07:19:33.281553 44 master_main.cc:82] Check failed: _s.ok() Bad status: IO error (yb/util/env_posix.cc:202): Unable to initialize catalog manager: Failed to initialize sys tables async: Failed to open new log: Call to mkstemp() failed on name template /yuga/mem/yb-master/yb-data/master/wals/table-sys.catalog.uuid/tablet-00000000000000000000000000000000/.tmp.newsegmentXXXXXX: Invalid argument (error 22)

The crash happens because O_DIRECT is not actually supported on tmpfs!

As a proposed fix, we'll add some defensive code in our servers such that, if any of the paths pointed to for data/logs/wals are on a file system we cannot open a file with O_DIRECT, then we softly downgrade the durable_wal_write flag to false and print out a notice about this.

@mbautin
Copy link
Collaborator

mbautin commented Jan 23, 2018

@bmatican: perhaps we could have two modes for durable WAL writes: the current behavior, in which we refuse to start in case we can't use O_DIRECT, and a "prefer durable WAL write" / "autodetect" mode (which could be the default) with the graceful downgrade logic you're describing.

@rkarthik007 rkarthik007 added this to To Do in YBase features via automation Jan 26, 2018
@rkarthik007 rkarthik007 added the kind/enhancement This is an enhancement of an existing feature label Jan 27, 2018
@rkarthik007 rkarthik007 moved this from To Do to In progress in YBase features Apr 3, 2018
@ayushsengupta1991
Copy link
Contributor

Closed with 4c5813a

YBase features automation moved this from In progress to Done Apr 3, 2018
mbautin added a commit that referenced this issue Jan 3, 2020
…up on macOS

Summary:
Add a DNS lookup of the local hostname to postmaster startup to force
macOS network libraries to get initialized before any fork() calls
happen.  This fixes failures of ~20 tests in macOS debug mode. Without
this, PostgreSQL backends would frequently crash with SIGSEGV and dump
cores when trying to do the same DNS lookup.

Here is a SIGSEGV stack trace that we would previously get without this
fix:

```
frame #0: 0x00007fff7bd53e34 libsystem_trace.dylib`_os_log_cmp_key + 4
frame #1: 0x00007fff7bbfcb74 libsystem_c.dylib`rb_tree_find_node + 53
frame #2: 0x00007fff7bd52021 libsystem_trace.dylib`os_log_create + 368
frame #3: 0x00007fff7bc5b127 libsystem_info.dylib`gai_log_init + 23
frame #4: 0x00007fff7bd37ce3 libsystem_pthread.dylib`__pthread_once_handler + 65
frame #5: 0x00007fff7bd2daab libsystem_platform.dylib`_os_once_callout + 18
frame #6: 0x00007fff7bd37c7f libsystem_pthread.dylib`pthread_once + 56
frame #7: 0x00007fff7bc5a4ab libsystem_info.dylib`gai_log + 27
frame #8: 0x00007fff7bc5b33f libsystem_info.dylib`_gai_load_libnetwork_once + 63
frame #9: 0x00007fff7bd37ce3 libsystem_pthread.dylib`__pthread_once_handler + 65
frame #10: 0x00007fff7bd2daab libsystem_platform.dylib`_os_once_callout + 18
frame #11: 0x00007fff7bd37c7f libsystem_pthread.dylib`pthread_once + 56
frame #12: 0x00007fff7bc5b29b libsystem_info.dylib`_gai_load_libnetwork + 27
frame #13: 0x00007fff7bc5b64f libsystem_info.dylib`_gai_nat64_v4_address_requires_synthesis + 31
frame #14: 0x00007fff7bc5aaa0 libsystem_info.dylib`_gai_nat64_second_pass + 512
frame #15: 0x00007fff7bc39847 libsystem_info.dylib`si_addrinfo + 1959
frame #16: 0x00007fff7bc38f77 libsystem_info.dylib`_getaddrinfo_internal + 231
frame #17: 0x00007fff7bc38e7d libsystem_info.dylib`getaddrinfo + 61
frame #18: 0x000000011512f8e5 libyb_util.dylib`yb::GetFQDN(hostname="...") at net_util.cc:371:20
```

Test Plan: Jenkins

Reviewers: mihnea, dmitry

Reviewed By: dmitry

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D7757
ttyusupov added a commit that referenced this issue Jul 16, 2021
Summary:
Some time ago we had optimizations disabled for debug build type, but it was enabled during fix of #1291: f710367 ( https://phabricator.dev.yugabyte.com/D6660 ). Now we no longer have `retryable_rpc_single_call_timeout_ms` flag, also optimizations in debug build make it harder to investigate issues because of optimized stack traces and variables. So, we can disable these optimizations again to make debugging easier.

Before (note <optimized out> values that are not available for debugging):
```
(gdb) bt
#0  0x00007f97990bfa6b in raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/pt-raise.c:35
#1  0x00007f97a45268b9 in AddHash (num_probes=6, total_bits=523776, num_lines=<optimized out>, data=0x2bdc000 "", h=4266458700) at ../../src/yb/rocksdb/util/bloom.cc:66
#2  rocksdb::(anonymous namespace)::FixedSizeFilterBitsBuilder::AddKey (this=<optimized out>, key=...) at ../../src/yb/rocksdb/util/bloom.cc:463
#3  0x00007f97a44f0818 in rocksdb::FixedSizeFilterBlockBuilder::AddKey (this=this@entry=0x1c87a40, key=...) at ../../src/yb/rocksdb/table/fixed_size_filter_block.cc:97
#4  0x00007f97a44f08b0 in rocksdb::FixedSizeFilterBlockBuilder::Add (this=0x1c87a40, key=...) at ../../src/yb/rocksdb/table/fixed_size_filter_block.cc:91
#5  0x00007f97a44cf295 in rocksdb::BlockBasedTableBuilder::Add (this=0x1cd1c00, key=..., value=...) at ../../src/yb/rocksdb/table/block_based_table_builder.cc:468
#6  0x00007f97a439b9a4 in rocksdb::BuildTable (dbname=..., env=0x7f97a48a2c00 <rocksdb::Env::Default()::default_env>, ioptions=..., env_options=..., table_cache=0x1ccf740, iter=0x7f97969858f8, meta=0x7f97969864a0, internal_comparator=
    std::shared_ptr<const rocksdb::InternalKeyComparator> (use count 3, weak count 0) = {...}, int_tbl_prop_collector_factories=std::vector of length 1, capacity 1 = {...}, column_family_id=0,
    snapshots=std::vector of length 0, capacity 0, earliest_write_conflict_snapshot=72057594037927935, compression=rocksdb::kSnappyCompression, compression_opts=..., paranoid_file_checks=false, internal_stats=0x1d40200,
    boundary_values_extractor=0x1cd03f0, io_priority=rocksdb::Env::IO_HIGH, table_properties=0x7f9796987000) at ../../src/yb/rocksdb/db/builder.cc:160
#7  0x00007f97a4444ced in rocksdb::FlushJob::WriteLevel0Table (this=this@entry=0x7f9796986f40, mems=..., edit=0x1d44268, meta=meta@entry=0x7f97969864a0) at ../../src/yb/rocksdb/db/flush_job.cc:290
#8  0x00007f97a444669c in rocksdb::FlushJob::Run (this=this@entry=0x7f9796986f40, file_meta=file_meta@entry=0x7f9796986d00) at ../../src/yb/rocksdb/db/flush_job.cc:191
#9  0x00007f97a43fb5ba in rocksdb::DBImpl::FlushMemTableToOutputFile (this=this@entry=0x1d24000, cfd=cfd@entry=0x1a7b000, mutable_cf_options=..., made_progress=made_progress@entry=0x7f9796987f47,
    job_context=job_context@entry=0x7f9796987d70, log_buffer=0x7f9796987480) at ../../src/yb/rocksdb/db/db_impl.cc:1873
#10 0x00007f97a43fc505 in rocksdb::DBImpl::BackgroundFlush (this=this@entry=0x1d24000, made_progress=made_progress@entry=0x7f9796987f47, job_context=job_context@entry=0x7f9796987d70, log_buffer=log_buffer@entry=0x7f9796987480,
    cfd=0x1a7b000, cfd@entry=0x0) at ../../src/yb/rocksdb/db/db_impl.cc:3202
#11 0x00007f97a4406cb3 in rocksdb::DBImpl::BackgroundCallFlush (this=this@entry=0x1d24000, cfd=cfd@entry=0x0) at ../../src/yb/rocksdb/db/db_impl.cc:3276
#12 0x00007f97a4406f6d in rocksdb::DBImpl::BGWorkFlush (db=db@entry=0x1d24000) at ../../src/yb/rocksdb/db/db_impl.cc:3132
#13 0x00007f97a4540875 in rocksdb::ThreadPool::BGThread (this=0x1adeb60, thread_id=0) at ../../src/yb/rocksdb/util/thread_posix.cc:126
#14 0x00007f97a4540899 in operator() (__closure=<optimized out>) at ../../src/yb/rocksdb/util/thread_posix.cc:165
#15 std::_Function_handler<void(), rocksdb::ThreadPool::StartBGThreads()::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
    at /opt/yb-build/brew/linuxbrew-20181203T161736v9-3ba4c2ed9b0587040949a4a9a95b576f520bae/Cellar/gcc/5.5.0_4/include/c++/5.5.0/functional:1871
#16 0x00007f979d804626 in operator() (this=0x1cbdc78) at /opt/yb-build/brew/linuxbrew-20181203T161736v9-3ba4c2ed9b0587040949a4a9a95b576f520bae/Cellar/gcc/5.5.0_4/include/c++/5.5.0/functional:2267
#17 yb::Thread::SuperviseThread (arg=0x1cbdc20) at ../../src/yb/util/thread.cc:771
#18 0x00007f97990b7694 in start_thread (arg=0x7f9796990700) at pthread_create.c:333
#19 0x00007f9798df941d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
```

After:
```
#0  0x00007f1a9109ba6b in raise (sig=11) at ../sysdeps/unix/sysv/linux/pt-raise.c:35
#1  0x00007f1a9edd7343 in rocksdb::(anonymous namespace)::AddHash (h=4266458700, data=0x2d58000 "", num_lines=1023, total_bits=523776, num_probes=6) at ../../src/yb/rocksdb/util/bloom.cc:66
#2  0x00007f1a9edd87c8 in rocksdb::(anonymous namespace)::FixedSizeFilterBitsBuilder::AddKey (this=0x2d30040, key=...) at ../../src/yb/rocksdb/util/bloom.cc:463
#3  0x00007f1a9ed8e757 in rocksdb::FixedSizeFilterBlockBuilder::AddKey (this=0x1ef22d0, key=...) at ../../src/yb/rocksdb/table/fixed_size_filter_block.cc:97
#4  0x00007f1a9ed8e6eb in rocksdb::FixedSizeFilterBlockBuilder::Add (this=0x1ef22d0, key=...) at ../../src/yb/rocksdb/table/fixed_size_filter_block.cc:91
#5  0x00007f1a9ed5b2ea in rocksdb::BlockBasedTableBuilder::Add (this=0x1e4dc00, key=..., value=...) at ../../src/yb/rocksdb/table/block_based_table_builder.cc:468
#6  0x00007f1a9eb80bd4 in rocksdb::BuildTable (dbname="/tmp/yb_tests__2020-12-14T18_15_29__23449.18214.17918/mytestdb-814110369", env=0x7f1a9f35fe40 <rocksdb::Env::Default()::default_env>, ioptions=..., env_options=...,
    table_cache=0x1e4b740, iter=0x7f1a8e961858, meta=0x7f1a8e962480, internal_comparator=std::shared_ptr<const rocksdb::InternalKeyComparator> (use count 3, weak count 0) = {...},
    int_tbl_prop_collector_factories=std::vector of length 1, capacity 1 = {...}, column_family_id=0, snapshots=std::vector of length 0, capacity 0, earliest_write_conflict_snapshot=72057594037927935,
    compression=rocksdb::kSnappyCompression, compression_opts=..., paranoid_file_checks=false, internal_stats=0x1ebc200, boundary_values_extractor=0x1e4c3f0, io_priority=rocksdb::Env::IO_HIGH, table_properties=0x7f1a8e962f70)
    at ../../src/yb/rocksdb/db/builder.cc:160
#7  0x00007f1a9ec8e56b in rocksdb::FlushJob::WriteLevel0Table (this=0x7f1a8e962eb0, mems=..., edit=0x1ec0268, meta=0x7f1a8e962480) at ../../src/yb/rocksdb/db/flush_job.cc:290
#8  0x00007f1a9ec8d767 in rocksdb::FlushJob::Run (this=0x7f1a8e962eb0, file_meta=0x7f1a8e962c70) at ../../src/yb/rocksdb/db/flush_job.cc:191
#9  0x00007f1a9ec10c56 in rocksdb::DBImpl::FlushMemTableToOutputFile (this=0x1ea0000, cfd=0x1bf7000, mutable_cf_options=..., made_progress=0x7f1a8e9640b7, job_context=0x7f1a8e963ee0, log_buffer=0x7f1a8e9635f0)
    at ../../src/yb/rocksdb/db/db_impl.cc:1873
#10 0x00007f1a9ec18bb6 in rocksdb::DBImpl::BackgroundFlush (this=0x1ea0000, made_progress=0x7f1a8e9640b7, job_context=0x7f1a8e963ee0, log_buffer=0x7f1a8e9635f0, cfd=0x1bf7000) at ../../src/yb/rocksdb/db/db_impl.cc:3202
#11 0x00007f1a9ec1914d in rocksdb::DBImpl::BackgroundCallFlush (this=0x1ea0000, cfd=0x0) at ../../src/yb/rocksdb/db/db_impl.cc:3276
#12 0x00007f1a9ec182fa in rocksdb::DBImpl::BGWorkFlush (db=0x1ea0000) at ../../src/yb/rocksdb/db/db_impl.cc:3132
#13 0x00007f1a9ee02747 in rocksdb::ThreadPool::BGThread (this=0x1c5ab60, thread_id=0) at ../../src/yb/rocksdb/util/thread_posix.cc:126
#14 0x00007f1a9ee028c6 in rocksdb::ThreadPool::<lambda()>::operator()(void) const (__closure=0x1e39c78) at ../../src/yb/rocksdb/util/thread_posix.cc:165
#15 0x00007f1a9ee03140 in std::_Function_handler<void(), rocksdb::ThreadPool::StartBGThreads()::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
    at /opt/yb-build/brew/linuxbrew-20181203T161736v9-3ba4c2ed9b0587040949a4a9a95b576f520bae/Cellar/gcc/5.5.0_4/include/c++/5.5.0/functional:1871
#16 0x00007f1aa1efe732 in std::function<void ()>::operator()() const (this=0x1e39c78) at /opt/yb-build/brew/linuxbrew-20181203T161736v9-3ba4c2ed9b0587040949a4a9a95b576f520bae/Cellar/gcc/5.5.0_4/include/c++/5.5.0/functional:2267
#17 0x00007f1a95e6bcf9 in yb::Thread::SuperviseThread (arg=0x1e39c20) at ../../src/yb/util/thread.cc:771
#18 0x00007f1a91093694 in start_thread (arg=0x7f1a8e96c700) at pthread_create.c:333
#19 0x00007f1a90dd541d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
```

Test Plan:
```
#!/usr/bin/env bash

set -euo pipefail

for i in {1..50}
do
  echo "Iteration: $i"
  rm -rf build/debug-gcc-dynamic-ninja/share/initial_sys_catalog_snapshot
  ybd --sj packaged
  ./bin/yb-ctl destroy
  ./bin/yb-ctl start
  ./bin/ysqlsh -c "SELECT 1"
done
./bin/yb-ctl destroy
```

Reviewers: bogdan, sergei, dmitry, mbautin

Reviewed By: mbautin

Subscribers: eng

Differential Revision: https://phabricator.dev.yugabyte.com/D10121
jasonyb pushed a commit that referenced this issue Dec 15, 2023
Summary:
YB Seq Scan code path is not hit because Foreign Scan is the default and
pg hint plan does not work.  Upcoming merge with YB master will bring in
master commit 465ee2c which changes the
default to YB Seq Scan.

To test YB Seq Scan, a temporary patch is needed (see the test plan).
With that, two bugs are encountered: fix them.

1. FailedAssertion("TTS_IS_VIRTUAL(slot)"

   On simple test case

       create table t (i int primary key, j int);
       select * from t;

   get

       TRAP: FailedAssertion("TTS_IS_VIRTUAL(slot)", File: "../../../../../../../src/postgres/src/backend/access/yb_access/yb_scan.c", Line: 3473, PID: 2774450)

   Details:

       #0  0x00007fd52616eacf in raise () from /lib64/libc.so.6
       #1  0x00007fd526141ea5 in abort () from /lib64/libc.so.6
       #2  0x0000000000af33ad in ExceptionalCondition (conditionName=conditionName@entry=0xc2938d "TTS_IS_VIRTUAL(slot)", errorType=errorType@entry=0xc01498 "FailedAssertion",
           fileName=fileName@entry=0xc28f18 "../../../../../../../src/postgres/src/backend/access/yb_access/yb_scan.c", lineNumber=lineNumber@entry=3473)
           at ../../../../../../../src/postgres/src/backend/utils/error/assert.c:69
       #3  0x00000000005c26bd in ybFetchNext (handle=0x2600ffc43680, slot=slot@entry=0x2600ff6c2980, relid=16384)
           at ../../../../../../../src/postgres/src/backend/access/yb_access/yb_scan.c:3473
       #4  0x00000000007de444 in YbSeqNext (node=0x2600ff6c2778) at ../../../../../../src/postgres/src/backend/executor/nodeYbSeqscan.c:156
       #5  0x000000000078b3c6 in ExecScanFetch (node=node@entry=0x2600ff6c2778, accessMtd=accessMtd@entry=0x7de2b9 <YbSeqNext>, recheckMtd=recheckMtd@entry=0x7de26e <YbSeqRecheck>)
           at ../../../../../../src/postgres/src/backend/executor/execScan.c:133
       #6  0x000000000078b44e in ExecScan (node=0x2600ff6c2778, accessMtd=accessMtd@entry=0x7de2b9 <YbSeqNext>, recheckMtd=recheckMtd@entry=0x7de26e <YbSeqRecheck>)
           at ../../../../../../src/postgres/src/backend/executor/execScan.c:182
       #7  0x00000000007de298 in ExecYbSeqScan (pstate=<optimized out>) at ../../../../../../src/postgres/src/backend/executor/nodeYbSeqscan.c:191
       #8  0x00000000007871ef in ExecProcNodeFirst (node=0x2600ff6c2778) at ../../../../../../src/postgres/src/backend/executor/execProcnode.c:480
       #9  0x000000000077db0e in ExecProcNode (node=0x2600ff6c2778) at ../../../../../../src/postgres/src/include/executor/executor.h:285
       #10 ExecutePlan (execute_once=<optimized out>, dest=0x2600ff6b1a10, direction=<optimized out>, numberTuples=0, sendTuples=true, operation=CMD_SELECT,
           use_parallel_mode=<optimized out>, planstate=0x2600ff6c2778, estate=0x2600ff6c2128) at ../../../../../../src/postgres/src/backend/executor/execMain.c:1650
       #11 standard_ExecutorRun (queryDesc=0x2600ff675128, direction=<optimized out>, count=0, execute_once=<optimized out>)
           at ../../../../../../src/postgres/src/backend/executor/execMain.c:367
       #12 0x000000000077dbfe in ExecutorRun (queryDesc=queryDesc@entry=0x2600ff675128, direction=direction@entry=ForwardScanDirection, count=count@entry=0, execute_once=<optimized out>)
           at ../../../../../../src/postgres/src/backend/executor/execMain.c:308
       #13 0x0000000000982617 in PortalRunSelect (portal=portal@entry=0x2600ff90e128, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0x2600ff6b1a10)
           at ../../../../../../src/postgres/src/backend/tcop/pquery.c:954
       #14 0x000000000098433c in PortalRun (portal=portal@entry=0x2600ff90e128, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true,
           dest=dest@entry=0x2600ff6b1a10, altdest=altdest@entry=0x2600ff6b1a10, qc=0x7fffc14a13c0) at ../../../../../../src/postgres/src/backend/tcop/pquery.c:786
       #15 0x000000000097e65b in exec_simple_query (query_string=0x2600ffdc6128 "select * from t;") at ../../../../../../src/postgres/src/backend/tcop/postgres.c:1321
       #16 yb_exec_simple_query_impl (query_string=query_string@entry=0x2600ffdc6128) at ../../../../../../src/postgres/src/backend/tcop/postgres.c:5060
       #17 0x000000000097b7a5 in yb_exec_query_wrapper_one_attempt (exec_context=exec_context@entry=0x2600ffdc6000, restart_data=restart_data@entry=0x7fffc14a1640,
           functor=functor@entry=0x97e033 <yb_exec_simple_query_impl>, functor_context=functor_context@entry=0x2600ffdc6128, attempt=attempt@entry=0, retry=retry@entry=0x7fffc14a15ff)
           at ../../../../../../src/postgres/src/backend/tcop/postgres.c:5028
       #18 0x000000000097d077 in yb_exec_query_wrapper (exec_context=exec_context@entry=0x2600ffdc6000, restart_data=restart_data@entry=0x7fffc14a1640,
           functor=functor@entry=0x97e033 <yb_exec_simple_query_impl>, functor_context=functor_context@entry=0x2600ffdc6128)
           at ../../../../../../src/postgres/src/backend/tcop/postgres.c:5052
       #19 0x000000000097d0ca in yb_exec_simple_query (query_string=query_string@entry=0x2600ffdc6128 "select * from t;", exec_context=exec_context@entry=0x2600ffdc6000)
           at ../../../../../../src/postgres/src/backend/tcop/postgres.c:5075
       #20 0x000000000097fe8a in PostgresMain (dbname=<optimized out>, username=<optimized out>) at ../../../../../../src/postgres/src/backend/tcop/postgres.c:5794
       #21 0x00000000008c8354 in BackendRun (port=0x2600ff8423c0) at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:4791
       #22 BackendStartup (port=0x2600ff8423c0) at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:4491
       #23 ServerLoop () at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:1878
       #24 0x00000000008caa55 in PostmasterMain (argc=argc@entry=25, argv=argv@entry=0x2600ffdc01a0) at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:1533
       #25 0x0000000000804ba8 in PostgresServerProcessMain (argc=25, argv=0x2600ffdc01a0) at ../../../../../../src/postgres/src/backend/main/main.c:208
       #26 0x0000000000804bc8 in main ()

       3469    ybFetchNext(YBCPgStatement handle,
       3470                            TupleTableSlot *slot, Oid relid)
       3471    {
       3472            Assert(slot != NULL);
       3473            Assert(TTS_IS_VIRTUAL(slot));

       (gdb) p *slot
       $2 = {type = T_TupleTableSlot, tts_flags = 18, tts_nvalid = 0, tts_ops = 0xeaf5e0 <TTSOpsHeapTuple>, tts_tupleDescriptor = 0x2600ff6416c0, tts_values = 0x2600ff6c2a00, tts_isnull = 0x2600ff6c2a10, tts_mcxt = 0x2600ff6c2000, tts_tid = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 0, yb_item = {ybctid = 0}}, tts_tableOid = 0, tts_yb_insert_oid = 0}

   Fix by making YB Seq Scan always use virtual slot.  This is similar
   to what is done for YB Foreign Scan.

2. segfault in ending scan

   Same simple test case gives segfault at a later stage.

   Details:

       #0  0x00000000007de762 in table_endscan (scan=0x3debfe3ab88) at ../../../../../../src/postgres/src/include/access/tableam.h:997
       #1  ExecEndYbSeqScan (node=node@entry=0x3debfe3a778) at ../../../../../../src/postgres/src/backend/executor/nodeYbSeqscan.c:298
       #2  0x0000000000787a75 in ExecEndNode (node=0x3debfe3a778) at ../../../../../../src/postgres/src/backend/executor/execProcnode.c:649
       #3  0x000000000077ffaf in ExecEndPlan (estate=0x3debfe3a128, planstate=<optimized out>) at ../../../../../../src/postgres/src/backend/executor/execMain.c:1489
       #4  standard_ExecutorEnd (queryDesc=0x2582fdc88928) at ../../../../../../src/postgres/src/backend/executor/execMain.c:503
       #5  0x00000000007800f8 in ExecutorEnd (queryDesc=queryDesc@entry=0x2582fdc88928) at ../../../../../../src/postgres/src/backend/executor/execMain.c:474
       #6  0x00000000006f140c in PortalCleanup (portal=0x2582ff900128) at ../../../../../../src/postgres/src/backend/commands/portalcmds.c:305
       #7  0x0000000000b3c36a in PortalDrop (portal=portal@entry=0x2582ff900128, isTopCommit=isTopCommit@entry=false)
           at ../../../../../../../src/postgres/src/backend/utils/mmgr/portalmem.c:514
       #8  0x000000000097e667 in exec_simple_query (query_string=0x2582ffdc6128 "select * from t;") at ../../../../../../src/postgres/src/backend/tcop/postgres.c:1331
       #9  yb_exec_simple_query_impl (query_string=query_string@entry=0x2582ffdc6128) at ../../../../../../src/postgres/src/backend/tcop/postgres.c:5060
       #10 0x000000000097b79a in yb_exec_query_wrapper_one_attempt (exec_context=exec_context@entry=0x2582ffdc6000, restart_data=restart_data@entry=0x7ffc81c0e7d0,
           functor=functor@entry=0x97e028 <yb_exec_simple_query_impl>, functor_context=functor_context@entry=0x2582ffdc6128, attempt=attempt@entry=0, retry=retry@entry=0x7ffc81c0e78f)
           at ../../../../../../src/postgres/src/backend/tcop/postgres.c:5028
       #11 0x000000000097d06c in yb_exec_query_wrapper (exec_context=exec_context@entry=0x2582ffdc6000, restart_data=restart_data@entry=0x7ffc81c0e7d0,
           functor=functor@entry=0x97e028 <yb_exec_simple_query_impl>, functor_context=functor_context@entry=0x2582ffdc6128)
           at ../../../../../../src/postgres/src/backend/tcop/postgres.c:5052
       #12 0x000000000097d0bf in yb_exec_simple_query (query_string=query_string@entry=0x2582ffdc6128 "select * from t;", exec_context=exec_context@entry=0x2582ffdc6000)
           at ../../../../../../src/postgres/src/backend/tcop/postgres.c:5075
       #13 0x000000000097fe7f in PostgresMain (dbname=<optimized out>, username=<optimized out>) at ../../../../../../src/postgres/src/backend/tcop/postgres.c:5794
       #14 0x00000000008c8349 in BackendRun (port=0x2582ff8403c0) at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:4791
       #15 BackendStartup (port=0x2582ff8403c0) at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:4491
       #16 ServerLoop () at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:1878
       #17 0x00000000008caa4a in PostmasterMain (argc=argc@entry=25, argv=argv@entry=0x2582ffdc01a0) at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:1533
       #18 0x0000000000804b9d in PostgresServerProcessMain (argc=25, argv=0x2582ffdc01a0) at ../../../../../../src/postgres/src/backend/main/main.c:208
       #19 0x0000000000804bbd in main ()

       294             /*
       295              * close heap scan
       296              */
       297             if (tsdesc != NULL)
       298                     table_endscan(tsdesc);

   Reason is initial merge 55782d5
   incorrectly merges end of ExecEndYbSeqScan.  Upstream PG
   9ddef36278a9f676c07d0b4d9f33fa22e48ce3b5 removes code, but initial
   merge duplicates lines.  Remove those lines.

Test Plan:
Apply the following patch to activate YB Seq Scan:

    diff --git a/src/postgres/src/backend/optimizer/path/allpaths.c b/src/postgres/src/backend/optimizer/path/allpaths.c
    index 8a4c38a965..854d84a648 100644
    --- a/src/postgres/src/backend/optimizer/path/allpaths.c
    +++ b/src/postgres/src/backend/optimizer/path/allpaths.c
    @@ -576,7 +576,7 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
                     else
                     {
                         /* Plain relation */
    -                    if (IsYBRelationById(rte->relid))
    +                    if (false)
                         {
                             /*
                              * Using a foreign scan which will use the YB FDW by

On almalinux 8,

    ./yb_build.sh fastdebug --gcc11
    pg15_tests/run_all_tests.sh fastdebug --gcc11 --sj --sp --scb

fails the following tests:

- test_D29546
- test_pg15_regress: yb_pg15
- test_types_geo: yb_pg_box
- test_hash_in_queries: yb_hash_in_queries

Manually check to see that they are due to YB Seq Scan explain output
differences.

Reviewers: aagrawal, tfoucher

Reviewed By: tfoucher

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D31139
jasonyb pushed a commit that referenced this issue Jun 11, 2024
Issue - (#16): PG-112: Change the column name "ip" to "client_ip" for readability purpose.
Issue - (#17): PG-111: Show all the queries from complete and incomplete buckets.
Issue - (#18): PG-108: Log the bucket start time.
Issue - (#19): PG-99: Response time histogram.
Issue - (#20): PG-97: Log CPU time for a query.
Issue - (#21): PG-96: Show objects(tables) involved in the query.
Issue - (#22): PG-93: Retain the bucket, and don't delete the bucket automatically.
Issue - (#23): PG-91: Log queries of all the databases.
Issue - (#24): PG-116: Restrict the query size.
Issue - (#3) : README file update.
devansh-ism pushed a commit to devansh-ism/yugabyte-db that referenced this issue Jul 19, 2024
devansh-ism pushed a commit to devansh-ism/yugabyte-db that referenced this issue Jul 19, 2024
Summary:
Copyright update.

Initial commit.

The plugin was first named json_decoding_plugin but changed the name
after discussion on -hackers. It is a POC to illustrate that it is
possible for changeset extraction work just fine. The changeset
extraction feature is already in development, hence, it is expected that
this code changes a lot until the feature is set in stone.

It is based on:

repository: git://git.postgresql.org/git/users/andresfreund/postgres.git
branch: xlog-decoding-rebasing-remapping
commit: 6f819a68cd10938df126ee44a3cfef7f457c8e1b

Generate a long bytea instead of explicitly writing a long one.

It reduces the sql file size. Suggested by Merlin Moncure.

In commit ea177a3ba7a7901f6467eadb0a407e03d46462fd, the internal structure was changed.

Unbreak regression tests.

Remove contrib dependency.

The old Makefile relies heavily on been in the contrib tree. While
analysing the ethansf pull request, I realized that that premise does
not hold water because you could build it without a source code
tree.

Although I didn't have used his patch [1], this code was inspired on it.

[1] eulerto/wal2json#1

Fix regression tests.

The regression tests was very unstable. The main reason is the
associated LSN in some CONTEXT messages. Adding VERBOSITY setting to
supress those CONTEXT messages.

Update copyright.

Include LSN position for the next transaction.

This feature was discussed with @ethansf and is useful for tool that
consumes WAL using pg_recvlogical and similar tools. The LSN is
important to mark the apply progress. It stores the LSN pointing to the
end of commit record + 1. Hence, if the consuming tool crashes while
applying a transaction (JSON document), it can restart the
pg_recvlogical streaming from the last saved LSN position. This is
essential for such consuming tool to work properly.

Although, I did not have used Ethans' patch, this piece of code is
based on his idea.

Rename "next lsn" JSON key.

It is a comestic change because you can access values like:

foo['next lsn']

The other form (foo.bar) is not possible because of the blank space.

Fix handling of nonexistant columns in old tuple versions.

This fix was discoverd in bug yugabyte#13470 and commit
d47a1136e441cebe7ae7fe72d70eb8ce278d5cd6 fixes it.

pretty print option.

The new parameter 'pretty-print' chooses between a compact (without
spaces) or a pretty-print format.

The default changed to compact format since tools usually transform the
JSON objects.

Control how the stream is written.

The new parameter 'write-in-chunks' controls how to write in the
stream. If true, write occurs per tuple; else it writes at the end
of transaction.

The default is true. Hence, the behavior is the same as prior code.

Add a real README.

README has instructions to build and install wal2json. Also, there are
examples of use.

Add Coverity Scan Badge

Replace palloc with palloc0

Avoid garbage on JsonDecodingData.

Fixed tests results

Give an error instead of a warning in case of bad param

Added gitignore

Ignore .o files

Error message on unknown params reworded

Merge pull request yugabyte#13 from dvarrazzo/gitignore

Added gitignore

Merge pull request yugabyte#12 from dvarrazzo/err-on-bad-option

Give an error instead of a warning in case of bad param

Merge pull request yugabyte#11 from dvarrazzo/fix-tests

Fixed tests results

Default for write-in-chunks changed to false

Default for include-xids changed to false

style fix

Update copyright year information

Dropped include-xids=0 from test suite

Avoid leaking memory when we have some can-not-happen situations

Fix LICENSE

Forgot to replace organization.

Merge pull request yugabyte#18 from dvarrazzo/better-defaults

Better defaults

Use '1' and '0' instead of 't' and 'f' on tests

Forgot to add some static on functions

Daniele Varrazzo

Support generic WAL messages for logical decoding

This feature allows software to insert data into WAL stream that can be
read by wal2json. Those messages could be useful to control replication,
for example. Messages can be sent as transactional or not. Non-transactional messages mean that it is sent even if the transaction is rollbacked.

There was a PR yugabyte#20 for this same feature but I didn't use it. Indeed,
this code was dusty in my computer for a few months.

NOTE: 'message' test will fail on <= 9.5 because this feature was coded
in 9.6 (I don't want to complicate Makefile).

Fix generic WAL messages

As spotted in issue yugabyte#28, I committed a code that was not supposed to be
part of the original commit 645ab69.

Thanks to Julien Rouhaud (@rjuju) to bring it up.

Support type name with modifier

Issue yugabyte#31 asked for type name with type modifier (if available). This
commit implemented type modifier inclusion by default. It is possible to
use the old type name, if you use the option 'include-typmod' as 'false'.

I'll change the regression tests to use the new format later. I've added
a new test (typmod) to exercise this new option.

Fix a description

Remove superfluous includes

rel.h includes pg_class.h, index.h, and relcache.h. logical.h includes
output_plugin.h. All of the others are not necessary.

TupleDescAttr() support

Commit d34a74dd064af959acd9040446925d9d53dff15b introduced
TupleDescAttr(tupdesc, i) in back branches (9.4.14, 9.5.9, 9.6.5, 10). If the
version supports it, use this macro. This change will unbreak (future) version
11.

Support type oids

Issue yugabyte#37 asked for type oids in the output because psycopg2 uses it
directly for internal type manipulation. Although @lionel-panhaleux had
provided PR yugabyte#38, I didn't like the way it was implemented mainly because
it mixes options. The new option 'include-type-oids' has default to
false. I didn't provide tests for this feature.

Support not-null constraints information

Issue yugabyte#44 asked for not-null constraints information because the other
side could benefit from this. The new option 'include-not-null' will add
another name/value pair (columnoptionals) that contains false for
not-null constraint (it means that the value is not optional) or true,
otherwise. The default is false.

Improve JSON encoding: * use the postgres builtin escape_json() function * escape all type, schema, table, field names. Add tests. [yugabyte#35] * leave `\x` alone unless it's a bytea prefix [yugabyte#23] * escape generic logical decoding messages with tests

Update call to AllocSetContextCreate for PostgreSQL 9.6+

Merge pull request yugabyte#55 from davidfetter/pg_96_plus

Update call to AllocSetContextCreate for PostgreSQL 9.6+

Filter out tables

The new parameter 'filter-tables' is a comma separated value that
contains schema.table elements. Both schema and table can be * that
denotes all. Hence, *.table_1 means table_1 in all schemas and
schema_1.* means all tables in schema_1.

Some characters (space, single quote, comma, period, asterisk) have
special treatment because they are part of the comma separated value. If
any of these characters is part of the schema or table name, add a
backslash (\) before it. For example, table public."foo bar" should be
specified as 'public.foo\ bar'.

Select tables to retrieve data

The new parameter 'add-tables' select only data from those tables. Like
parameter 'filter-tables', it is a comma separated value that contains
schema.table elements. Both schema and table can be * that denotes all.
Escape also works for special characters. By default, all tables in all
schemas are selected.

Merge pull request yugabyte#40 from koordinates/encoding

Improve JSON encoding

Preserve unchanged toast values by default. Use `include-unchanged-toast=0` option to suppress.

Merge pull request yugabyte#50 from koordinates/include-unchanged-toast

Preserve unchanged toast values by default.

This commit introduces a behavior change since TOAST values will always be output from now on. If you want the previous behavior, add include-unchanged-toast=0.

Style fixes for last commit

Copyright update

Document parameters

Update examples

The examples don't reflect the defaults options. Also, rename tables in
SQL functions example (avoid table conflict with the other test).

Build wal2json out of the tree on Windows

Build out of the tree is a common task in Unix-like systems (PostgreSQL
has PGXS but it doesn't support Windows). This commit adds a project
file to build out of the tree on Windows. I also add minimal
instructions to build on Windows. I tested wal2json on MS Visual Studio
2017 (version 15.6.4) on a x64 platform.

Improve style

Fix oversight in copyright message

LICENSE is correct but C file has wrong copyright message.

Tweak message test

Per PR yugabyte#59, replace bytea value with valid data.

Sanity checks after selective table options

The sanity checks should be run after filter-tables and add-tables
options to avoid a bunch of unsolicited warnings (per issue yugabyte#64).

Refactor wal2json output

Almost all writes were handled in two different ways: with or without
pretty print. It means more lines of code and an error-prone code. The
new code avoids all of the pretty-print tests and also uses only one
setence per output. It reduces the code by more than 150 lines.

I also fix some comments (use only C89 comments) and improve style.

Revert include-unchanged-toast option

Per discussion in issue yugabyte#74, we can't rely on access unchanged TOAST
data because it is not in the WAL stream.

It reverts commits 947043e
ce82d73 and
d86a13b .

Add an error message to include-unchanged-toast parameter

Per issue yugabyte#74, parameter include-unchanged-toast wasn't safe, hence,
remove it. Since it is part of release 1.0, deprecate this option for
the next release.

Make wal2json logging less chatty

Suppress the "XXX argument is null" message. Decrease the TOAST message
level from WARNING to DEBUG1. Also, decrease some debug message levels.

Improve perf on deletes when repl ident full

Merge pull request yugabyte#82 from rkrage/repl-ident-full-perf

Improve perf on deletes when repl ident full

Add tests for include-timestamp option

Add tests for include-lsn option

Add tests for include-xids option

Merge pull request yugabyte#94 from paulczajka/add-tests

Add tests for timestamp, lsn, xids options

Avoid null pointer dereference with empty filter-tables

As spotted by clang:
wal2json.c:321:21: warning: Dereference of null pointer
                        rawstr = pstrdup(strVal(elem->arg));
                                         ^~~~~~~~~~~~~~~~~
/usr/include/postgresql/11/server/nodes/value.h:54:20: note: expanded from macro 'strVal'

Add option format-version

This option defines which format to use. Only version 1 is currently
supported. New formats will be available.

Note include-unchanged-toast is deprecated

Merge pull request yugabyte#102 from benjie/patch-1

Note `include-unchanged-toast` is deprecated

Merge pull request yugabyte#97 from xrmx/nullfiltertables

Avoid null pointer dereference with empty filter-tables

remove empty line

Add 'cd' command to README

Merge pull request yugabyte#107 from benjie/patch-2

Add 'cd' command to README

PATH should contain the parent folder of pg_config

Merge pull request yugabyte#108 from benjie/patch-3

PATH should contain the parent folder of pg_config

Fix add-tables to filter single-character schemas

Add extra_float_digits to tests

Postgres commit 02ddd499322ab6f2f0d58692955dc9633c2150fc for v12 and
later, changes the default number of trailing digits output for double
precision. Add extra_float_digits to keep regression tests stable.

Filter messages by prefix

The new parameter 'filter-msg-prefixes' filters messages with those
prefixes. It a comma separated value. By default no messages are
filtered. Per off-list discussion with @martinmarques

Updated copyright to 2019

Add messages by prefix

The new parameter 'add-msg-prefixes' adds only messages with these
prefixes. It is a comma separated value. By default all messages are
included. wal2json enforces this rule after 'filter-msg-prefixes'. Per
off-list discussion with @martinmarques

Merge pull request yugabyte#123 from davidfetter/copyright_2019

Updated copyright to 2019

Update instructions

Since version 10, we don't need to adjust parameters. Also, pg_hba.conf
changes its behavior. It matches normal connections instead of
(physical) replication connections.

Merge pull request yugabyte#113 from benjie/patch-4

Fix add-tables to filter single-character schemas

New wal2json format

This is a new format for wal2json. You can choose it using
option 'format-version' = 2. This format is completely different from
version 1.

Features are:

* one JSON per tuple;
* each JSON has an "action" (BEGIN, COMMIT, INSERT, UPDATE, DELETE,
MESSAGE);
* one (optional) JSON object for BEGIN/COMMIT;
* BEGIN contains xid, timestamp, and lsn;
* COMMIT contains xid, timestamp, and lsn;
* INSERT/UPDATE/DELETE contains lsn, schema, table, columns, identity;
* "columns" and "identity" are arrays of elements;
* each "columns" element is an object that contains name, type, value,
and optional;
* "identity" is an array of elements (REPLICA IDENTITY for UPDATE /
DELETE statements) that contains name, type, and value;
* MESSAGE contains xid, timestamp, lsn, transactional, prefix, and
content.

This new format solves the big transaction issue (that consumes a lot of
memory) since tuples aren't accumulate until the end of transaction to
be written. Users can control transactions using option
include-transaction that will emit a JSON at the beginning of the
transaction and another one at the end of the transaction.

Fix tests

Postgres 12 breaks random() predictability. Since we use the same tests
for all versions, replace random() with a sequence.

Improve regression test

message regression test used to fail with 9.4 and 9.5. That's because
message API is available in 9.6 and later. Filter out this test if the
major version is 9.4 or 9.5.

Add parameter include-origin

This parameter includes replication origin. Default is false. Format
version 2 adds origin information in BEGIN, COMMIT, CHANGE, and MESSAGE.
In format version 1, origin is also available. Since replication origin
was introduced in 9.5, 9.4 does not print origin information.

Support TRUNCATE command for logical decoding

This feature includes TRUNCATE commands into wal2json output. It was
introduced in PostgreSQL 11 which means that wal2json will support it
only for this version and on. Format version 2 will output it by
default. Format version 1 won't (maintain backward compatibility). (A
new option will be added to select actions -- insert, update, delete,
truncate -- and it should be possible to output TRUNCATE commands in
version 1 too).

Option actions

This new feature let users select actions they want to receive. Actions
supported are insert, update, delete, and truncate. This actions are
similar to PUBLICATION command.

Format version 1 won't enable truncate by default (to maintain backward
compatibility). Format version 2 will enable all options by default.

A new test was added (run only on Postgres v11 or later).  That's
because test includes truncate and it will only supported by v11 or
later.

Adjust default format version

Format version 1 will still remain the default version. wal2json does
not require a format version until a few commits ago (with no new
version released) and use a new format version will break calls that
does not inform format-version parameter. Let's give some time to users
adopt format-version parameter.

format-version 2 example

This new example shows the output for format-version 2.

Fix license oversight

The wal2json license is BSD 3-clause, however, there is a sentence in the
README that says it is released under PostgreSQL license (it is not). This is
an oversight in the commit 0b653ee.

Update copyright

Adjust instructions

Use default max_replication_slots value as suggestion for examples /
tests.

Fix oversight in option actions

This oversight was introduced in commit
aa431a9.

Issue yugabyte#148 reported by @dko-slapdash.

Fix memory leak while using add-tables and filter-tables

This issue was reported in PR yugabyte#147 by @haizz. The proposed fix was
incomplete (as I pointed out) and I fixed add-tables option and also
format 2 (add-tables and filter-tables).

Fix segfault on 32 bits

I forgot to pass a Datum and I'm surprised it haven't crashed yet.  This
commit fixes issue yugabyte#142.

Reported by @df7cb and confirmed by @cpaelzer.

Use correct format specifier

uint64 should use UINT64_FORMAT instead of %lu because format specifier
depends on a type probe. It suppresses a compiler warning.

Move some debug code for general function

Some debug messages were only printed if we are using format 1, let's
move them to general function.

Add macro for backward compatibility

UInt64GetDatum was introduced in 9.6. I forgot to test old stable
versions and didn't get it. Fortunately, @df7cb reported it.

Add parameter include-domain-data-type

This parameter replaces domain name with its underlying data type.
Default is false (backward compatibility). Both formats (1 and 2) are
supported. It is useful if you want to expose the "real" data type.

struct ReorderBufferTXN changed

Some struct members (including has_catalog_changes) was turned into a
single variable since v13 (cf postgres commit
a7b6ab5db1d35438112f74f3531354ddd61970b5).

Rephrase include-unchanged-toast docs

This parameter was deprecated some time ago. However, its description is
confusing users. Let's add a blink alert to NOT use it. It will
eventually be removed.

Add expected file for parameter include-domain-data-type

Commit e33df94 forgot to add expected
file.

Add parameter filter-origins

This parameter is useful in multidirectional replication solutions. It
can prevent replicating some data back and forth. Although, each
transaction has origin information, filter using this callback is more
efficient. This callback is available in 9.5+.

Improve Linux installation instructions

Since there are some issues that are opened from time to time, I add a
few instructions to Linux distros. I add instructions to install the
available packages from Debian/Ubuntu and RHEL/CentOS. I also write some
instructions to compile on those OSes. I hope it should clarify doubts
about installation.

Fix an oversight for old minor versions

Format 2 (1b0cbac) forgot to consider
the workaround I applied to TupleDescAttr
(5352cc4).  Since I did not test with
old minor versions, this bug was not caught by tests. This was reported
in issue yugabyte#162.

Improve configuration section

Add a comment explaining which parameters need adjust based on the
version users are using. Version 10 or later just need to change
wal_level, for example. Though, old versions have to adjust a few
parameters to use logical replication.

Add parameter include-column-positions

This parameter is useful to detect schema changes (since attnum are
numbered from 1 up). Both formats (1 and 2) are supported although
element names are different ("columnpositions" for format 1 and
"position" for format 2). Default is false.

Issue yugabyte#160

Add parameter include-default

This parameter adds "columndefaults" that contains all defaults values
(same as pg_get_expr). Both formats (1 and 2) are supported although
element names are different ("columndefaults" for format 1 and "default"
for format 2). Each default value is represented as string because
it can be an expression. Tests are included. Default is false.

I also added prototypes to 2 static functions.

Add parameter include-pk

This parameter adds primary key information if it is available. Both
formats (1 and 2) are supported. Each "pk" object provides column name
and its data type. Tests are included. Default is false.

Fix an oversight in format 2: nextlsn

Format 2 (1b0cbac) forgot to add
nextlsn. Format 1 includes this information if include-lsn is provided.
This information can be useful for applications that use LSN position as
a restart point.

Fix package name

As pointed by @MaxmaxmaximusAWS Debian/Ubuntu package name was wrong.
It fixes yugabyte#175.

Update instructions to Postgres 13

Since Postgres 13 was released a few days ago, let's update wal2json
instructions. I also replaced wal2json version with the latest version
(2.3).

Improve pretty print

Replace strcpy() and strncpy() with an assignment.

Improve documentation

Add 'message' usage to the examples. In comparison to the widely used
output plugins such as pgoutput and pglogical 2, wal2json is the only
plugin that supports message_cb.

Fix table names in the example

I renamed the tables in the first example but forgot to change the
script.

Remove unused code

Don't rely on index attribute names for primary key

If you rename a column that is part of the primary key, the attname for
the primary key index isn't renamed. Someone could argue that it is a
Postgres bug (I think it is a catalog inconsistency that could probably
be fixed eventually) but it seems the cure is worse than the disease
(see [1]).

The code was refactored to avoid comparison by attribute name, instead
it uses RelationGetIndexAttrBitmap() that obtains a bitmap of index
attribute numbers based on its kind (primary key, identity, ...).

Closes yugabyte#11

[1] https://www.postgresql.org/message-id/CAA8M49r%3D1XoE27tQ08sAPhft_ayBu4Vvib%2BubwX4SRqtUGJ%2B3g%40mail.gmail.com

Combine common code into functions

Instead of repeating code into different wal2json format output
functions, combine the common code into functions and use them.

Add docs on include-origin option.

document include-transaction option

Filter is not applied for TRUNCATE

Filter is failing to exclude tables that matchs the filter. Filter code
was refactored at commit a96dd31 but I
forgot to replace the filter in the truncate function. I include a
regression test to cover this code.

Merge pull request yugabyte#207 from olirice/master

Document include-transaction option

Avoid duplicate double quotes for type names

When an internal function already returns a type name with double
quotes, doesn't add another one.

Fix yugabyte#200

Stamp 2.4

default value was not always printed in v1

This bug prevents a default value to be printed in v1 after a TRUNCATE
statement.

Reported by @ls-guillaume-rebesche

Fixes yugabyte#225

Parameter write-in-chunks is only used for format version 1

If you set format-version to 2, parameter write-in-chunks has no effect.
Document this behavior.

Fixes yugabyte#228

Add Postgres 14 to the installation instructions

Refer to the latest Postgres version in the instructions.

Update copyright year

Add include-type-oids support in format-version 2

unbreak test on old versions

'default' test uses TRUNCATE that is not supported in v10 and previous
versions. Add an alternate test file to cover this case.

Array type does not print the correct type name

If the base type requires quotes and the type is an array, wal2json is
not quoting the whole type name. Instead, it was not considering the
brackets as part of the type name. Fix both format versions.

Fixes yugabyte#233

Lag tracking support

See lag times in pg_stat_replication.

Add PostgreSQL 15 support

Merge pull request yugabyte#247 from Naros/postgres15-compatibility

Add PostgreSQL 15 support

Stamp 2.5

Add option to output numeric data types as string. (yugabyte#255)

Add option to output numeric data types as string.

Data types like `numeric`, `real`, `double precision` supports `Infinity`,
`-Infinity` and `NaN` values. Currently these values output as `null` because
JSON specification does not recognize them as valid numeric values. This will
create problems for the users of wal2json who need these values to maintain
data integerity.

invalid JSON for non-transaction message in v1 (yugabyte#266)

Non-transactional messages contain only one object. It means comma
should not be provided for such JSON objects. If you are sending a
non-transactional message into a transaction, it should guarantee that
comma is only emitted for transactional commands. Hence, the previous
check is weak and it should also check for transactional information.

Fixes yugabyte#266

Correct DELETED tuple reference pg_decode_change_v1 (yugabyte#252)

Avoid variable shadowing

Fix typo

Fix lag tracking support for large transactions

Postgres fixed some time ago a logical replication timeout during large
transactions (commit
postgres/postgres@f95d53e). For
Postgres 16, a proper fix was added in the logical decoding but the
previous versions (10 to 14) a new function
(update-replication_progress) was added. It is the same function used by
pgoutput plugin.

Fixes yugabyte#270

Add version

Packages might rely on the version from source code instead of
extracting the version from the tarball file name.

If the WAL2JSON_VERSION contains "dev" suffix it means it is not a
released version. WAL2JSON_VERSION_NUM changes only for release. After a
release, the next commit will add the "dev" suffix that will be removed
only for the next release.

Replace tab with space

Update .gitignore

Fix include-pk when replica identity is full

In format 1, the primary key information was not included even though
the include-pk option is enabled. It happens when the table has a
primary key and the replica identity is full. Format 2 was not affected.

Fixes yugabyte#273

Update instructions with the latest PostgreSQL version

wal2json supports up to the latest supported version: 16.

Fix Red Hat package name

Update copyright year

Add PostgreSQL 17 support

Postgres 17 changed the ReorderBufferChange data structure (commit
postgres/postgres@08e6344).

Update build infrastructure for Windows

It updates MS Visual Studio from 2017 (v141) to 2022 (v143). It also
changes the target OS from 8.1 to 10.0 because Windows 8 was EOL for
some time (Windows 10 is also deprecated 3 months ago but since it is
the one I used to test these builds, I kept it. Feel free to update
WindowsTargetPlatformVersion if you are using a newer Windows version.)

It also includes a solution file (wal2json.sln -- although there aren't
multiple project files) that you can also use to trigger the build using
the following command:

msbuild wal2json.sln /verbosity:normal /p:Configuration=Release

I tested these builds with Postgres 10 and later on x86_64.

Stamp 2.6

Add 'src/postgres/third-party-extensions/wal2json/' from commit '75629c2e1e81a12350cc9d63782fc53252185d8d'

git-subtree-dir: src/postgres/third-party-extensions/wal2json
git-subtree-mainline: 66ed3a5
git-subtree-split: 75629c2

Test Plan: jenkins: compile only

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D36690
myang2021 added a commit that referenced this issue Aug 8, 2024
Summary:
The DDL atomicity stress tests failed more on pg15 branch with an error like:

```
WARNING: ThreadSanitizer: data race (pid=180911)
  Write of size 8 at 0x7b2c000257b8 by thread T17 (mutexes: write M0):
    #0 profile_open_file prof_file.c (libkrb5.so.3+0xf45b3)
    #1 profile_init_flags <null> (libkrb5.so.3+0xfb056)
    #2 k5_os_init_context <null> (libkrb5.so.3+0xe5546)
    #3 krb5_init_context_profile <null> (libkrb5.so.3+0xabc90)
    #4 krb5_init_context <null> (libkrb5.so.3+0xabbd5)
    #5 krb5_gss_init_context init_sec_context.c (libgssapi_krb5.so.2+0x448da)
    #6 acquire_cred_from acquire_cred.c (libgssapi_krb5.so.2+0x39159)
    #7 krb5_gss_acquire_cred_from acquire_cred.c (libgssapi_krb5.so.2+0x39072)
    #8 gss_add_cred_from <null> (libgssapi_krb5.so.2+0x1fcd3)
    #9 gss_acquire_cred_from <null> (libgssapi_krb5.so.2+0x1f69d)
    #10 gss_acquire_cred <null> (libgssapi_krb5.so.2+0x1f431)
    #11 pg_GSS_have_cred_cache ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/../../../../../../src/postgres/src/interfaces/libpq/fe-gssapi-common.c:68:10 (libpq.so.5+0x543fe)
    #12 PQconnectPoll ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/../../../../../../src/postgres/src/interfaces/libpq/fe-connect.c:2909:22 (libpq.so.5+0x359ca)
    #13 connectDBComplete ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/../../../../../../src/postgres/src/interfaces/libpq/fe-connect.c:2241:10 (libpq.so.5+0x30807)
    #14 PQconnectdb ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/../../../../../../src/postgres/src/interfaces/libpq/fe-connect.c:719:10 (libpq.so.5+0x30af1)
    #15 yb::pgwrapper::PGConn::Connect(string const&, std::chrono::time_point<yb::CoarseMonoClock, std::chrono::duration<long long, std::ratio<1l, 1000000000l>>>, bool, string const&) ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/libpq_utils.cc:348:24 (libpq_utils.so+0x13c5b)
    #16 yb::pgwrapper::PGConn::Connect(string const&, bool, string const&) ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/libpq_utils.h:254:12 (libpq_utils.so+0x1a77e)
    #17 yb::pgwrapper::PGConnBuilder::Connect(bool) const ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/libpq_utils.cc:743:10 (libpq_utils.so+0x1a77e)
    #18 yb::pgwrapper::LibPqTestBase::ConnectToDBAsUser(string const&, string const&, bool) ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/libpq_test_base.cc:54:6 (libpg_wrapper_test_base.so+0x26f34)
    #19 yb::pgwrapper::LibPqTestBase::ConnectToDB(string const&, bool) ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/libpq_test_base.cc:44:10 (libpg_wrapper_test_base.so+0x26b1e)
    #20 yb::pgwrapper::LibPqTestBase::Connect(bool) ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/libpq_test_base.cc:40:10 (libpg_wrapper_test_base.so+0x26b1e)
    #21 yb::pgwrapper::PgDdlAtomicityStressTest::Connect() ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/pg_ddl_atomicity_stress-test.cc:147:25 (pg_ddl_atomicity_stress-test+0x136d6c)
    #22 yb::pgwrapper::PgDdlAtomicityStressTest::TestDdl(std::vector<string, std::allocator<string>> const&, int) ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/pg_ddl_atomicity_stress-test.cc:165:15 (pg_ddl_atomicity_stress-test+0x136df5)
    #23 yb::pgwrapper::PgDdlAtomicityStressTest_StressTest_Test::TestBody()::$_2::operator()() const ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/pg_ddl_atomicity_stress-test.cc:316:5 (pg_ddl_atomicity_stress-test+0x13d2eb)
```

It appears that the function `yb::pgwrapper::LibPqTestBase::Connect` isn't
thread safe. I restructured the code to make the connections in a single thread
and then pass them to various concurrent threads for testing.
Jira: DB-2996

Test Plan:
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/0 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/1 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/2 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/3 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/4 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/5 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/6 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/7 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/8 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/9 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/10 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/11 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/12 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/13 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/14 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/15 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/16 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/17 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/18 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/19 --clang17

Verified that no more tsan errors.

Reviewers: fizaa

Reviewed By: fizaa

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D37111
myang2021 added a commit that referenced this issue Aug 8, 2024
… in tsan build

Summary:
The DDL atomicity stress tests failed more on pg15 branch with an error like:

```
WARNING: ThreadSanitizer: data race (pid=180911)
  Write of size 8 at 0x7b2c000257b8 by thread T17 (mutexes: write M0):
    #0 profile_open_file prof_file.c (libkrb5.so.3+0xf45b3)
    #1 profile_init_flags <null> (libkrb5.so.3+0xfb056)
    #2 k5_os_init_context <null> (libkrb5.so.3+0xe5546)
    #3 krb5_init_context_profile <null> (libkrb5.so.3+0xabc90)
    #4 krb5_init_context <null> (libkrb5.so.3+0xabbd5)
    #5 krb5_gss_init_context init_sec_context.c (libgssapi_krb5.so.2+0x448da)
    #6 acquire_cred_from acquire_cred.c (libgssapi_krb5.so.2+0x39159)
    #7 krb5_gss_acquire_cred_from acquire_cred.c (libgssapi_krb5.so.2+0x39072)
    #8 gss_add_cred_from <null> (libgssapi_krb5.so.2+0x1fcd3)
    #9 gss_acquire_cred_from <null> (libgssapi_krb5.so.2+0x1f69d)
    #10 gss_acquire_cred <null> (libgssapi_krb5.so.2+0x1f431)
    #11 pg_GSS_have_cred_cache ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/../../../../../../src/postgres/src/interfaces/libpq/fe-gssapi-common.c:68:10 (libpq.so.5+0x543fe)
    #12 PQconnectPoll ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/../../../../../../src/postgres/src/interfaces/libpq/fe-connect.c:2909:22 (libpq.so.5+0x359ca)
    #13 connectDBComplete ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/../../../../../../src/postgres/src/interfaces/libpq/fe-connect.c:2241:10 (libpq.so.5+0x30807)
    #14 PQconnectdb ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/../../../../../../src/postgres/src/interfaces/libpq/fe-connect.c:719:10 (libpq.so.5+0x30af1)
    #15 yb::pgwrapper::PGConn::Connect(string const&, std::chrono::time_point<yb::CoarseMonoClock, std::chrono::duration<long long, std::ratio<1l, 1000000000l>>>, bool, string const&) ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/libpq_utils.cc:348:24 (libpq_utils.so+0x13c5b)
    #16 yb::pgwrapper::PGConn::Connect(string const&, bool, string const&) ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/libpq_utils.h:254:12 (libpq_utils.so+0x1a77e)
    #17 yb::pgwrapper::PGConnBuilder::Connect(bool) const ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/libpq_utils.cc:743:10 (libpq_utils.so+0x1a77e)
    #18 yb::pgwrapper::LibPqTestBase::ConnectToDBAsUser(string const&, string const&, bool) ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/libpq_test_base.cc:54:6 (libpg_wrapper_test_base.so+0x26f34)
    #19 yb::pgwrapper::LibPqTestBase::ConnectToDB(string const&, bool) ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/libpq_test_base.cc:44:10 (libpg_wrapper_test_base.so+0x26b1e)
    #20 yb::pgwrapper::LibPqTestBase::Connect(bool) ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/libpq_test_base.cc:40:10 (libpg_wrapper_test_base.so+0x26b1e)
    #21 yb::pgwrapper::PgDdlAtomicityStressTest::Connect() ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/pg_ddl_atomicity_stress-test.cc:147:25 (pg_ddl_atomicity_stress-test+0x136d6c)
    #22 yb::pgwrapper::PgDdlAtomicityStressTest::TestDdl(std::vector<string, std::allocator<string>> const&, int) ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/pg_ddl_atomicity_stress-test.cc:165:15 (pg_ddl_atomicity_stress-test+0x136df5)
    #23 yb::pgwrapper::PgDdlAtomicityStressTest_StressTest_Test::TestBody()::$_2::operator()() const ${YB_SRC_ROOT}/src/yb/yql/pgwrapper/pg_ddl_atomicity_stress-test.cc:316:5 (pg_ddl_atomicity_stress-test+0x13d2eb)
```

It appears that the function `yb::pgwrapper::LibPqTestBase::Connect` isn't
thread safe. I restructured the code to make the connections in a single thread
and then pass them to various concurrent threads for testing.
Jira: DB-2996

Original commit: bd4874b / D37111

Test Plan:
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/0 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/1 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/2 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/3 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/4 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/5 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/6 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/7 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/8 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/9 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/10 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/11 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/12 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/13 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/14 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/15 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/16 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/17 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/18 --clang17
./yb_build.sh tsan --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest/PgDdlAtomicityStressTest.StressTest/19 --clang17

Verified that no more tsan errors.

Reviewers: fizaa

Reviewed By: fizaa

Subscribers: yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D37167
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement This is an enhancement of an existing feature
Projects
Development

No branches or pull requests

5 participants