New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tdengine-3.0.2.4 swarm cluster (3mnode-3dnodes-3replica): 2/3 nodes restart cannot be recovered #19837
Labels
bug
Something isn't working
Comments
我这也遇到了这样的问题,目前还没解决,头疼。 |
可以升级到 3.0.3.0 问题还在的话,我可以远程看下 |
这个应该是docker自己的域名解析机制导致的。 |
更新至3.0.3.0,重启集群仍会报错
|
tdengine-3.0.4.0 swarm 集群重启测试附件为 td-1 相关日志 [taoslogs_20230417_td-3.0.4.0.tar.gz] 测试步骤与之前的测试一致,td-1/2/3 为 3 个 TDengine service
service td-2/td-3 无法启动
service td-1 无法启动:
|
Bug
解决方案
|
lazyky
added a commit
to lazyky/TDengine
that referenced
this issue
Apr 18, 2023
add check if dnode has been created when TDengine docker image starts Resolves taosdata#19837
sangshuduo
pushed a commit
that referenced
this issue
Apr 20, 2023
* fix: tdengine swarm cluster(3 mnodes) startup error add check if dnode has been created when TDengine docker image starts Resolves #19837 * refactor: change the check (#6f53e8ed76) location
AdamEECS
added a commit
that referenced
this issue
Apr 23, 2023
* fix: check uv_is_closing before uv_close * fix: fix compilation error * fix:optimize log & change the length of tag if tag is null in schemaless * fix: table level privilege * enhance: add udf replace function test case * refactor: do some internall refactor. * fix: fix max/min(tag) random result * fix(shell): update the double display. * add test cases * fix(udf1): use 1 as luck number to make new gcc happy * fix(stream): disable the deploy msg when restart taosd. * fix(shell): update the display of double value. * docs:compile error (#20865) * merge main * add test cases * fix:modify fileContent for data compare * fix:rm useless fileContent function * fix(test/udf): use 1 as lucky number instead of 88 * fix(tsdb/cache): skip schema updating for non ts row * refactor(tq): do some internal refactor. * refactor: do some internal refactor. * fix:modify checkFileContent if one is empty * merge main * Update 03-package.md * fix: script if share not exist (#20875) * Update 14-stream.md * enh: change the error msg of INVALID_VGROUP_ID to Vnode is closed or removed * enh: change sync log repl mgr to sync log repl in logging msg * enh: refactor func names doOnce, attempt, probe, and sendTo of syncLogRepl * enh: refactor func name syncLogIsReplicationBarrier to syncLogReplBarrier * fix: show user privileges invalid write * merge main * fix:open test cases for tmq & add log if rebalance error * fix: fix constant comparision precision error * fix(driver): return error code to java (#20869) * fix: taosdump continue if fail (#20886) * fix: taosdump continue if ts out of range for main (#20887) * fix: error querying after setting permissions for varchar type tags * fix: the precision of delete statement * chore: revert the extra line * chore: revert the extra line * chore: more code * chore: add test case to CI * add test cases * test: add flush database in tsim/parser/last_cache.sim and limit1.sim * docs: add rest api diff (#20892) * docs: use html table in rest api (#20893) * docs: add rest api diff * docs: use html table * test: reopen tmqDelete-1ctb.py * docs: use json instead of table in rest api (#20895) * docs: add rest api diff * docs: use html table * docs: use json instead of table in rest api doc * enh: refactor some vars in syncLogReplProcessReplyAsNormal * fix:doBitmapMerge error if remaind bytes is not 0 * fix(tsdb/read): remove duplicat schema fetching * docs: fix connector case (#20900) * docs: add rest api diff * docs: use html table * docs: use json instead of table in rest api doc * docs: fix connector upcase * Update 29-changes.md * Update 29-changes.md * fix translatefunc not getting precision issue * fix(tsdb/cache): fix block index ref releasing * enh(taosAdapter): make the schemaless automatic database creation configurable (#20903) * enh(taosAdapter): make the schemaless automatic database creation configurable (#20902) * balance leader to enterprise * fix: fix illegal usage of _isfilled/_irowts * test: add cases for TS-3150 * fix: fix illegal usage of _isfilled/_irowts * fix(tsdb/read): release bi cache entry before returning * add test cases * fix spread timestmap column reading sma issue * chore: fix install.sh for explorer * enh(docker): add debugging tools in TDengine image (#20908) * Doc/xsren/install des on mac base main (#20910) * install desc on mac * echo > exception --------- Co-authored-by: facetosea <25808407@qq.com> * fix: alter table check * fix crash * chore: fix packaging install.sh for explorer OEM * enh(docker): add debugging tools in TDengine image (#20909) * chore: fix install.sh when not root * fix: add client option tsEnableScience * fix: build error fix * fix: adjust format * fix: remove obsolete code * fix: optimizing 'alter table drop tag' error reporting * Update cases.task close tmqDelete-1ctb.py * Update tmqDelete-1ctb.py * Update cases.task reopen tmqDelete-1ctb.py * fix: optimizing 'alter table drop tag' error reporting * enh: try to propose vnode commit at vnode closing * fix: add three more stars for shell mask for main (#20916) * fix(stream): all data should be extracted from wal. * fix(stream): fix memory leak. * fix:memset nullBitmap of SSDataBlock to 0 in udf * fix(stream): remove unused tqreader, do some internal refactor, set the meta value for streamtask. * enchance: increate testpackage.sh timeout seconds (#20920) * ench: increate testpackage.sh timeout seconds * Update debRpmAutoInstall.sh extend timeout to 30 * fix(stream): don't the initial task status and do some internal refactor. * test: modify test case * release: upgrade default version * opti:the logic of mndDoRebalance for clear * fix(query): fix the invalid read. * fix: udf plan error * fix: an important fix * fix more code * fix(stream): disable the status check. * refactor: increase the buffer size * make it compile * fix(stream): update the version when open stream tasks. * opti:the logic of mndDoRebalance for clear * fix(stream): set the correct initial checkpoint version to restore the operators state and add check for the initial destination tables. * fix(stream): disable stream task when no tasks exist. * other: merge main. * refactor: do some internal refactor. * fix(query): return correct suid to delete sink. * fix(stream): update the table list api. * fix: fix double free caused crash * fix:stream memory leak * enh: remove unused functions in sync * fix:rebalance not only one in once timer * fix(stream): set the correct start offset for stream task. * fix(stream): set the correct initial offset value. * fix(query): set the table schema correctly when the table is dropped. * chore(deps): bump spring-core in /examples/JDBC/taosdemo (#20955) Bumps [spring-core](https://github.com/spring-projects/spring-framework) from 5.3.26 to 5.3.27. - [Release notes](https://github.com/spring-projects/spring-framework/releases) - [Commits](spring-projects/spring-framework@v5.3.26...v5.3.27) --- updated-dependencies: - dependency-name: org.springframework:spring-core dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix(query): fix invalid free. * fix: duplicate calling shellWriteHistory * fix: atoi on int64 config item (#20956) * fix local variable * feat: support exit by kill heart-beat thread mode * fix: illegal accesses (#20960) (#20962) * fix(stream): update the reference count value to be int32, insead of int8 * opti:escape logic in schemaless * enh: ignore single row null data type validation * fix(os): fix a deadlock. * fix(os): fix the link error in unit test cases. * fix: trim database db from mnd * test:add testcase of compatibility,py for TS-3209 * fix: tdengine swarm cluster(3 mnodes) startup error (#20966) * fix: tdengine swarm cluster(3 mnodes) startup error add check if dnode has been created when TDengine docker image starts Resolves #19837 * refactor: change the check (#6f53e8ed76) location * other: update wal logs level. * test: add the cast that select field include two udf function * fix: do not performace table count scan optimized where there are no agg functions * test: fix udf1_dup error * test: select fields error * test: add udf2_dup test case * test: add udf2_dup test case * test: add udf2_dup test case * fix: invalid identifier check * enhance: modify inside isEligibleAgg func * fix:add test cases for escape in schemaless * fix(stream): add lock during check wal to create new stream task. * fix:add test cases for escape in schemaless * fix:[TS-3221] reset max stmt if execute error * enh(stream): add more check to stop stream asap. * other: do some internal refactor. * enh(stream): set the max input queue size to be 3000. * fix:add test cases for escape in schemaless * test: modify tmq case * fix:add test cases for escape in schemaless * other: add some logs. * enh(stream): stop stream asap. * fix(stream): set the correct number of tasks. * fix(stream): fix the race condition during create new stream tasks. * log: update the log. * enhance: enterprise package include jdbc driver (#21001) * fix(stream): fix error in start stream tasks. * refactor: do some internal refactor. * fix(stream): fix memory leak. * fix:[TS-3250] change strtegy in schemaless if modifyDBSchema error * fix:[TS-3082] change offset to firstver if offset is smller than firstVer when wal is removed * fix(stream): set the correct offset version. * feat: support new table_prefix/table_suffix mode * fix: add table_prefix/table_suffix cases * fix: memory leak issue * fix:ci error * fix: taosdump escape dbname (#21014) * fix: taosdump escape dbname * fix: json file for escape char * fix: update taostools ffc2e6f --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Haojun Liao <hjxilinx@users.noreply.github.com> Co-authored-by: shenglian zhou <shenglian_zhou@163.com> Co-authored-by: wangmm0220 <wangmm0220@gmail.com> Co-authored-by: Xiaoyu Wang <xiaoyuwang@taosdata.com> Co-authored-by: Xiaoyu Wang <59301069+xiao-yu-wang@users.noreply.github.com> Co-authored-by: Haojun Liao <hjliao@taosdata.com> Co-authored-by: Ganlin Zhao <ganlinzhao@hotmail.com> Co-authored-by: dapan1121 <72057773+dapan1121@users.noreply.github.com> Co-authored-by: Minglei Jin <mljin@taosdata.com> Co-authored-by: Bo Ding <dingbo8128@163.com> Co-authored-by: wade zhang <95411902+gccgdb1234@users.noreply.github.com> Co-authored-by: Shuduo Sang <sangshuduo@gmail.com> Co-authored-by: liuyao <38781207+54liuyao@users.noreply.github.com> Co-authored-by: Benguang Zhao <bgzhao@taosdata.com> Co-authored-by: huolibo <huolibo@qq.com> Co-authored-by: kailixu <klxu@taosdata.com> Co-authored-by: plum-lihui <huili@taosdata.com> Co-authored-by: Hui Li <52318143+plum-lihui@users.noreply.github.com> Co-authored-by: Xuefeng Tan <1172915550@qq.com> Co-authored-by: cadem <cademfly@hotmail.com> Co-authored-by: jiajingbin <jiajaybin@126.com> Co-authored-by: Huo Linhe <linhehuo@gmail.com> Co-authored-by: xinsheng Ren <285808407@qq.com> Co-authored-by: facetosea <25808407@qq.com> Co-authored-by: Alex Duan <417921451@qq.com> Co-authored-by: xiaolei li <85657333+xleili@users.noreply.github.com> Co-authored-by: xleili <xlli@taosdata.com> Co-authored-by: Hongze Cheng <hzcheng@taosdata.com> Co-authored-by: liuyao <54liuyao@163.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dapan1121 <wpan@taosdata.com> Co-authored-by: chenhaoran <haoran920c@163.com> Co-authored-by: Kaiyu Zhu <kingzhuky88@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Bug Description
三个低性能 PC 的 docker swarm 集群灾备测试 ( 3mnode-3dnodes-3replica )
server-01/server-02
, 插入数据,TDengine
集群运行正常, 可看到同步日志server-01
和server-02
, 重新开机后,td-2
和td-3
显示以下报错,TDengine
集群无法正常部署To Reproduce
Steps to reproduce the behavior:
Expected Behavior
重启后,正常部署 tdengine 集群
Screenshots
If applicable, add screenshots to help explain your problem.
Environment (please complete the following information):
Additional Context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: