-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
observer 启动失败,日志文件打印的错误信息太笼统,无法定位具体的错误 #97
Comments
|
There is inner restriction that memory_limit should be [8G, ). Documents and parameter description are wrong, and we will fix this later. For your question, set memory_limit to 8G may solve the problem. Thank you very much for your feedback. |
Bad luck, after modifed configuration parameters, observer still failed to start up. ...... |
please provide more log for this error
|
grep ERROR observer.logfind the first erros,
|
[Issue summary]
observer 启动失败,但是从日志文件(observer,log),用户很难获得有用的错误信息,日志里记录了一个返回代码(-4147),
但是根据这个错误代码却查询不到对应的错误信息,不能像Oracle的oerr工具那么智能(只要你输入一个错误号,就能返回
对应的错误信息,而且这个错误信息支持多国语言版本,例如将locale设置成zh_CN再执行oerr xxx, 返回的信息就是中文的错误信息。
[Steps]
ERROR [SERVER] init_config (ob_server.cpp:832) [1320][0][Y0-0000000000000000] [lt=32] invalid config from cmdline options(opts_.optstr_="__min_full_resource_pool_memory=268435456,datafile_size=8G,memory_limit=4G,system_memory=2G,stack_size=512K,cpu_count=1,cache_wash_threshold=1G,workers_per_cpu_quota=10,schema_history_expire_time=1d,net_thread_count=1,sys_bkgd_migration_retry_num=3,minor_freeze_times=10,enable_separate_sys_clog=0,enable_merge_by_turn=False,datafile_disk_percentage=20", ret=-4147) BACKTRACE:0x90a107e 0x90008fb 0x24d18a1 0x251b43b 0x8702a2e 0x86fe493 0x24be805 0x7fb34909f555 0x24bd4e9
完整的ERROR信息 from observer.log
[root@redis-server-1 log]# grep -i "error" observer.log
[2021-06-10 17:36:44.655716] ERROR [SERVER] init_config (ob_server.cpp:832) [6630][0][Y0-0000000000000000] [lt=32] invalid config from cmdline options(opts_.optstr_="__min_full_resource_pool_memory=268435456,datafile_size=8G,memory_limit=2G,system_memory=2G,stack_size=128K,cpu_count=1,cache_wash_threshold=512M,workers_per_cpu_quota=1,schema_history_expire_time=1d,net_thread_count=1,sys_bkgd_migration_retry_num=3,minor_freeze_times=10,enable_separate_sys_clog=0,enable_merge_by_turn=False,datafile_disk_percentage=20", ret=-4147) BACKTRACE:0x90a107e 0x90008fb 0x24d18a1 0x251b43b 0x8702a2e 0x86fe493 0x24be805 0x7f5e95a07555 0x24bd4e9
[2021-06-10 17:36:44.656767] INFO ob_server_config.cpp:242 [6630][0][Y0-0000000000000000] [lt=4] | ignore_replay_checksum_error = False
[2021-06-10 17:36:44.656810] INFO ob_server_config.cpp:242 [6630][0][Y0-0000000000000000] [lt=4] | ignore_replica_checksum_error = False
[2021-06-10 17:36:44.657534] INFO ob_server_config.cpp:242 [6630][0][Y0-0000000000000000] [lt=4] | enable_rich_error_msg = False
[2021-06-10 17:36:44.661805] ERROR [SERVER] init (ob_server.cpp:165) [6630][0][Y0-0000000000000000] [lt=5] init config fail(ret=-4147) BACKTRACE:0x90a107e 0x90008fb 0x24c152f 0x24c04f6 0x86fee88 0x24be805 0x7f5e95a07555 0x24bd4e9
[2021-06-10 17:36:44.663553] ERROR stop (ob_ddl_task_executor.cpp:176) [6630][0][Y0-0000000000000000] [lt=5] invalid tg id BACKTRACE:0x90a107e 0x90008fb 0x24c00eb 0x24bd7a5 0x6257766 0x5ef0630 0x65dfebf 0x86fd61f 0x86fef2d 0x24be805 0x7f5e95a07555 0x24bd4e9
[2021-06-10 17:36:44.663732] ERROR wait (ob_ddl_task_executor.cpp:181) [6630][0][Y0-0000000000000000] [lt=177] invalid tg id BACKTRACE:0x90a107e 0x90008fb 0x24c00eb 0x24bd7a5 0x6257956 0x5ef0638 0x65dfebf 0x86fd61f 0x86fef2d 0x24be805 0x7f5e95a07555 0x24bd4e9
[2021-06-10 17:36:44.666418] ERROR [SERVER] main (main.cpp:485) [6630][0][Y0-0000000000000000] [lt=6] observer init fail(ret=-4147) BACKTRACE:0x90a107e 0x90008fb 0x24c152f 0x24c04f6 0x24bea5e 0x7f5e95a07555 0x24bd4e9
看了半天配置文件,好像没发现有什么错误,我是在虚拟机上测试的,因为没有分配那么多内存和CPU, 所以只修改了对应的参数。
oceanbase-ce:
servers:
- 127.0.0.1
global:
home_path: /root/observer
devname: lo
mysql_port: 2883
rpc_port: 2882
zone: zone1
cluster_id: 1
datafile_size: 8G
memory_limit: 4G
system_memory: 2G
stack_size: 512K
cpu_count: 1
cache_wash_threshold: 512M
__min_full_resource_pool_memory: 268435456
workers_per_cpu_quota: 1
schema_history_expire_time: 1d
net_thread_count: 1
sys_bkgd_migration_retry_num: 3
minor_freeze_times: 10
enable_separate_sys_clog: 0
enable_merge_by_turn: FALSE
datafile_disk_percentage: 20
对配置文件参数的合法性校验 第一步应该是在obd cluster deploy执行的时候去做校验:
--如果校验通过,则完成部署
--如果校验未通过(例如/root/observer 目录不为空),打印对应的错误信息。
然后,在启动集群的时候 肯定也要重新校验这个配置文件,因为用户很有可能在deploy后又修改了配置文件。
[Suggestions]
目前的校验流程本身没有什么大问题,主要问题出在与用户的交互上。日志文件是辅助用户定位错误的重要线索,
但是从目前来看,这个日志文件虽然打印的内容很多,但是提供的实质性的,有用的信息有限,很难提供给用户
清晰的线索去定位真正的问题。这点是需要改进的,离产品的标准化还是有一段距离的。
The text was updated successfully, but these errors were encountered: