Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

teamcity CI 数据容错机制 #11254

Closed
guochaorong opened this issue Jun 7, 2018 · 1 comment
Closed

teamcity CI 数据容错机制 #11254

guochaorong opened this issue Jun 7, 2018 · 1 comment
Assignees

Comments

@guochaorong
Copy link
Contributor

guochaorong commented Jun 7, 2018

6月6号晚上teamcity db 被删除, 历史build数据丢失。

原因分析:

在6月6号晚上11:30分开始, teamcity db异常:
org.springframework.transaction.CannotCreateTransactionException: Could not open JDBC Connection for transaction; nested exception is java.sql.SQLException: Connection to 172.19.32.197:5432 refused.

平均每分钟4000多个db连接异常日志。日志持续到晚上11:54分,
server上同时还有大量和其他agent的rpc,

推测这个时候teamcity db 所在docker stop掉。

我们teamcity server机器上crontab 中
[root@k8s-node3 ~]# crontab -l
0 0 * * * docker system prune -f
1 0 * * * docker exec -d 3fade6bf77e9 bash /opt/teamcity/bin/daily_backup.sh

0 0 * * * docker system prune -f 在零点删除所有stop的 docker,

在teamcity server docker中
root@3fade6bf77e9:/opt/teamcity/logs# cat /opt/teamcity/bin/daily_backup.sh
rm -f /data/teamcity_server/datadir/backup/daily_backup.zi*
/opt/teamcity/bin/maintainDB.sh backup --all -F /data/teamcity_server/datadir/backup/daily_backup
先删除所有备份, 再开始新的备份

导致所有build历史数据被删除。

容错机制:

  1. 执行0 0 * * * docker system prune -f 时,过滤掉teamcity server 和 teamcity db的container

  2. crontab里面每天docker commit一下db container到dockerhub上(如果第3步能保存下来历史build,第2步可去掉)

  3. 数据备份:先备份(上传云服务器),备份成功则删除之前备份的方式

  4. teamcity server 机器密码太弱,需要变更。

@shanyi15
Copy link
Collaborator

您好,此issue在近一个月内暂无更新,我们将于今天内关闭。若在关闭后您仍需跟进提问,可重新开启此问题,我们将在24小时内回复您。因关闭带来的不便我们深表歉意,请您谅解~感谢您对PaddlePaddle的支持!
Hello, this issue has not been updated in the past month. We will close it today for the sake of other user‘s experience. If you still need to follow up on this question after closing, please feel free to reopen it. In that case, we will get back to you within 24 hours. We apologize for the inconvenience caused by the closure and thank you so much for your support of PaddlePaddle Group!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants