-
Notifications
You must be signed in to change notification settings - Fork 683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Teslamate fails with "Bad Gateway" after 1-4 days of normal operation while Grafana shows "404 page not found" #3793
Comments
Thanks for reporting.
Your TZ Environment variable is not valid. Even if unreleated: In addition your postgres version is very outdated. Please do a full backup and follow https://docs.teslamate.org/docs/maintenance/upgrading_postgres Check your trafik config, as the unreachable issue is related to your config imo |
@JakobLichterfeld thank you kindly for the reply and feedback as well as flagging that I am running an older Postgres version. What version # should I use? Now, while will be sure to backup the DB and update Postgres, can you please explain how it is possible that Teslamate, trafik, the "TZ Environment" and "old" Postgres work together just fine for 2-4 days when I restore from last backup, and then just days later I get the "Bad Gateway" error? Here is a redacted image of my .env file: Something is happening after it runs just fine for a few days and this all started about 2 Teslamate version updates ago, right before Docker needed to be updated to v20+ Thank you. |
The correct name of the Time Zone Environemt variable is TZ was not really used in old versions, #3646 changed that since 1.28.3 In #3678 people had same issue with incorrect TZ variable name used |
Thanks @JakobLichterfeld but I am using the correct timezone variable. Here is an excerpt of my YML config file: Now again, the mystery is why is it working fine for a few days, after a restore from backup? I actually just new had to restore from backup again, since it was showing "Bad Gateway". Now it again works fine with TM v1.28.5 but noticed something new: it is possible the core issue may be that it is running out of disk space! I issued a "df -h" and got this: Again, even with 0 free space it is working now, but I bet it will crash again in a day or two unless I figure out how to free up some space... To solve this I 1st ran "curl -Ls http://bit.ly/clean-centos-disk-space | sudo bash" but it only cleaned up about 500MB Digging deeper, here is the output from "docker system df": If the core issue was indeed with lack of space, the current running restored backup of TM v1.28.5 should now run without fail for more than a few days. Let's see... |
Yeah sorry, your TZ seams to be right, but log shows different. Yeah a Anyways, as I first said, not a TeslaMate issue |
Thanks @JakobLichterfeld Interesting that you confirm TZ is setup correctly, yet the log still shows an issue. Also, I attempted the Postgres version update from v12 to v15 (as per here), made a DB backup and DB restore as per here), but it failed to come back up! Let's see if it stays active now for more than a few days... |
As you do not use official Installation, nothing is confirmed 😀 If disk is full weired symtopms may apply, do not se an TeslaMate issue here, as wrote before. |
Sorry @JakobLichterfeld but could you please elaborate on what do you mean that I am not using the "official installation"? I am using the "Advanced installation with Traefik, Let's Encrypt & HTTP Basic Auth" (as per HERE), and have been using it for 3 years! |
There are numerous possible ways you can configure Teslamate to start. As much as we would like to, we are not able to provide support for every single user. Which doesn't mean there is anything wrong with what you are doing, just that our ability to help you is limited. Having said that I notice the line above is:
Shouldn't that be:
? |
Good catch @brianmay thank you. I have fixed As for my configuration, it is "stock" and the advanced config in the manual. Only thing that changed in the last 3Y was the version of Docker and the version of Postgres DB. Nothing specific to me. |
Same here. Disk space is fine, Grafana is working but TeslaMate gives me a 404. |
What version Docker are those impacted running? See #3754 for more details. There was also a comment about adding a line to your YML file in the following comment. |
Hi @cwanja I believe my issue was old Docker 19 followed by lack of disk space, since after the last full system image restore on 3/30/2024 the system is still up. Today is the 4th day still up, so if it works tomorrow as well then lack of space was the issue. Purging stale images in Docker did the trick for me. I also updated Docker to v26 (see here) but failed to update Postgres DB from 12 to 14 so still running PG DB 12. Hopefully that does not become an issue as well. |
Thanks @SemoTech. Please close the issue whenever you feel it's completed. |
Thanks for reporting, if your disk space is fine, than it is not the same issue, please open a seperate one (if you are using docker-compose v1, plase migrate to docker compose v2 and rebuid your docker-compose stack). |
My Teslamate and Grafana still work fine today so I believe the space issue was the core cause. I am closing this case. Thanks again for all the help in troubleshooting and keep up the great work in Teslamate! |
@SemoTech Can you provide instructions how you upgraded your system? The official guide for Debian is not working for me, I'm hosting TM on Google cloud. |
My issue was running out of space. I ran the docker command ( |
Is there an existing issue for this?
What happened?
For the last 2 updates (currently on v1.28.5) and after a few days of normal operation, Teslamate becomes unreachable and shows just "Bad Gateway". At the same time, trying to access Grafana I get "404 page not found"
I saw there was another thread with this "Bad Gateway" issue where there were reports that restarting Docker fixed it, but in my case neither restarting Docker or rebooting my server helps! Both Teslamate and Grafana are broken!
My only resort has been to revert from a snapshot taken 2 days before (while loosing all data gathered since) and if I miss to do this the next automated snapshot will be of the broken Teslamate, and I will be stuck permanently until there is a proper & permanent fix.
For reference, my OS is CentOS 7, Docker version is v26.0.0, build 2ae903e, Docker Compose is version v2.25.0 and here is my yaml file:
Expected Behavior
Normal operaiton
Steps To Reproduce
Deploy latest teslamate, wait 1-4 days...
Relevant log output
Screenshots
No response
Additional data
No response
Type of installation
Docker
Version
v1.28.5
The text was updated successfully, but these errors were encountered: