Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Applications deployed with strategy --rolling do not produce crash logs. #2587

Closed
dzaslavskiy opened this issue Dec 9, 2021 · 1 comment
Closed

Comments

@dzaslavskiy
Copy link

dzaslavskiy commented Dec 9, 2021

Issue

Applications deployed with strategy --rolling do not produce crash logs.

Context

Normally, when an application crashes it produces a log with a message detailing the crash:

App instance exited with guid f94d3889-3229-4045-befd-af282865dc2a payload: {"instance"=>"1016193b-39d2-42dc-69c8-27d9", "index"=>0, "cell_id"=>"b701ee4f-885a-4c07-80c1-036e3d494e0a", "reason"=>"CRASHED", "exit_description"=>"APP/PROC/WEB: Exited with status 1", "crash_count"=>1, "crash_timestamp"=>1638762864524964378, "version"=>"0d53cd6f-6270-49e9-8ba8-1acd5d9efb58"}

However, if the application is deployed with strategy --rolling and then redeployed. It will not produce these log messages.

There is another log message that is associated with crashes. This message is produced for both types of deployment strategies but it does not contain all of the information of the above message.

Process has crashed with type: "web"

Steps to Reproduce

Deploy application with strategy --rolling and then redeploy it again. Cause app to crash. It will not produce above log message

Expected result

The application produces above log message regardless of deployment strategy.

Current result

The application does not produce the above log.

Possible Fix

This may be related to fact that with strategy --rolling the process GUID changes in differant way then with normal deployments.

@tcdowney
Copy link
Member

tcdowney commented May 9, 2024

Interesting, good find.

This may be related to fact that with strategy --rolling the process GUID changes in differant way then with normal deployments.

I think your hunch is correct here. To make the v2->v3 migration easier the web process starts out with a GUID that is identical to its parent app's GUID. This allows it to be used almost interchangeably. Rolling deployments create new web processes so that causes the GUIDs to drift.

I think this also means crash events are broken for multi-process apps. We are sending the process GUID all the way through from tps-watcher to Cloud Controller, but at the point the event is created it is considering it to be an "app" and not a process. I wonder if we could fix things by changing this call to actually pass through the app instead of a process.

tcdowney added a commit that referenced this issue May 9, 2024
- The internal app crash event endpoint assumed that the web process
  guid == app guid (legacy behavior from the v2->v3 data model migrations)
- Rolling deployments create new web processes that have different guids
  which made it so this endpoint no longer created events associated with
  the parent app
- This change updates the endpoint to make app crash events for the app
  guid instead of the process guid which should resolve the rolling
  deployment issue and support crash events for non-web process types

Fixes #2587
tcdowney added a commit that referenced this issue May 14, 2024
- The internal app crash event endpoint assumed that the web process
  guid == app guid (legacy behavior from the v2->v3 data model migrations)
- Rolling deployments create new web processes that have different guids
  which made it so this endpoint no longer created events associated with
  the parent app
- This change updates the endpoint to make app crash events for the app
  guid instead of the process guid which should resolve the rolling
  deployment issue and support crash events for non-web process types

Fixes #2587
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants