Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic in state backend can stall checkpointing #444

Closed
mwylde opened this issue Dec 8, 2023 · 1 comment
Closed

Panic in state backend can stall checkpointing #444

mwylde opened this issue Dec 8, 2023 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@mwylde
Copy link
Member

mwylde commented Dec 8, 2023

This code in the parquet state backend:

tokio::spawn(async move {
loop {
if !self.flush_iteration().await.unwrap() {
return;
}
}
});

is responsible for flushing data to the filesystem. Currently it unwraps on failure, which causes the flusher to stall, breaking checkpointing:

ERROR arroyo_server_common: panicked at /opt/arroyo/src/arroyo-state/src/parquet.rs:1256:50:
called `Result::unwrap()` on an `Err` value: object store error: Generic { store: "S3", source: Error { retries: 0, message: "request error", source: Some(reqwest::Error { kind: Request, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("s3.ap-southeast-1.amazonaws.com")), port: None, path: "/arroyo-streaming/checkpoints/job_Qwj7ye3Egl/checkpoints/checkpoint-0013414/operator-job_source_0/table-k-000", query: None, fragment: None }, source: hyper::Error(Io, Os { code: 104, kind: ConnectionReset, message: "Connection reset by peer" }) }), status: None } }

Thanks @harshit2283 for the report

@mwylde mwylde added the bug Something isn't working label Dec 8, 2023
@mwylde
Copy link
Member Author

mwylde commented Dec 8, 2023

Fixed with #445

@mwylde mwylde closed this as completed Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants