[BUG] HikariDataSource HikariDataSource (HikariPool-1) has been closed #1003

dilipdhankecha2530 · 2024-04-22T10:31:13Z

JobRunr Version

7.0.0

JDK Version

eclipse-temurin:21-jre-alpine

Your SQL / NoSQL database

Postgres

What happened?

We are using jobrunr for the distributed jobs with in the k8s instances. We have shared postgres database where we have multiple application which use the same postgres instance with different database. We assign 20 connection pool size to our spring application.
We use the spring auto configuration with below configurations for the jobrunr,

Let me give the idea on the scheduled task, We have a 100k records in batch of 200 which we have to process using the single job and in the job we use virtual thread to manage the multi threading. During the processing sometime we face the below mentioned issue.

2024-04-22T10:11:08.880Z ERROR 1 --- [backgroundjob-worker] org.jobrunr.server.BackgroundJobPerformer.updateJobStateToFailedAndRunJobFilters : ERROR - could not update job(id=018f0548-020a-70ca-a32d-20ea2188ea00, jobName='InitBPNGenerationProcess') to FAILED state

Can anyone help us here to resolve this issue?
Thanks in advance.

How to reproduce?

    @Job(name = "LongPrcoess")
    @Recurring(id = "LongProrcess", cron = "${scheduler.cron}")
    public void prcoess() {
        doSomeLongProcess();
    }

Relevant log output

No response

rdehuyss · 2024-04-22T14:38:18Z

@dilipdhankecha2530 - although I also saw this issue in the last weeks, we just tried to reproduce it but are unable to do so (we created a long running job and hikari waits until the server has shutdown correctly).

Can you reproduce it easily?

dilipdhankecha2530 · 2024-04-23T05:29:45Z

@rdehuyss
We can't seem to replicate this issue locally. It seems to occur only on the server where we deploy the application with k8s, with specific configurations: 3Gi memory max and 1 core CPU.
This issue we faced after introduce the Jobrunr.

rdehuyss · 2024-04-23T05:31:22Z

Without a way to reproduce it, is going to be hard to fix....

rdehuyss · 2024-04-23T05:31:57Z

Please also update your GitHub profile like requested in the JobRunr community guidelines...

dilipdhankecha2530 · 2024-04-23T06:05:47Z

@rdehuyss Could you please provide suggestion what i have to do for the Github profile?

rdehuyss · 2024-04-23T06:09:19Z

You can find everything when you create an issue (it's the text before the fields)

rdehuyss · 2024-04-23T15:38:42Z

Cool, thx for updating your profile. Any luck on a way to reproduce it? We can not reproduce it in K8S either.

dilipdhankecha2530 · 2024-04-24T04:56:54Z

@rdehuyss I'm setting up the scenario in our local environment, trying to replicate it. Once everything's ready, I'll keep you posted on the progress.

dilipdhankecha2530 · 2024-04-24T06:56:21Z

@rdehuyss During the testing i faced another issue.

This issue occurred during the processing of the large amount. So we have 100k records to process and we use multi threading here. During the processing we use the 10 seconds Thread sleep.
It's cause the issue during the processing. I assume jobrunr is not able to manage the thread properly but i am also not sure.

rdehuyss · 2024-04-24T07:11:21Z

In case of such an exception, the dashboard allows you to create automatically a github issue so we can diagnose the root cause. See the video at https://www.jobrunr.io/en/blog/2021-02-07-v1.3. Without that, I'm afraid we cannot help.

Also, the Orphaned job means either a server died or a stop the world garbage collection happens for a long period. Also this is mentioned on the Dashboard page.

We test JobRunr regularly with 10 million jobs and Thread.sleep and that works without any problems.

rdehuyss · 2024-04-24T07:11:47Z

Which DB are you using?

dilipdhankecha2530 · 2024-04-24T08:55:35Z

Postgres DB

dilipdhankecha2530 · 2024-04-26T10:37:54Z

I'm attempting to replicate the issue with my application setup in Kubernetes, but I'm still unable to do so.

uben01 · 2024-04-29T07:37:48Z

It also happens to use, like every week for the last month or so.

We've no longer got any logs about it, but if it occurs again, I'll let you know. I assumed it was because of the CloudSQL maintenance but I might be wrong here.

rdehuyss · 2024-04-29T07:43:44Z

Indeed, this looks more related to your database / Hikari timing out then to JobRunr.

We also are still looking but were not able to reproduce it...

rdehuyss · 2024-05-06T12:35:43Z

Hi @dilipdhankecha2530 @uben01 - if we don't get any reproducer, I'm afraid we will need to close this issue. We try to keep the amount of open issues as small as possible.

dilipdhankecha2530 · 2024-05-06T13:02:59Z

@rdehuyss
Here's some more insight into our process:
With just 1 core CPU, when the load is high, almost 80% of the CPU is occupied, leading to the application throwing errors.
Do you know if Jobrunr has any documentation that specifies the standard memory and CPU configuration for handling around 6 GB of data?
Additionally, we have below configurations for the database.
DB configs: 2 vCores, 8 GiB RAM, 64 GiB storage, Storage IOPS: 240, Compute IOPS: Max 3200

rdehuyss · 2024-05-06T13:10:46Z

Hi @dilipdhankecha2530 - we cannot give you any documentation around this as this depends on the jobs that you run.

I'm afraid that as part of the open-source version we require a reproducer to solve these kind of issues.

rdehuyss · 2024-05-14T10:31:40Z

We're unable to reproduce this and did not get a reproducer. If you're able to reproduce it, please reopen the issue with a link to a reproducer.

rdehuyss closed this as completed May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] HikariDataSource HikariDataSource (HikariPool-1) has been closed #1003

[BUG] HikariDataSource HikariDataSource (HikariPool-1) has been closed #1003

dilipdhankecha2530 commented Apr 22, 2024

rdehuyss commented Apr 22, 2024

dilipdhankecha2530 commented Apr 23, 2024

rdehuyss commented Apr 23, 2024

rdehuyss commented Apr 23, 2024

dilipdhankecha2530 commented Apr 23, 2024

rdehuyss commented Apr 23, 2024

rdehuyss commented Apr 23, 2024

dilipdhankecha2530 commented Apr 24, 2024

dilipdhankecha2530 commented Apr 24, 2024 •

edited

rdehuyss commented Apr 24, 2024

rdehuyss commented Apr 24, 2024

dilipdhankecha2530 commented Apr 24, 2024

dilipdhankecha2530 commented Apr 26, 2024

uben01 commented Apr 29, 2024

rdehuyss commented Apr 29, 2024

rdehuyss commented May 6, 2024

dilipdhankecha2530 commented May 6, 2024

rdehuyss commented May 6, 2024

rdehuyss commented May 14, 2024

[BUG] HikariDataSource HikariDataSource (HikariPool-1) has been closed #1003

[BUG] HikariDataSource HikariDataSource (HikariPool-1) has been closed #1003

Comments

dilipdhankecha2530 commented Apr 22, 2024

JobRunr Version

JDK Version

Your SQL / NoSQL database

What happened?

How to reproduce?

Relevant log output

rdehuyss commented Apr 22, 2024

dilipdhankecha2530 commented Apr 23, 2024

rdehuyss commented Apr 23, 2024

rdehuyss commented Apr 23, 2024

dilipdhankecha2530 commented Apr 23, 2024

rdehuyss commented Apr 23, 2024

rdehuyss commented Apr 23, 2024

dilipdhankecha2530 commented Apr 24, 2024

dilipdhankecha2530 commented Apr 24, 2024 • edited

rdehuyss commented Apr 24, 2024

rdehuyss commented Apr 24, 2024

dilipdhankecha2530 commented Apr 24, 2024

dilipdhankecha2530 commented Apr 26, 2024

uben01 commented Apr 29, 2024

rdehuyss commented Apr 29, 2024

rdehuyss commented May 6, 2024

dilipdhankecha2530 commented May 6, 2024

rdehuyss commented May 6, 2024

rdehuyss commented May 14, 2024

dilipdhankecha2530 commented Apr 24, 2024 •

edited