Temporary working folders are left behind on Middle Managers after tasks complete #12332

sergioferragut · 2022-03-15T01:11:13Z

Affected Version

Apache Druid 0.22.1

Description

This problem was originally reported here: https://www.druidforum.org/t/temp-folder-size-was-increasing-due-to-that-peons-processing-taking-more-time-how-to-clear-temp-folder-automatically/7139

I was able to reproduce it by running on a small minikube deployment by running the vanilla wikipedia index_parallel ingestion a few times, each with a different target datasource name and confirmed that after the jobs completed the temporary folders for the tasks are not being removed, after 3 runs, the ~/var/tmp folder still contained the three empty folders:

~/var/tmp $ ls -l
total 12
drwx------    2 druid    druid         4096 Mar 14 23:39 druid-realtime-persist1040350100896362009
drwx------    2 druid    druid         4096 Mar 14 23:32 druid-realtime-persist668375622911252079
drwx------    2 druid    druid         4096 Mar 14 23:34 druid-realtime-persist944793843865837077
~/var/tmp $ ls -l druid-realtime-persist944793843865837077
total 0
~/var/tmp $ ls -l druid-realtime-persist668375622911252079
total 0
~/var/tmp $ ls -l druid-realtime-persist1040350100896362009
total 0

The original report on Druid Forum spoke of thousands of such folders left behind.

The text was updated successfully, but these errors were encountered:

lejinghu · 2022-08-16T07:25:51Z

We saw this too in our clusters. Also timed out queries are also leaving tmp folders.
As a workaround we are cleaning them manually using cron jobs.

github-actions · 2023-12-11T00:17:43Z

This issue has been marked as stale due to 280 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If this issue is still
relevant, please simply write any comment. Even if closed, you can still revive the
issue at any time or discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

github-actions · 2024-01-09T00:17:15Z

This issue has been closed due to lack of activity. If you think that
is incorrect, or the issue requires additional review, you can revive the issue at
any time.

asdf2014 · 2024-01-10T03:24:45Z

I recommend reopening this issue, as I've also encountered such problem, which could lead to the failure of ingestion task if the disk space is fully, and this could be a significant concern that this belongs to the resource leak problem 😅

asdf2014 · 2024-01-10T09:57:22Z

Hi @sergioferragut , have you had a chance to check the ~/var/druid/task/ dir? I find many outdated single_phase_sub_task_xxx directories with druid-input-entity-xxx.tmp file, which is worse than tmp folders..

asdf2014 · 2024-01-10T12:27:00Z

@abhishekagarwal87 Do you have any idea on this one 😄

abhishekagarwal87 · 2024-01-15T09:10:18Z

What version are you on? I don't see such folders on my local box. Can you post your ingestion spec that you are running?

asdf2014 · 2024-01-16T07:06:08Z

Hi @abhishekagarwal87 , same as the version that @sergioferragut mentioned in this issue, yes, this indeed is a very low probability event. Now that we are using the MoK mode with the latest version of Druid, this issue no longer affects us 😅

sergioferragut added the Uncategorized problem report label Mar 15, 2022

github-actions bot added the stale label Dec 11, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 9, 2024

asdf2014 reopened this Jan 10, 2024

asdf2014 added Bug Area - Ingestion and removed stale Uncategorized problem report labels Jan 10, 2024

asdf2014 mentioned this issue Jan 10, 2024

Druid did not remove segments from middle manager after ingestion task failed #13183

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Temporary working folders are left behind on Middle Managers after tasks complete #12332

Temporary working folders are left behind on Middle Managers after tasks complete #12332

sergioferragut commented Mar 15, 2022

lejinghu commented Aug 16, 2022

github-actions bot commented Dec 11, 2023

github-actions bot commented Jan 9, 2024

asdf2014 commented Jan 10, 2024

asdf2014 commented Jan 10, 2024

asdf2014 commented Jan 10, 2024

abhishekagarwal87 commented Jan 15, 2024

asdf2014 commented Jan 16, 2024

Temporary working folders are left behind on Middle Managers after tasks complete #12332

Temporary working folders are left behind on Middle Managers after tasks complete #12332

Comments

sergioferragut commented Mar 15, 2022

Affected Version

Description

lejinghu commented Aug 16, 2022

github-actions bot commented Dec 11, 2023

github-actions bot commented Jan 9, 2024

asdf2014 commented Jan 10, 2024

asdf2014 commented Jan 10, 2024

asdf2014 commented Jan 10, 2024

abhishekagarwal87 commented Jan 15, 2024

asdf2014 commented Jan 16, 2024