Skip to content

Remote log consumption is stuck because the semaphore permit cannot be acquired #3085

@zuston

Description

@zuston

Search before asking

  • I searched in the issues and found nothing similar.

Fluss version

0.9.0 (latest release)

Please describe the bug 🐞

We have a Flink job that has been consuming a Fluss log table for about a week, but recently its throughput dropped to zero. After analyzing the subtask thread dump, it appears that remote log fetching is stuck because a semaphore permit cannot be acquired.

One subtask's thread dump is as follows

Image

and then to use arthas to print the internal variable.

vmtool -c 312787f0 --action getInstances --className org.apache.fluss.client.table.scanner.log.LogFetchBuffer --limit 5 --express '
instances.length == 0 ? "no LogFetchBuffer" : (
  #buf = instances[0],
  #f = #buf.getClass().getDeclaredField("pendingFetches"),
  #f.setAccessible(true),
  #pmap = #f.get(#buf),
  #pmap.isEmpty() ? "pendingFetches empty" : (
    #firstList = #pmap.values().iterator().next(),
    #firstList.isEmpty() ? "first list empty" : #firstList.get(0).toString()
  )
)
Image

and then to print some vars in the RemoteLogDownloader

vmtool -c 312787f0 --action getInstances --className org.apache.fluss.client.table.scanner.log.RemoteLogDownloader --limit 10 --express 'instances.length==0 ? "no RemoteLogDownloader" : (#d=instances[0],#c=#d.getClass(),#f1=#c.getDeclaredField("segmentsToFetch"),#f1.setAccessible(true),#f2=#c.getDeclaredField("segmentsToRecycle"),#f2.setAccessible(true),#f3=#c.getDeclaredField("prefetchSemaphore"),#f3.setAccessible(true),#m=new java.util.LinkedHashMap(),#m.put("segmentsToFetch_size",#f1.get(#d).size()),#m.put("segmentsToRecycle_size",#f2.get(#d).size()),#m.put("availablePermits",#f3.get(#d).availablePermits()),#m)'
Image

And there's no any download failure logs in the taskmanager logs.

Solution

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

Type

No fields configured for Bug.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions