Slowness due to blocked threads in Channel.getChannel call by DavidTavoularis · Pull Request #887 · mwiede/jsch

DavidTavoularis · 2025-08-21T11:44:58Z

No description provided.

…hread environment mwiede#885

…hread environment

DavidTavoularis · 2025-08-21T11:56:19Z

Summary of changes made:

Replaced Vector with ConcurrentHashMap for the channel pool - eliminates synchronization bottlenecks
Changed from static int index to AtomicInteger - ensures thread-safe ID generation
Optimized channel lookup - Direct HashMap access O(1) instead of linear search O(n)
Updated del method - Changed from del(Channel c) to del(int id) for efficiency
Maintained OpenSSH compatibility - Preserved the INT_MAX capping logic in a thread-safe manner using compare-and-set
Updated ChannelDirectTCPIP - Fixed the API change for the new del(int id) signature

The implementation now provides better performance and thread safety for concurrent channel operations while maintaining backward compatibility with the OpenSSH workaround.

norrisjeremy

The strategy that is being proposed here could potentially result in memory leaks due to stranded Channels.

For example, the old strategy, while likely slow as you have pointed out, there was no chance for a Channel to not be added to the pool, even if they happened to somehow share the same id value.

However, the strategy proposed here, if two Channel's ever ended up sharing the same id value, the later created one would overwrite it's position in the pool & thus result in a memory leak (since the earlier created Channel would no longer be present in the pool).

norrisjeremy · 2025-08-21T12:19:42Z

src/main/java/com/jcraft/jsch/Channel.java

+      nextIndex = (currentIndex + 1) & Integer.MAX_VALUE;
+    } while (!index.compareAndSet(currentIndex, nextIndex));
+    id = currentIndex;
+    pool.put(id, this);


To illustrate, what I mean, suppose at time X, the id value for a particular Channel A was computed as a value of 5.
Then suppose at some later time Y, the id value for a different Channel B happens to be computed as a value of 5 (since the AtomicInteger index value cycled through enough values to rollover and begin to be recycled).
The pool.put(id, this) call here that puts Channel B into into the pool with an id key value of 5 overwrites the previous entry with a key value of 5 that points at the earlier Channel A.
And thus we have a memory leak, since Channel A is no longer being maintained in the pool.

DavidTavoularis · 2025-08-21T14:27:55Z

Thank you very much for the problem you identified:

With ConcurrentHashMap, if two channels get the same ID (after ~2 billion channel creations and ID rollover), the second channel would overwrite the first in the pool
The overwritten channel becomes orphaned - still consuming resources but unreachable for cleanup
Original Vector approach, while slow, never lost channels even with ID collisions

Here is solution 1:

while (true) {
    int candidateId = index.getAndUpdate(i -> (i + 1) & Integer.MAX_VALUE);
    if (pool.putIfAbsent(candidateId, this) == null) {
        id = candidateId;
        break;
    }
}

Here is solution 2:

private static AtomicLong index = new AtomicLong(0);

Channel() {
    while (true) {
        int candidateId = (int) (index.getAndIncrement() % (Integer.MAX_VALUE + 1L));
        if (pool.putIfAbsent(candidateId, this) == null) {
            id = candidateId;
            break;
        }
    }
}

@norrisjeremy I prefer solution 1, what about you ?

norrisjeremy · 2025-08-21T14:33:34Z

Hi @DavidTavoularis,

I'm not a fan of either solution honestly, because it artificially restricts the reuse of a specific id values.
I'm working on an alternate solution to see if it would be possible to remedy the issue you have reported by simply using a ReadWriteLock.
I'll report back once I've had some time to perform more testing.

Thanks,
Jeremy

DavidTavoularis · 2025-08-21T15:19:22Z

Hi @norrisjeremy thank you for suggesting to use ReadWriteLock.

I just updated the pull request with these new changes:

ReadWriteLock for concurrent access control
ArrayList instead of Vector (no need for Vector's synchronization)
Read lock for lookups - allows multiple concurrent reads
Write lock for modifications - exclusive access when adding/removing

Benefits:

Better concurrency - Multiple threads can search channels simultaneously
No ID collision issues - IDs can be freely reused as channels are removed
No memory leaks - Every channel is tracked in the list
Cleaner code - Using modern ArrayList with explicit locking instead of legacy Vector
Maintains original behavior - ID overflow handling preserved exactly

Key operations:

getChannel() - Uses read lock, allows concurrent lookups
Channel() constructor - Uses write lock for adding to pool
del() - Uses write lock for removal
disconnect() - Uses read lock to collect channels, then disconnects outside the lock

This solution provides thread safety without the artificial restrictions of collision detection, while improving performance for concurrent read operations.

norrisjeremy · 2025-08-21T15:26:34Z

Hi @DavidTavoularis,

See #888 for my current draft proposal.

Thanks,
Jeremy

norrisjeremy · 2025-08-21T15:33:34Z

Hi @DavidTavoularis,

See #888 for my current draft proposal.

Thanks, Jeremy

FYI, I still need to think through the memory usage ramifications of changing the pool structure from a Vector to either an ArrayList as you have proposed or a HashMap as I have done in my draft PR.

DavidTavoularis · 2025-08-21T15:44:43Z

@norrisjeremy I like your approach, here are my comments, regarding your pull-request:

Redundant ID check in getChannel:
if (c.id == id && c.session == session)
Since you're already looking in pool.get(id), the c.id == id check is redundant. Should be:
if (c.session == session)
SonarQube: Unused imports - You have Collection imported but not used
The other SonarQube "Remove this assignment of "index" should be flagged as "OK" (as no change compared to before)
Should you put pool & poolLock as final ?
While this new implementation handles collisions, the Set approach adds small memory overhead of maintaining Sets even for single channels, so should you replace:
pool.computeIfAbsent(id, k -> new HashSet<>()).add(this);
by:
pool.computeIfAbsent(id, k -> new HashSet<>(1)).add(this);
as 99.99999% of the time, the HashSet size will be 1 ?

Best Regards

DavidTavoularis · 2025-08-21T16:26:03Z

@norrisjeremy Let's estimate the memory usage for 10000 Channels.

Original Vector (or ArrayList) approach = O(N):

Vector object: ~24 bytes overhead
Array storage: 10000 × 8 bytes (object references) = 80,000 bytes
Total: ~80,024 bytes

Map<Integer, Set> approach = O(1):

HashMap object: ~48 bytes overhead
Entry array: 10000 × 32 bytes (Entry objects) = 320,000 bytes
HashSet objects: 10000 × 48 bytes = 480,000 bytes
HashSet arrays: 10000 × 4 bytes (capacity=1, single element) = 40,000 bytes
Total: ~888,048 bytes

Another approach would be to have 2 maps (a regular one, and one only for collisions) = O(1):
Map<Integer, Channel> singleChannelPool = new HashMap<>();
Map<Integer, Set> collisionPool = new HashMap<>(); // empty 99.999% of the time

HashMap object: ~48 bytes overhead
Entry array: 10,000 × 32 bytes (Entry objects) = 320,000 bytes
Subtotal: ~320,048 bytes
This last approach would add unnecessary code complexity...

80KB vs 890KB vs 320KB

I agree that it requires more thinking.
David

norrisjeremy · 2025-08-21T16:26:09Z

@norrisjeremy I like your approach, here are my comments, regarding your pull-request:

Redundant ID check in getChannel:
if (c.id == id && c.session == session)
Since you're already looking in pool.get(id), the c.id == id check is redundant. Should be:
if (c.session == session)

SonarQube: Unused imports - You have Collection imported but not used
The other SonarQube "Remove this assignment of "index" should be flagged as "OK" (as no change compared to before)

Should you put pool & poolLock as final ?

While this new implementation handles collisions, the Set approach adds small memory overhead of maintaining Sets even for single channels, so should you replace:
pool.computeIfAbsent(id, k -> new HashSet<>()).add(this);
by:
pool.computeIfAbsent(id, k -> new HashSet<>(1)).add(this);
as 99.99999% of the time, the HashSet size will be 1 ?

Best Regards

Hi @DavidTavoularis,

I've implemented 1, 2 & 4 as you suggested.
As for item 3, I don't see any reason to further increase the size of the Channel object by permanently storing additional properties pointing at the read & write locks derived from the ReadWriteLock.

DavidTavoularis · 2025-08-21T16:27:48Z

@norrisjeremy Thanks, I agree with you for item 3

norrisjeremy · 2025-08-21T16:28:32Z

@norrisjeremy Let's estimate the memory usage for 10000 Channels.

Original Vector (or ArrayList) approach = O(N):

Vector object: ~24 bytes overhead

Array storage: 10000 × 8 bytes (object references) = 80,000 bytes

Total: ~80,024 bytes

Map<Integer, Set> approach = O(1):

HashMap object: ~48 bytes overhead

Entry array: 10000 × 32 bytes (Entry objects) = 320,000 bytes

HashSet objects: 10000 × 48 bytes = 480,000 bytes

HashSet arrays: 10000 × 4 bytes (capacity=1, single element) = 40,000 bytes

Total: ~888,048 bytes

Another approach would be to have 2 maps (a regular one, and one only for collisions) = O(1): Map<Integer, Channel> singleChannelPool = new HashMap<>(); Map<Integer, Set> collisionPool = new HashMap<>(); // empty 99.999% of the time

HashMap object: ~48 bytes overhead

Entry array: 10,000 × 32 bytes (Entry objects) = 320,000 bytes

Subtotal: ~320,048 bytes
This last approach would add unnecessary code complexity...

80KB vs 890KB vs 320KB

I agree that it requires more thinking. David

Yes, that is what I'm worried about and thus want to think further on.

norrisjeremy · 2025-08-21T16:31:12Z

src/main/java/com/jcraft/jsch/Channel.java

+  private static final Lock readLock = poolLock.readLock();
+  private static final Lock writeLock = poolLock.writeLock();


I would omit these two properties and simply derive the read & write locks in the methods that need to use them like I do in #888.

DavidTavoularis · 2025-08-21T16:42:03Z

@norrisjeremy Local ephemeral port range (typically ~28,000-65,000 ports)
Is it true that each Channel is using a different local ephemeral port range ?

If yes, then the maximum amount of simultaneous Channels is about 50000, but I don't believe that anyone would use more than 10000 simultaneous Channels, a more typical usage would be 1000 simultaneous Channels for high-performance systems (this is my use-case).
Using Map<Integer, Set> would increase memory usage by 800KB for 10000 Channels and by only 80KB for 1000 Channels.

norrisjeremy · 2025-08-21T16:50:51Z

@norrisjeremy Local ephemeral port range (typically ~28,000-65,000 ports) Is it true that each Channel is using a different local ephemeral port range ?

If yes, then the maximum amount of simultaneous Channels is about 50000, but I don't believe that anyone would use more than 10000 simultaneous Channels, a more typical usage would be 1000 simultaneous Channels for high-performance systems (this is my use-case). Using Map<Integer, Set> would increase memory usage by 800KB for 10000 Channels and by only 80KB for 1000 Channels.

The id values are only locally significant to a particular session.
You could have thousands of sessions, and in each session have thousands of channels.

DavidTavoularis · 2025-08-21T17:10:13Z

I believe that there is usually a limit of 10 channels per SSH session (cf MaxSessions 10 in /etc/ssh/sshd_config)

norrisjeremy · 2025-08-21T18:08:51Z

src/main/java/com/jcraft/jsch/Channel.java

-            channels[count++] = c;
+            channels.add(c);
          }
        } catch (Exception e) {


I'm not sure why this inner try block that catches exceptions & ignores them exists.
I wonder if we should just drop it?

agreed, I dropped it and pushed in PR

DavidTavoularis · 2025-08-21T19:21:20Z

Main Achievement: Replaced slow synchronized blocks with ReadWriteLock for better thread concurrency.

Key Changes:

Migrated from Vector + synchronized(pool) to ArrayList + ReentrantReadWriteLock
Read operations (getChannel, disconnect) use read locks - allow concurrent execution
Write operations (constructor, del) use write locks - require exclusive access
Removed unnecessary exception handling in disconnect() method

Benefits: Multiple threads can now lookup channels concurrently instead of being serialized. Write operations still require exclusive access for data consistency. Same memory usage (~80KB for 10K channels) with significantly better scalability for read-heavy SSH workloads.

Compatibility: All APIs unchanged, maintains OpenSSH compatibility and existing behavior.

norrisjeremy · 2025-08-21T19:48:16Z

src/main/java/com/jcraft/jsch/Channel.java

-  private static Vector<Channel> pool = new Vector<>();
+  private static final ReadWriteLock poolLock = new ReentrantReadWriteLock();
+  private static int index = 0;
+  private static List<Channel> pool = new ArrayList<>();


I think pool could be marked final?

sonarqubecloud · 2025-08-22T11:10:53Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

norrisjeremy

I think going with this more conservative approach is safer than the alternate approach I explored in #888, so I think we should go with this.

DavidTavoularis · 2025-08-22T11:44:03Z

I think going with this more conservative approach is safer than the alternate approach I explored in #888, so I think we should go with this.

agreed

norrisjeremy · 2025-08-25T14:00:40Z

Hi @DavidTavoularis,

So I have started exploring a different approach to fixing this issue, see #889.
Basically it remove the global pool from the Channel class and instead maintains a per-Session List of all Channels linked to that particular Session.
This would result in a memory increase per Session (since there is now an ArrayList and ReentrantReadWriteLock maintained for each Session object), but the performance impact may be worthwhile.

Thanks,
Jeremy

norrisjeremy · 2025-09-17T14:27:42Z

@mwiede I think what we should do is merge this PR (#887), and then I will rebase my PR (#889) on top of the changes in this, and then I will take #889 out of draft status and you can then merge it.
That will give us the ability to easily rollback the changes in #889 in case we discover any sort of regression.

DavidTavoularis added 2 commits August 21, 2025 13:36

Slowness due to blocked threads in Channel.getChannel call in multi-t…

de83e39

…hread environment mwiede#885

Slowness due to blocked threads in Channel.getChannel call in multi-t…

f8f009b

…hread environment

DavidTavoularis changed the title ~~Slowness due to blocked threads in Channel.getChannel call #885~~ Slowness due to blocked threads in Channel.getChannel call Aug 21, 2025

norrisjeremy suggested changes Aug 21, 2025

View reviewed changes

norrisjeremy reviewed Aug 21, 2025

View reviewed changes

Slowness due to blocked threads in Channel.getChannel call

bff8e27

norrisjeremy reviewed Aug 21, 2025

View reviewed changes

Slowness due to blocked threads in Channel.getChannel call

ddac299

norrisjeremy reviewed Aug 21, 2025

View reviewed changes

Slowness due to blocked threads in Channel.getChannel call

ff7fd64

DavidTavoularis requested a review from norrisjeremy August 21, 2025 19:14

norrisjeremy reviewed Aug 21, 2025

View reviewed changes

Slowness due to blocked threads in Channel.getChannel call

cd5229e

norrisjeremy approved these changes Aug 22, 2025

View reviewed changes

norrisjeremy mentioned this pull request Aug 22, 2025

Modernize handling of Channel pool to improve performance. (#885) #888

Closed

mwiede merged commit 6995f49 into mwiede:master Sep 17, 2025
8 checks passed

norrisjeremy mentioned this pull request Oct 22, 2025

Slowness due to blocked threads in Channel.getChannel call in multi-thread environment #885

Open

		private static final Lock readLock = poolLock.readLock();
		private static final Lock writeLock = poolLock.writeLock();

Conversation

DavidTavoularis commented Aug 21, 2025

Uh oh!

DavidTavoularis commented Aug 21, 2025

Uh oh!

norrisjeremy left a comment

Choose a reason for hiding this comment

Uh oh!

norrisjeremy Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

DavidTavoularis commented Aug 21, 2025

Uh oh!

norrisjeremy commented Aug 21, 2025

Uh oh!

DavidTavoularis commented Aug 21, 2025

Uh oh!

norrisjeremy commented Aug 21, 2025

Uh oh!

norrisjeremy commented Aug 21, 2025

Uh oh!

DavidTavoularis commented Aug 21, 2025

Uh oh!

DavidTavoularis commented Aug 21, 2025

Uh oh!

norrisjeremy commented Aug 21, 2025

Uh oh!

DavidTavoularis commented Aug 21, 2025

Uh oh!

norrisjeremy commented Aug 21, 2025

Uh oh!

norrisjeremy Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

DavidTavoularis commented Aug 21, 2025

Uh oh!

norrisjeremy commented Aug 21, 2025

Uh oh!

DavidTavoularis commented Aug 21, 2025

Uh oh!

norrisjeremy Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

DavidTavoularis Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

DavidTavoularis commented Aug 21, 2025

Uh oh!

norrisjeremy Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Aug 22, 2025

Quality Gate passed

Uh oh!

norrisjeremy left a comment

Choose a reason for hiding this comment

Uh oh!

DavidTavoularis commented Aug 22, 2025

Uh oh!

norrisjeremy commented Aug 25, 2025

Uh oh!

norrisjeremy commented Sep 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants