Skip to content

Performance optimizations: O(n*m) to O(n) lookups, lazy OUI loading, …#457

Merged
kimocoder merged 1 commit intomasterfrom
claude/scan-project-issues-rodxf
Mar 6, 2026
Merged

Performance optimizations: O(n*m) to O(n) lookups, lazy OUI loading, …#457
kimocoder merged 1 commit intomasterfrom
claude/scan-project-issues-rodxf

Conversation

@kimocoder
Copy link
Owner

…cached FD checks

  • airodump.py: Replace O(n*m) nested loops with O(1) dict lookups for both client-to-target mapping and old-target matching (250K+ comparisons → ~1K)
  • airodump.py: Sample only first 4KB for chardet encoding detection instead of reading entire CSV file, reducing I/O per scan cycle
  • airodump.py: Suppress repeated null-byte warnings (log once per parse)
  • config.py: Lazy-load OUI manufacturer database (1.7MB file) on first access instead of parsing at startup, improving startup time
  • process.py: Cache file descriptor count with 2-second TTL to avoid os.listdir('/proc/pid/fd/') on every process creation
  • process.py: Reduce FD limit sleep from 500ms to 100ms since cleanup is sync
  • scanner.py: Use heapq.nlargest() for O(n log k) target capping instead of O(n log n) full sort
  • scanner.py: Remove duplicate pid.poll() system call in main scan loop

…cached FD checks

- airodump.py: Replace O(n*m) nested loops with O(1) dict lookups for both
  client-to-target mapping and old-target matching (250K+ comparisons → ~1K)
- airodump.py: Sample only first 4KB for chardet encoding detection instead
  of reading entire CSV file, reducing I/O per scan cycle
- airodump.py: Suppress repeated null-byte warnings (log once per parse)
- config.py: Lazy-load OUI manufacturer database (1.7MB file) on first access
  instead of parsing at startup, improving startup time
- process.py: Cache file descriptor count with 2-second TTL to avoid
  os.listdir('/proc/pid/fd/') on every process creation
- process.py: Reduce FD limit sleep from 500ms to 100ms since cleanup is sync
- scanner.py: Use heapq.nlargest() for O(n log k) target capping instead of
  O(n log n) full sort
- scanner.py: Remove duplicate pid.poll() system call in main scan loop

https://claude.ai/code/session_012qvM97c3D9M98CrMjmzuQd
Copilot AI review requested due to automatic review settings March 6, 2026 06:07
@kimocoder kimocoder merged commit bf0ff90 into master Mar 6, 2026
8 of 9 checks passed
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on reducing scan-loop overhead and startup latency by replacing repeated linear scans and eager file parsing with cached/dict-based lookups and lazy loading in the WiFi scanning pipeline.

Changes:

  • Optimizes scan-time data processing (dict lookups, heapq.nlargest, fewer redundant system calls).
  • Introduces lazy loading for the OUI manufacturer database instead of parsing at startup.
  • Adds a TTL cache for /proc/<pid>/fd counting to reduce filesystem scans during frequent process creation.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
wifite/util/scanner.py Removes redundant pid.poll() check and caps target lists using heapq.nlargest for better asymptotic performance.
wifite/util/process.py Adds FD-count caching with TTL and shortens FD-limit sleep delay.
wifite/ui/selector_view.py Ensures manufacturer DB is loaded before OUI lookup in the selector UI.
wifite/ui/scanner_view.py Ensures manufacturer DB is loaded before OUI lookup in the scanner TUI.
wifite/tools/airodump.py Replaces nested loops with dict lookups and reduces CSV encoding-detection I/O; suppresses repeated null-byte warnings.
wifite/model/target.py Loads manufacturer DB on-demand when rendering manufacturer column in Target.to_str().
wifite/config.py Moves OUI parsing to a lazy-load method and defers manufacturer DB initialization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +123 to +127
def load_manufacturers(cls):
"""Lazy-load OUI manufacturer database on first access."""
if cls._manufacturers_loaded:
return
cls._manufacturers_loaded = True
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configuration.load_manufacturers() reads cls._manufacturers_loaded, but _manufacturers_loaded is not defined as a class attribute (it’s only set inside initialize). If load_manufacturers() is called before initialize(), this will raise AttributeError. Define _manufacturers_loaded = False at class scope (and consider using getattr(cls, '_manufacturers_loaded', False) for extra safety).

Copilot uses AI. Check for mistakes.
Comment on lines 456 to +466
def get_open_fd_count():
"""Get current open file descriptor count from /proc/{pid}/fd"""
"""Get current open file descriptor count from /proc/{pid}/fd (cached with TTL)"""
now = time.time()
if now - Process._fd_cache_time < Process._FD_CACHE_TTL:
return Process._fd_cache_value
try:
proc_fd_dir = f'/proc/{os.getpid()}/fd'
if os.path.exists(proc_fd_dir):
return len(os.listdir(proc_fd_dir))
Process._fd_cache_value = len(os.listdir(proc_fd_dir))
Process._fd_cache_time = now
return Process._fd_cache_value
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_open_fd_count() caches results for 2s, but check_fd_limit() calls it twice back-to-back (before and immediately after cleanup). With the TTL, the “after cleanup” call will return the cached pre-cleanup value, making freed/new_count wrong and potentially keeping the system in a “high FD usage” state for up to the TTL. Consider adding a force_refresh/use_cache parameter or explicitly invalidating the cache before the post-cleanup re-check.

Copilot uses AI. Check for mistakes.
Comment on lines +187 to +196
# Use dict lookup (O(1)) instead of nested loop (O(n*m))
old_by_bssid = {t.bssid: t for t in old_targets}
for new_target in new_targets:
just_found = True
for old_target in old_targets:
# If the new_target is found in old_target copy attributes from old target
if old_target == new_target:
# Identify decloaked targets
if new_target.essid_known and not old_target.essid_known:
# We decloaked a target!
new_target.decloaked = True

old_target.transfer_info(new_target)
just_found = False
break

# If the new_target is not in old_targets, check target_archives
# and copy attributes from there
if just_found and new_target.bssid in target_archives:
old_target = old_by_bssid.get(new_target.bssid)
if old_target is not None:
# Identify decloaked targets
if new_target.essid_known and not old_target.essid_known:
new_target.decloaked = True
old_target.transfer_info(new_target)
elif new_target.bssid in target_archives:
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old_by_bssid = {t.bssid: t for t in old_targets} changes behavior if old_targets contains duplicate BSSIDs (possible because CSV parsing doesn’t dedupe). The previous nested loop copied from the first matching old_target; this dict keeps the last one. If you want to preserve previous semantics, build the dict with “first wins” (e.g., only set the key if it’s not already present).

Copilot uses AI. Check for mistakes.
self.targets = self.targets[:self._max_targets]

# Use heapq.nlargest: O(n log k) vs O(n log n) for full sort
self.targets = heapq.nlargest(self._max_targets, self.targets, key=lambda x: x.power)
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heapq.nlargest() is not stable for ties, so when multiple targets share the same power, their relative order may become non-deterministic compared to the previous stable list.sort(). If target ordering is user-visible (and affects selection indices), consider adding a deterministic tie-breaker (e.g., (power, bssid)), or keep the stable sort when order matters.

Suggested change
self.targets = heapq.nlargest(self._max_targets, self.targets, key=lambda x: x.power)
# Add deterministic tie-breaker (BSSID) to avoid non-deterministic ordering for equal power
self.targets = heapq.nlargest(
self._max_targets,
self.targets,
key=lambda x: (x.power, getattr(x, 'bssid', ''))
)

Copilot uses AI. Check for mistakes.
@kimocoder kimocoder deleted the claude/scan-project-issues-rodxf branch March 6, 2026 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants