fix: improve error handling and fix bugs#2499
Conversation
- Fix bug: return err instead of closeErr in DecodeData error path (httpx.go) - Improve error handling for strconv.Atoi calls in multiple locations - Change hammingDistanceThreshold from var to const since it's never modified - Fix redundant return statement in ListFilesWithPattern - Improve port parsing to handle empty port strings explicitly
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
WalkthroughThis PR improves error handling and input validation across four files. HTTP globbing and response decoding now properly surface errors instead of losing them. Port extraction is hardened against parse failures in two independent codepaths. A package constant is introduced to replace a mutable variable. ChangesError Handling and Port Parsing Improvements
🎯 2 (Simple) | ⏱️ ~8 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR updates httpx runner behavior around deduplication/proxy defaults and persistence, adds CPE storage to DB backends, and bumps dependencies/Go version.
Changes:
- Tighten input and response deduplication logic (thread-safe
testAndSet, IP-aware simhash dedupe) and add tests. - Add
Options.HasMatcherOrFilter()and change response storage to persist only matched outputs when matchers/filters are active. - Add
cpecolumn support in Postgres/MySQL schemas and batch inserts; update proxy env fallback and dependency versions.
Reviewed changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| runner/runner.go | Adds mutex to testAndSet, changes simhash cache value type and dedupe logic, updates input streaming, and modifies response storing behavior. |
| runner/runner_test.go | Adds unit and concurrency tests for testAndSet, matcher/filter detection, and duplicate detection semantics. |
| runner/options.go | Adds HasMatcherOrFilter() for detecting active output matchers/filters and updates resolver file reading. |
| runner/headless.go | Falls back to proxy settings from environment variables for headless browser. |
| common/httpx/httpx.go | Uses protocol constants, adds proxy-from-env default, adjusts retryable http2 fallback behavior, and fixes error handling. |
| common/httpx/httpx_test.go | Adds tests validating HTTP/1.1 disables HTTP/2 fallback client. |
| internal/db/postgres.go | Adds cpe column to schema and inserts it in batch writes; adds upgrade-time ALTER for back-compat. |
| internal/db/mysql.go | Adds cpe column to schema and inserts it in batch writes; adds upgrade-time column ensure helper. |
| common/hashes/jarm/jarmhash.go | Makes port parsing robust when URL has no/invalid port. |
| common/fileutil/fileutil.go | Changes ListFilesWithPattern to always return nil error when files exist. |
| README.md | Minor wording/punctuation fixes. |
| go.mod / go.sum | Bumps Go version and updates multiple dependencies. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for storedHash, storedIPs := range r.simHashes.GetALL(false) { | ||
| if simhash.Compare(storedHash, respSimHash) > 3 { | ||
| continue | ||
| } | ||
| if ip == "" || sliceutil.Contains(storedIPs, ip) { | ||
| gologger.Debug().Msgf("Skipping duplicate response (simhash %d, ip %s) for URL %s\n", respSimHash, ip, result.URL) | ||
| return true | ||
| } | ||
| _ = r.simHashes.Set(storedHash, append(storedIPs, ip)) | ||
| return false | ||
| } | ||
| _ = r.simHashes.Set(respSimHash, struct{}{}) | ||
|
|
||
| _ = r.simHashes.Set(respSimHash, []string{ip}) |
| func (options *Options) HasMatcherOrFilter() bool { | ||
| return len(options.matchStatusCode) > 0 || | ||
| len(options.matchContentLength) > 0 || | ||
| len(options.filterStatusCode) > 0 || | ||
| len(options.filterContentLength) > 0 || | ||
| len(options.matchRegexes) > 0 || | ||
| len(options.filterRegexes) > 0 || | ||
| len(options.matchLinesCount) > 0 || | ||
| len(options.matchWordsCount) > 0 || | ||
| len(options.filterLinesCount) > 0 || | ||
| len(options.filterWordsCount) > 0 || | ||
| len(options.OutputMatchString) > 0 || | ||
| len(options.OutputFilterString) > 0 || | ||
| len(options.OutputMatchFavicon) > 0 || | ||
| len(options.OutputFilterFavicon) > 0 || | ||
| len(options.OutputMatchCdn) > 0 || | ||
| len(options.OutputFilterCdn) > 0 || | ||
| len(options.OutputFilterPageType) > 0 || | ||
| options.OutputMatchCondition != "" || | ||
| options.OutputFilterCondition != "" || | ||
| options.OutputMatchResponseTime != "" || | ||
| options.OutputFilterResponseTime != "" | ||
| } |
| return nil, errors.New("no files found") | ||
| } | ||
| return files, err | ||
| return files, nil |
| -- Back-compat for databases whose schema was created before CPE support. | ||
| -- New installs already get this column via the CREATE TABLE above; this | ||
| -- statement only matters for in-place upgrades. | ||
| -- TODO: replace these ad-hoc ALTER TABLE statements with a proper | ||
| -- migration framework (e.g. golang-migrate / goose) once more schema | ||
| -- changes accumulate. | ||
| ALTER TABLE %s ADD COLUMN IF NOT EXISTS cpe JSONB; |
Revert the defensive strconv.Atoi guard around the SkipDedupe branches: values stored in r.hm are always integer strings written by this same code path, so the parseErr/cnt>0 check is unreachable in practice, and when taken literally it skipped the numHosts/numTargets increment for duplicate inputs.
Summary
DecodeDatafailure returned wrong errorstrconv.Atoiparsing errorshammingDistanceThresholdto constantMotivation
Improve code robustness and fix silent error handling issues.
Changes
Summary by CodeRabbit