Skip to content

Optimise encodeNonUTF8QueryURLs#485

Merged
NGTmeaty merged 1 commit intointernetarchive:mainfrom
vbanos:opt-encodeNonUTF8QueryURLs
Sep 19, 2025
Merged

Optimise encodeNonUTF8QueryURLs#485
NGTmeaty merged 1 commit intointernetarchive:mainfrom
vbanos:opt-encodeNonUTF8QueryURLs

Conversation

@vbanos
Copy link
Copy Markdown
Collaborator

@vbanos vbanos commented Sep 19, 2025

Two optimisations:
We iterate query key -> values and then iterate each value in values. We encoded the key again and again for each value. We only need to do it once. We move the key encoding code outside the values loop.

We define enc.NewEncoder().String() every time we need to encode a value. We can define encoder := enc.NewEncoder() in the beginning and reuse it.

We add a unit test for a bad URL just to be sure.

Two optimisations:
We iterate query `key` -> `values` and then iterate each `value` in `values`.
We encoded the `key` again and again for each `value`. We only need to
do it once. We move the key encoding code outside the `values` loop.

We define `enc.NewEncoder().String()` every time we need to encode a
value. We can define `encoder := enc.NewEncoder()` in the beginning and
reuse it.

We add a unit test for a bad URL just to be sure.
@vbanos vbanos changed the title Optimise internal/pkg/postprocessor/extractor/html_document_test.go Optimise encodeNonUTF8QueryURLs Sep 19, 2025
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Sep 19, 2025

Codecov Report

❌ Patch coverage is 87.50000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.39%. Comparing base (0837f4d) to head (2c7ad38).
⚠️ Report is 63 commits behind head on main.

Files with missing lines Patch % Lines
...ernal/pkg/postprocessor/extractor/html_document.go 87.50% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #485      +/-   ##
==========================================
- Coverage   56.42%   56.39%   -0.03%     
==========================================
  Files         130      130              
  Lines        8091     8096       +5     
==========================================
+ Hits         4565     4566       +1     
- Misses       3161     3164       +3     
- Partials      365      366       +1     
Flag Coverage Δ
e2etests 40.65% <68.75%> (-0.01%) ⬇️
unittests 29.37% <68.75%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@NGTmeaty NGTmeaty merged commit dc714e4 into internetarchive:main Sep 19, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants