PageMatch: Return fetched content in match_data #10143
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
brew style
with your changes locally?brew typecheck
with your changes locally?brew tests
with your changes locally?brew man
locally and committed any changes?This is a follow-up to #10113, which made modifications to
PageMatch
to support working with cached content, as a way of being able to modify theXorg
strategy to usePageMatch
internally (like other strategies). With those changes,PageMatch
properly works with page content that's passed to#find_versions
.However, the issue is that
Xorg
isn't actually able to cache page content in the current state becausePageMatch
isn't passing fetched content back to it in its return hash (match_data
). Before #10113, I had been deleting the:content
frommatch_data
(as it wasn't necessary to keep it in the hash) and I simply forgot to modify that line when modifyingPageMatch#find_versions
to support cached content.In general, this PR makes the following changes:
PageMatch#find_versions
to return fetched page content in thematch_data
hash (i.e., not actively deleting it like we have been doing).provided_content
inPageMatch#find_versions
to ensure thatprovided_content
is a string before using it.match_data[:cached] = true
whenPageMatch#find_versions
is usingprovided_content
. In this case,Cached?: Yes
is printed in the debug output and["meta"]["cached"]
is set totrue
in the verbose JSON output. I missed this issue before because it's sometimes hard to tell when caching is actually working, so it's a good idea to surface this information.Though I know it's the holidays, I'm labeling this "critical" so this fix can be integrated if anyone happens to review it over the next few days.