Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions crates/oxide/src/scanner/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -553,11 +553,17 @@ where
a
})
.into_iter()
.map(|s| unsafe { String::from_utf8_unchecked(s.to_vec()) })
.filter_map(|s| match String::from_utf8(s.to_vec()) {
Ok(s) => Some(s),
Err(_e) => {
// Optionally log or handle invalid UTF-8 here
// eprintln!("Skipped invalid UTF-8 candidate, error: {:?}", _e);
None
}
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an intentional performance optimization. String::from_utf8 does a validation pass over the string which should be uncnessary here. The extractor's implementation should guarantee that the resulting candidates are already UTF-8.

If you have a test case for this that crashes when scanning files please open a separate issue and I will take a look.

.collect();

// SAFETY: Unstable sort is faster and in this scenario it's also safe because we are
// guaranteed to have unique candidates.
// Unstable sort is performant & remains correct with unique candidates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While not a traditional safety message this is 100% intentional because behavior would be incorrect in the face of non-unique candidates + an unstable sort.

result.par_sort_unstable();

result
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import { parseCandidate } from '../../../../tailwindcss/src/candidate'
import { parse as parseHtml } from 'node-html-parser'
import type { DesignSystem } from '../../../../tailwindcss/src/design-system'
import { DefaultMap } from '../../../../tailwindcss/src/utils/default-map'
import * as version from '../../utils/version'
Expand Down Expand Up @@ -187,31 +188,24 @@ export function isSafeMigration(
return true
}

// Assumptions:
// - All `<style` tags appear before the next `</style>` tag
// - All `<style` tags are closed with `</style>`
// - No nested `<style>` tags
// Robustly locates all <style> blocks (with or without attributes) using an HTML parser.
const styleBlockRanges = new DefaultMap((source: string) => {
let ranges: number[] = []
let offset = 0

while (true) {
let startTag = source.indexOf('<style', offset)
if (startTag === -1) return ranges

offset = startTag + 1

// Ensure the style looks like:
// - `<style>` (closed)
// - `<style …>` (with attributes)
if (!source[startTag + 6].match(/[>\s]/)) continue

let endTag = source.indexOf('</style>', offset)
if (endTag === -1) return ranges
offset = endTag + 1

ranges.push(startTag, endTag)
const ranges: number[] = []
try {
const root = parseHtml(source, { lowerCaseTagName: false, comment: false, blockTextElements: { style: true } })
const styleNodes = root.querySelectorAll('style')
styleNodes.forEach(node => {
const nodeHtml = node.toString()
const start = typeof node.range === 'object' && node.range !== null
? node.range[0]
: source.indexOf(nodeHtml)
const end = start + nodeHtml.length
ranges.push(start, end)
})
} catch (_) {
// fallback: do nothing if parser fails
}
return ranges
})
Comment on lines +191 to 209
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Robust HTML parser implementation with minor fallback concern.

The replacement of manual string scanning with node-html-parser is a significant improvement that correctly handles style tags with attributes, nesting, and complex HTML structures. The parser options are appropriate:

  • blockTextElements: { style: true } ensures content inside <style> tags isn't parsed as HTML
  • lowerCaseTagName: false preserves case sensitivity for accuracy

However, the fallback on line 201 (source.indexOf(nodeHtml)) could match the wrong occurrence if the source contains multiple identical <style> blocks. This is an edge case but could lead to incorrect range detection.

The silent error handling (line 205-207) is reasonable for graceful degradation, but consider logging errors in development mode to aid debugging.

Optional improvement for the indexOf fallback:

-      const start = typeof node.range === 'object' && node.range !== null
-        ? node.range[0]
-        : source.indexOf(nodeHtml)
+      // Only use indexOf as last resort; range should be available in most cases
+      const start = typeof node.range === 'object' && node.range !== null && node.range[0] !== undefined
+        ? node.range[0]
+        : source.indexOf(nodeHtml)
🤖 Prompt for AI Agents
In packages/@tailwindcss-upgrade/src/codemods/template/is-safe-migration.ts
around lines 191 to 209, the fallback that uses source.indexOf(nodeHtml) can
pick the wrong identical <style> occurrence; replace the fallback with a
deterministic search by tracking a moving search offset (e.g., maintain a
lastIndex variable and call source.indexOf(nodeHtml, lastIndex) then advance
lastIndex to end) or prefer parser-provided node.range/startIndex if available,
and ensure ranges are pushed in document order; additionally, change the silent
catch to optionally emit a development-only debug log (or rethrow in dev) rather
than completely swallowing the parser error.


const BACKSLASH = 0x5c
Expand Down