Skip to content

Conversation

@johnowhitaker
Copy link
Contributor

Related issue: #668

Issue: inline spaces weren't preserved:

e.g. "Howdy <a href="https://answer.ai">answer</a> how are you?" -> "Howdy" (no trailing space).

We changed it to only strip newlines from strings in _parse, and added a test. The original test failed, the HTML was functionally identical but different by one newline - independant of the trailing space bug. We now compare two soups of the HTML before vs after the round trip

@gitnotebooks
Copy link

gitnotebooks bot commented Nov 13, 2025

Found 1 changed notebook. Review the changes at https://app.gitnotebooks.com/AnswerDotAI/fasthtml/pull/797

@johnowhitaker
Copy link
Contributor Author

@jph00 can you review when conveneint (I can explain when we next chat :) )

@johnowhitaker
Copy link
Contributor Author

To expand, this was a bug we picked up during an impromptu office hours.
We first looked at the example from the issue and explored it in a dialog here.

Then, once we were convinced it was a good idea, we moved to the fasthtml.components nb and made the tweak. We added an inline, informal test, but then noticed the 'tests' section right below and added a test for this case instead. It failed though, an errant newline unrelated to our change (I think). @erikgaas had the bright idea to check if bs4 could check equivalence of two soups, and we changed the test to use that instead of comparing the xml strings directly.

@jph00 I am not 100% sure this is right - can you think if there are cases where we definitely do want to strip() strings that are parts of elements? Or other issues with our fix? Do you think it makes sense to still strip newlines, or should weget rid of the strip()s entirely?

@jph00
Copy link
Contributor

jph00 commented Nov 13, 2025

lgtm! :) tbh I don't recall why the strip is there at all...

@jph00 jph00 merged commit 9994651 into main Nov 13, 2025
2 checks passed
@jph00 jph00 deleted the html2ft_space branch November 13, 2025 20:09
@jph00 jph00 added the bug Something isn't working label Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants