Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Lexer::clone leak and UB + tests #390

Merged
merged 1 commit into from
May 3, 2024

Conversation

Jakobeha
Copy link
Contributor

@Jakobeha Jakobeha commented May 2, 2024

Lexer::clone shouldn't clone the inner ManuallyDrop, because doing so clones the inner value, which is moved out in Lexer::next.

This causes use-after-free if the lexer is cloned after the last-returned token is dropped, especially if the token contains an overridden implementation of Clone (such as Rc) that tries to read the dropped data.

It causes a memory leak if the token contains a heap-allocated value, because cloning makes a new allocation. This allocation is in the ManuallyDrop and it's guaranteed to be overridden before the call to ManuallyDrop::take, so it's never freed.

Another thing: #263 (make Lexer implement Copy) probably should be added (referencing here because it looks like the issue has been forgotten).

`Lexer::clone` shouldn't clone the inner `ManuallyDrop`, because doing so clones the inner value, which is moved out in `Lexer::next`.

This causes use-after-free if the lexer is cloned after the last-returned token is dropped, especially if the token contains an overridden implementation of `Clone` (such as `Rc`) that tries to read the dropped data.

It causes a memory leak if the token contains a heap-allocated value, because cloning makes a new allocation. This allocation is in the `ManuallyDrop` and it's guaranteed to be overridden before the call to `ManuallyDrop::take`, so it's never freed.

Another thing: maciejhirsz#263 (make `Lexer` implement `Copy`) probably should be added (referencing here because it looks like the issue has been forgotten).
@jeertmans
Copy link
Collaborator

Hello @Jakobeha, thank you for this very comprehensive analysis!

But, if I understand well, then cloning Lexer becomes virtually useless?

@jeertmans jeertmans added the bug Something isn't working label May 2, 2024
@Jakobeha
Copy link
Contributor Author

Jakobeha commented May 2, 2024

Cloning can be used as a kind of lookahead: the cloned lexer will return the exact same tokens that would be returned by the original lexer without advancing the original (provided the lexing functions are pure). One can also clone_from to cloned lexer on the original to "commit" the peeked tokens.

I'm not sure if it's the most efficient way to implement a lookahead (I don't know the performance vs storing an array of spans and tokens and then making next into a conditional), and source being cloned in this method is redundant, but I'm sure in most cases it's fast enough to not matter.

The same thing (creating a "lookahead" lexer) can also be done using Lexer::with_extras(original.remainder(), original.extras), or to keep the same span for lexed tokens,

let mut clone = Lexer::with_extras(original.source(), original.extras);
clone.bump(original.span().end);

Though it's less convenient, especially if the lexer is wrapped in another data structure, because then Clone on the outer structure has to be manually implemented.


Related, I also thought about Copy on Lexer, and personally I don't think it's a good idea, because it makes it easy to unintentionally copy the lexer instead of mutably borrowing it. Clone doesn't have this issue because it's explicit.

@maciejhirsz
Copy link
Owner

This is a pretty good catch!

FWIW whenever the mythical rewrite happens that turns all the internal gotos from functions to match branches over an enum in a loop, the token field can be replaced by just a variable on stack inside next.

@jeertmans jeertmans merged commit 0bcfb6a into maciejhirsz:master May 3, 2024
18 checks passed
akrantz01 pushed a commit to akrantz01/antsi that referenced this pull request Aug 2, 2024
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [logos](https://logos.maciej.codes/)
([source](https://togithub.com/maciejhirsz/logos)) | dependencies |
patch | `0.14.0` -> `0.14.1` |

---

### Release Notes

<details>
<summary>maciejhirsz/logos (logos)</summary>

###
[`v0.14.1`](https://togithub.com/maciejhirsz/logos/releases/tag/v0.14.1):
0.14.1 - Debug feature and fixes

#### What's Changed

- fix(doc): reset logos2 to logos by
[@&#8203;jeertmans](https://togithub.com/jeertmans) in
[maciejhirsz/logos#372
- chore(book): add JSON-borrowed parser example by
[@&#8203;jeertmans](https://togithub.com/jeertmans) in
[maciejhirsz/logos#373
- Add Rc<T> and Arc<T> sources by
[@&#8203;InfiniteCoder01](https://togithub.com/InfiniteCoder01) in
[maciejhirsz/logos#340
- Fix unicode dot by [@&#8203;RustyYato](https://togithub.com/RustyYato)
in
[maciejhirsz/logos#376
- chore(docs): cleanup examples by
[@&#8203;jeertmans](https://togithub.com/jeertmans) in
[maciejhirsz/logos#381
- chore(lib): add debug feature by
[@&#8203;jeertmans](https://togithub.com/jeertmans) in
[maciejhirsz/logos#382
- Cleanup unused Source features by
[@&#8203;kmicklas](https://togithub.com/kmicklas) in
[maciejhirsz/logos#335
- chore(deps): bump peaceiris/actions-mdbook from 1 to 2 by
[@&#8203;dependabot](https://togithub.com/dependabot) in
[maciejhirsz/logos#387
- Fix `Lexer::clone` leak and UB + tests by
[@&#8203;Jakobeha](https://togithub.com/Jakobeha) in
[maciejhirsz/logos#390
- fix(lib): correctly handle miss for loop in loop by
[@&#8203;lukas-code](https://togithub.com/lukas-code) in
[maciejhirsz/logos#393
- chore(lib): remove error branch from LUT if it is unreachable by
[@&#8203;RustyYato](https://togithub.com/RustyYato) in
[maciejhirsz/logos#386
- fix(docs): typo by
[@&#8203;joerivanruth](https://togithub.com/joerivanruth) in
[maciejhirsz/logos#396
- chore(docs): Adds graph debug documentation to book by
[@&#8203;afreeland](https://togithub.com/afreeland) in
[maciejhirsz/logos#379
- chore: drop python linting frmo pre-commit-config by
[@&#8203;LeoDog896](https://togithub.com/LeoDog896) in
[maciejhirsz/logos#403
- refactor: don't use deprecated max_value() method by
[@&#8203;LeoDog896](https://togithub.com/LeoDog896) in
[maciejhirsz/logos#404
- chore(version): bump logos version to 0.14.1 by
[@&#8203;jeertmans](https://togithub.com/jeertmans) in
[maciejhirsz/logos#409
- fix(docs): change old 0.14.0 by
[@&#8203;jeertmans](https://togithub.com/jeertmans) in
[maciejhirsz/logos#410

#### New Contributors

- [@&#8203;InfiniteCoder01](https://togithub.com/InfiniteCoder01) made
their first contribution in
[maciejhirsz/logos#340
- [@&#8203;RustyYato](https://togithub.com/RustyYato) made their first
contribution in
[maciejhirsz/logos#376
- [@&#8203;Jakobeha](https://togithub.com/Jakobeha) made their first
contribution in
[maciejhirsz/logos#390
- [@&#8203;lukas-code](https://togithub.com/lukas-code) made their first
contribution in
[maciejhirsz/logos#393
- [@&#8203;joerivanruth](https://togithub.com/joerivanruth) made their
first contribution in
[maciejhirsz/logos#396
- [@&#8203;afreeland](https://togithub.com/afreeland) made their first
contribution in
[maciejhirsz/logos#379
- [@&#8203;LeoDog896](https://togithub.com/LeoDog896) made their first
contribution in
[maciejhirsz/logos#403

**Full Changelog**:
maciejhirsz/logos@v0.14...v0.14.1

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View the
[repository job log](https://developer.mend.io/github/akrantz01/antsi).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy40NDAuNyIsInVwZGF0ZWRJblZlciI6IjM3LjQ0MC43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6W119-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants