Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AST source locations #16

Merged
merged 35 commits into from Aug 17, 2022
Merged

AST source locations #16

merged 35 commits into from Aug 17, 2022

Conversation

rvanasa
Copy link
Contributor

@rvanasa rvanasa commented Aug 14, 2022

Resolves #7.

Progress:

  • Find a good way to convert usize ranges to line/column numbers
  • Refactor Exp_, Type_, etc. to include source locations
  • Refactor the lexer to use the same source location pattern
  • Update usages throughout the codebase to account for these changes
  • Update the parser to preserve source locations from tokens

@rvanasa rvanasa marked this pull request as draft August 14, 2022 18:19
use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct Loc<X>(pub X, pub Source);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it help in any way to absorb the Box<_> around the X into this type definition?

That way, we avoid having to think about the Box and the Loc separately, and they combine in the AST structure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally included Box<_> in the definition, but I ran into edge cases where I just wanted the source information without storing everything on the heap (specifically for the lexer). This could also give us room for optimizations down the line in the interpreter.

src/lib/ast.rs Outdated

pub type Prog = Decs;

pub type Dec_ = Loc<Box<Dec>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if reading Loc<Dec> is simpler, where it would imply the box?

See my earlier comment about this question, near the def for Loc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense for the AST but adds a bunch of extra overhead to the lexer. If this is enough of a QoL issue, we could maybe add a standard type alias called LB<_> or similar.

We could also add some specialized sugar functions by adding and impl<X> for Loc<Box<X>>, which makes it seem okay to have the optional heap allocation from an abstraction standpoint.

vec: vec![d],
has_trailing: false,
pub fn dec_into_exp(d: Dec_) -> Exp_ {
Loc(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question, and more evidence for combining I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a counterexample where it makes sense to use Loc<_> without Box<_>. We could also eventually use this to optimize the VM, since boxing would add a bit of extra overhead which could make an impact on the interpreter's performance. Seems worth having a "separation of concerns" similar to Rust's Arc<Mutex<_>> pattern for reusability in different situations. I'll definitely add a shorthand for Loc<Box<_>> though, since it will probably continue to be used in most cases.

@matthewhammer
Copy link
Contributor

This is looking really good, and very tedious in terms of the changes required. Thank you for undertaking it!

@rvanasa
Copy link
Contributor Author

rvanasa commented Aug 15, 2022

Replaced all usages of Loc<Box<_>> with a type alias called Node<_>, which has a specialized impl with its own helper functions, etc. Let me know if you think of a better name, since I just chose Node since it seemed to cause slightly less cognitive overhead than LB.

@matthewhammer matthewhammer marked this pull request as ready for review August 15, 2022 21:27
@matthewhammer matthewhammer marked this pull request as draft August 15, 2022 21:27
@matthewhammer
Copy link
Contributor

Accidentally converted from "draft" to "ready to review", so just reverted that.

@rvanasa rvanasa marked this pull request as ready for review August 17, 2022 18:14
@rvanasa rvanasa merged commit ce7d4ed into main Aug 17, 2022
@rvanasa rvanasa deleted the issue-7-line-col-numbers branch August 18, 2022 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Line numbers. Column numbers.
2 participants