New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of the Monkey language #1

Merged
merged 1 commit into from Oct 19, 2018

Conversation

Projects
None yet
6 participants
@pauldix
Copy link
Owner

pauldix commented Oct 19, 2018

I'm opening this so I can leave questions on some lines for specific things. Hopefully I can come back to these and answer them!

@pauldix
Copy link
Owner

pauldix left a comment

Leaving some open questions here.

// a HashLiteral isn't a valid expression as a key in a monkey hash.
impl Hash for HashLiteral {
fn hash<H: Hasher>(&self, _state: &mut H) {
panic!("hash not implemented for HashLiteral");

This comment has been minimized.

@pauldix

pauldix Oct 19, 2018

Owner

Is this bad form to panic on a method implemented for a trait? This should never get called.

This comment has been minimized.

@Boscop

Boscop Oct 23, 2018

If executing this code should be a bug (the programmer's fault) it's usually recommended to use unreachable!() to signal that this code should never be executed.
(In many cases, this also allows the compiler to do more optimizations, also when using zero-sized types like Never which can never be instantiated.)

This comment has been minimized.

@pauldix

pauldix Oct 24, 2018

Owner

Thanks!

pub struct IfExpression {
pub condition: Expression,
pub consequence: BlockStatement,
pub alternative: Option<BlockStatement>,

This comment has been minimized.

@pauldix

pauldix Oct 19, 2018

Owner

I had some confusion about some spots where it seems like you can use Option in lieu of using a Box. Is that true, what's going on here?

This comment has been minimized.

@Boscop

Boscop Oct 23, 2018

You can use Option for any type that has a fixed size (that is Sized), e.g. structs, enums. Box is usually used for trait objects (as in Box<dyn Trait> when the trait is "object-safe" (not all traits can be turned into trait objects)) or sometimes boxed slices (Box<[T]> because while a reference to a slice is Sized (it's a fat pointer that also stores the len), a slice itself is not Sized).
Btw, a reference to a trait object is also a fat pointer (pointer to the instance and a pointer to the vtable of its trait impl).
The doc has a detailed explanation of the rules for when a trait is object-safe (sorry, I'm on my phone right now).

Box of course also differs from Option in that it represents a definitely existing T instead of an optional T, and thus automatically dereferences to its content (via the Deref trait).

Btw, I just read your blog post and I'm interested in helping you on your journey of learning Rust and using it in production :)

You mentioned that you're also interested in gRPC and web servers. There are several crates for gRPC (when I tried it there was only one but it was working well) and for Web servers I've been using rocket in production (before actix-web existed) and now also actix-web (because it's faster / async (and it allows serving websockets on the same port)).
There are also some other async web server frameworks like tower-web but they are in early stages.

This comment has been minimized.

@OvermindDL1

OvermindDL1 Oct 23, 2018

Another +1 on actix/actix-web, I'm surprised at just how solid it is and it seems to be taking over in any place I see such programming styles being used.

This comment has been minimized.

@pauldix

pauldix Oct 24, 2018

Owner

@Boscop thanks, that's great info. I'll have to read more up on trait objects. Didn't really get a chance to utilize traits in this code, but it's coming up for some future hacking. I'll search through the gRPC crates. I saw in some thread criticism about Rocket because it's not async so I'll probably dig into actix first.

input: &'a str,
expected: i64,
}
let tests = vec![

This comment has been minimized.

@pauldix

pauldix Oct 19, 2018

Owner

Is the table test idiom used by Rustaceans or is this frowned upon?

This comment has been minimized.

@pastjean

pastjean Oct 23, 2018

Yeah it is common pattern used, you can even look into quickcheck & proptest which are really popular libraries that helps doing those table tests in some "more powerful" ways.

Builtin(Builtin),
Array(Rc<Array>),
Hash(Rc<MonkeyHash>),
Null,

This comment has been minimized.

@pauldix

pauldix Oct 19, 2018

Owner

The Null object is referenced everywhere and I always create a new one when used. In the book a constant was used. I couldn't quite figure out how to do this in basic Rust. Would you use something like the lazy_static crate to define constant objects?

This comment has been minimized.

@gmacon

gmacon Oct 22, 2018

Rust represents enum variants of unit type (like Null) very efficiently, so creating them over and over is fine.

In this example, creating the unit variant compiles down to a single assembly instruction (mov [r14], 2). While the exact memory layout of an enum is an implementation detail, I don't have any reason to believe that it would ever get much more expensive than that.

}

impl Object {
pub fn inspect(&self) -> String {

This comment has been minimized.

@pauldix

pauldix Oct 19, 2018

Owner

The book used the inspect method, but I think if I were more idiomatic I would just implement the Display trait.

This comment has been minimized.

@OvermindDL1

OvermindDL1 Oct 23, 2018

Display is useful for something displayable in a user-formatted way. If you want a low level detail representation then look at the Debug trait, that seems closer to what inspect is from what I see about what inspect is supposed to be?

This comment has been minimized.

@pauldix

pauldix Oct 24, 2018

Owner

Yeah, I think that's right. I would imagine in a more complete implementation of a language I would want to implement both (not as derived, but specific to those objects)

impl Builtin {
pub fn lookup(name: &str) -> Option<Object> {
match name {
"len" => Some(Object::Builtin(Builtin::Len)),

This comment has been minimized.

@pauldix

pauldix Oct 19, 2018

Owner

Same here as with Object::Null, would be nice to re-use a constant of some sort. How to do this?

This comment has been minimized.

@Boscop

Boscop Oct 23, 2018

You only need lazy_static if the value can't be computed at compile-time.
Btw, the range of expressions that can be computed at compile-time is being expanded (through const fn, similar to const-expr in C++).

For this case, you could just define a non-lazy (normal) static or const:

const BUILTIN_LEN = Object::Builtin(Builtin::Len);

The difference between a const and a static is that consts don't have a memory location, they are inlined at every use location, so it's not recommended to use const for large arrays, they will be copied onto the stack at the use locations and can blow the stack.

pub struct Function {
pub parameters: Vec<ast::IdentifierExpression>,
pub body: ast::BlockStatement,
pub env: Rc<RefCell<Environment>>,

This comment has been minimized.

@pauldix

pauldix Oct 19, 2018

Owner

This is the spot where I end up creating a memory leak since there are circular references. I'll log an issue to talk about what to do about this since it's a longer thing. One simple question though: I've seen Rc<RefCell<... and I've seen RefCell<Rc<.... Is there a difference between the two? Which is "more correct"?

This comment has been minimized.

@Boscop

Boscop Oct 23, 2018

For avoiding memory leaks caused by cycles in the interpreted code, this gc library seems like a good fit (but it's in an early stage):
https://www.reddit.com/r/rust/comments/9ozaut/shifgrethor_i_garbage_collection_as_a_rust_library/

RefCell is like Cell (allows mutation) except it works for types that aren't Copy. Rc wraps it in a reference counted "box", so if you want to be able to mutate the same object from different (non-concurrent) code paths (in the same thread), you'd usually use Rc<RefCell<T>>, e.g. with the job_scheduler crate if you want to be able to mutate an object from different jobs (that are not run in parallel).
It's used when the borrow checker won't let you do it without this, because it can't prove that the mutable borrows don't overlap in time, but you know it for sure (RefCell will panic at runtime if a second mutable borrow happens while the first one is still active).
This pattern is basically runtime borrow checking. It should only be used when compile-time borrow checking doesn't suffice (because there is the (small) ref counting overhead and it's better to prove the absence of a bug at compile-time than to check for it at runtime).
This pattern is also used heavily in the nphysics engine (the last time I checked) because it has to store a lot of cross references that allow mutating the objects.

AFAIR, I've never used the second way (RefCell<Rc<...) (since I started using Rust in 2014), where did you see that?

This comment has been minimized.

@pauldix

pauldix Oct 24, 2018

Owner

I thought I had seen it the second way somewhere, but now I can't seem to dig up where I saw this. So maybe just misremembering all the different things I read on these.

Ok(left_exp)
}

fn prefix_fn(&mut self) -> Option<PrefixFn> {

This comment has been minimized.

@pauldix

pauldix Oct 19, 2018

Owner

The way I did this felt kind of odd. He had in the book a global map that had a string key to match the token to the associated function used to parse. I got the same effect with this match and passing functions, but it felt a little weird. Is there a better way to do this?

This comment has been minimized.

@Boscop

Boscop Oct 23, 2018

This is basically the more type-safe version of that. Instead of looking up the function by the token's string repr, you're looking it up by the token variant. I don't think this is weird but I'd probably manually inline infix_fn at its call location to make it more apparent / readable what its purpose is (so that the person reading this code doesn't have to jump around), also because the infix_fn is short and only used at this one location.
So I'd write let infix_fn = match { ... and then match infix_fn { ....


type ParseError = String;
type ParseErrors = Vec<ParseError>;
pub type ParseResult<T> = Result<T, ParseError>;

This comment has been minimized.

@pauldix

pauldix Oct 19, 2018

Owner

The error handling examples in both The Rust Programming Language and Programming Rust used just Return as a name. But when I tried to do that I got some weird error about cycles. Is that not allowed anymore and is it now the convention to do <Thing>Result and <Thing>Error?

This comment has been minimized.

@sagebind

sagebind Oct 23, 2018

I assume you mean that you tried pub type Result<T> = Result<T, ParseError>;? This doesn't work because the compiler thinks you are defining the type alias in terms of itself, hence the cycle detected error (probably not the best error message).

Try using the fully qualified name instead:

pub type Result<T> = ::std::result::Result<T, ParseError>;

@pauldix pauldix merged commit 78c705f into master Oct 19, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment