Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Review: Smart Contract VM 02/13/19 #921

Merged
merged 50 commits into from Feb 19, 2019
Merged

Conversation

kantai
Copy link
Member

@kantai kantai commented Feb 13, 2019

This PR adds the following:

  1. Contract interface for executing a transaction within the context of a transaction, with an in-memory implementation of that interface.
  2. (define-public ...) function for specifying the public functions in a contract.
  3. Support for buffer literals via hex strings or ascii string literals.
  4. Type definitions that use the structural representation described in SIP-002 (i.e., (list 3 (buffer 5)))
  5. Max value size enforcement
  6. Max stack and context depth enforcement

This PR also reimplements the lexer from #908 so that it is a much more standard lexer (and the lexer itself handles literals).

This PR addresses:

#911
#913 -- though this will need to be revisited during the course of the implementation
#914
#915
#917
#918

…paration for creating contract contexts-- a transaction should _never_ be able to modify a contract's global context.
@kantai
Copy link
Member Author

kantai commented Feb 15, 2019

Pushed some more work into this PR:

Adding type parameters to (define-public ...): #922

  • Use loose type admission for maximum buffer/tuple size: basically, we want it so that we enforce a maximum buffer size, but will accept smaller buffers (the containing type is just the maximum of its composite types)

Adding initial implementation of a "Principal" type, parsed as described in SIP-002. I got a little carried away on this and implemented C32 address decoding -- see the implementation in src/address/c32.rs

}
}
} else {
let types = self.types.as_ref().unwrap(); // if types is None, and is_public = true, we should panic.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally fine that this panics, but could you insert an error message here as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you could make this type of function unrepresentable by removing is_public from DefinedFunction and instead create a type enumeration like this:

pub enum FunctionTypes {
   PublicFunction(DefinedFunction),
   PrivateFunction(DefinedFunction)
}

Then, you'd simply handle this as a match statement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I'll create a new type for this

@@ -21,26 +28,38 @@ impl <'a> Environment <'a> {
}
}

// Aaron: note -- only the global context will ever have DefinedFunctions
// so it is probably worthwhile to separate into 2 types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea -- goes along with the general Rust-ism of avoiding non-sensical instantiations by making them unrepresentable in the first place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

src/vm/errors.rs Outdated
BadSymbolicRepresentation(String),
ReservedName(String),
InterpreterError(String),
MultiplyDefined(String)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"MultiplyDefined"? As in, defined multiple times, or something to do with the multiplication operator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it's an error when you use the same variable name multiple times in a variable declaration like, the function definition:

(define (foo a a) (+ a a))

I can rename it to VariableDefinedMultipleTimes

src/vm/types.rs Outdated
List(Vec<Value>, TypeSignature),
Principal(u8, Vec<u8>), // a principal is a version byte + hash160 (20 bytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can enforce this in the type system by using [u8; 20] instead of Vec<u8>

src/vm/types.rs Outdated
let mut value_size: i128 = 0;
for (name, type_signature) in self.type_map.iter() {
// we only accept ascii names, so 1 char = 1 byte.
name_size = name_size.checked_add(name.len() as i128).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these unwrap()s, can you add an error message?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, how certain are we that these unwrap()s never occur?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure -- will add an error message.

These panics should never occur, since the composite values of the tuple should be smaller than MAX_VALUE_SIZE (much smaller than i128::max bytes) and combining them until the tuple is too large would require an extraordinarily large program. However, since the max value size check for tuple (and list) types occurs using the size() method, it's theoretically possible. So I'll change these to raise a ValueTooLarge error.

// e.g.: (list "abcd" "abc") will currently error because one etry is
// if type (buffer 4) and the other is of type (buffer 3)
// my feeling is that this should probably be allowed, and the resulting
// type should be (list 2 (buffer 4))
Copy link
Member

@jcnelson jcnelson Feb 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, just as long as we can make it clear somehow that (list "abcdefghijk" "abc") costs as much to process as (list "abcdefghijk" "\x00\x00\x00\x00\x00\x00\x00\x00\x00abc"). We'd also want to be careful about how we "expand" shorter buffer items -- do we left-pad them with 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep -- we'd need to make this clear. I think the behavior we want is that for any functions which accepts two buffers, to treat the smaller buffer as left padded with zeros. currently we don't have any buffer functions (except eq?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you want to handle eq? for buffers that get implicitly left-padded with zeros? I don't think \x00abc and abc are necessarily equal. Maybe the buffer type should include a length field internally to avoid this problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was just going to comment on this. Those shouldn't be equal. Buffers should have a length, which they do currently (via their vec.len()), but the Buffer type has a maximum length, which is a little silly for a buffer's runtime type (because its length is always equal to its maximum length), but for declared types like public function inputs and datamap keys, values, it makes sense, because the maximum length determines whether or not a given runtime value would be admissable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this contradicts my previous response up there --- I don't think we should be automatically extending buffers. if we decide to include buffer functions like strncat, they should specify their padding behavior, if they have any)

src/vm/parser.rs Outdated
// lazy_static (or just hand implementing that), and I'm not convinced
// it's worth either (1) an extern macro, or (2) the complexity of hand implementing.
let lex_matchers: &[LexMatcher] = &[
LexMatcher::new(r##""(?P<value>((\\")|([[:ascii:]&&[^"\n\r\t]]))*)""##, TokenType::StringLiteral),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is :ascii: here? Is it printable ASCII characters? Also, can we have a lexer test below for ensuring that a code snippit with non-printable characters does not parse?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the str type is a unicode string slice. We'll want the lexer to reject non-ASCII strings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah -- that :ascii: class should actually be :print: for printable ascii characters.

But this lexer will reject non-ASCII strings. None of these matchers will match a non-ascii character, which will cause the lexer to fail (if the lexer cannot process a given character, it exhaust all the matches and returns a ParseError)

@kantai
Copy link
Member Author

kantai commented Feb 19, 2019

Okay -- I pushed changes that addressed all of those issues.

  1. Use two types for Global vs. Local contexts.
  2. Use two types for public vs. private defined functions.
  3. Raise a ValueTooLarge error in the event of an i128 overflow in type size.
  4. Use printable ascii character class instead of ascii character class (and add unicode rejection test)
  5. Use [u8;20] for the principal data
  6. Rename MultiplyDefined error
  7. Implement value equality check which checks for data equality not type equality. Will create a new issue for speccing out buffer functions.

@jcnelson
Copy link
Member

Thanks! As soon as CircleCI finishes, please go ahead and merge to develop. Thanks also for implementing c32.rs 😀

@kantai
Copy link
Member Author

kantai commented Feb 19, 2019

Thanks for the review! The Circle CI tests look like they've finished successfully (https://circleci.com/gh/blockstack/blockstack-core/1716), so I'll go ahead and merge (Circle github notifications/commit check updates appear to currently be broken).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants