-
Notifications
You must be signed in to change notification settings - Fork 634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/char type #957
Feature/char type #957
Conversation
Codecov Report
@@ Coverage Diff @@
## master #957 +/- ##
==========================================
+ Coverage 74.97% 75.35% +0.38%
==========================================
Files 429 435 +6
Lines 17170 17565 +395
==========================================
+ Hits 12873 13236 +363
- Misses 4297 4329 +32
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for putting single quotes (e.g. 'a'
) in output files, for greater clarity and syntactic uniformity.
Regarding support for
let g = '\xH2A';
let h = '\xH9';
let i = '\xO011';
let j = '\xO172';
actually the RFC (https://github.com/AleoHQ/leo/blob/master/docs/rfc/001-initial-strings.md#characters), and the ABNF PR (#954), only allow
let a = '\x0a';
let b = '\x7E';
let c = '\x44';
...
that is \x
followed by an octal digit followed by a hex digit. The rationale was to follow Rust (https://doc.rust-lang.org/stable/reference/tokens.html#ascii-escapes) and to have something simple. So we need to change either the implementation, or the RFC and ABNF PR to make things consistent one way or the other.
Ah, I totally misread how that worked, @acoglio. Whoops. Either way, I don't mind changing the code to match the RFC to match rust. We could also change the RFC, but I'm more in favor of matching rust's behavior. |
I'm also in favor of matching Rust's behavior -- thanks. |
…ts for now since no constraints
Output changes now write chars as 'a'. Using #920 for tests. Though current tests that should pass currently fail till snarkVM has support for field equality. |
- currently uses back quotes "`" for strings, change later - ast -> asg unimplemented, strings need to be processed on canonicalization stage
…those are in they will work
I started on #946. This is possible bc even though |
[Feature] String parsing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall good!
One note. Should we have consistency in naming of Tokens? We have CharLit
and AddressLit
tokens but string is StringLiteral
. I can submit a quick patch.
UPD: done.
Found two issues:
let b: char = 'abcdefg';
let c: char = '\u{bbbbb}\u{aaaa}'; // works only with unicode escapes
let d: char = '😭😂😘';
let s = "😒";
// thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ValidationFailed',
// 5: leo_parser::tokenizer::lexer::<impl leo_parser::tokenizer::token::Token>::eat
// at ./parser/src/tokenizer/lexer.rs:227:29 |
@damirka #1 shouldn't be allowed but should be an easy fix; I definitely just messed that up when I changed the char parsing style. #2 I believe it is probably an issue since a Unicode symbol is many u8 bytes, and since it's not an escape, I probably attempted to send over one byte over at a time. At least that's my guess. I'll look into these as well and see what we can get fixed. |
#1 has been resolved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Tested multiple scenarios, and it looks like we got it working.
Implements the basic
char
type in Leo.Closes #939.
Closes #940.
Closes #942.
Closes #946.
Closes #950.
What it does is allow the base type into language.
Adds Parser Tests for the
char
type.Adds Compiler Tests for the
char
type. Should I merge in the compiler tests branch first and have that branch dep on this one? Then we can merge this branch to master after that one is in, or we can merge this one into that branch.The ASG passes both the character and the field version to the compiler. This is so the character can be printed, and the compiler uses the field portion for everything else.
The follow is allowed in Leo and those are also allowed in input files.
Output files lose the single quotes on the char; should I force it to put them there? Or should I have them write the field to the output instead? Example of current behavior:
Note: Operators are not yet implemented.