Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ravi evolution proposal #223

Open
snoopcatt opened this issue May 19, 2021 · 62 comments
Open

Ravi evolution proposal #223

snoopcatt opened this issue May 19, 2021 · 62 comments

Comments

@snoopcatt
Copy link

Hello!
We have some proposals about Ravi development & support.

Our team wish to use Ravi in production.
So, if you need some our help & contributing (patches, bounties, etc.) we need to talk with you somewhere ☺

If interested, please, share some your contact.

@dibyendumajumdar
Copy link
Owner

Hi - thank you for your interest. I can't accept bounties but if you would like to contribute patches/improvements, will be happy to consider them.

@snoopcatt
Copy link
Author

snoopcatt commented May 20, 2021

Understood, thank you for your reply.
Do you have any roadmap and/or future release plans? Planned/desired features maybe?
And in general, what are your plans for this project?

I've researched your project within a few weeks, and I have some questions and proposals.


MIR JIT

Why do you choose MIR JIT?
It lacks MIPS architecture support. Running on ARM64 is a problem too.
It does not look ready at all.

Also, why we have to compile functions manually? Why not to do it at start, as LuaJIT does?

Performance

Ravi is 3-5 times slower than vanilla Lua 5.3, and about 10 times slower than LuaJIT.
I'm talking about real use, not synthetic benchmarks. Networking, in my case (HTTP server, PostgreSQL client).
Of course, code was the same for all interpreters and static typing was not used. JIT was enabled too.

Yes, Ravi still leaves far, far behind all smoothie technologies such as Python (with asyncio) and V8 (NodeJS).
But boost at least to vanilla Lua 5.3 level will be delightful ☺

Static typing

Static typing is great thing, and you have done great job implementing this.
However, there is a lot to improve here.

Some types (such as string or closure) allows variable to be nil.

This leads to the fact that we need additional checks/asserts in a function body.

There is no boolean type, which is very useful.

There was, as I can see within commented code, but you dropped this out?
Why? Is there a some caveat here?

Function arguments can not be optional.

It means that if we need type checking on function arguments, they can not be optional (or nil).
Moreover, string, closure and userdata can be optional, but integer, number and table — not. Confusing a lot.

Useful (but not mandatory) features for typing system

  • Union types (string|number)
  • Return value type checking (example: function test(a: integer, b: integer): integer)
  • Structs and enums (not necessary since we have possibility to define custom type)

I'm looking forward to your reply.
I almost finished my work on typing system misbehaviour, but I'm not sure that we have same vision how it should work.
Also I'm not sure in my code and implementation quality (I had to add dozens opcodes) and it of course will break backward compability.

Accordingly to that, I wouldn't open PR now, but in next my post I will attach a patch with explainations and some code examples with new typing system behaviour.

Thank you for your time! ♥

@snoopcatt
Copy link
Author

snoopcatt commented May 20, 2021

As I said before, I've done some work to change some confusing behaviour of typing system (and improve it a bit).

Function arguments with types string, closure, table, userdata can not be nil.

Example:

local function hello(a: string)
	print(a, ravitype(a))
end

hello("world")
hello()

Current behaviour:
hello string
nil nil

New behaviour:
world string
ravi: test_string.ravi:0: string expected

Function arguments now may be optional.

Each argument type (except integer[] and number[]) now may be optional. Just append ? (question mark) to type name.
Example:

local function sum(a: integer, b: integer, c: integer?)
	return a + b + (c or 0)
end

print(sum(1, 2, 3))
print(sum(1,2))
print(sum(1))

Output:
6
3
ravi: test_optional.ravi:0: TOINT: integer expected

Added boolean type.

Example:

local function hello(b: boolean)
        print(b, ravitype((b))
end

hello(true) -- `true, boolean`
hello("world") -- error: ravi: test_boolean.ravi:0: boolean expected

So, here is the patch

5359ca1d5ff50204e29ff594c271d55973f9f646.patch.txt

Implementation may be not correct at all, I'm sure that it can be better.
But before rigorous testing and trying to optimize it some way, I need to know your opinion about this things.

Of course, it will break backward compatibility (because currently function accepts arg: string even if it is nil), but IMO modified behaviour is more predictable and understandable.
TypedLua also works the same way.


Will look forward for your response!
Thank you for attention.

@snoopcatt snoopcatt changed the title Ravi production usage Ravi evolution proposal May 20, 2021
@dibyendumajumdar
Copy link
Owner

Hi thank you for above. I will respond later when I have time.

In general, please keep PRs separate by each feature - and also ensure you have run tests/added tests.

One thing worth noting is we cannot allow NIL for integer/number types or table/integer[]/number[] types because that will mess up the JIT

@snoopcatt
Copy link
Author

One thing worth noting is we cannot allow NIL for integer/number types or table/integer[]/number[] types because that will mess up the JIT

Yea, agree with you about table/integer[]/number[].
But what's wrong with integer/number types?

I made some quick tests, all looks good.

ravi.jit(true)

function sum(a: integer, b: integer, c: integer?)
        return a + b + (c or 0)
end

ravi.compile(sum)

for i=1,10000000 do
        assert(sum(10, 20), 30)
        assert(sum(10, 20, 70), 100)
end

print 'done'

What do you mean about messing up the JIT?

@dibyendumajumdar
Copy link
Owner

ravi generates special bytecodes for integer/number types when annotated
You can see by running:

ravi.dumplua(function)

There are no NIL checks as the whole point is to do optimized arithmetic.

@dibyendumajumdar
Copy link
Owner

Also new compiler will allocate on stack

@snoopcatt
Copy link
Author

ravi generates special bytecodes for integer/number types when annotated
You can see by running:

ravi.dumplua(function)

There are no NIL checks as the whole point is to do optimized arithmetic.

Still not understanding how it theoretically can affect runtime when using optional types.
By the way, there are extra opcodes for optional types with nil-checking , like

// optional types
vmcase(OP_RAVI_TOINT_NIL) {
lua_Integer j;
if (RAVI_LIKELY(tointeger(ra, &j))) { setivalue(ra, j); }
else if(!ttisnil(ra))
  luaG_runerror(L, "TOINT: integer expected");
vmbreak;
}

Can you, please, explain me the main danger with some example code?
Or it is just about performance?

Optional typed arguments is very important thing, without it it would be hard to implement some kind of applications.

@dibyendumajumdar
Copy link
Owner

There are two distinct goals for type annotations.

Ravi's implementation is a hybrid. Some types are for performance. Others are just for argument / return type checking - but they do not help performance

To support optional integer/number would need the compiler to revert to unoptimized type as NIL is not a valid value

@snoopcatt
Copy link
Author

To support optional integer/number would need the compiler to revert to unoptimized type as NIL is not a valid value

Can you clarify where to look? Commit number/whatever.

I also still do not understand what exactly wrong with my implementation.
Optional integers/numbers are working well, performance is the same as with mandatory types (as in current upstream revision).

Are there any potential critical bugs may hide? Such as crashes, memory corruption, et cetera.

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

ravi generates special bytecodes for integer/number types when annotated
You can see by running:

ravi.dumplua(function)

There are no NIL checks as the whole point is to do optimized arithmetic.

Note that this is currently handled correctly.

The following code produces an int|nil value with (n and a or nil). The following ADD opcode is not optimized as the value can be nil.

ravi.jit(true)

function sum(n, a: integer, b: integer)
        return (n and a or nil) + b
end

ravi.dumplua(sum)

@snoopcatt
Copy link
Author

ravi.jit(true)

function sum(n, a: integer, b: integer)
        return (n and a or nil) + b
end

ravi.dumplua(sum)

I think this code is not correct at all.
If n is not specified, you will try to sum b and nil.

Slightly modified code:

ravi.jit(true)

function sum(n: integer?, a: integer, b: integer)
	return a + b + (n or 0)
end

-- ravi.dumplua(sum)

print(sum(nil, 1, 2))
print(sum(1, 2, 3))
print(sum(1, 2))

works and returns
3
6
ravi: jit.lua:0: TOINT: integer expected

But I have no idea on bytecode, I don't know how to read it.

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

If n is not specified, you will try to sum b and nil.

That was intended to show that in such cases the unoptimized ADD instruction is used.

It should show that

function sum(a: integer?, b: integer)
        return a + b
end

is not using the RAVI_ADD_II but the generic ADD.

@snoopcatt
Copy link
Author

snoopcatt commented May 20, 2021

I tried, it seems my code producing ADD_II but your code producing ADD.

ravi.jit(true)

function sum(a: integer, b: integer)
	return a+b
end

function sum2(n: integer?, a: integer, b: integer)
	return a + b + (n or 0)
end

function sum3(n, a: integer, b: integer)
        return (n and a or nil) + b
end

print('--------------------------------------------')
print [[function sum(a: integer, b: integer)
	return a+b
end]]
ravi.dumplua(sum)

print('--------------------------------------------')
print [[function sum2(n: integer?, a: integer, b: integer)
        return a + b + (n or 0)
end]]
ravi.dumplua(sum2)

print('--------------------------------------------')
print [[function sum3(n, a: integer, b: integer)
        return (n and a or nil) + b
end]]
ravi.dumplua(sum3)

result:

--------------------------------------------
function sum(a: integer, b: integer)
	return a+b
end

function <jit.lua:3,5> (5 instructions at 0x7f3742c79610)
2 params, 3 slots, 0 upvalues, 2 locals, 0 constants, 0 functions
	1	[3]	TOINT    	0
	2	[3]	TOINT    	1
	3	[4]	ADDII    	2 0 1
	4	[4]	RETURN   	2 2
	5	[5]	RETURN   	0 1
constants (0) for 0x7f3742c79610:
locals (2) for 0x7f3742c79610:
	0	a	1	6
	1	b	1	6
upvalues (0) for 0x7f3742c79610:
--------------------------------------------
function sum2(n: integer?, a: integer, b: integer)
        return a + b + (n or 0)
end

function <jit.lua:7,9> (10 instructions at 0x7f3742c793f0)
3 params, 5 slots, 0 upvalues, 3 locals, 1 constant, 0 functions
	1	[7]	TOIARRAY 	0
	2	[7]	TOINT    	1
	3	[7]	TOINT    	2
	4	[8]	ADDII    	3 1 2
	5	[8]	TESTSET  	4 0 1
	6	[8]	JMP      	0 1	; to 8
	7	[8]	LOADK    	4 -1	; 0
	8	[8]	ADDII    	3 3 4
	9	[8]	RETURN   	3 2
	10	[9]	RETURN   	0 1
constants (1) for 0x7f3742c793f0:
	1	0
locals (3) for 0x7f3742c793f0:
	0	n	1	11
	1	a	1	11
	2	b	1	11
upvalues (0) for 0x7f3742c793f0:
--------------------------------------------
function sum3(n, a: integer, b: integer)
        return (n and a or nil) + b
end

function <jit.lua:11,13> (10 instructions at 0x7f3742c7e760)
3 params, 4 slots, 0 upvalues, 3 locals, 0 constants, 0 functions
	1	[11]	TOINT    	1
	2	[11]	TOINT    	2
	3	[12]	TEST     	0 0
	4	[12]	JMP      	0 2	; to 7
	5	[12]	TESTSET  	3 1 1
	6	[12]	JMP      	0 1	; to 8
	7	[12]	LOADNIL  	3 0
	8	[12]	ADD      	3 3 2
	9	[12]	RETURN   	3 2
	10	[13]	RETURN   	0 1
constants (0) for 0x7f3742c7e760:
locals (3) for 0x7f3742c7e760:
	0	n	1	11
	1	a	1	11
	2	b	1	11
upvalues (0) for 0x7f3742c7e760:

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

All of the code seems to be correct. For the second case ADD_II can be used since n or 0 will be an integer and never nil. However, the TOIARRAY seems out of place, but this is caused by missing to add all the new opcodes to luaP_opnames.

OP_RAVI_TOINT_NIL
OP_RAVI_TOFLT_NIL
OP_RAVI_TOTAB_NIL
OP_RAVI_TOSTRING_NIL
OP_RAVI_TOBOOLEAN_NIL
OP_RAVI_TOCLOSURE_NIL
OP_RAVI_TOTYPE_NIL

are missing.
It would be interesting to see the opcodes of

function sum(a: integer?, b: integer)
        return a + b
end

I didn't apply your patch so far.

@dibyendumajumdar
Copy link
Owner

Also these opcodes might be redundant - as the the existing ones already allow NIL where applicable

@dibyendumajumdar
Copy link
Owner

Re reading bytecodes - you may want to checkout https://the-ravi-programming-language.readthedocs.io/en/latest/lua_bytecode_reference.html. It doesn't cover the extended Ravi opcodes but these are straight forward once you can follow the Lua 5.3 ones

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

The existing ones are patched to not allow nil any more. See the patch supplied in #223 (comment).

@snoopcatt
Copy link
Author

All of the code seems to be correct. For the second case ADD_II can be used since n or 0 will be an integer and never nil.
However, the TOIARRAY seems out of place, but this is caused by missing to add all the new opcodes to luaP_opnames.

yeah, missed it, thanks.

function sum(a: integer?, b: integer)
        return a + b
end

Of course it produces ADD, but as I mentioned before, code is incorrect itself.
You should avoid using add + (and concat .. for strings too) without being sure that both values are correct type. Or it'll just crash.

Another correct variant (also produces ADDII)

ravi.jit(true)

function sum(a: integer?, b: integer)
        return @integer(a) + b
end

ravi.dumplua(sum)

but it is almost the same as (a or 0).

By the way, main purpose of optional typed arguments is to ensure that external variable (if supplied) is correct type.
And if variable is not supplied, you should like always assign some default value to it.

I mean,

function hello(s: string?)
  s = s or 'world'
  return "hello, " .. s
end

Without support of optional arguments, we have to do something like that:

function hello(s)
  s = s or 'world'
  assert(type(s), 'string')
  return "hello, " .. s
end

@dibyendumajumdar
Copy link
Owner

For optional integer/number you will then need to set the value to 0 rather than NIL - and ensure type is set. Then I think it might be okay.

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

Note that

function hello(s: string?)
  s = s or 'world'
  return "hello, " .. s
end

is not optimal since s still may hold nil in the eyes of the optimizer at the point of concatenation.
Better would be

function hello(s: string?)
  local rs: string = s or 'world'
  return "hello, " .. rs
end

@snoopcatt
Copy link
Author

Also these opcodes might be redundant - as the the existing ones already allow NIL where applicable

That's what I said in original post.
Try to run this code:

function hello(s: string)
  return "hello, " .. s
end
print(hello("world")) -- works as intended
print(hello(false)) -- works as intended, throws `ravi: r.lua:0: string expected`
print(hello()) -- crash -- `ravi: r.lua:2: attempt to concatenate a nil value (local 's')`

For some reason, string type allows nil as correct argument type, that may lead to serious errors and crashes.

@dibyendumajumdar
Copy link
Owner

Well - actually a few of the types are union of type & NIL. @XmiliaH made this more explicit in the code.

#define RAVI_TM_FALSISH (RAVI_TM_NIL | RAVI_TM_FALSE)
#define RAVI_TM_TRUISH (~RAVI_TM_FALSISH)
#define RAVI_TM_BOOLEAN (RAVI_TM_FALSE | RAVI_TM_TRUE)
#define RAVI_TM_NUMBER (RAVI_TM_INTEGER | RAVI_TM_FLOAT)
#define RAVI_TM_INDEXABLE (RAVI_TM_INTEGER_ARRAY | RAVI_TM_FLOAT_ARRAY | RAVI_TM_TABLE)
#define RAVI_TM_STRING_OR_NIL (RAVI_TM_STRING | RAVI_TM_NIL)
#define RAVI_TM_FUNCTION_OR_NIL (RAVI_TM_FUNCTION | RAVI_TM_NIL)
#define RAVI_TM_BOOLEAN_OR_NIL (RAVI_TM_BOOLEAN | RAVI_TM_NIL)
#define RAVI_TM_USERDATA_OR_NIL (RAVI_TM_USERDATA | RAVI_TM_NIL)
#define RAVI_TM_ANY (~0)

typedef enum {
  RAVI_TNIL = RAVI_TM_NIL,        /* NIL */
  RAVI_TNUMINT = RAVI_TM_INTEGER, /* integer number */
  RAVI_TNUMFLT = RAVI_TM_FLOAT,   /* floating point number */
  RAVI_TNUMBER = RAVI_TM_NUMBER,
  RAVI_TARRAYINT = RAVI_TM_INTEGER_ARRAY,   /* array of ints */
  RAVI_TARRAYFLT = RAVI_TM_FLOAT_ARRAY,     /* array of doubles */
  RAVI_TTABLE = RAVI_TM_TABLE,              /* Lua table */
  RAVI_TSTRING = RAVI_TM_STRING_OR_NIL,     /* string */
  RAVI_TFUNCTION = RAVI_TM_FUNCTION_OR_NIL, /* Lua or C Function */
  RAVI_TBOOLEAN = RAVI_TM_BOOLEAN_OR_NIL,   /* boolean */
  RAVI_TTRUE = RAVI_TM_TRUE,
  RAVI_TFALSE = RAVI_TM_FALSE,
  RAVI_TUSERDATA = RAVI_TM_USERDATA_OR_NIL, /* userdata or lightuserdata */
  RAVI_TANY = RAVI_TM_ANY,                  /* Lua dynamic type */
} ravitype_t;

@snoopcatt
Copy link
Author

For optional integer/number you will then need to set the value to 0 rather than NIL - and ensure type is set. Then I think it might be okay.

Also a solution. You won't be able to set default variables with n = n or 1024, you have to use n = (n ~= 0) and n or 1024.
But the problem is, that 0 is also valid number, and we may expect it sometimes as correct value.

@dibyendumajumdar
Copy link
Owner

I had a todo #172
Because one option is TOINT/TOFLT should convert to integer/float where possible (maybe NIL to 0)

@snoopcatt
Copy link
Author

Well - actually a few of the types are union of type & NIL. @XmiliaH made this more explicit in the code.

#define RAVI_TM_FALSISH (RAVI_TM_NIL | RAVI_TM_FALSE)
#define RAVI_TM_TRUISH (~RAVI_TM_FALSISH)
#define RAVI_TM_BOOLEAN (RAVI_TM_FALSE | RAVI_TM_TRUE)
#define RAVI_TM_NUMBER (RAVI_TM_INTEGER | RAVI_TM_FLOAT)
#define RAVI_TM_INDEXABLE (RAVI_TM_INTEGER_ARRAY | RAVI_TM_FLOAT_ARRAY | RAVI_TM_TABLE)
#define RAVI_TM_STRING_OR_NIL (RAVI_TM_STRING | RAVI_TM_NIL)
#define RAVI_TM_FUNCTION_OR_NIL (RAVI_TM_FUNCTION | RAVI_TM_NIL)
#define RAVI_TM_BOOLEAN_OR_NIL (RAVI_TM_BOOLEAN | RAVI_TM_NIL)
#define RAVI_TM_USERDATA_OR_NIL (RAVI_TM_USERDATA | RAVI_TM_NIL)
#define RAVI_TM_ANY (~0)

typedef enum {
  RAVI_TNIL = RAVI_TM_NIL,        /* NIL */
  RAVI_TNUMINT = RAVI_TM_INTEGER, /* integer number */
  RAVI_TNUMFLT = RAVI_TM_FLOAT,   /* floating point number */
  RAVI_TNUMBER = RAVI_TM_NUMBER,
  RAVI_TARRAYINT = RAVI_TM_INTEGER_ARRAY,   /* array of ints */
  RAVI_TARRAYFLT = RAVI_TM_FLOAT_ARRAY,     /* array of doubles */
  RAVI_TTABLE = RAVI_TM_TABLE,              /* Lua table */
  RAVI_TSTRING = RAVI_TM_STRING_OR_NIL,     /* string */
  RAVI_TFUNCTION = RAVI_TM_FUNCTION_OR_NIL, /* Lua or C Function */
  RAVI_TBOOLEAN = RAVI_TM_BOOLEAN_OR_NIL,   /* boolean */
  RAVI_TTRUE = RAVI_TM_TRUE,
  RAVI_TFALSE = RAVI_TM_FALSE,
  RAVI_TUSERDATA = RAVI_TM_USERDATA_OR_NIL, /* userdata or lightuserdata */
  RAVI_TANY = RAVI_TM_ANY,                  /* Lua dynamic type */
} ravitype_t;

I also confused with that.
RAVI_TNUMBER is RAVI_TM_NUMBER, but RAVI_TFUNCTION is RAVI_TM_FUNCTION_OR_NIL.
We may expect closure as input value in some function, but still may get nil instead.
So we need to perform additional checks, despite that we designated expected type in arguments list.

@dibyendumajumdar
Copy link
Owner

We can tighten the check for annotated types to disallow NIL. I didn't do it originally as I took the view that NIL is a valid value for these (i.e. absent)

@snoopcatt
Copy link
Author

Also, compromise solution may be to implement union types.
It is already implemented on C-side, we just need to implement Lua interface to create custom unions.

They we may explicitly write

function sum(a: integer, b: integer|nil)
        return @integer(a) + b
end

when needed (and when we know what we doing).

And also If it's possible not for a big cost for parser, maybe add possibility to specify literals directly, i.e.

function sum(a: integer, b: integer|666)
        return a + b
end
function hello(s: string|"World")
       return 'hello ' .. s
end 

I'm not sure that it is possible without lot of workarounds, but I might be wrong.

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

@snoopcatt maybe change the type parsing to:

if (strcmp(str, "integer") == 0)
  tm = RAVI_TM_INTEGER;
else if (strcmp(str, "number") == 0)
  tm = RAVI_TM_FLOAT;
else if (strcmp(str, "closure") == 0)
  tm = RAVI_TM_FUNCTION_OR_NIL;
else if (strcmp(str, "table") == 0)
  tm = RAVI_TM_TABLE;
else if (strcmp(str, "string") == 0)
  tm = RAVI_TM_STRING_OR_NIL;
else if (strcmp(str, "boolean") == 0)
  tm = RAVI_TM_BOOLEAN_OR_NIL;
else if (strcmp(str, "any") == 0)
  tm = RAVI_TM_ANY;
else {
  /* default is a userdata type */
  tm = RAVI_TM_USERDATA_OR_NIL;
  typename = user_defined_type_name(ls, typename);
  str = getstr(typename);
  *pusertype = typename;
}
if (tm == RAVI_TM_FLOAT || tm == RAVI_TM_INTEGER) {
  /* if we see [] then it is an array type */
  if (testnext(ls, '[')) {
	checknext(ls, ']');
	tm = (tm == RAVI_TM_FLOAT) ? RAVI_TM_FLOAT_ARRAY : RAVI_TM_INTEGER_ARRAY;
  }
}
if (testnext(ls, '?')) {
  tm |= RAVI_TM_NIL;
} else if (testnext(ls, '!')) {
  tm &= ~RAVI_TM_NIL;
}

this would still have the old behavior to not break compatibility but allow to remove the nil type with !.

@snoopcatt
Copy link
Author

It should mean default value, if none specified.

The I would recommend

function hello(s: string = "world")
   return 'hello ' .. s
end

All we need is to enable types annotation inside tables the same way as for local variables: t = {a: integer = 20}.
Then we will be able to simply call hello{s: string = "World"}.

Looks like simple and elegant solution, but I have no idea what about technical side and why only local variables can be annotated for now

@dibyendumajumdar
Copy link
Owner

dibyendumajumdar commented May 20, 2021

A sophisticated type system is difficult to do in the Lua parser which is designed for memory efficiency/speed. I haven't been able to figure out a way to do more sophisticated type system without breaking with the spirit of Lua.

Totally agree.
I personally like Typedlua, it looks pretty and fully follows the Lua spirit:

I would say it does not - depending on what we mean by Lua spirit. I am using the term as it is used by C standard - spirit of C.

Lua's implementation is carefully designed to avoid heap allocation in the parser. It uses recursion and stack to create temp expression nodes.

Typed Lua is an add-on - so immediately add a lot of overhead and breaks the spirit of Lua in that sense.

You can look at new project https://github.com/teal-language/tl or Pallene - they may be what you are looking for.

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

Looks like simple and elegant solution, but I have no idea what about technical side and why only local variables can be annotated for now

Annotating tables is quite hard and checking it at runtime too.

@snoopcatt
Copy link
Author

snoopcatt commented May 20, 2021

I would say it does not - depending on what we mean by Lua spirit. I am using the term as it is used by C standard - spirit of C.

I meant only how it looks for the "end customer". Simple and elegant.
Of course implementation looks like a hell, that's why I don't use it. Same for Teal.

@snoopcatt
Copy link
Author

Annotating tables is quite hard and checking it at runtime too.

local a: integer = 1
local b: string = 'hello'

local t = {a=a, b=b}

why can't we say local t = {a: integer=1, b: string = 'hello'}?
it shouldn't be much more expensive

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

First: following the current type system it should be more like:

local a: integer = 1
local b: string = 'hello'
local t: {a: integer, b: string} = {a = 1, b = 'hello'}

Second: every access would need to check the types since you could pass it to a function with violates this constraints and writes an string to the key a.
Third: a function parameter annotated with such a table expression would require to check every key to be valid. And you could have something like:

function f(t: {a: {b: {c: {d: {e: integer}}}}})
   return t
end

@snoopcatt
Copy link
Author

First: following the current type system it should be more like:

local a: integer = 1
local b: string = 'hello'
local t: {a: integer, b: string} = {a = 1, b = 'hello'}

Second: every access would need to check the types since you could pass it to a function with violates this constraints and writes an string to the key a.
Third: a function parameter annotated with such a table expression would require to check every key to be valid. And you could have something like:

function f(t: {a: {b: {c: {d: {e: integer}}}}})
   return t
end

I mean, why it is possible to annotate only local variables?
Why can't we add do the same for "global" variables by its reference?

Or it is Lua design limitations?

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

Globales are stored in the global table accessible with _G. If you could annotate the global variables how should this work over multiple ravi/lua files and direct access to the table.
If you could annotate globales what would happen and when in the following example:

global a: integer; -- just assuming to annotate the global `a`
_G.a = "not an integer"
print(a)

should _G.a throw since the key a is annotated as integer or should print(a) throw since a is not an integer?
In either of the cases additional runtime checks are required and no compile time errors could be generated. There is not much to gain from these annotations.

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

@snoopcatt I just noticed that you forgot to change the JIT compiler in src/ravi_jitshared.c to handle your added and the changed opcodes correctly.

@snoopcatt
Copy link
Author

should _G.a throw

this, same as for local variables:

➜  ~ cat t.lua 
local a: integer = 1
local b: string = 'hello'
a = "not integer"

ravi: t.lua:4: Invalid assignment: integer expected near <eof>

In either of the cases additional runtime checks are required and no compile time errors could be generated.
There is not much to gain from these annotations.

yes, but we already have some runtime checks, on function calls for example. that's what about my question about cost was.

I didn't dive very deep into Lua, I only know basics, i.e. how to create C module etc.
so I don't know much how process of re-assigning value is going at low-level.

I thought it is possible to bring some runtime checks at for example

static void assignment (LexState *ls, struct LHS_assign *lh, int nvars) 

to ensure new value is still correct type for that variable (we storing type of variable somewhere, aren't we?)

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

(we storing type of variable somewhere, aren't we?)

The type for local variables is stored, however, the types of table values are not.

@snoopcatt
Copy link
Author

snoopcatt commented May 20, 2021

The type for local variables is stored, however, the types of table values are not.

No more questions ☺

@snoopcatt I just noticed that you forgot to change the JIT compiler in src/ravi_jitshared.c to handle your added and the changed opcodes correctly.

Thanks.
Just noticed that in jit_shared.c in TO_STRING opcode nil does not allowed:

static void emit_op_tostring(struct function *fn, int A, int pc) {
  (void)pc;
  emit_reg(fn, "ra", A);
  membuff_add_string(&fn->body, "if (!ttisstring(ra)) {\n");
#if GOTO_ON_ERROR
  membuff_add_fstring(&fn->body, " error_code = %d;\n", Error_string_expected);
  membuff_add_string(&fn->body, " goto Lraise_error;\n");
#else
  membuff_add_fstring(&fn->body, " raviV_raise_error(L, %d);\n", Error_string_expected);
#endif
  membuff_add_string(&fn->body, "}\n");
}

in oppose to lvm.c:

vmcase(OP_RAVI_TOSTRING) {
    if (!ttisnil(ra) && RAVI_UNLIKELY(!ttisstring(ra)))
        luaG_runerror(L, "string expected");        
    vmbreak;
}

Wut? Why? Why even if JIT enabled --

  1. on "patched" Ravi my tests working (even with ravi_jitshared.c unchanged)
  2. on vanilla Ravi function(s: string) behaviour unchanged and allows to pass nil?

I think I'm totally lost here

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

The same behavior can be seen for closure. I did open a separat issue #225 for this.

@snoopcatt
Copy link
Author

Oh, I'm not (fully) lost my mind diving into it, it just a bug.

this would still have the old behavior to not break compatibility but allow to remove the nil type with !.

So, now we have to change behaviour of that anyway.

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

In that case I would use

if (strcmp(str, "integer") == 0)
  tm = RAVI_TM_INTEGER;
else if (strcmp(str, "number") == 0)
  tm = RAVI_TM_FLOAT;
else if (strcmp(str, "closure") == 0)
  tm = RAVI_TM_FUNCTION;
else if (strcmp(str, "table") == 0)
  tm = RAVI_TM_TABLE;
else if (strcmp(str, "string") == 0)
  tm = RAVI_TM_STRING;
else if (strcmp(str, "boolean") == 0)
  tm = RAVI_TM_BOOLEAN;
else if (strcmp(str, "any") == 0)
  tm = RAVI_TM_ANY;
else {
  /* default is a userdata type */
  tm = RAVI_TM_USERDATA;
  typename = user_defined_type_name(ls, typename);
  str = getstr(typename);
  *pusertype = typename;
}
if (tm == RAVI_TM_FLOAT || tm == RAVI_TM_INTEGER) {
  /* if we see [] then it is an array type */
  if (testnext(ls, '[')) {
	checknext(ls, ']');
	tm = (tm == RAVI_TM_FLOAT) ? RAVI_TM_FLOAT_ARRAY : RAVI_TM_INTEGER_ARRAY;
  }
}
if (testnext(ls, '?')) {
  tm |= RAVI_TM_NIL;
}

This allows a consistent use of ?. However, this would likely require the type checks for userdata and arrays to be adjusted too.

@snoopcatt
Copy link
Author

snoopcatt commented May 20, 2021

I'm afraid it won't work.
Because we also need to adjust type checks in lvm.c and ravi_jitshared.c, and functions there must know some way, is nil acceptable or should we throw an error.

I've done it by just copy-pasting opcodes and appending _NIL to them, it works, but it's dirty solution, I'm sure there are more graceful way to do that.

I've imported my working repos to sh*thub (my god, it still can't mirror git repos, in 2021 year)
Its here, at least you may see what files are changed.

@dibyendumajumdar
Copy link
Owner

dibyendumajumdar commented May 20, 2021

We could enhance the existing bytecode with a parameter as it doesn't use them I think

Each bytecode can take upto 3 params A,B,C. The TO* bytecodes just use A I think

@dibyendumajumdar
Copy link
Owner

dibyendumajumdar commented May 20, 2021

But there is more complication because we have type checks that don't use TO* bytecodes. These need to work too .

I mean the checks at compile time and also for upvalue assigments

@XmiliaH
Copy link
Contributor

XmiliaH commented May 20, 2021

I would propose changing OP_RAVI_TOxxx to R(A) := toxxx(R(B), C) where C denotes if this can or can't be nil and remove the OP_RAVI_MOVE in favor of those. Furthermore it would be nice to have opcodes to check it a value is int|float a la of a union type.

@dibyendumajumdar
Copy link
Owner

dibyendumajumdar commented May 20, 2021

@snoopcatt Re the performance of Ravi vs Lua53 vs LuaJIT;

In interpreted code, Ravi should be better than Lua 5.3 but worse than Lua 5.4 for regular Lua code.
For type annotated code where there are number/integer values and integer/number arrays, performance should be better than Lua.

JIT only makes sense for type annotated code - as otherwise the small JIT engine cannot sufficiently optimize code, and more time is spent compiling the code too, depending on whether you are reusing functions or not.

LuaJIT is much better in the general case as it runs on any Lua code and can perform well. LuaJIT also has negligible time spent compiling code.

I am interested to know how you got your figures if you are able to share.

@dibyendumajumdar
Copy link
Owner

dibyendumajumdar commented May 20, 2021

Re roadmap/plan - here are some things I would like to do:

  • Specialize function bytecodes automatically. Most function calls tend to have same argument types and return types from one call to another. It should be possible to generate type specialized bytecode once we know the starting types of a function and the return types of called functions.
  • Separate optimizing compiler that uses C stack for integer/number values and also is capable of AOT compilation
  • Implement inlining of small functions
  • A simple to use class mechanism so that users don't need to do the boilerplate stuff.

Some harder problems:

  • How to optimize table access in the general case. We ideally want t.name type access to be super fast.

@dibyendumajumdar
Copy link
Owner

dibyendumajumdar commented May 20, 2021

Why MIR?

I implemented multiple JIT backends, including LLVM, libgccjit, NanoJIT, Eclipse OMR.
I also looked at others.

For something that is like Lua, it is necessary to have a small JIT engine.

MIR is the only engine that compiles in 2 secs, is small like Lua, and yet has an optimizer that can achieve good performance if the code is generated carefully.

@dibyendumajumdar
Copy link
Owner

dibyendumajumdar commented May 20, 2021

Finally re the changes we have been discussing - please allow me some time to look at PRs as I don't have time to work on Ravi except during weekends (and that too it has to compete with other hobby projects).

I should add that I appreciate the discussion and the contributions. It will be made easier for me if you separate out each feature when submitting PRs.

@snoopcatt
Copy link
Author

snoopcatt commented May 20, 2021

I am interested to know how you got your figures if you are able to share.

of course: listener.lua

deps:

results:

@snoopcatt
Copy link
Author

Finally re the changes we have been discussing - please allow me some time to look at PRs as I don't have time to work on Ravi except during weekends (and that too it has to compete with other hobby projects).

Of course. I'm not ready to submit PRs for such fundamentl things, I just shown you proof-of-concept.
It's better if someone who more familiar with this project will implement that.

I've sent some PRs with small changes that most likely they won't break anything (of course separated from each other)

By the way, if it `s not a secret, why u refused paid working on some features? 👀

@dibyendumajumdar
Copy link
Owner

Thank for sharing your test case. I will have a look and let you know what I think

@dibyendumajumdar
Copy link
Owner

By the way, if it `s not a secret, why u refused paid working on some features? 👀

Ravi is a hobby project that I work on for my own pleasure. It is not a job I do.

@snoopcatt
Copy link
Author

snoopcatt commented May 20, 2021

Ravi is a hobby project that I work on for my own pleasure. It is not a job I do.

That's nothing about a job. Nothing about deadlines and tasks.
It's about doing what you love and receive some thanks from someone that appreciates your work ☺


So, I've opened issues for each thing separately: #228 #229 (and #226 #227 also)

If these proposals are OK to you — please, add [enhancement] tags to issues and we may start experimenting around implementations.

I won't try to open PRs myself, I think such core things must be done by someone who knows Ravi architecture really deep.

But if someone wants to — not only you, maybe @XmiliaH (or anyone else) -- I really would like to donate some coins for quality code that will be merged to upstream ☺


Thank you all for the discussion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants