Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

12c Passing by reference #24

Merged
merged 16 commits into from
Apr 22, 2022
Merged

12c Passing by reference #24

merged 16 commits into from
Apr 22, 2022

Conversation

ngjunsiang
Copy link
Contributor

@ngjunsiang ngjunsiang commented Apr 21, 2022

9608 pseudocode recognises the notion of passing-by-value vs passing-by-reference.

Procedures are able to access all variables in the global frame. But reassigning variables is another matter.

  • Passing-by-reference means that any reassignments to a variable in the local frame affect the global frame as well.
  • Passing-by-value means that reassignments to a variable in the local frame do not affect the global frame. These reassignments are therefore lost when the procedure exits and the local frame is no longer used (since 9608 pseudocode does not support closures).

@ngjunsiang ngjunsiang marked this pull request as draft April 21, 2022 10:29
@ngjunsiang
Copy link
Contributor Author

Using a local frame for CALL execution

To use a local frame, we are going to need to create one when we execute the call:
[c2319a7]

https://github.com/nyjc-computing/pseudo/blob/c2319a7f5477a4d8eadbe40e504d8334be51b6a2/interpreter.py#L69-L73
First, we create a local frame (just like the global one). We make a copy of each variable already in the global frame (this is pass-by-value).

https://github.com/nyjc-computing/pseudo/blob/c2319a7f5477a4d8eadbe40e504d8334be51b6a2/interpreter.py#L76-L79
Now we assign the args to the appropriate named vars, according to params order. Notice that when we evaluate the arg, we still do so with the global frame, since that is where the function is called and the args are passed.

https://github.com/nyjc-computing/pseudo/blob/c2319a7f5477a4d8eadbe40e504d8334be51b6a2/interpreter.py#L81
Finally, we execute the CALL statement with the local frame.

@ngjunsiang
Copy link
Contributor Author

ngjunsiang commented Apr 22, 2022

Designating pass-by-value or pass-by-reference

While 9608 pseudocode assumes pass-by-value as the default mode of passing arguments, it also supports passing-by-reference:

To specify whether a parameter is passed by value or by reference, the keywords BYVALUE and BYREF
precede the parameter in the definition of the procedure. If there are several parameters, they should all be
passed by the same method and the BYVALUE or BYREF keyword need not be repeated.

Example – passing parameters by reference

PROCEDURE SWAP(BYREF X : INTEGER, Y : INTEGER)
    Temp <- X
    X <- Y
    Y <- Temp
ENDPROCEDURE 

If the method for passing parameters is not specified, passing by value is assumed.

The use of BYVALUE and BYREF is now recognised by the parser when used in a procedure declaration.
@ngjunsiang
Copy link
Contributor Author

Parsing BYVALUE and BYREF

We start with recognising these two new keywords in the parser. We initialise passby with a default BYVALUE token, then override it if the parser matches an appropriate token after the opening (:
https://github.com/nyjc-computing/pseudo/blob/7e828efb545e2742fc05dbd5e17fb168123d5953/parser.py#L297-L300

Our stmt dict now gains a new key:
https://github.com/nyjc-computing/pseudo/blob/7e828efb545e2742fc05dbd5e17fb168123d5953/parser.py#L319-L325

Add a declare() parser for parameter declarations. declareStmt() is also refactored to use this expression parser.
@ngjunsiang
Copy link
Contributor Author

ngjunsiang commented Apr 22, 2022

Implementing pass-by-value

Our current resolver code certainly isn't using pass-by-value; it's still verify()ing the procedure's stmts with the global frame 😅:
https://github.com/nyjc-computing/pseudo/blob/7e828efb545e2742fc05dbd5e17fb168123d5953/resolver.py#L86-L88

Should we create a local frame here? How would we pass that to the interpreter?

Each time the procedure is called, we will need:

  • BYREF (global) args to be copied to local (unless they are already there)
  • BYVALUE args to be declared in local
  • params to be assigned from args

The resolver feels like a more appropriate place to set up the local frame then. But first we'll need to do some tweaking to the procedure parser, to make declaration of BYVALUE args easier.

Handling parameter declarations

For starters, notice how similar parameter declarations are to variable declarations. We have a name, followed by :, followed by a type keyword. The key difference is that variable declarations are called through a DECLARE statement, with a line break expected at the end, while parameter declarations are an expression within a PROCEDURE statement.

Let's set up a declare() expression parser for parameter declarations, and use it inside declareStmt() to minimise code duplication:
https://github.com/nyjc-computing/pseudo/blob/f52c24235e2d71bf047ef3a718c3cc7862f96c65/parser.py#L136-L146

https://github.com/nyjc-computing/pseudo/blob/f52c24235e2d71bf047ef3a718c3cc7862f96c65/parser.py#L148-L156

Since declarations need to be type-checked, we might as well do it in declare(). Then we can take it out of procedureStmt().

BREAKING CHANGE:
procedureStmt() now stores params as a list instead of a dict.

This will require changes in resolver and interpreter.
@ngjunsiang
Copy link
Contributor Author

ngjunsiang commented Apr 22, 2022

Let's refactor procedureStmt() to use our handy dandy new declare() expression parser:

https://github.com/nyjc-computing/pseudo/blob/090821db3047d0dcc471c57bf5de6014edd4d74e/parser.py#L304-L316

Here we fix an early mistake decision: instead of using dicts for params, since we how have var dicts for holding the name and type, let's just use a list instead. Those are so much easier to iterate through.

BREAKING CHANGE:
procedure resolving uses a local frame instead of the global frame.
This will affect interpreter execution of procedure statements.
@ngjunsiang
Copy link
Contributor Author

Resolving BYREF parameters

First we declare our params as local vars. This is now made easier with our var expression (which has a 'name' and 'type' key), since we can draw on our earlier work in verifyDeclare():
https://github.com/nyjc-computing/pseudo/blob/63b3d184d9b3a1e213bd26d877132a741133a059/resolver.py#L86-L91

If the procedure is declared with BYREF, reassignments of the parameters cause the global variable to change as well. So we need to type-check BYREF parameters:
https://github.com/nyjc-computing/pseudo/blob/63b3d184d9b3a1e213bd26d877132a741133a059/resolver.py#L92-L101

Not forgetting to verify the procedure's own statements. But we pass the local frame instead of the global frame:
https://github.com/nyjc-computing/pseudo/blob/63b3d184d9b3a1e213bd26d877132a741133a059/resolver.py#L102-L104

Update verifyCall() to follow the new format for procedure statements, which uses list instead of dict for params
@ngjunsiang
Copy link
Contributor Author

Our CALL resolver needs updating as well, to handle params as a list instead of a dict:
https://github.com/nyjc-computing/pseudo/blob/c9e3d8d37e9f2cd768f665150ebedcc1b6940127/resolver.py#L128-L135

Since the frame is already evaluated to return proc, there is no more need to retrieve proc values through the 'value' key
@ngjunsiang
Copy link
Contributor Author

@ngjunsiang
Copy link
Contributor Author

Rewind

Oops ... here we realise we won't be able to update BYREF variables; if they are changed in local, how would we know, and how would we change them back in global? Time to rethink our approach from #24 (comment).

Recall that

if there are several parameters, they should all be passed by the same method and the BYVALUE or BYREF keyword need not be repeated.

That suggests a different approach to local variable declaration. Recall that when we declare variables, we add a dict to the frame:
https://github.com/nyjc-computing/pseudo/blob/72badaaa5b315febaa8dd22277513a5c3d01a4dd/resolver.py#L48-L51

For lack of a better name, let's call this dict a slot. So the frame is a dict with the name as key, and a slot as value.

When we initialise variables in the procedure's local frame, we have a choice:

  • Create a new slot in local if the variables are pass-by-value
  • Use the slot from global frame if the variables are pass-by-reference

Let's see what that looks like.

Instead of declaring new variables in local all the time,
BYREF variables will refer to the global variable.
@ngjunsiang
Copy link
Contributor Author

Resolving BYREF parameters, redux

We still have a local frame:
https://github.com/nyjc-computing/pseudo/blob/d8167c1b0c0817e638052beaec44a1fd5f3a4869/resolver.py#L86-L88

Whether for BYVALUE or BYREF variables, we have to loop through params:
https://github.com/nyjc-computing/pseudo/blob/d8167c1b0c0817e638052beaec44a1fd5f3a4869/resolver.py#L89-L93

For BYVALUE variables, we just declare local variables with new slots like we did in the previous version.
For BYREF variables, instead of declaring new slots, we reuse the slot from the global frame:
https://github.com/nyjc-computing/pseudo/blob/d8167c1b0c0817e638052beaec44a1fd5f3a4869/resolver.py#L94-L95

We do the usual type-checking (param type should match global var type) ...
https://github.com/nyjc-computing/pseudo/blob/d8167c1b0c0817e638052beaec44a1fd5f3a4869/resolver.py#L96-L101

Before we reference the global frame slot in the local frame:
https://github.com/nyjc-computing/pseudo/blob/d8167c1b0c0817e638052beaec44a1fd5f3a4869/resolver.py#L102-L103

And the rest of the code is the same. With one small embarrassing fix: we were checking the passby token instead of the word:
[16ce3cd]

@ngjunsiang
Copy link
Contributor Author

Testing

Code:

DECLARE Person : STRING
PROCEDURE SayHi(BYREF Person : STRING)
    OUTPUT "Hi, ", Person, "!"
ENDPROCEDURE
CALL SayHi("John")

Result:

Expected variable name, got ':'

That's because of these two lines:
https://github.com/nyjc-computing/pseudo/blob/16ce3cdae838f59b992b743d27ded30da8723c11/parser.py#L309-L310

match() already consumes a token, and consume() consumes another one. Fix: [9d051a9]

Result:

Expect STRING for BYREF Person, got {'type': 'name', 'word': 'STRING', 'value': None}

That's because each var in stmt['params'] still contains the type token, which needs to be resolve()d to get the type string. (Confused? Me too. This is screaming for a refactor. But not yet.) The fix: [20ff659]

Result:

{ 'Person': {'type': 'STRING', 'value': 'John'},
  'SayHi': { 'type': 'procedure',
             'value': { 'frame': { 'Person': { 'type': 'STRING',
                                               'value': 'John'}},
                        'params': [ { 'name': { 'type': 'name',
                                                'value': None,
                                                'word': 'Person'},
                                      'type': { 'type': 'name',
                                                'value': None,
                                                'word': 'STRING'}}],
                        'passby': 'BYREF',
                        'stmts': [ { 'exprs': [ { 'type': 'string',
                                                  'value': 'Hi, ',
                                                  'word': '"Hi, "'},
                                                { 'left': { 'Person': { 'type': 'STRING',
                                                                        'value': 'John'}},
                                                  'oper': { 'type': 'symbol',
                                                            'value': <function get at 0x7f257f47d0d0>,
                                                            'word': ''},
                                                  'right': { 'type': 'name',
                                                             'value': None,
                                                             'word': 'Person'}},
                                                { 'type': 'string',
                                                  'value': '!',
                                                  'word': '"!"'}],
                                     'rule': 'output'}]}}}

@ngjunsiang
Copy link
Contributor Author

Interpreting BYREF calls

With our local frame now set up correctly, we can use it directly from the procedure stored in the global frame:
https://github.com/nyjc-computing/pseudo/blob/213ff6c05c64bcd36e55abee2576c84ace18ff91/interpreter.py#L79-L85

This way, BYREF variables will be accessed directly from the global variable slot, while BYVALUE variables will be ... stored in the same local frame?! That sounds dangerous ... shouldn't each call use a fresh frame for BYVALUE variables?

Well yes ... we don't want to complicate our code more than it needs to be though. The next step is to assign arguments into the local frame. Since 9608 pseudocode does not support default argument values, all the local frame variables will be overridden anyway. We have to be careful not to evaluate any gets from local frame before doing so.

And the rest of the code remains the same:
https://github.com/nyjc-computing/pseudo/blob/213ff6c05c64bcd36e55abee2576c84ace18ff91/interpreter.py#L87-L93

Elegantly, local[name] passes the global variable slot, allowing values to be directly getted from or assigned to the global frame.

@ngjunsiang
Copy link
Contributor Author

Testing

Code:

DECLARE Person : STRING
PROCEDURE SayHi(BYREF Person : STRING)
    OUTPUT "Hi, ", Person, "!"
    Person <- "Mary"
ENDPROCEDURE
CALL SayHi("John")
OUTPUT Person

Result:

Hi, John!
Mary

Our BYREF works 😎

@ngjunsiang ngjunsiang marked this pull request as ready for review April 22, 2022 06:28
@ngjunsiang ngjunsiang changed the title 12c Passing by value 12c Passing by reference Apr 22, 2022
@ngjunsiang ngjunsiang merged commit 5c1d0fd into main Apr 22, 2022
@ngjunsiang ngjunsiang deleted the procedure branch April 22, 2022 06:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant