Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement local builtin with dynamic scoping for POSIX functions #123

Open
McDutchie opened this issue Aug 31, 2020 · 22 comments
Open

Implement local builtin with dynamic scoping for POSIX functions #123

McDutchie opened this issue Aug 31, 2020 · 22 comments
Labels
1.1 Issue relevant to dev branch (v1.1.*) TODO Things to be done before releasing

Comments

@McDutchie
Copy link

Previous discussion in #9:

At the moment the main challenge is that the fundamental design of POSIX functions (like fun()) in ksh93 doesn't include any form of scoping. It remains to be seen whether this can be made to work for the upcoming release or not.

@McDutchie McDutchie added the TODO Things to be done before releasing label Aug 31, 2020
@McDutchie
Copy link
Author

Continuing from #9 (comment) (@JohnoKing):

Ksh93v- made the following change to give POSIX functions proper scoping in the Bash compatibility mode:

@@ -3387,7 +3411,7 @@ static void sh_funct(Shell_t *shp,Namval_t *np,int argn, char *argv[],struct arg
 	SH_LEVELNOD->nvalue.s = lp->maxlevel;
 	shp->st.lineno = error_info.line;
 	np->nvalue.rp->running  += 2;
-	if(nv_isattr(np,NV_FPOSIX))
+	if(nv_isattr(np,NV_FPOSIX) && !sh_isoption(shp,SH_BASH))
 	{
 		char *save;
 		int loopcnt = shp->st.loopcnt;

This change breaks backward compatibility because it also affects typeset, so a different fix will be necessary.

Interesting: if you change that if to an if(0), then all POSIX functions act like ksh functions. And it breaks very little. If you run the regression test suite, the vast majority of tests pass – only a few to do with POSIX functions fail, in entirely logical ways.

I'm not sure about the implications yet, this is just an interesting data point.

@avih
Copy link

avih commented Sep 12, 2020

Going through the long Austin group thread, I can't say I identify a conclusion on how it should behave. On the contrary - it seems to be stalled on disagreements more than two years ago.

So perhaps if the maintainers really want to support it (and I completely agree it's useful), it would be best to first define exactly how it should behave - preferably while referencing the current behavior at other shells and/or various suggestions at the Austin group thread, and reach agreement on the behavior before starting implementation?

I can have a go at such initial definition if ther's an interest in such procedure.

@avih
Copy link

avih commented Sep 12, 2020

To give some evidence on how different `local' behaves in current shells, I tested using the following script with the shells: pdksh-5.2.14, mksh-master, oksh-master (OpenBSD sh), bash-4.4, dash-master:

vars() {
    printf %s\\n "$1:"; shift
    while [ "$#" != 0 ]; do
        # printf '  %s:%s' "VAR" "${VAR+[$VAR]}"
        eval "printf '  %s:%s' \"\$1\" \"\${$1+[\$$1]}\""
        shift
    done
    echo
}

foo() {
    local a b c d e
    vars "Inside foo initial" a b c d e
    a=1 b=2 c=3 d=4
    vars "Inside foo final" a b c d e
}


# main
readonly a=A; b=B c=

vars "Before foo" a b c d e

foo

vars "After foo" a b c d e

Note that the script doesn't use set -e nor set -u.

And the outputs from the shells (generated with shcmp ):

$ shcmp ./local-test.sh
= mksh, oksh, pdksh:
Before foo:
  a:[A]  b:[B]  c:[]  d:  e:
Inside foo initial:
  a:  b:  c:  d:  e:
Inside foo final:
  a:[1]  b:[2]  c:[3]  d:[4]  e:
After foo:
  a:[A]  b:[B]  c:[]  d:  e:

= bash:
Before foo:
  a:[A]  b:[B]  c:[]  d:  e:
./local-test.sh: line 12: local: a: readonly variable
Inside foo initial:
  a:[A]  b:  c:  d:  e:
./local-test.sh: line 14: a: readonly variable
After foo:
  a:[A]  b:[B]  c:[]  d:  e:

= dash_master:
<error> Before foo:
  a:[A]  b:[B]  c:[]  d:  e:
Inside foo initial:
  a:[A]  b:[B]  c:[]  d:  e:
./local-test.sh: 14: a: is read only

Observations:

  • pdksh, openbsd-sh, mksh behave identically in this test (not entirely surprising, but still nice):
    • All local variables are unset initially (personally I'm not sure whether it's the best choice).
    • After foo exits it behaves as if foo was executed in a subshell - zero effect on the global variables.
  • bash:
    • local warns that a is readonly, but continues.
    • local seems to try unset all local vars, but fails for a.
    • foo aborts when it tries to assign a value to a.
    • The global execution continues after foo aborts, with identical vars state as before calling foo.
  • dash master:
    • local seem to completely not-touch the variables - they retain the state/value as before entry.
    • Doesn't warn that a is read only.
    • Fails and aborts foo and the entire script when trying to assign to a (recall, set -e is not used).

And that's only the simplest use case of local - only declare the variables at the very begining of a function, without trying to set any initial value explicitly and without touching them before using local.

Bottom line, shells handle it differently, and if ksh wants to support it, then IMHO it should first define how the implementation should behave.

@avih
Copy link

avih commented Sep 18, 2020

ping?

@McDutchie
Copy link
Author

Sorry, been working on other stuff. There's much to fix and not enough time and sleep :)

Bottom line, shells handle it differently, and if ksh wants to support it, then IMHO it should first define how the implementation should behave.

Yes and no. I certainly agree it's useful to think about this in advance and come up with an idea of how it should work. But it's also going to depend on the existing ksh codebase. Much of the infrastructure for a local is already there, it's got typeset for ksh functions after all. ksh88 used to do local variables with dynamic scoping in POSIX functions using 'typeset'. Even now, if a ksh function calls a POSIX function, the latter can access the former's local variables. Anyway, point is, I'd like to make the minimum changes necessary to make it work sensibly. This ksh fork is not about doing radical overhauls, so any behaviour that would require dramatic changes is unlikely to be implemented.

I think the basic requirement for a dynamically scoped local can be summarised in the following feature test script:

PATH=/dev/null command -v local >/dev/null 2>&1 || alias local=typeset
var=GLOBAL
fn()
{
	local var=LOCAL || exit
	fn()
	{
		test "$var" = LOCAL
	}
	fn
}
fn && test "$var" = GLOBAL
case $? in
0)	echo "OK" ;;
*)	! echo "Nope" ;;
esac

All shells except ksh93 currently pass this. ksh93 of course fails because the variable always remains global in POSIX functions, so it wrongly has the value LOCAL at the end. If you change the function definitions from fn() to function fn, it fails because of static scoping: the inner fn cannot see the outer fn's local variable, so returns false, passing down that exit status to the outer fn.

Thank you for the detail tests, they are going to come in handy.

@avih
Copy link

avih commented Sep 18, 2020

Anyway, point is, I'd like to make the minimum changes necessary to make it work sensibly. This ksh fork is not about doing radical overhauls, so any behaviour that would require dramatic changes is unlikely to be implemented.

Clearly and obviously, and FWIW I also think that's how it should be.

But at this case the title is misleading, because currently there's no concrete plan or even a WIP definition of "POSIX local".

I think the basic requirement...

Sure, the basic is fairly known, but it would still be worth defining/documenting the behavior, because the devil is in the details, and in sh even more so than most other places, IMHO.

@McDutchie
Copy link
Author

Anyway, point is, I'd like to make the minimum changes necessary to make it work sensibly. This ksh fork is not about doing radical overhauls, so any behaviour that would require dramatic changes is unlikely to be implemented.

Clearly and obviously, and FWIW I also think that's how it should be.

But that does mean that setting every detail in stone before we begin is not practical, as we're all still learning the finer details of how this code base works. We're just going to have to work out a few things as we go along.

Meanwhile, to add to the data provided by your test, I've identified a number of shell quirks with local variables as part of the modernish library's feature/bug/quirk testing framework (they are actual tests that scripts using the library can easily use the results of if they want). It would be good to avoid these as I don't really consider them sensible behaviour. These are:

  • QRK_LOCALINH: On a shell with LOCALVARS, local variables, when declared
    without assigning a value, inherit the state of their global namesake, if
    any. (dash, FreeBSD sh)
  • QRK_LOCALSET: On a shell with LOCALVARS, local variables are immediately set
    to the empty value upon being declared, instead of being initially without
    a value. (zsh)
  • QRK_LOCALSET2: Like QRK_LOCALSET, but only if the variable by the
    same name in the global/parent scope is unset. If the global variable is
    set, then the local variable starts out unset. (bash 2 and 3)
  • QRK_LOCALUNS: On a shell with LOCALVARS, local variables lose their local
    status when unset. Since the variable name reverts to global, this means that
    unset will not necessarily unset the variable! (yash, pdksh/mksh. Note:
    this is actually a behaviour of typeset, to which modernish aliases local
    on these shells.)
  • QRK_LOCALUNS2: This is a more treacherous version of QRK_LOCALUNS that
    is unique to bash. The unset command works as expected when used on a local
    variable in the same scope that variable was declared in, however, it
    makes local variables global again if they are unset in a subscope of that
    local scope, such as a function called by the function where it is local.
    (Note: since QRK_LOCALUNS2 is a special case of QRK_LOCALUNS, modernish
    will not detect both.)
    On bash >= 5.0, modernish eliminates this quirk upon initialisation
    by setting shopt -s localvar_unset.

@avih
Copy link

avih commented Sep 18, 2020

But that does mean that setting every detail in stone before we begin is not practical

Sure, which is why I suggested at least to document the eventual behavior well.

Few things which I think should get documented (some of which you already figured are not the same between shells):

  • What's the state of var after local var in different initial states before this line (unset, set and empty, set and have value, readonly).
  • Same as above but after local var=val.
  • Is local foo=$(echo A B) the same as local foo=A B (i.e. local foo=A; local B) or as local foo="A B" ?

@McDutchie
Copy link
Author

McDutchie commented Sep 18, 2020

To minimise surprises and inconsistencies, I think it should ideally work just like typeset in ksh functions, except for dynamic scoping.

Which means:

  • What's the state of var after local var in different initial states before this line (unset, set and empty, set and have value, readonly).

Initial state unset in all cases. A global readonly with the same name doesn't affect it.

  • Same as above but after local var=val.

Same (except initial state set to 'val' obviously).

  • Is local foo=$(echo A B) the same as local foo=A B (i.e. local foo=A; local B) or as local foo="A B" ?

It's a declaration command, so split and glob are not performed on the assignment-arguments, which makes it the same as local foo="A B". (This will definitely be in the POSIX standard as well, this behaviour is already in the draft for export and readonly.)

@avih
Copy link

avih commented Sep 18, 2020

It's a declaration command, so split and glob are not performed on the assignment-arguments, which makes it the same as local foo="A B". (This will definitely be in the POSIX standard as well, this behaviour is already in the draft for export and readonly.)

Right. I did have some recollection that it's the expected behavior in an "assignment context" (i.e. normal assignments and readonly/export), and I know for a fact that most shells do that correctly these days for readonly/exports, but I couldn't actually find it mentioned at the 2018 spec.

But I guess I've read it at a draft. Link please?

@McDutchie
Copy link
Author

It isn't in the 2018 spec. It's in the first draft for the next version of the spec, which isn't out yet and is still internal to the Austin Group. Which anyone can join, by the way.

@avih
Copy link

avih commented Sep 18, 2020

Also, is there a way to "inherit" the prior state before the local command (but not the readonly state)? i.e. maybe something to the effect of this?

local foo=${foo+$foo}
case $foo in ?*) foo=${foo#?};; *) unset -v foo; esac

@McDutchie
Copy link
Author

McDutchie commented Sep 18, 2020

That would be more like:

case ${foo+set} in
set) local foo=$foo ;;
*)   local foo ;;
esac

or, more concisely and using a ksh-specific construct:

[[ -v foo ]] && local foo=$foo || local foo

@avih
Copy link

avih commented Sep 18, 2020

Thanks. Wouldn't they have identical effect though? or am I missing something (other than your solution being more elegant)?

@avih
Copy link

avih commented Sep 18, 2020

Ah, bug in mine, should have been foo=${foo++$foo}, clearly...

@McDutchie
Copy link
Author

local foo=${foo+$foo} sets the local foo to the empty value if the global foo is either empty or unset, because that parses to local foo= in either case. Your second line makes no sense to me at all – it deletes the first character of the value if it was non-empty and unsets it if it was empty.

@McDutchie
Copy link
Author

After that correction it should work, although I'd use case $foo in +*) foo=${foo#+};; *) unset -v foo; esac so that readers of that code might have a non-infinitesimal chance of grasping what it does. :)

Anyway, this is getting well bogged down in minutiae now.

@avih
Copy link

avih commented Sep 18, 2020

:)

Thanks. So there's no public link to the next draft other than by registering (as easy as It might be)?

@McDutchie
Copy link
Author

Correct.

JohnoKing added a commit to JohnoKing/ksh that referenced this issue Sep 18, 2020
This implementation of local doesn't have proper scoping.
The POSIX function scoping mechanism should be implemented
in b_dot_cmd, although it should be done without breaking
backward compatibility (i.e., typeset's behavior in POSIX
functions shouldn't change as a result of implementing
scoping).

Discussion: ksh93#123

src/cmd/ksh93/sh/name.c:
- Set a dynamic scope in ksh functions for the local and
  declare builtins. This currently doesn't work in POSIX
  functions since b_dot_cmd doesn't apply any scoping
  right now.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/xec.c:
- Add a dictionary generator entry for declare(1).
- Only allow the local builtin to run inside of functions.
- If the local builtin is run in a function without arguments,
  avoid any output.
- Set shp->st.var_local to shp->var_tree for declare and local
  to get a dynamic scope in ksh functions. For typeset use
  shp->var_base to get a static scope for typeset. This will
  need bug-testing for when local is used in namespaces.
- Set shp->st.funname to NULL after a POSIX function finishes
  running. Without this local will work outside of functions
  after a POSIX function is run.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/include/builtins.h:
- Add local and declare to the builtin table as declaration
  builtins. Depending on how local is specified in the next
  POSIX standard, it may need to be changed to a special builtin.
  That does break backward compatibility with existing ksh
  scripts though, so for now it's not one.
- Update the man pages with references to local(1). Add
  information to the man page about the intended scoping
  of declare and local, although it's not completely
  implemented.

src/cmd/ksh93/tests/builtins.sh:
- Run one of the regression tests for builtins inside of a
  function to fix a test failure with the local builtin.

src/cmd/ksh93/tests/types2.sh:
- Add a number of regression tests mostly targeted at the
  local builtin. Most of the bugs tested for here are present
  in the ksh93v- and ksh2020 implementations of local and
  declare.
@saper
Copy link

saper commented Feb 18, 2021

I just peeked at ksh93v- release notes and it seems that local has been implemented there, as well as bash-like function scoping (in a so-called bash mode):

14-06-02 +When compiled with the SHOPT_BASH and run with the name bash,
          the shell now uses dynamic scoping for name() and function name. 
          In addition the builtins declare and local are supported.
          The SHOPT_BASH option is on by default in the Makefile.
          More work remains with the bash compatibility option.

JohnoKing added a commit to JohnoKing/ksh that referenced this issue Mar 12, 2021
This implementation of local doesn't have proper scoping.
The POSIX function scoping mechanism should be implemented
in b_dot_cmd, although it should be done without breaking
backward compatibility (i.e., typeset's behavior in POSIX
functions shouldn't change as a result of implementing
scoping).

Discussion: ksh93#123

src/cmd/ksh93/sh/name.c:
- Set a dynamic scope in ksh functions for the local and
  declare builtins. This currently doesn't work in POSIX
  functions since b_dot_cmd doesn't apply any scoping
  right now.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/xec.c:
- Add a dictionary generator entry for declare(1).
- Only allow the local builtin to run inside of functions.
- If the local builtin is run in a function without arguments,
  avoid any output.
- Set shp->st.var_local to shp->var_tree for declare and local
  to get a dynamic scope in ksh functions. For typeset use
  shp->var_base to get a static scope for typeset. This will
  need bug-testing for when local is used in namespaces.
- Set shp->st.funname to NULL after a POSIX function finishes
  running. Without this local will work outside of functions
  after a POSIX function is run.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/include/builtins.h:
- Add local and declare to the builtin table as declaration
  builtins. Depending on how local is specified in the next
  POSIX standard, it may need to be changed to a special builtin.
  That does break backward compatibility with existing ksh
  scripts though, so for now it's not one.
- Update the man pages with references to local(1). Add
  information to the man page about the intended scoping
  of declare and local, although it's not completely
  implemented.

src/cmd/ksh93/tests/builtins.sh:
- Run one of the regression tests for builtins inside of a
  function to fix a test failure with the local builtin.

src/cmd/ksh93/tests/types2.sh:
- Add a number of regression tests mostly targeted at the
  local builtin. Most of the bugs tested for here are present
  in the ksh93v- and ksh2020 implementations of local and
  declare.
JohnoKing added a commit to JohnoKing/ksh that referenced this issue Mar 17, 2021
This implementation of local doesn't have proper scoping.
The POSIX function scoping mechanism should be implemented
in b_dot_cmd, although it should be done without breaking
backward compatibility (i.e., typeset's behavior in POSIX
functions shouldn't change as a result of implementing
scoping).

Discussion: ksh93#123

src/cmd/ksh93/sh/name.c:
- Set a dynamic scope in ksh functions for the local and
  declare builtins. This currently doesn't work in POSIX
  functions since b_dot_cmd doesn't apply any scoping
  right now.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/xec.c:
- Add a dictionary generator entry for declare(1).
- Only allow the local builtin to run inside of functions.
- Set shp->st.var_local to shp->var_tree for declare and local
  to get a dynamic scope in ksh functions. For typeset use
  shp->var_base to get a static scope for typeset. This will
  need bug-testing for when local is used in namespaces.
- Set shp->st.funname to NULL after a POSIX function finishes
  running. Without this local will work outside of functions
  after a POSIX function is run.

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/include/builtins.h:
- Add local and declare to the builtin table as declaration
  builtins. Depending on how local is specified in the next
  POSIX standard, it may need to be changed to a special builtin.
  That does break backward compatibility with existing ksh
  scripts though, so for now it's not one.
- Update the man pages with references to local(1). Add
  information to the man page about the intended scoping
  of declare and local, although it's not completely
  implemented.

src/cmd/ksh93/tests/builtins.sh:
- Run one of the regression tests for builtins inside of a
  function to fix a test failure with the local builtin.

src/cmd/ksh93/tests/local.sh:
- Add a number of regression tests mostly targeted at the
  local builtin. Most of the bugs tested for here are present
  in the ksh93v- and ksh2020 implementations of local and
  declare.
JohnoKing added a commit to JohnoKing/ksh that referenced this issue Apr 5, 2021
This implementation of local doesn't have proper scoping.
The POSIX function scoping mechanism should be implemented
in b_dot_cmd, although it should be done without breaking
backward compatibility (i.e., typeset's behavior in POSIX
functions shouldn't change as a result of implementing
scoping).

Discussion: ksh93#123

src/cmd/ksh93/sh/name.c:
- Set a dynamic scope in ksh functions for the local and
  declare builtins. This currently doesn't work in POSIX
  functions since b_dot_cmd doesn't apply any scoping
  right now.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/xec.c:
- Only allow the local builtin to run inside of functions. This
  is accomplished by using a new 'SH_INFUNCTION' state set when
  ksh is inside of a function.
- Set shp->st.var_local to shp->var_tree for declare and local
  to get a dynamic scope in ksh functions. For typeset use
  shp->var_base to get a static scope in ksh functions.
  This will need bug-testing for when local is used in
  namespaces.
- Add local and declare to top comment alongside other builtins.
- Add a dictionary generator entry for declare(1).

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/include/builtins.h:
- Add local and declare to the builtin table as declaration
  builtins. Depending on how local is specified in the next
  POSIX standard, it may need to be changed to a special builtin.
  That does break backward compatibility with existing ksh
  scripts though (such as one regression test script), so for now
  it's just a regular builtin.
- Update the man pages with references to local(1). Add
  information to the man page about the intended scoping
  of declare and local, although it's not completely
  implemented in POSIX functions.

src/cmd/ksh93/tests/builtins.sh:
- Run one of the regression tests for builtins inside of a
  function to fix a test failure with the local builtin.

src/cmd/ksh93/tests/local.sh:
- Add a number of regression tests mostly targeted at the
  local builtin. Most of the bugs tested for here are present
  in the ksh93v- and ksh2020 implementations of local and
  declare.
JohnoKing added a commit to JohnoKing/ksh that referenced this issue Apr 22, 2021
This implementation of local doesn't have proper scoping.
The POSIX function scoping mechanism should be implemented
in b_dot_cmd, although it should be done without breaking
backward compatibility (i.e., typeset's behavior in POSIX
functions shouldn't change as a result of implementing
scoping).

Discussion: ksh93#123

src/cmd/ksh93/sh/name.c:
- Set a dynamic scope in ksh functions for the local and
  declare builtins. This currently doesn't work in POSIX
  functions since b_dot_cmd doesn't apply any scoping
  right now.

src/cmd/ksh93/bltins/typeset.c,
src/cmd/ksh93/sh/xec.c:
- Only allow the local builtin to run inside of functions. This
  is accomplished by using a new 'SH_INFUNCTION' state set when
  ksh is inside of a function.
- Set shp->st.var_local to shp->var_tree for declare and local
  to get a dynamic scope in ksh functions. For typeset use
  shp->var_base to get a static scope in ksh functions.
  This will need bug-testing for when local is used in
  namespaces.
- Add local and declare to top comment alongside other builtins.
- Add a dictionary generator entry for declare(1).

src/cmd/ksh93/data/builtins.c,
src/cmd/ksh93/include/builtins.h:
- Add local and declare to the builtin table as declaration
  builtins. Depending on how local is specified in the next
  POSIX standard, it may need to be changed to a special builtin.
  That does break backward compatibility with existing ksh
  scripts though (such as one regression test script), so for now
  it's just a regular builtin.
- Update the man pages with references to local(1). Add
  information to the man page about the intended scoping
  of declare and local, although it's not completely
  implemented in POSIX functions.

src/cmd/ksh93/tests/builtins.sh:
- Run one of the regression tests for builtins inside of a
  function to fix a test failure with the local builtin.

src/cmd/ksh93/tests/local.sh:
- Add a number of regression tests mostly targeted at the
  local builtin. Most of the bugs tested for here are present
  in the ksh93v- and ksh2020 implementations of local and
  declare.
JohnoKing added a commit to JohnoKing/ksh that referenced this issue Dec 8, 2021
- tests/basic.sh: Fix a regression test that was failing under dtksh by
  allowing the error message to name ksh 'lt-dtksh'. Additionally, fix
  the test's inaccurate failure message (a version string is not what
  the regression test expects).

- test/builtins.sh: Exclude the expr builtin from the unrecognized
  options test because it's incompatible. Additionally, put the
  unrecognized options test inside of a function to ensure that it works
  with the future local builtin (ksh93#123).

- tests/io.sh: The long seek test may fail to seek past the 2 GiB
  boundary on 32-bit systems, so only allow it to run on 64-bit.
  References:
  https://src.fedoraproject.org/rpms/ksh/blob/3222ac2b59c042bb60413bdc53e2301b7ecc1555/f/ksh-1.0.0-beta.1-regre-tests.patch
  att@a5c692e

- tests/substring.sh: Add a regression test for the ${.sh.match}
  crashing bug fixed in commit 1bf1d2f.
McDutchie pushed a commit that referenced this issue Dec 9, 2021
- tests/basic.sh: Fix a regression test that was failing under dtksh by
  allowing the error message to name ksh 'lt-dtksh'. Additionally, fix
  the test's inaccurate failure message (a version string is not what
  the regression test expects).

- test/builtins.sh: Exclude the expr builtin from the unrecognized
  options test because it's incompatible. Additionally, put the
  unrecognized options test inside of a function to ensure that it works
  with the future local builtin (#123).

- tests/io.sh: The long seek test may fail to seek past the 2 GiB
  boundary on 32-bit systems, so only allow it to run on 64-bit.
  References:
  https://src.fedoraproject.org/rpms/ksh/blob/3222ac2b59c042bb60413bdc53e2301b7ecc1555/f/ksh-1.0.0-beta.1-regre-tests.patch
  att@a5c692e

- tests/substring.sh: Add a regression test for the ${.sh.match}
  crashing bug fixed in commit 1bf1d2f.
McDutchie pushed a commit that referenced this issue Dec 9, 2021
- tests/basic.sh: Fix a regression test that was failing under dtksh by
  allowing the error message to name ksh 'lt-dtksh'. Additionally, fix
  the test's inaccurate failure message (a version string is not what
  the regression test expects).

- test/builtins.sh: Exclude the expr builtin from the unrecognized
  options test because it's incompatible. Additionally, put the
  unrecognized options test inside of a function to ensure that it works
  with the future local builtin (#123).

- tests/io.sh: The long seek test may fail to seek past the 2 GiB
  boundary on 32-bit systems, so only allow it to run on 64-bit.
  References:
  https://src.fedoraproject.org/rpms/ksh/blob/3222ac2b59c042bb60413bdc53e2301b7ecc1555/f/ksh-1.0.0-beta.1-regre-tests.patch
  att@a5c692e

- tests/substring.sh: Add a regression test for the ${.sh.match}
  crashing bug fixed in commit 1bf1d2f.
@McDutchie McDutchie changed the title Implement to-be-POSIX local builtin Implement local builtin with dynamic scoping for POSIX functions Sep 25, 2022
@McDutchie McDutchie added the 1.1 Issue relevant to dev branch (v1.1.*) label Sep 25, 2022
@JohnoKing
Copy link

JohnoKing commented Oct 1, 2022

I've managed to create a patch with a local builtin that works in POSIX functions (although it's unfortunately very buggy). To get any sort of local scope working in POSIX functions, I had to import code from b_dot_cmd() to sh_funscope(), then have the latter function handle all POSIX functions.

local-builtin-in-posix-funs.patch
diff --git a/src/cmd/ksh93/bltins/misc.c b/src/cmd/ksh93/bltins/misc.c
index 338d08d11..f9923b30e 100644
--- a/src/cmd/ksh93/bltins/misc.c
+++ b/src/cmd/ksh93/bltins/misc.c
@@ -222,7 +222,7 @@ int    b_dot_cmd(register int n,char *argv[],Shbltin_t *context)
 	register int jmpval;
 	struct sh_scoped savst, *prevscope = sh.st.self;
 	char *filename=0, *buffer=0, *tofree;
-	int	fd;
+	int	fd, infunction=0;
 	struct dolnod   *saveargfor;
 	volatile struct dolnod   *argsave=0;
 	struct checkpt buff;
@@ -249,56 +249,57 @@ int    b_dot_cmd(register int n,char *argv[],Shbltin_t *context)
 		errormsg(SH_DICT,ERROR_exit(1),e_toodeep,script);
 		UNREACHABLE();
 	}
-	if(!(np=sh.posix_fun))
+	/* check for KornShell style function first */
+	np = nv_search(script,sh.fun_tree,0);
+	if(np && is_afunction(np) && !nv_isattr(np,NV_FPOSIX) && !(sh_isoption(SH_POSIX) && sh.bltindata.bnode==SYSDOT))
 	{
-		/* check for KornShell style function first */
-		np = nv_search(script,sh.fun_tree,0);
-		if(np && is_afunction(np) && !nv_isattr(np,NV_FPOSIX) && !(sh_isoption(SH_POSIX) && sh.bltindata.bnode==SYSDOT))
+		if(!np->nvalue.ip)
 		{
-			if(!np->nvalue.ip)
+			path_search(script,NIL(Pathcomp_t**),0);
+			if(np->nvalue.ip)
 			{
-				path_search(script,NIL(Pathcomp_t**),0);
-				if(np->nvalue.ip)
-				{
-					if(nv_isattr(np,NV_FPOSIX))
-						np = 0;
-				}
-				else
-				{
-					errormsg(SH_DICT,ERROR_exit(1),e_found,script);
-					UNREACHABLE();
-				}
+				if(nv_isattr(np,NV_FPOSIX))
+					np = 0;
 			}
-		}
-		else
-			np = 0;
-		if(!np)
-		{
-			if((fd=path_open(script,path_get(script))) < 0)
+			else
 			{
-				errormsg(SH_DICT,ERROR_system(1),e_open,script);
+				errormsg(SH_DICT,ERROR_exit(1),e_found,script);
 				UNREACHABLE();
 			}
-			filename = path_fullname(stkptr(sh.stk,PATH_OFFSET));
 		}
 	}
+	else
+		np = 0;
+	if(!np)
+	{
+		if((fd=path_open(script,path_get(script))) < 0)
+		{
+			errormsg(SH_DICT,ERROR_system(1),e_open,script);
+			UNREACHABLE();
+		}
+		filename = path_fullname(stkptr(sh.stk,PATH_OFFSET));
+	}
 	*prevscope = sh.st;
+	sh.st.prevst = prevscope;
+	sh.st.self = &savst;
+	sh.topscope = (Shscope_t*)sh.st.self;
+	prevscope->save_tree = sh.var_tree;
+	if(np)
+	{
+		infunction = sh.infunction;
+		sh.infunction = 1;
+	}
 	sh.st.lineno = np?((struct functnod*)nv_funtree(np))->functline:1;
-	sh.st.save_tree = sh.var_tree;
+	sh.st.save_tree = sh.st.var_local = sh.var_tree;
 	if(filename)
 	{
 		sh.st.filename = filename;
 		sh.st.lineno = 1;
 	}
-	sh.st.prevst = prevscope;
-	sh.st.self = &savst;
-	sh.topscope = (Shscope_t*)sh.st.self;
-	prevscope->save_tree = sh.var_tree;
 	tofree = sh.st.filename;
 	if(np)
 		sh.st.filename = np->nvalue.rp->fname;
 	nv_putval(SH_PATHNAMENOD, sh.st.filename ,NV_NOFREE);
-	sh.posix_fun = 0;
 	if(np || argv[1])
 		argsave = sh_argnew(argv,&saveargfor);
 	sh_pushcontext(&buff,SH_JMPDOT);
@@ -333,12 +334,14 @@ int    b_dot_cmd(register int n,char *argv[],Shbltin_t *context)
 		prevscope->dolc = sh.st.dolc;
 		prevscope->dolv = sh.st.dolv;
 	}
-	if (sh.st.self != &savst)
+	if(sh.st.self != &savst)
 		*sh.st.self = sh.st;
-	/* only restore the top Shscope_t portion for POSIX functions */
+	if(!infunction)
+		sh.infunction = infunction;
+	/* only restore the top Shscope_t portion */
 	memcpy((void*)&sh.st, (void*)prevscope, sizeof(Shscope_t));
 	sh.topscope = (Shscope_t*)prevscope;
-	nv_putval(SH_PATHNAMENOD, sh.st.filename ,NV_NOFREE);
+	nv_putval(SH_PATHNAMENOD,sh.st.filename,NV_NOFREE);
 	if(jmpval && jmpval!=SH_JMPFUN)
 		siglongjmp(*sh.jmplist,jmpval);
 	return(sh.exitval);
diff --git a/src/cmd/ksh93/bltins/typeset.c b/src/cmd/ksh93/bltins/typeset.c
index 0f3ddf245..f2633e9ee 100644
--- a/src/cmd/ksh93/bltins/typeset.c
+++ b/src/cmd/ksh93/bltins/typeset.c
@@ -21,6 +21,8 @@
  * export [-p] [arg...]
  * readonly [-p] [arg...]
  * typeset [options] [arg...]
+ * declare [options] [arg...]
+ * local [options] [arg...]
  * autoload [options] [arg...]
  * compound [options] [arg...]
  * float [options] [arg...]
@@ -208,6 +210,7 @@ int    b_alias(int argc,register char *argv[],Shbltin_t *context)
     /* for the dictionary generator */
     int    b_autoload(int argc,register char *argv[],Shbltin_t *context){}
     int    b_compound(int argc,register char *argv[],Shbltin_t *context){}
+    int    b_declare(int argc,register char *argv[],Shbltin_t *context){}
     int    b_float(int argc,register char *argv[],Shbltin_t *context){}
     int    b_functions(int argc,register char *argv[],Shbltin_t *context){}
     int    b_integer(int argc,register char *argv[],Shbltin_t *context){}
@@ -221,17 +224,18 @@ int    b_typeset(int argc,register char *argv[],Shbltin_t *context)
 	const char	*optstring = sh_opttypeset;
 	Namdecl_t 	*ntp = (Namdecl_t*)context->ptr;
 	Dt_t		*troot;
-	int		isfloat=0, isadjust=0, shortint=0, sflag=0;
+	int		isfloat=0, isadjust=0, shortint=0, sflag=0, local;
 
 	memset((void*)&tdata,0,sizeof(tdata));
 	troot = sh.var_tree;
+	local = argv[0][0] == 'l';
 	if(ntp)					/* custom declaration command added using enum */
 	{
 		tdata.tp = ntp->tp;
 		opt_info.disc = (Optdisc_t*)ntp->optinfof;
 		optstring = ntp->optstring;
 	}
-	else if(argv[0][0] != 't')		/* not <t>ypeset */
+	else if(argv[0][0] != 't' && argv[0][0] != 'd' && !local) /* not <t>ypeset, <d>eclare or <l>ocal */
 	{
 		char **new_argv = (char **)stakalloc((argc + 2) * sizeof(char*));
 		error_info.id = new_argv[0] = SYSTYPESET->nvname;
@@ -444,6 +448,12 @@ int    b_typeset(int argc,register char *argv[],Shbltin_t *context)
 endargs:
 	argv += opt_info.index;
 	opt_info.disc = 0;
+	/* 'local' builtin */
+	if(local && !sh.infunction)
+	{
+		errormsg(SH_DICT,ERROR_exit(1), "can only be used in a function");
+		UNREACHABLE();
+	}
 	/* handle argument of + and - specially */
 	if(*argv && argv[0][1]==0 && (*argv[0]=='+' || *argv[0]=='-'))
 		tdata.aflag = *argv[0];
@@ -647,17 +657,21 @@ static int     setall(char **argv,register int flag,Dt_t *troot,struct tdata *tp
 	char *last = 0;
 	int nvflags=(flag&(NV_ARRAY|NV_NOARRAY|NV_VARNAME|NV_IDENT|NV_ASSIGN|NV_STATIC|NV_MOVE));
 	int r=0, ref=0, comvar=(flag&NV_COMVAR),iarray=(flag&NV_IARRAY);
-	Dt_t *save_vartree;
+	Dt_t *save_vartree, *save_varlocal = 0;
 	Namval_t *save_namespace;
-	if(flag&NV_GLOBAL)
+	if((flag&NV_GLOBAL) || (sh.infunction==1 && sh.st.var_local == sh.var_base))
 	{
 		save_vartree = sh.var_tree;
-		troot = sh.var_tree = sh.var_base;
+		save_varlocal = sh.st.var_local;
+		troot = sh.var_tree = sh.st.var_local = sh.var_base;
+	}
 #if SHOPT_NAMESPACE
+	if(flag&NV_GLOBAL)
+	{
 		save_namespace = sh.namespace;
 		sh.namespace = NIL(Namval_t*);
-#endif
 	}
+#endif
 	if(!sh.prefix)
 	{
 		if(!tp->pflag)
@@ -1031,13 +1045,15 @@ static int     setall(char **argv,register int flag,Dt_t *troot,struct tdata *tp
 		if(r==0)
 			r = 1;  /* ensure the exit status is at least 1 */
 	}
-	if(flag&NV_GLOBAL)
+	if((flag&NV_GLOBAL) || (sh.infunction==1 && save_varlocal))
 	{
 		sh.var_tree = save_vartree;
+		sh.st.var_local = save_varlocal;
+	}
 #if SHOPT_NAMESPACE
+	if(flag&NV_GLOBAL)
 		sh.namespace = save_namespace;
 #endif
-	}
 	return(r);
 }
 
diff --git a/src/cmd/ksh93/data/builtins.c b/src/cmd/ksh93/data/builtins.c
index ff1654a52..a0bfe9f27 100644
--- a/src/cmd/ksh93/data/builtins.c
+++ b/src/cmd/ksh93/data/builtins.c
@@ -68,6 +68,8 @@ const struct shtable3 shtab_builtins[] =
 	"functions",	NV_BLTIN|BLT_ENV,		bltin(typeset),
 	"integer",	NV_BLTIN|BLT_ENV|BLT_DCL,	bltin(typeset),
 	"nameref",	NV_BLTIN|BLT_ENV|BLT_DCL,	bltin(typeset),
+	"local",	NV_BLTIN|BLT_ENV|BLT_DCL,	bltin(typeset),
+	"declare",	NV_BLTIN|BLT_ENV|BLT_DCL,	bltin(typeset),
 	"test",		NV_BLTIN|BLT_ENV,		bltin(test),
 	"[",		NV_BLTIN|BLT_ENV,		bltin(test),
 	"let",		NV_BLTIN|BLT_ENV,		bltin(let),
@@ -728,7 +730,7 @@ const char sh_optexport[] =
         "[+>0?An error occurred.]"
 "}"
 
-"[+SEE ALSO?\bsh\b(1), \btypeset\b(1)]"
+"[+SEE ALSO?\bsh\b(1), \btypeset\b(1), \blocal\b(1)]"
 ;
 
 const char sh_optgetopts[] =
@@ -1539,7 +1541,7 @@ const char sh_optreadonly[] =
         "[+>0?An error occurred.]"
 "}"
 
-"[+SEE ALSO?\bsh\b(1), \btypeset\b(1)]"
+"[+SEE ALSO?\bsh\b(1), \btypeset\b(1), \blocal\b(1)]"
 ;
 
 const char sh_optredirect[] =
@@ -1678,6 +1680,7 @@ const char sh_optksh[] =
 
 "[+SEE ALSO?\bset\b(1), \bbuiltin\b(1)]"
 ;
+
 const char sh_optset[] =
 "+[-1c?\n@(#)$Id: set (ksh 93u+m) 2022-06-04 $\n]"
 "[--catalog?" SH_DICT "]"
@@ -1714,11 +1717,9 @@ const char sh_optset[] =
         "[+>0?An error occurred.]"
 "}"
 
-"[+SEE ALSO?\btypeset\b(1), \bshift\b(1)]"
+"[+SEE ALSO?\btypeset\b(1), \blocal\b(1), \bshift\b(1)]"
 ;
 
-
-
 const char sh_optshift[] =
 "[-1c?\n@(#)$Id: shift (AT&T Research) 1999-07-07 $\n]"
 "[--catalog?" SH_DICT "]"
@@ -1830,7 +1831,7 @@ const char sh_opttrap[] =
 ;
 
 const char sh_opttypeset[] =
-"+[-1c?\n@(#)$Id: typeset (ksh 93u+m) 2022-06-01 $\n]"
+"+[-1c?\n@(#)$Id: typeset (ksh 93u+m) 2022-06-17 $\n]"
 "[--catalog?" SH_DICT "]"
 "[+NAME?typeset - declare or display variables with attributes]"
 "[+DESCRIPTION?Without the \b-f\b option, \btypeset\b sets, unsets, "
@@ -1843,7 +1844,12 @@ const char sh_opttypeset[] =
 "[+?When \btypeset\b is called inside a function defined with the "
 	"\bfunction\b reserved word, and \aname\a does not contain a "
 	"\b.\b, then a local variable statically scoped to that function "
-	"will be created.]"
+	"will be created. If \btypeset\b is used in a POSIX function, a "
+	"variable is created in the global scope.]"
+"[+?\btypeset\b can also be invoked as \bdeclare\b, or (in functions "
+	"only) as \blocal\b. A variable created or modified by either \bdeclare\b "
+	"or \blocal\b is given a dynamic scope limited to the function the "
+	"variable was created in.]"
 "[+?Not all option combinations are possible.  For example, the numeric "
 	"options \b-i\b, \b-E\b, and \b-F\b cannot be specified with "
 	"the justification options \b-L\b, \b-R\b, and \b-Z\b.]"
@@ -2060,7 +2066,7 @@ const char sh_optunset[] =
 	"or an error occurred.]"
 "}"
 
-"[+SEE ALSO?\btypeset\b(1)]"
+"[+SEE ALSO?\btypeset\b(1), \blocal\b(1)]"
 ;
 
 const char sh_optunalias[] =
diff --git a/src/cmd/ksh93/include/builtins.h b/src/cmd/ksh93/include/builtins.h
index c1000fa2d..455849711 100644
--- a/src/cmd/ksh93/include/builtins.h
+++ b/src/cmd/ksh93/include/builtins.h
@@ -47,16 +47,18 @@
 						/* functions	|		*/
 						/* integer	|		*/
 #define SYSNAMEREF	(sh.bltin_cmds+15)	/* nameref      |		*/
-#define SYSTYPESET_END	(sh.bltin_cmds+15)	/*	       /		*/
-
-#define SYSTEST		(sh.bltin_cmds+16)	/* test */
-#define SYSBRACKET	(sh.bltin_cmds+17)	/* [ */
-#define SYSLET		(sh.bltin_cmds+18)	/* let */
-#define SYSEXPORT	(sh.bltin_cmds+19)	/* export */
-#define SYSDOT		(sh.bltin_cmds+20)	/* . */
-#define SYSSOURCE	(sh.bltin_cmds+21)	/* source */
-#define SYSRETURN	(sh.bltin_cmds+22)	/* return */
-#define SYSENUM		(sh.bltin_cmds+23)	/* enum */
+#define SYSLOCAL	(sh.bltin_cmds+16)	/* local	|		*/
+#define SYSDECLARE	(sh.bltin_cmds+17)	/* declare	|		*/
+#define SYSTYPESET_END	(sh.bltin_cmds+17)	/*	       /		*/
+
+#define SYSTEST		(sh.bltin_cmds+18)	/* test */
+#define SYSBRACKET	(sh.bltin_cmds+19)	/* [ */
+#define SYSLET		(sh.bltin_cmds+20)	/* let */
+#define SYSEXPORT	(sh.bltin_cmds+21)	/* export */
+#define SYSDOT		(sh.bltin_cmds+22)	/* . */
+#define SYSSOURCE	(sh.bltin_cmds+23)	/* source */
+#define SYSRETURN	(sh.bltin_cmds+24)	/* return */
+#define SYSENUM		(sh.bltin_cmds+25)	/* enum */
 
 /* entry point for shell special builtins */
 
diff --git a/src/cmd/ksh93/include/defs.h b/src/cmd/ksh93/include/defs.h
index 038080afd..74911790a 100644
--- a/src/cmd/ksh93/include/defs.h
+++ b/src/cmd/ksh93/include/defs.h
@@ -129,6 +129,7 @@ extern int		sh_macfun(const char*,int);
 extern void 		sh_machere(Sfio_t*, Sfio_t*, char*);
 extern void 		*sh_macopen(void);
 extern char 		*sh_macpat(struct argnod*,int);
+extern void		local_exports(Namval_t*, void*);
 extern Sfdouble_t	sh_mathfun(void*, int, Sfdouble_t*);
 extern int		sh_outtype(Sfio_t*);
 extern char 		*sh_mactry(char*);
diff --git a/src/cmd/ksh93/include/shell.h b/src/cmd/ksh93/include/shell.h
index fd09acb21..dd5e6bda4 100644
--- a/src/cmd/ksh93/include/shell.h
+++ b/src/cmd/ksh93/include/shell.h
@@ -195,6 +195,7 @@ struct sh_scoped
 	int		lineno;
 	Dt_t		*save_tree;	/* var_tree for calling function */
 	struct sh_scoped *self;		/* pointer to copy of this scope */
+	Dt_t		*var_local;	/* local level variables used by 'local' and 'declare' */
 	struct slnod	*staklist;	/* link list of function stacks */
 	int		states;		/* shell state bits used by sh_isstate(), etc. */
 	int		breakcnt;	/* number of levels to 'break'/'continue' (negative if 'continue') */
@@ -291,7 +292,6 @@ struct Shell_s
 	unsigned int	jobenv;		/* subshell number for jobs */
 	int		infd;		/* input file descriptor */
 	short		nextprompt;	/* next prompt is PS<nextprompt> */
-	Namval_t	*posix_fun;	/* points to last name() function */
 	char		*outbuff;	/* pointer to output buffer */
 	char		*errbuff;	/* pointer to stderr buffer */
 	char		*prompt;	/* pointer to prompt string */
@@ -313,6 +313,7 @@ struct Shell_s
 	char		funload;
 	char		used_pos;	/* used positional parameter */
 	char		universe;
+	char		infunction;	/* 0 outside of functions, 1 in ksh functions run from '.' and POSIX functions, 2 in regular ksh functions */
 	char		winch;		/* set upon window size change or 'set -b' notification */
 	short		arithrecursion;	/* current arithmetic recursion level */
 	char		indebug; 	/* set when in debug trap */
diff --git a/src/cmd/ksh93/sh/name.c b/src/cmd/ksh93/sh/name.c
index e568a8c85..fb4d8cbdc 100644
--- a/src/cmd/ksh93/sh/name.c
+++ b/src/cmd/ksh93/sh/name.c
@@ -237,19 +237,23 @@ void nv_setlist(register struct argnod *arg,register int flags, Namval_t *typ)
 	struct Namref	nr;
 	int		maketype = flags&NV_TYPE;  /* make a 'typeset -T' type definition command */
 	struct sh_type	shtp;
-	Dt_t		*vartree, *save_vartree;
+	Dt_t		*vartree, *save_vartree, *save_varlocal = 0;
 #if SHOPT_NAMESPACE
 	Namval_t	*save_namespace;
 #endif
-	if(flags&NV_GLOBAL)
+	if((flags&NV_GLOBAL) || (sh.infunction==1 && sh.st.var_local == sh.var_base))
 	{
 		save_vartree = sh.var_tree;
-		sh.var_tree = sh.var_base;
+		save_varlocal = sh.st.var_local;
+		sh.var_tree = sh.st.var_local = sh.var_base;
+	}
 #if SHOPT_NAMESPACE
+	if(flags&NV_GLOBAL)
+	{
 		save_namespace = sh.namespace;
 		sh.namespace = NIL(Namval_t*);
-#endif
 	}
+#endif
 	if(maketype)
 	{
 		shtp.previous = sh.mktype;
@@ -386,6 +390,7 @@ void nv_setlist(register struct argnod *arg,register int flags, Namval_t *typ)
 					Dt_t	*last_root = sh.last_root;
 					char **argv = sh_argbuild(&argc,&tp->com,0);
 					sh.last_root = last_root;
+					/* TODO: Implement a proper check for POSIX functions here */
 					if(sh.mktype && sh.dot_depth==0 && np==((struct sh_type*)sh.mktype)->nodes[0])
 					{
 						sh.mktype = 0;
@@ -633,13 +638,15 @@ void nv_setlist(register struct argnod *arg,register int flags, Namval_t *typ)
 		}
 		/* continue loop */
 	}
-	if(flags&NV_GLOBAL)
+	if((flags&NV_GLOBAL) || (sh.infunction==1 && save_varlocal))
 	{
 		sh.var_tree = save_vartree;
+		sh.st.var_local = save_varlocal;
+	}
 #if SHOPT_NAMESPACE
+	if(flags&NV_GLOBAL)
 		sh.namespace = save_namespace;
 #endif
-	}
 }
 
 /*
@@ -2284,7 +2291,7 @@ int nv_scan(Dt_t *root, void (*fn)(Namval_t*,void*), void *data,int mask, int fl
  */
 void sh_scope(struct argnod *envlist, int fun)
 {
-	register Dt_t		*newscope, *newroot=sh.var_base;
+	register Dt_t		*newscope, *newroot=sh.st.var_local ? sh.st.var_local : sh.var_base;
 	struct Ufunction	*rp;
 #if SHOPT_NAMESPACE
 	if(sh.namespace)
diff --git a/src/cmd/ksh93/sh/xec.c b/src/cmd/ksh93/sh/xec.c
index b76bceb1b..45b17e666 100644
--- a/src/cmd/ksh93/sh/xec.c
+++ b/src/cmd/ksh93/sh/xec.c
@@ -998,6 +998,23 @@ int sh_exec(register const Shnode_t *t, int flags)
 						else if(checkopt(com,'a'))
 							flgs |= NV_IARRAY;
 					}
+					/* Prohibit usage of the local builtin outside of functions */
+					if(np==SYSLOCAL && !sh.infunction)
+					{
+						errormsg(SH_DICT,ERROR_exit(1),"%s: can only be used in a function",com0);
+						UNREACHABLE();
+					}
+					/*
+					 * The declare and local builtins have a dynamic scope limited
+					 * to the function in which they are called. This should be
+					 * skipped when not in a function.
+					 * TODO: Should this be moved after the flag checks to handle NV_GLOBAL?
+					 */
+					if((np == SYSLOCAL || (sh.infunction && np == SYSDECLARE)) && sh.st.var_local != sh.var_tree)
+						sh.st.var_local = sh.var_tree;
+					/* The other typeset builtins have a static scope */
+					else if(!(np == SYSLOCAL || np == SYSDECLARE) && sh.st.var_local == sh.var_tree)
+						sh.st.var_local = sh.var_base;
 					if(np && funptr(np)==b_typeset)
 					{
 						/* command calls b_typeset(); treat as a typeset variant */
@@ -2951,7 +2968,7 @@ pid_t sh_fork(int flags, int *jobid)
 /*
  * add exports from previous scope to the new scope
  */
-static void  local_exports(register Namval_t *np, void *data)
+void local_exports(register Namval_t *np, void *data)
 {
 	register Namval_t	*mp;
 	NOT_USED(data);
@@ -3012,7 +3029,7 @@ int sh_funscope(int argn, char *argv[],int(*fun)(void*),void *arg,int execflg)
 	int			isig,jmpval;
 	volatile int		r = 0;
 	int			n;
-	char			save_invoc_local;
+	char			save_invoc_local, infunction, posix_fun = 0;
 	char 			**savsig, *save_debugtrap = 0;
 	struct funenv		*fp = 0;
 	struct checkpt	*buffp = (struct checkpt*)stkalloc(sh.stk,sizeof(struct checkpt));
@@ -3026,7 +3043,6 @@ int sh_funscope(int argn, char *argv[],int(*fun)(void*),void *arg,int execflg)
 	else
 		sh.options = sh.glob_options;
 	*prevscope = sh.st;
-	sh_offoption(SH_ERREXIT);
 	sh.st.prevst = prevscope;
 	sh.st.self = &savst;
 	sh.topscope = (Shscope_t*)sh.st.self;
@@ -3038,7 +3054,16 @@ int sh_funscope(int argn, char *argv[],int(*fun)(void*),void *arg,int execflg)
 		fp = (struct funenv*)arg;
 		sh.st.real_fun = (fp->node)->nvalue.rp;
 		envlist = fp->env;
+		posix_fun = nv_isattr(fp->node,NV_FPOSIX);
+	}
+	infunction = sh.infunction;
+	if(!posix_fun)
+	{
+		sh.infunction = 2;
+		sh_offoption(SH_ERREXIT);
 	}
+	else
+		sh.infunction = 1;
 	prevscope->save_tree = sh.var_tree;
 	n = dtvnext(prevscope->save_tree)!= (sh.namespace?sh.var_base:0);
 	sh_scope(envlist,1);
@@ -3047,50 +3072,59 @@ int sh_funscope(int argn, char *argv[],int(*fun)(void*),void *arg,int execflg)
 		/* eliminate parent scope */
 		nv_scan(prevscope->save_tree, local_exports, NIL(void*), NV_EXPORT, NV_EXPORT|NV_NOSCOPE);
 	}
-	sh.st.save_tree = sh.var_tree;
+	if(posix_fun)
+		/* TODO: Is this even necessary anymore? */
+		sh.st.lineno = ((struct functnod*)nv_funtree(fp->node))->functline;
+	sh.st.save_tree = sh.st.var_local = sh.var_tree;
 	if(!fun)
 	{
-		if(sh_isoption(SH_FUNCTRACE) && is_option(&options,SH_XTRACE) || nv_isattr(fp->node,NV_TAGGED))
-			sh_onoption(SH_XTRACE);
-		else
-			sh_offoption(SH_XTRACE);
+		if(fp->node->nvalue.rp)
+			sh.st.filename = fp->node->nvalue.rp->fname;
+		nv_putval(SH_PATHNAMENOD, sh.st.filename, NV_NOFREE);
+		sh.last_root = nv_dict(DOTSHNOD);
 	}
-	sh.st.cmdname = argv[0];
-	/* save trap table */
-	if((nsig=sh.st.trapmax)>0 || sh.st.trapcom[0])
+	if(!posix_fun)
 	{
-		savsig = sh_malloc(nsig * sizeof(char*));
-		/*
-		 * the data is, usually, modified in code like:
-		 *	tmp = buf[i]; buf[i] = sh_strdup(tmp); free(tmp);
-		 * so sh.st.trapcom needs a "deep copy" to properly save/restore pointers.
-		 */
-		for (isig = 0; isig < nsig; ++isig)
+		if(!fun)
 		{
-			if(sh.st.trapcom[isig] == Empty)
-				savsig[isig] = Empty;
-			else if(sh.st.trapcom[isig])
-				savsig[isig] = sh_strdup(sh.st.trapcom[isig]);
+			if(sh_isoption(SH_FUNCTRACE) && is_option(&options,SH_XTRACE) || nv_isattr(fp->node,NV_TAGGED))
+				sh_onoption(SH_XTRACE);
 			else
-				savsig[isig] = NULL;
+				sh_offoption(SH_XTRACE);
 		}
+		sh.st.cmdname = argv[0];
+		/* save trap table */
+		if((nsig=sh.st.trapmax)>0 || sh.st.trapcom[0])
+		{
+			savsig = sh_malloc(nsig * sizeof(char*));
+			/*
+			 * the data is, usually, modified in code like:
+			 *	tmp = buf[i]; buf[i] = sh_strdup(tmp); free(tmp);
+			 * so sh.st.trapcom needs a "deep copy" to properly save/restore pointers.
+			 */
+			for (isig = 0; isig < nsig; ++isig)
+			{
+				if(sh.st.trapcom[isig] == Empty)
+					savsig[isig] = Empty;
+				else if(sh.st.trapcom[isig])
+					savsig[isig] = sh_strdup(sh.st.trapcom[isig]);
+				else
+					savsig[isig] = NULL;
+			}
+		}
+		if(!fun && sh_isoption(SH_FUNCTRACE) && sh.st.trap[SH_DEBUGTRAP] && *sh.st.trap[SH_DEBUGTRAP])
+			save_debugtrap = sh_strdup(sh.st.trap[SH_DEBUGTRAP]);
+		sh_sigreset(-1);
+		if(save_debugtrap)
+			sh.st.trap[SH_DEBUGTRAP] = save_debugtrap;
 	}
-	if(!fun && sh_isoption(SH_FUNCTRACE) && sh.st.trap[SH_DEBUGTRAP] && *sh.st.trap[SH_DEBUGTRAP])
-		save_debugtrap = sh_strdup(sh.st.trap[SH_DEBUGTRAP]);
-	sh_sigreset(-1);
-	if(save_debugtrap)
-		sh.st.trap[SH_DEBUGTRAP] = save_debugtrap;
 	argsav = sh_argnew(argv,&saveargfor);
 	sh_pushcontext(buffp,SH_JMPFUN);
 	errorpush(&buffp->err,0);
 	error_info.id = argv[0];
 	if(!fun)
 	{
-		if(fp->node->nvalue.rp)
-			sh.st.filename = fp->node->nvalue.rp->fname;
 		sh.st.funname = nv_name(fp->node);
-		sh.last_root = nv_dict(DOTSHNOD);
-		nv_putval(SH_PATHNAMENOD,sh.st.filename,NV_NOFREE);
 		nv_putval(SH_FUNNAMENOD,sh.st.funname,NV_NOFREE);
 	}
 	save_invoc_local = sh.invoc_local;
@@ -3132,7 +3166,8 @@ int sh_funscope(int argn, char *argv[],int(*fun)(void*),void *arg,int execflg)
 	sh.invoc_local = save_invoc_local;
 	sh.fn_depth--;
 	update_sh_level();
-	if(sh.fn_depth==1 && jmpval==SH_JMPERRFN)
+	sh.infunction = infunction;
+	if(!posix_fun && sh.fn_depth==1 && jmpval==SH_JMPERRFN)
 	{
 		errormsg(SH_DICT,ERROR_exit(1),e_toodeep,argv[0]);
 		UNREACHABLE();
@@ -3141,12 +3176,26 @@ int sh_funscope(int argn, char *argv[],int(*fun)(void*),void *arg,int execflg)
 	sh_unscope();
 	sh.namespace = nspace;
 	sh.var_tree = (Dt_t*)prevscope->save_tree;
-	sh_argreset(argsav,saveargfor);
-	trap = sh.st.trapcom[0];
-	sh.st.trapcom[0] = 0;
-	sh_sigreset(1);
+	if(!posix_fun || jmpval!=SH_JMPSCRIPT)
+		sh_argreset(argsav,saveargfor);
+	if(!posix_fun)
+	{
+		trap = sh.st.trapcom[0];
+		sh.st.trapcom[0] = 0;
+		sh_sigreset(1);
+	}
 	sh.st = *prevscope;
 	sh.topscope = (Shscope_t*)prevscope;
+	sh.last_root = last_root;
+	if(posix_fun)
+	{
+		if(sh.st.self != &savst)
+			*sh.st.self = sh.st;
+		nv_putval(SH_PATHNAMENOD,sh.st.filename,NV_NOFREE);
+		if(jmpval && jmpval!=SH_JMPFUN)
+			siglongjmp(*sh.jmplist,jmpval);
+		return(sh.exitval);
+	}
 	nv_getval(sh_scoped(IFSNOD));
 	if(nsig)
 	{
@@ -3156,9 +3205,8 @@ int sh_funscope(int argn, char *argv[],int(*fun)(void*),void *arg,int execflg)
 		memcpy((char*)&sh.st.trapcom[0],savsig,nsig*sizeof(char*));
 		free((void*)savsig);
 	}
-	sh.trapnote=0;
+	sh.trapnote = 0;
 	sh.options = options;
-	sh.last_root = last_root;
 	if(jmpval == SH_JMPSUB)
 		siglongjmp(*sh.jmplist,jmpval);
 	if(trap)
@@ -3181,8 +3229,9 @@ int sh_funscope(int argn, char *argv[],int(*fun)(void*),void *arg,int execflg)
 static void sh_funct(Namval_t *np,int argn, char *argv[],struct argnod *envlist,int execflg)
 {
 	struct funenv fun;
-	char *fname = nv_getval(SH_FUNNAMENOD);
+	char *fname =	nv_getval(SH_FUNNAMENOD);
 	pid_t		pipepid = sh.pipepid;
+	int		loopcnt;
 #if !SHOPT_DEVFD
 	Dt_t		*save_fifo_tree = sh.fifo_tree;
 	sh.fifo_tree = NIL(Dt_t*);
@@ -3192,38 +3241,25 @@ static void sh_funct(Namval_t *np,int argn, char *argv[],struct argnod *envlist,
 	if((struct sh_scoped*)sh.topscope != sh.st.self)
 		sh_setscope(sh.topscope);
 	sh.st.lineno = error_info.line;
-	np->nvalue.rp->running  += 2;
+	np->nvalue.rp->running += 2;
 	if(nv_isattr(np,NV_FPOSIX))
 	{
-		char *save;
-		int loopcnt = sh.st.loopcnt;
-		sh.posix_fun = np;
-		save = argv[-1];
-		argv[-1] = 0;
-		sh.st.funname = nv_name(np);
-		sh.last_root = nv_dict(DOTSHNOD);
-		nv_putval(SH_FUNNAMENOD, nv_name(np),NV_NOFREE);
-		opt_info.index = opt_info.offset = 0;
-		error_info.errors = 0;
+		loopcnt = sh.st.loopcnt;
 		sh.st.loopcnt = 0;
-		b_dot_cmd(argn+1,argv-1,&sh.bltindata);
-		sh.st.loopcnt = loopcnt;
-		argv[-1] = save;
-	}
-	else
-	{
-		fun.env = envlist;
-		fun.node = np;
-		fun.nref = 0;
-		sh_funscope(argn,argv,0,&fun,execflg);
 	}
+	fun.env = envlist;
+	fun.node = np;
+	fun.nref = 0;
+	sh_funscope(argn,argv,0,&fun,execflg);
+	if(nv_isattr(np,NV_FPOSIX))
+		sh.st.loopcnt = loopcnt;
 	sh.last_root = nv_dict(DOTSHNOD);
 	nv_putval(SH_FUNNAMENOD,fname,NV_NOFREE);
 	nv_putval(SH_PATHNAMENOD,sh.st.filename,NV_NOFREE);
 	sh.pipepid = pipepid;
 	if(np->nvalue.rp)
 	{
-		np->nvalue.rp->running  -= 2;
+		np->nvalue.rp->running -= 2;
 		if(np->nvalue.rp->running==1)
 		{
 			np->nvalue.rp->running = 0;
diff --git a/src/cmd/ksh93/tests/builtins.sh b/src/cmd/ksh93/tests/builtins.sh
index 9ab269822..c87878341 100755
--- a/src/cmd/ksh93/tests/builtins.sh
+++ b/src/cmd/ksh93/tests/builtins.sh
@@ -1226,7 +1226,7 @@ function test_usage
 			actual=$({ PATH=${bltin%/*}; "${bltin##*/}" --this-option-does-not-exist; } 2>&1) ;;
 		*/*)	err_exit "strange path name in 'builtin' output: $(printf %q "$bltin")"
 			continue ;;
-		autoload | compound | float | functions | integer | nameref)
+		autoload | compound | float | functions | integer | nameref | local | declare)
 			bltin=typeset ;&
 		*)	expect="Usage: $bltin "
 			actual=$({ "${bltin}" --this-option-does-not-exist; } 2>&1) ;;
diff --git a/src/cmd/ksh93/tests/functions.sh b/src/cmd/ksh93/tests/functions.sh
index c56250f95..d17af8317 100755
--- a/src/cmd/ksh93/tests/functions.sh
+++ b/src/cmd/ksh93/tests/functions.sh
@@ -1306,7 +1306,7 @@ got=$(
 	num=3.25+4.5 f1
 	typeset -p num
 )
-[[ $got == "$exp" ]] || echo 'assignment preceding POSIX function call is not correctly exported or propagated' \
+[[ $got == "$exp" ]] || err_exit 'assignment preceding POSIX function call is not correctly exported or propagated' \
 	"(expected $(printf %q "$exp"), got $(printf %q "$got"))"
 
 # ======
diff --git a/src/cmd/ksh93/tests/local.sh b/src/cmd/ksh93/tests/local.sh
new file mode 100755
index 000000000..b980ec555
--- /dev/null
+++ b/src/cmd/ksh93/tests/local.sh
@@ -0,0 +1,89 @@
+########################################################################
+#                                                                      #
+#              This file is part of the ksh 93u+m package              #
+#          Copyright (c) 2020-2022 Contributors to ksh 93u+m           #
+#                    <https://github.com/ksh93/ksh>                    #
+#                      and is licensed under the                       #
+#                 Eclipse Public License, Version 2.0                  #
+#                                                                      #
+#                A copy of the License is available at                 #
+#      https://www.eclipse.org/org/documents/epl-2.0/EPL-2.0.html      #
+#         (with md5 checksum 84283fa8859daf213bdda5a9f8d1be1d)         #
+#                                                                      #
+#            Johnothan King <johnothanking@protonmail.com>             #
+#                                                                      #
+########################################################################
+
+# These are regression tests for the local and declare builtins.
+# In the cases when local could return an error, it's run using
+# 'command' because it's a special builtin in ksh93v- and ksh2020.
+
+. "${SHTESTS_COMMON:-${0%/*}/_common}"
+
+# ======
+# This test must be run first due to the next test.
+command local 2> /dev/null && err_exit "'local' works outside of functions"
+
+# local shouldn't suddenly work outside of functions after a function runs local.
+posix_dummy() { command local > /dev/null; }
+function ksh_dummy { command local > /dev/null; }
+posix_dummy && command local 2> /dev/null && err_exit 'the local builtin works outside of functions after a POSIX function runs local'
+ksh_dummy && command local 2> /dev/null && err_exit 'the local builtin works outside of functions after a KornShell function runs local'
+
+# ======
+for i in declare local; do
+	unset foo bar
+	# local should work inside both kinds of functions, without reliance on environment variables.
+	function ksh_function_nounset {
+		command $i foo=bar 2>&1
+	}
+	function ksh_function_unset {
+		unset .sh.fun
+		command $i foo=bar 2>&1
+	}
+	posix_function_nounset() {
+		command $i foo=bar 2>&1
+	}
+	posix_function_unset() {
+		unset .sh.fun
+		command $i foo=bar 2>&1
+	}
+	[[ $(ksh_function_nounset) ]] && err_exit "'$i' fails inside of KornShell functions"
+	[[ $(ksh_function_unset) ]] && err_exit "'$i' fails inside of KornShell functions when \${.sh.fun} is unset"
+	[[ $(posix_function_nounset) ]] && err_exit "'$i' fails inside of POSIX functions"
+	[[ $(posix_function_unset) ]] && err_exit "'$i' fails inside of POSIX functions when \${.sh.fun} is unset"
+
+	# The local and declare builtins should have a dynamic scope
+	# Tests for the scope of POSIX functions
+	foo=globalscope
+	subfunc() {
+		[[ $foo == dynscope ]]
+	}
+	mainfunc() {
+		$i foo=dynscope
+		subfunc
+	}
+	mainfunc || err_exit "'$i' is not using a dynamic scope in POSIX functions"
+	# TODO: Implement scoping in POSIX functions
+	[[ $foo == globalscope ]] || err_exit "'$i' changes variables outside of a POSIX function's scope"
+
+	# Tests for the scope of KornShell functions
+	bar=globalscope
+	function subfunc_b {
+		[[ $bar == dynscope ]]
+	}
+	function mainfunc_b {
+		$i bar=dynscope
+		subfunc_b
+	}
+	mainfunc_b || err_exit "'$i' is not using a dynamic scope in KornShell functions"
+	[[ $bar == globalscope ]] || err_exit "'$i' changes variables outside of a KornShell function's scope"
+done
+
+# The declare builtin should work outside of functions
+unset foo
+declare foo=bar
+[[ $foo == bar ]] || err_exit "'declare' doesn't work outside of functions"
+
+# ======
+exit $((Errors<125?Errors:125))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.1 Issue relevant to dev branch (v1.1.*) TODO Things to be done before releasing
Projects
None yet
Development

No branches or pull requests

5 participants