Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ksh93: When "read -m json" is used to read a single-line JSON object, text fields following the first numeric field are re-interpreted as numeric variable names #39

Closed
zakukai opened this issue May 22, 2017 · 3 comments
Labels

Comments

@zakukai
Copy link

zakukai commented May 22, 2017

I realize the JSON functionality in Korn Shell isn't fully mature yet, but I built myself a shell from the beta branch on git (Version ABIJM 93v- 2014-12-24) to try it out... I found that when I feed "read -m json" a JSON object with no newlines in it, a numeric field would cause all following fields to appear wrong

$ foo=9
$ read -m json json_test <<<$'{ "squanchy" : "cromulent", "num" : 1, "text": "foo", "notbool": "true", "bool": true }'
$ print -j json_test
{
	"bool": 0,
	"notbool": 0,
	"num": 1,
	"squanchy": "cromulent",
	"text": 9
}
# "squanchy" comes through just fine, because it precedes "num".
# All variables following "num" : 1 were replaced with numeric variable lookup, as if they'd appeared in $(())
$ echo "$json_test"
(
	typeset -l -E bool=0
	typeset -l -E notbool=0
	typeset -l -E num=1
	squanchy=cromulent
	typeset -l -E text=9
)
# typeset -E is a floating-point numeric type. "bool", "notbool", and "text" have all inherited this type from "num"

However, if I insert newlines into the JSON string, this doesn't happen:

$ read -m json json_test <<<$'{ "squanchy" : "cromulent", "num" : 1,\n "text": "bar", "notbool": "true", "bool": true }'
$ print -j json_test
{
	"bool": true,
	"notbool": "true",
	"num": 1,
	"squanchy": "cromulent",
	"text": "bar"
}

I haven't looked at the source code for the JSON support yet. It does appear to be in pretty rough shape overall... When I get some time I'll see what I can do with it.

@zakukai zakukai changed the title When "read -m json" is used to read a single-line JSON object, text fields following the first numeric field are re-interpreted as numeric variable names ksh93: When "read -m json" is used to read a single-line JSON object, text fields following the first numeric field are re-interpreted as numeric variable names May 22, 2017
@kshji
Copy link

kshji commented Jul 5, 2018

Version AJ 93v-1203-g8d977fd3

There are more strange in read -m json. Make same read twice. Result is different. 1st reading works almost everytime ... but second reading some reading error. zakukai foo problem is also interesting.

Unset json variable not helps.

nvtree.c has some parsing problem ?

print "========================================================="
read -m json some <<EOF
{
        "squanchy" : "cromulent",
        "num" : 1,
        "text": "foo",
        "notbool": "true",
        "bool": true ,
        "endval" : "yes"
}
EOF

print "-------------------------------"
print -j some
print "-------------------------------"
print $some
print "-------------------------------"
printf "%(json)B\n" some
print "-------------------------------"
printf "%(json)q\n" $some
print "-------------------------------"
printf "%#(csv)q\n" $some
print "-------------------------------"
printf "%(csv)B\n" some

print "========================================================="
read -m json some <<EOF
{
        "squanchy" : "cromulent",
        "num" : 1,
        "text": "foo",
        "notbool": "true",
        "bool": true ,
        "endval" : "yes"
}
EOF

print "-------------------------------"
print -j some
print "-------------------------------"
print $some
print "-------------------------------"
printf "%(json)B\n" some
print "-------------------------------"
printf "%(json)q\n" $some
print "-------------------------------"
printf "%#(csv)q\n" $some
print "-------------------------------"
printf "%(csv)B\n" some

And output is

=========================================================
-------------------------------
{
        "bool": true,
        "endval": "yes",
        "notbool": "true",
        "num": 1,
        "squanchy": "cromulent",
        "text": "foo"
}
-------------------------------
( _Bool bool=true endval=yes notbool=true typeset -l -E num=1 squanchy=cromulent text=foo )
-------------------------------
{
        "bool": true,
        "endval": "yes",
        "notbool": "true",
        "num": 1,
        "squanchy": "cromulent",
        "text": "foo"
}
-------------------------------
'('
_Bool
bool=true
endval=yes
notbool=true
typeset
-l
-E
num=1
squanchy=cromulent
text=foo
')'
-------------------------------
"("
_Bool
"bool=true"
"endval=yes"
"notbool=true"
typeset
"-l"
"-E"
"num=1"
"squanchy=cromulent"
"text=foo"
")"
-------------------------------
(
        _Bool bool=true
        endval=yes
        notbool=true
        typeset -l -E num=1
        squanchy=cromulent
        text=foo
)
=========================================================
json5.sh[31]: read: line 48: ": not found
-------------------------------
{
}
-------------------------------
( )
-------------------------------
{
}
-------------------------------
'('
')'
-------------------------------
"("
")"
-------------------------------
(
)

Previosly next example didn't work, but in this version it works fine

read -m json person <<EOF
{
    "first" : "My",
    "last" : "Name",
    "email" : "My.Name@gmail.com",
    "lucky" : 13,
    "quarter" : 0.25,
    "empty" : null,
    "nerd" : true,
    "lotto" : [ 9, 12, 17, 38, 45, 46 ],
    "children" : [ "boy", "girl" ]
}
EOF



print "-------------------------------"
print "$person"
print "-------------------------------"
print -r "$person"
print "-------------------------------"
print ${person.email}
print "-------------------------------"
print ${person.lotto[*]}
print "-------------------------------"
print -j person
print "-------------------------------"

unset person

compound person
compound person=(firstname="John" lastname="Some" age=32)
print -j person
print "-------------------------------"
printf "%(json)B\n" person
print "-------------------------------"

But if you read twice same lines, second reading give error ...

I have tested built ksh

  • Win10 WSL Ubuntu 18.04
  • Debian 9

Same problem.

@siteshwar siteshwar added the bug label Sep 5, 2018
@krader1961
Copy link
Contributor

I realize the JSON functionality in Korn Shell isn't fully mature yet, ....

Not only is native JSON support by ksh not mature it also has significant bugs. See issue #820. My recommendation is to remove the JSON related code. At the moment, by necessity, the focus is on a ksh release that could replace ksh93u+ from 2012 but that is much easier to maintain.

@siteshwar
Copy link
Contributor

Since current implementation for json support is broken, we are going to disable it in next release. We may reintroduce it in future, but it won't be done without solid testing. Closing this issue as it does not serve any purpose now.

citrus-it pushed a commit to citrus-it/ast that referenced this issue Apr 15, 2021
.github/workflows/ci.yml:
- Disable Mac build as the GitHub runners appear to be broken
  (e.g. SIGCHLD fails, unlike on real Macs) and tend to hang.
- For the Linux build:
  - Set GMT timezone for 'printf %T' tests in builtins.sh.
  - Set the ulimit for open files to 1024 as the subshell.sh tests
    need a lot of open files.
  - As the runners lack the POSIX standard /dev/tty device, use the
    script command to provide a fake /dev/tty for the bracket.sh
    tests that use 'test -t $fd'.
    Ref.: actions/runner#241
citrus-it pushed a commit to citrus-it/ast that referenced this issue Apr 15, 2021
This applies a number of fixes to the printf formatting directives
%H and %#H (as well as their equivalents %(html)q and %(url)q):
1. Both formatters have been made multibyte/UTF-8 aware, and no
   longer delete multibyte characters. Invalid UTF-8 byte sequences
   are rendered as ASCII question marks.
2. %H no longer wrongly encodes spaces as non-breaking spaces
   (&nbsp;) and instead correctly encodes the UTF-8 non-breaking
   space as such.
3. %H now converts the single quote (') to '%att#39;' instead of
   '&apos;' which is not a valid entity in all HTML versions.
4. %#H failed to encode some reserved characters (e.g. '?') while
   encoding some unreserved ones (e.g. '~'). It now percent-encodes
   all characters except those 'unreserved' as per RFC3986 (ASCII
   alphanumeric plus -._~).

Prior discussion:
https://groups.google.com/d/msgid/korn-shell/ce8d1467-4a6d-883b-45ad-fc3c7b90e681%40inlv.org

src/cmd/ksh93/include/defs.h:
src/cmd/ksh93/sh/string.c:
- defs.h: If compiling without SHOPT_MULTIBYTE, redefine the
  mbwide() macro (which tests if we're in a multibyte locale) as 0.
  This lets the compiler optimiser do the work that would otherwise
  require a lot of tedious '#if SHOPT_MULTIBYTE' directives.
- string.c: Remove some now-unneeded '#if SHOPT_MULTIBYTE' stuff.
- defs.h, string.c: Rename is_invisible() to sh_isprint(), invert
  the boolean return value, and make it an extern for use in
  fmthtml() -- see below. If compiling without SHOPT_MULTIBYTE,
  simply #define sh_isprint() as equivalent to isprint(3).
- defs.h: Add URI_RFC3986_UNRESERVED macro for fmthtml() containing
  the characters "unreserved" for purposes of URI percent-encoding.

src/cmd/ksh93/bltins/print.c: fmthtml():
- Remove kludge that skipped all multibyte characters (!).
- Complete rewrite to implement fixes described above.
- Don't bother with '#if SHOPT_MULTIBYTE' directives (see above).

src/cmd/ksh93/data/builtins.c:
- sh_optprintf[]: %H: Add single quote to encoded chars doc.
- Edit credits and bump version date.

src/cmd/ksh93/tests/builtins.sh:
- Update and tweak old regression tests.
- Add a number of new tests for UTF-8 HTML and URI encoding, which
  are only run when running tests in a UTF-8 locale (shtests -u).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants