You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Jawk should implement AWK's strnum / numeric-string semantics. Today, input values and script-created strings appear to use the same runtime representation, so comparisons cannot distinguish an input-derived numeric string from a plain string literal.
If Jawk later supports strtonum(), that can explicitly parse hexadecimal input.
Long and Integer
While we're working on this, we could avoid checking results of arithmetic operations to see whether the result is actually a Long (integer) and output an Long object instead of a Double. This can be performed only when the number must be converted to a String. This will improve the performance of arithmetic operations. So, we could have only 3 object types: StrNum, Double, and String (and Uninitialized).
Possible implementation approach
Consider introducing a dedicated internal representation such as StrNum, or a broader scalar type, so JRT.compare2(...) can distinguish:
plain String
input-derived numeric string / strnum
actual numeric values (Long, Double)
StrNum should preserve the original string, and may optionally cache its numeric value.
Acceptance criteria
Input-derived values that fully look numeric are tagged as strnum.
Input-derived values that do not fully look numeric remain plain strings.
Assignment preserves the attribute.
String operations produce plain strings.
Numeric operations produce numeric values and keep permissive numeric-prefix parsing.
Comparisons use the AWK string / number / strnum matrix.
Summary
Jawk should implement AWK's
strnum/ numeric-string semantics. Today, input values and script-created strings appear to use the same runtime representation, so comparisons cannot distinguish an input-derived numeric string from a plain string literal.Related: #110
Current implementation context
Runtime values are passed around as
Object, currently including values such asLong,Double,String, andUninitializedObject.Relevant methods:
JRT.compare2(Object, Object, boolean)for comparisonsJRT.toDouble(Object)for numeric operationsThis is enough for arithmetic conversion, but not enough for AWK comparison semantics, because AWK needs to distinguish three value attributes:
number: numeric value, usually from arithmeticstring: ordinary string, such as a string literal or a string operation resultstrnum: string from user/input sources that fully looks like a numberRequired comparison rule
For
<,<=,>,>=,==, and!=, AWK chooses string or numeric comparison based on the operand attributes:A pure string operand forces string comparison. Otherwise, comparison is numeric.
Important behavior to preserve
Arithmetic operations should still use permissive numeric-prefix conversion:
But comparisons must not simply parse numeric prefixes:
Attribute propagation examples
Assignment should preserve the attribute:
String operations should produce plain strings:
String literals are plain strings, not
strnum:Numeric operations produce numeric values:
Hexadecimal note
By default, GNU awk does not treat input such as
0x10as hexadecimal during ordinary string-to-number conversion:If Jawk later supports
strtonum(), that can explicitly parse hexadecimal input.Long and Integer
While we're working on this, we could avoid checking results of arithmetic operations to see whether the result is actually a Long (integer) and output an Long object instead of a Double. This can be performed only when the number must be converted to a String. This will improve the performance of arithmetic operations. So, we could have only 3 object types: StrNum, Double, and String (and Uninitialized).
Possible implementation approach
Consider introducing a dedicated internal representation such as
StrNum, or a broader scalar type, soJRT.compare2(...)can distinguish:StringstrnumLong,Double)StrNumshould preserve the original string, and may optionally cache its numeric value.Acceptance criteria
strnum.