character kind parameter #1

zbeekman · 2019-12-16T19:57:24Z

Shouldn't the constants have an explicit character kind parameter? There's no guarantee that the "DEFAULT" character kind is "ASCII".

I even wonder if it would make sense to set everything to ISO_10646 characters (UCS4... unicode, basically)

Also, it would be nice to have compile time polymorphism to allow non-ascii (i.e. ISO_10646 or DEFAULT when default isn't ascii) character kinds to be queried since ISO_10646 is a superset that include ASCII characters, although the bitwise representation at runtime may be different (likely padded with zeros).

The only half decent way I know to write extensive code with compile time polymorphism is using some templating/code-generation approach. I've been using Jin2For for this.

ivan-pi · 2019-12-16T22:55:46Z

Thanks for the comment. This was just a quick port of the functions in the D std.ascii module (https://dlang.org/phobos/std_ascii.html). The same functionality is also in the ctype header file of the C standard library (http://www.cplusplus.com/reference/cctype/).

If I understand correctly you are suggesting I create several copies of these function to operate on the following character kinds:

integer, parameter :: default = selected_char_kind('default')
integer, parameter :: ascii = selected_char_kind('ascii')
integer, parameter :: iso = selected_char_kind('iso_10646')

of which only the default set is guaranteed to be supported by a given processor. Moreover, the compiler vendors are not required to support the ASCII and ISO_10646 sets (my ifort 19.0.3 only supports one character kind).

Indeed with jin2for (similar to what you did with Zstdlib), I could reduce the amount of boilerplate code necessary. Perhaps this should be a separate discussion at https://github.com/fortran-lang/stdlib. I will create a new proposal there.

zbeekman · 2019-12-17T15:50:33Z

If I understand correctly you are suggesting I create several copies of these function to operate on the following character kinds:

Well, not exactly, because, as you noted, they're not guaranteed to exist, and when they do exist, "DEFAULT" is often/usually the same kind as "ASCII", so you can't create overloaded functions with arguments that are "ascii" and "default". (In that case you'd have a duplicate interface.)

That's one nice thing about jin2for: It doesn't assume anything and interrogates the numeric kinds from the compiler to then generate the code. So if only one character kind is supported then your code will only have that one kind. I'll cross post this on the new issue you made.

wclodius2 · 2020-06-19T16:14:48Z

The DEFAULT character kind is guaranteed to contain all the characters of the Fortran character set, which is all the printable characters of ASCII. It says nothing about the control codes or the order of printable characters in the character set. The order dependence for the printable characters can be consistently worked around by using ACHAR and IACHAR. In practice, the default character set is a mapping to the system's internal character set which is UTF-8 on Linux, UTF-16 on Windows, and Mac Roman(?) on the Macintosh. All map to ASCII for code points 0:127. The Chinese and Japanese computers tend to use national character septs that map to ASCII for 0:127. I don't know if the code set is well defined for Berkely Unix, but the ones I know use the Latin character sets which also map to ASCII for code points 0:127. I don't know what they use in India, but I would be very surprised if their character sets didn't also map to ASCII for 0:127. The only computers I know of that don't map to ASCII in code points 0:127, are those using EBCDIC(?) mostly IBM mainframes. The EBCDIC actually comprise a variety of character sets with the specific active one context dependent. The XL Fortran compiler, https://www.ibm.com/support/knowledgecenter/SS2MB5_14.1.0/com.ibm.xlf141.bg.doc/language_ref/asciit.html, appears to use an EBCDIC character set with equivalents to all the ASCII control characters.

ivan-pi mentioned this issue Dec 16, 2019

Proposal for ascii fortran-lang/stdlib#11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

character kind parameter #1

character kind parameter #1

zbeekman commented Dec 16, 2019

ivan-pi commented Dec 16, 2019

zbeekman commented Dec 17, 2019

wclodius2 commented Jun 19, 2020

character kind parameter #1

character kind parameter #1

Comments

zbeekman commented Dec 16, 2019

ivan-pi commented Dec 16, 2019

zbeekman commented Dec 17, 2019

wclodius2 commented Jun 19, 2020