update fortran keywords #3656

cx384 · 2023-10-25T14:12:44Z

This fixes #3362.

I added the keywords from Fortran 2018 with the help of a python script (if this has to be done again in the future).
The current standard and keywords can be found here:
https://j3-fortran.org/doc/year/18/18-007r1.pdf
https://github.com/cdslaborg/FortranKeywords

Script

old_f = open("filetypes.fortran", "r")
old_lines = old_f.read().splitlines()

old_primary = set()
old_intrinsic_functions = set()
old_user_functions = set()
for line in old_lines:
    s = line.split("=")
    if s[0] == "primary":
        for w in s[1].split():
            old_primary.add(w)
    elif s[0] == "intrinsic_functions":
        for w in s[1].split():
            old_intrinsic_functions.add(w)
    elif s[0] == "user_functions":
        for w in s[1].split():
            old_user_functions.add(w)
old_f.close()


f = open("FortranKeywords2018.txt", "r")
lines = f.read().splitlines()

primary = set()
intrinsic_functions = set()
user_functions = set()

types_primary = {"specifier", "statement", "attribute"}
types_intrinsic_functions = {
    "function_elemental",
    "function_transformational",
    "function_inquiry",
    "function_void",
    "subroutine",
    "subroutine_atomic",
    "subroutine_pure",
    "constant",
    "subroutine_collective",
    "type_derived",
    "module_intrinsic",
}
types_user_functions = {}

old_maping = {
    "primary": set(),
    "intrinsic_functions": set(),
    "user_functions": set(),
    "unspecified": set(),
}

for line in lines:
    s = line.split(",")
    ws = s[0].replace("(", " ").replace(")", " ").replace(".", " ").split()

    # check old maping
    for w in ws:
        wl = w.lower()
        if wl in old_intrinsic_functions and not wl in old_primary :
            old_maping["intrinsic_functions"].add(s[1])
        elif wl in old_primary and not wl in old_intrinsic_functions:
            old_maping["primary"].add(s[1])
        # elif wl in old_user_functions:
            # old_maping["user_functions"].add(s[1])
        old_maping["unspecified"].add(s[1])

    old_maping["unspecified"] = (
        old_maping["unspecified"]
        - old_maping["primary"]
        - old_maping["intrinsic_functions"]
        - old_maping["user_functions"]
    )

    if s[1] in types_primary:
        for w in ws:
            primary.add(w.lower())
    elif s[1] in types_intrinsic_functions:
        for w in ws:
            intrinsic_functions.add(w.lower())
    elif s[1] in types_user_functions:
        for w in ws:
            user_functions.add(w.lower())
    else:
        print("not added:", s)
        if s[0] in old_primary:
            print("old_primary")
        if s[0] in old_primary:
            print("old_intrinsic_functions")
f.close()

old_dubs = old_intrinsic_functions.intersection(old_primary)
wrong = (
    primary.intersection(old_intrinsic_functions).union(
        intrinsic_functions.intersection(old_primary)
    )
    - old_dubs
)
print("wrong:", wrong)
# print("old duplicates:", old_dubs)
# dubs = primary.intersection(intrinsic_functions)-old_dubs
# print("dubs:",' '.join(dubs))

print("\nmissing primary:", " ".join(primary - old_primary))
print(
    "\nmissing intrinsic_functions:",
    " ".join(intrinsic_functions - old_intrinsic_functions),
)

print("\n approximate old maping:", old_maping)

# print("legacy:")
# print("primary:", old_primary-primary)
# print("intrinsic_functions:", old_intrinsic_functions-intrinsic_functions)

Some of the keywords are added to both primary and intrinsic_functions, but this was also the case in the past. I mapped keyword categories (from the standard) to "primary" and "intrinsic_functions" like it has been in the past, and new categories like "module_intrinsic" have been added to the most fitting one.
As far as I can tell, it is correct and all keywords are handled. Also, I didn't remove any of the existing keywords to keep compatibility to older Fortran versions.

Category Mapping

primary = {"specifier", "statement", "attribute"}
intrinsic_functions = {
    "function_elemental",
    "function_transformational",
    "function_inquiry",
    "function_void",
    "subroutine",
    "subroutine_atomic",
    "subroutine_pure",
    "constant",
    "subroutine_collective",
    "type_derived",
    "module_intrinsic",
}

waynelapierre · 2024-01-10T15:31:39Z

When will this PR be merged?

elextr · 2024-01-10T22:36:51Z

"Somebody" who knows Fortran should check it, but none of the devs use Fortran AFAIK so its waiting for any contributor.

gnikit

Things look fine, albeit a bit hard to read.

elextr · 2024-02-19T02:41:25Z

@gnikit thanks.

I just noticed a comment on the OP that some names are in more than one list.

That won't break anything, but it will have performance effects. The lexer has to search all lists for every identifier it finds (so all your variable names, function names, etc as well as keywords, it doesn't know they are keywords until it finds them in a list) so duplicating names makes lists bigger which will slow the lexer down searching them, especially for those identifiers not in a list (it does a linear search of names with the same start character in each list).

Maybe somebody might want to "optimise" it, it won't hurt to delay for a bit since there is no release on the horizon.

gnikit · 2024-02-19T08:58:42Z

@elextr Truth be told a lot of these intrinsic variables/methods should be conditional to module imports, exactly in order to alleviate pressure from the lexer/parser. In the fortls language server, we only start parsing for certain tokens only if the modules are USEd. Our serialised data looks somewhat different than yours see. In the simple grammar rules, I think we don't parse certain of these variables/methods at all.

elextr · 2024-02-19T10:25:12Z

these intrinsic variables/methods should be conditional to module imports,

The code doing highlighting is called lexers because they are just that, pure syntax, no semantics is available, so no knowledge of what is imported. Its just whats in the lists.

There are some experiments with various LSPs (C/C++, Go, Python, not Fortran, remember what I said above about no devs using it) which replace the lexers for some things, but not basic syntax. But its not merged yet and it doesn't replace the Lexers.

gnikit · 2024-02-23T20:54:39Z

Completely understandable @elextr about the lexer. We have a GSoC project this year that aims at adding highlighting support in our language server so fingers crossed Fortran can soon join the experiment.

eht16 · 2024-04-21T12:30:08Z

Don't we want to merge this anyway? Maybe the performance impact is tolerable and then it is still an improvement for Fortran users.

elextr · 2024-04-21T13:29:54Z

@eht16 sure, doesn't break anything if nobody wants to fix it.

update fortran keywords

6f72e72

techee added the filetype label Oct 28, 2023

waynelapierre mentioned this pull request Feb 18, 2024

update keywords for Geany fortran-lang/vscode-fortran-support#1058

Closed

2 tasks

gnikit approved these changes Feb 18, 2024

View reviewed changes

eht16 merged commit 2c42615 into geany:master Apr 21, 2024

b4n added this to the 2.1 milestone Apr 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update fortran keywords #3656

update fortran keywords #3656

cx384 commented Oct 25, 2023 •

edited

waynelapierre commented Jan 10, 2024 •

edited

elextr commented Jan 10, 2024

gnikit left a comment

elextr commented Feb 19, 2024

gnikit commented Feb 19, 2024

elextr commented Feb 19, 2024

gnikit commented Feb 23, 2024

eht16 commented Apr 21, 2024

elextr commented Apr 21, 2024

update fortran keywords #3656

update fortran keywords #3656

Conversation

cx384 commented Oct 25, 2023 • edited

waynelapierre commented Jan 10, 2024 • edited

elextr commented Jan 10, 2024

gnikit left a comment

Choose a reason for hiding this comment

elextr commented Feb 19, 2024

gnikit commented Feb 19, 2024

elextr commented Feb 19, 2024

gnikit commented Feb 23, 2024

eht16 commented Apr 21, 2024

elextr commented Apr 21, 2024

cx384 commented Oct 25, 2023 •

edited

waynelapierre commented Jan 10, 2024 •

edited