Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generator errors with struct in union #307

Closed
thomasbrandon opened this issue Jul 2, 2021 · 10 comments · Fixed by #342
Closed

Generator errors with struct in union #307

thomasbrandon opened this issue Jul 2, 2021 · 10 comments · Fixed by #342
Labels

Comments

@thomasbrandon
Copy link

I'm trying to generate code for a library that uses pthread which includes (simplified from pthreadtypes.h):

typedef struct __pthread_internal_list
{
  struct __pthread_internal_list *__prev;
} __pthread_list_t;

typedef union
{
  struct __pthread_mutex_s
  {
    __pthread_list_t __list;
  } __data;
  char __size[__SIZEOF_PTHREAD_MUTEX_T];
} pthread_mutex_t;

This results in an error:

ERROR: LoadError: There is no definition for __pthread_mutex_s's field: __list's type: [`__pthread_list_t`] at test_sys.h:12:10
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] resolve_dependency!(dag::ExprDAG, node::ExprNode{Clang.Generators.StructDefinition, Clang.CLStructDecl})
   @ Clang.Generators ~/.julia/packages/Clang/YdMM7/src/generator/resolve_deps.jl:162
...

Debugging it seems CollectDependentSystemNode adds __pthread_mutex_t (and the anonymous struct) to the dag.nodes but does not add __pthread_mutex_s as it's an AnonymousUnion which is ignored. Then in CollectNestedRecord it processes __pthread_mutex_t and adds __pthread_mutex_s to dag.nodes but not __pthread_list_t . Then resolve_dependency! tries to resolve fields of __pthread_mutex_s and fails.

Not suite sure how to properly fix this as I don't fully understand implications of various approaches.

Minimal reproduction at https://gist.github.com/thomasbrandon/32001bcb6b8275a6afd7d5782c7162ea

@Gnimuc Gnimuc added the bug label Jul 2, 2021
@Gnimuc
Copy link
Member

Gnimuc commented Jul 2, 2021

I guess this is a bug that may be related to handling nested tags for system nodes. I don't have time to look into this at the moment, does manually adding the definition via @add_def __pthread_mutex_s workaround the issue?

@Gnimuc
Copy link
Member

Gnimuc commented Jul 2, 2021

It's @add_def __pthread_list_t instead of __pthread_mutex_s as the error suggested.

@thomasbrandon
Copy link
Author

Thanks for the help.

Yes, @add_def does seem to work (haven't compiled and run but generated code looks OK).

I can have a look but I wasn't entirely sure whether it should:

  1. Support AnonymousUnion in CollectDependentSystemNode
  2. Not visit it in CollectNestedRecord

Looking at generated code it seems like (1) is likely appropriate as it does generate a getproperty accessor for the __pthread_mutex_s. Though I don't think that's needed here, and may not be appropriate as I think it's intended to be opaque, this may be needed elsewhere,

@Gnimuc
Copy link
Member

Gnimuc commented Jul 2, 2021

@add_def always has a higher priority so it can be used for skipping certain nodes and users can manually add definitions for those in the patch file prologue.jl.

The error complains about missing definition for __pthread_list_t. __pthread_list_t is an identifier in the system header, somehow CollectDependentSystemNode failed to collect it.

@Gnimuc Gnimuc added this to the release-0.14.0 milestone Jul 19, 2021
@melonedo
Copy link
Contributor

melonedo commented Aug 29, 2021

Supporting collecting anonymous union alone won't solve this.
The problem is that C allows declaring struct/union/enum almost everywhere in a file, for example

struct s* p = NULL; // tag naming an unknown struct declares it
struct s { int a; }; // definition for the struct pointed to by p
void g(void)
{
    struct s; // forward declaration of a new, local struct s
              // this hides global struct s until the end of this block
    struct s *p;  // pointer to local struct s
                  // without the forward declaration above,
                  // this would point at the file-scope s
    struct s { char* p; }; // definitions of the local struct s
}

Currently Clang assumes all typedef expressions to be top-level (so CollectNestedRecord stops at typedef border __pthread_list_t), and all struct/union/enum declarations in system headers to be top-level(CollectDependentSystemNode does not add new system nodes, only top-level ones can be reached), which is not true.

As a result, I think collecting all declarations in system headers before collecting system-dependent nodes might help.

@Gnimuc
Copy link
Member

Gnimuc commented Aug 29, 2021

void g(void)
{
    struct s; // forward declaration of a new, local struct s
              // this hides global struct s until the end of this block
    struct s *p;  // pointer to local struct s
                  // without the forward declaration above,
                  // this would point at the file-scope s
    struct s { char* p; }; // definitions of the local struct s
}

we don't even parse function body, so this will not be a problem.

Currently Clang assumes all typedef expressions to be top-level (so CollectNestedRecord stops at typedef border __pthread_list_t), and all struct/union/enum declarations in system headers to be top-level(CollectDependentSystemNode does not add new system nodes, only top-level ones can be reached), which is not true.

to be clear, the reason is that non-top-level system decls are not collected in the CollectDependentSystemNode pass, but they are collected and inserted to the DAG later in the CollectNestedRecord pass, so if those decls depend on certain top-level decls in the system header that are collected in the CollectDependentSystemNode pass, then it yields this issue.

I guess a simple fix might be:

function collect_dependent_system_nodes!(dag::ExprDAG, node::ExprNode{<:AbstractUnionNodeType}, system_nodes)
    cursor = node.cursor
    for c in fields(getCursorType(cursor))
        ty = getCursorType(c)
        jlty = tojulia(ty)
        leaf_ty = get_jl_leaf_type(jlty)

        is_jl_basic(leaf_ty) && continue

        hasref = has_elaborated_reference(ty)
        if (hasref && haskey(dag.tags, leaf_ty.sym)) ||
            (!hasref && haskey(dag.ids, leaf_ty.sym)) ||
            haskey(dag.ids_extra, leaf_ty.sym) ||
            occursin("anonymous", spelling(ty))
            # pass
        else
            # add all sys nodes with the same id
            has_non_top_level = true
            for (i, n) in enumerate(dag.sys)
                if n.id == leaf_ty.sym
                    has_non_top_level = false
                    system_nodes[n] = i
                end
            end
            ndef_cursor = get_elaborated_cursor(getCursorType(c))
            if has_non_top_level && isCursorDefinition(ndef_cursor)
                ty = ndef_cursor isa CLStructDecl ? StructDefinition() : UnionDefinition()
                ndef_node = ExprNode(leaf_ty.sym, ty, ndef_cursor, Expr[], Int[])
                push!(dag.sys, ndef_node)
                system_nodes[ndef_node] = lastindex(dag.sys)
            end
        end
    end
    return dag
end

@melonedo
Copy link
Contributor

This looks good. But tracing it I find even stranger situation:

typedef struct
{
    struct {
        pthread_mutex_t lock;
    } s;
} test_t;

Here pthread_mutex_t is not included because it is enclosed by an anonymous struct. I did the following:

function collect_dependent_system_nodes!(dag::ExprDAG, node::ExprNode{<:AbstractTagType}, system_nodes)
    cursor = node.cursor
    for c in fields(getCursorType(cursor))
        ty = getCursorType(c)
        jlty = tojulia(ty)
        leaf_ty = get_jl_leaf_type(jlty)

        is_jl_basic(leaf_ty) && continue

        hasref = has_elaborated_reference(ty)
        if (hasref && haskey(dag.tags, leaf_ty.sym)) ||
            (!hasref && haskey(dag.ids, leaf_ty.sym)) ||
            haskey(dag.ids_extra, leaf_ty.sym)
            # pass (anonymous tags now are not skipped)
        elseif occursin("anonymous", spelling(ty))
            # do not add this tag, but still need to trace through it
            ty = UnionDuplicated() # placeholder
            ndef_node = ExprNode(leaf_ty.sym, ty, ndef_cursor, Expr[], Int[])
            collect_dependent_system_nodes!(dag, ndef_node, system_nodes)
        else
            # as suggested above
        end
    end
    return dag
end

@thomasbrandon
Copy link
Author

As a workaround I did:

function collect_dependent_system_nodes!(dag::ExprDAG, node::ExprNode{<:Union{AbstractStructNodeType,AbstractUnionNodeType}}, system_nodes)
    cursor = node.cursor
    for c in fields(getCursorType(cursor))
        ty = getCursorType(c)
        jlty = tojulia(ty)
        leaf_ty = get_jl_leaf_type(jlty)

        is_jl_basic(leaf_ty) && continue

        hasref = has_elaborated_reference(ty)
        if (hasref && haskey(dag.tags, leaf_ty.sym)) ||
            (!hasref && haskey(dag.ids, leaf_ty.sym)) ||
            haskey(dag.ids_extra, leaf_ty.sym) ||
            occursin("anonymous", spelling(ty))
            # pass
        else
            # add all sys nodes with the same id
            for (i, n) in enumerate(dag.sys)
                if n.id == leaf_ty.sym
                    system_nodes[n] = i
                end
            end
            # Nested structs/unions are not dealt with above as they aren't in dag.sys.
            # They will be added in the CollectNestedRecord pass but with unresolved
            # system types. So just descend through them and add referenced system types.
            if is_elaborated(ty)
                collect_nested_dependent_system_nodes!(dag, ty, system_nodes)
            end
        end
    end
    return dag
end

function collect_nested_dependent_system_nodes!(dag::ExprDAG, ty::CLElaborated, system_nodes)
    for c in fields(ty)
        ty = getCursorType(c)
        jlty = tojulia(ty)
        leaf_ty = get_jl_leaf_type(jlty)

        is_jl_basic(leaf_ty) && continue

        hasref = has_elaborated_reference(ty)
        if (hasref && haskey(dag.tags, leaf_ty.sym)) ||
            (!hasref && haskey(dag.ids, leaf_ty.sym)) ||
            haskey(dag.ids_extra, leaf_ty.sym) ||
            occursin("anonymous", spelling(ty))
            # pass
        else
            # add all sys nodes with the same id
            for (i, n) in enumerate(dag.sys)
                if n.id == jlty.sym
                    system_nodes[n] = i
                end
            end
            # Recurse for any nested struct/union
            if is_elaborated(ty)
                collect_nested_dependent_system_nodes!(dag, ty, system_nodes)
            end
        end
    end
    return dag
end

(Note I just copied the code for the CLElaborated specialisation and some parts may not apply here)
This seemed to work OK on the minimal example i gave and on the full codebase it was from. But I ended up abandoning this work (due to other issues) so I didn't extensively test it. Not familiar enough with C/Clang to fully grasp the potential issues.

@Gnimuc
Copy link
Member

Gnimuc commented Sep 23, 2021

hi @melonedo, would you like to submit a PR for the fix you proposed in #307 (comment) ?

@melonedo
Copy link
Contributor

Sure. I will check it tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants