Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with generated resolvers (1/2) #24

Closed
alrevuelta opened this issue Jun 13, 2020 · 14 comments
Closed

Problem with generated resolvers (1/2) #24

alrevuelta opened this issue Jun 13, 2020 · 14 comments
Assignees

Comments

@alrevuelta
Copy link
Owner

There is a problem with the autogenerated resolvers (the ones that map a given operator with the function, i.e. argmax with argmax__float)

Lets use resolve_operator__onnx__argmax__12 as an example. This function returns a given function depending on the type that is used (i.e. operator__onnx__argmax__12__T_tensor_float). The problem here is that if one the functions is not implemented, the compiler can't of course find the symbol and it gives an error.

This was introduced in #22 and fixed by commenting the types that are not implemented, but should be fixed, since in most of the cases we won't implement all types (float, int,...) for a given operator.

Can this be solved with weakrefs? So if the symbol is not found it automatically fallbacks to an empty operator stub?

@nopeslide

@alrevuelta alrevuelta changed the title Problem with generated resolvers Problem with generated resolvers (1/2) Jun 13, 2020
@nopeslide
Copy link
Collaborator

@alrevuelta will look into it.

@nopeslide nopeslide self-assigned this Jun 14, 2020
alrevuelta added a commit that referenced this issue Jun 16, 2020
Generate all MNIST operators with the Python script and start integrating them in the existing code. So far only float type is supported.

-Python generated code is now being used
-MNIST and tinyYOLO models are working
-Opset and function resolver is not hardcoded anymore.

TODOs for next patches (see #24 #25)
-Remove unused code
-Remove all switch cases in the operator (no longer needed)
-There is a minor issue when resolving the operator functions. When a function is not defined (i.e. for double) there is an error because that symbol can't be found. This should be linked with the default operator stub (using weakrefs?)
-There is a workaround with the operator set. opset=12 is hardcoded for the initial tests.
-Modify manually maxpool resolver. The Python generator for resolvers has to be rethought. Doesn't make sense for some operators. This modification is a temporal fix until we figure out how to handle this.
@nopeslide
Copy link
Collaborator

nopeslide commented Jun 24, 2020

I tried the mechanism with a minimal example and it works:

main.c

#include <stdio.h>

extern __attribute__((weak)) char* func(void);

int main() {

    printf("func @ %p\n",func);

    if (func) {
        printf("func: %s\n", func());
    } else {
        printf("func not defined\n");
    }
    return 0;
}

func.c

char* func(void) {
    return "hello";
}
$ gcc -std=c99 -c main.c -o main.o
$ gcc -std=c99 -c func.c -o func.o
$ gcc -std=c99 main.o -o no_func
$ ./no_func 
func @ (nil)
func not defined
$ gcc -std=c99 main.o func.o -o func
$ ./func   
func @ 0x55ff463ed1a2
func: hello

the extern __attribute__((weak)) line actually produces a function pointer, which is by default NULL. when the linker finds a matching function it "updates" the pointer.

@nopeslide
Copy link
Collaborator

ah sorry, I misunderstood the problem

@nopeslide
Copy link
Collaborator

nopeslide commented Jun 24, 2020

the address operators inside the return statements are wrong. the resolver should return the function pointer itself, not the address of the function pointer. will update the generator.

@alrevuelta
Copy link
Owner Author

@nopeslide
Actually what I meant was your first reply. The third command is not working for me under macOS:

$ gcc -std=c99 main.o -o no_func

Undefined symbols for architecture x86_64:
  "_func", referenced from:
      _main in main.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

After a quick search I found this where one of the solutions suggests to add the following flags -Wl,-flat_namespace,-undefined,dynamic_lookup and that seems to work for me.

@nopeslide
Copy link
Collaborator

After a quick search I found this where one of the solutions suggests to add the following flags -Wl,-flat_namespace,-undefined,dynamic_lookup and that seems to work for me.

these flags are not present in my ld. are these mac specific flags?

@alrevuelta
Copy link
Owner Author

Looks like its mac specific. Seems also that its not really a good idea.

"Weak linking with weak_import really only works well with dynamic libraries. You can get it to work with static linking (by specifying -undefined dynamic_lookup as suggested above) but this isn't such a hot idea. It means that no undefined symbols will be detected until runtime. This is something I would avoid in production code, personally."

Will look further into this.

@nopeslide
Copy link
Collaborator

since we are building a static binary no symbol search should be done at runtime. could you pass the static flag and see if it still does not work or if flat_namespace is still needed?

@nopeslide
Copy link
Collaborator

are you developing on a mac? if not we could keep these flags and just trust the linux gcc to report any linker errors

@alrevuelta
Copy link
Owner Author

If you mean something like:

gcc -std=c99 main.o -o no_func -static

I'm geetting:

ld: library not found for -lcrt0.o
clang: error: linker command failed with exit code 1 (use -v to see invocation)

And according to gcc man:

This option will not work on Mac OS X unless all libraries (including libgcc.a) have also been compiled with -static. Since neither a static version of libSystem.dylib nor crt0.o are provided, this option is not useful to most people.

Yep, I'm developing on mac.

@nopeslide
Copy link
Collaborator

nopeslide commented Jun 30, 2020

I looked at a few other projects with Mac ports (i.e. gcc). it seems they're all using these flags.
Could you try if the linker accepts missing hard refs with these flags by adding a function prototype and using it without defining the actual function?
If so, this would be a problem, if not we may use them.
Alternatively I could rewrite the Makefile to partially link everything operator related with these flags.
Hopefully this would allow "normal" linking of everything else

@alrevuelta
Copy link
Owner Author

You mean these flags? -Wl,-flat_namespace,-undefined,dynamic_lookup?

@nopeslide
Copy link
Collaborator

yes

@alrevuelta
Copy link
Owner Author

Solved in #34

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants