Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ffi: comparison with Calypso, feature set, limitations? #1

Open
1 of 14 tasks
timotheecour opened this issue Aug 7, 2018 · 4 comments
Open
1 of 14 tasks

ffi: comparison with Calypso, feature set, limitations? #1

timotheecour opened this issue Aug 7, 2018 · 4 comments

Comments

@timotheecour
Copy link

timotheecour commented Aug 7, 2018

just heard of fragments through https://t.me/nim_lang ; I'm curious to what extent it fullfills what I was looking for in nim-lang/Nim#8327?

  • what are limitations are there in terms of understanding/parsing C++ code?
  • what are limitations are there in terms of mapping C++ concepts to nim?
  • can it handle C++ stdlib? maybe show an example that would wrap std::vector<T> (note: should provide a view over data, not a mere copy)?
  • can it handle complex things like wrapping opencv?
  • what overhead is there compared to directly calling a C++ function/constructor, with and without compiler optimizations? (NOTE: calypso would involve 0 overhead, no function pointer etc or pointer indirection; swig would involve overhead of wrapper but might be optimized away bby compiler)
  • support for C++ templates (functions and types) with cpp source instantiating a given type
  • direct support for C++ templates (functions and types) without needing cpp source to instantiate type (NOTE: swig supports it)
  • catch / rethrowing C++ exceptions?

design discussions

  • why not split out ffi to its own nimble package? would keep issue tracking etc cleaner

scratch below here

  • explicit enumeration for different number of arguments: should be doable with macro magic
    template cppnewref*(myRef: ref CppObject, arg0: typed, arg1: typed): untyped =
  • .to(void) => ugly but could be necessary due to a bug in Nim ; forgot what that bug was
  • converter toShort*(co: CppProxy): int16 {.used, importcpp:"(#)".} + friends => convertTo(int16) (ie make these generic using typedesc)

minor

  • probably a better way to handle windows vs posix uniformly rather than having lots of when(defined(windows)) let win_incl = ($lib).replace("/", "\\").quoteShell blocks
  • in README.md: echo $global.globalNumber.to(cint) => $ superfluous (+ all such cases)
@jwollen
Copy link
Contributor

jwollen commented Aug 8, 2018

Hi!
Thanks for the feedback. This is still very much work in progress and grows based on our own use cases. We will improve comments and code quality much more!

We also used libclang quite a bit to generate wrappers before. For simple C APIs that's fine. The major downside is that each codebase needs some tweaking and dealing with #if/def is sometimes just not possible.

what are limitations are there in terms of understanding/parsing C++ code?
what are limitations are there in terms of mapping C++ concepts to nim?

Code is only emitted, not parsed, so it's mostly a matter of coming up with a Nim-syntax. A downside is the lack of linting/auto completion in Nim. For diagnostics we have to rely on the C++ compiler.

can it handle C++ stdlib? maybe show an example that would wrap std::vector (note: should provide a view over data, not a mere copy)?

In general the stdlib will work fine. Templates are still a bit cumbersome and instantiations have to be declared manually. I'd love to end up with something like type IntVec = std/vector[int] though. There are some things that will need a bit more thought too, like bridging Nim/C++ allocators, threading etc.

can it handle complex things like wrapping opencv?

We were working with a few big code bases, including LLVM, but there is still a lot manual work involved. The main obstacle would be templates and inheriting from C++ types right now.

We would like to get to a point where namespaces, types, enum-members, etc. can all be accessed through a syntax like My/Sub/Namespace.MyEnum.SomeValue. Currently this has to be done in defineCppType for each type, has to explicitly name headers, etc.

what overhead is there compared to directly calling a C++ function/constructor, with and without compiler optimizations?

There is no overhead. All calls directly emit C++ code using {.importcpp.}. There are no wrapper types or functions.

support for C++ templates [...] without needing cpp source to instantiate type

Templates are not yet supported. For our use cases we imported template classes/functions manually so far. This is very possible though and mostly about coming up with a nice syntax that looks like Nim generics.

catch / rethrowing C++ exceptions?

When using nim cpp, Nim exceptions translate to C++ exceptions, so this would be trivial to implement!

why not split out ffi to its own nimble package? would keep issue tracking etc cleaner

We would love to turn the things in this repo into standalone/stdlib modules once they are more mature!

explicit enumeration for different number of arguments

We had some issues with varargs[untyped] here I believe. We will revisit it!

.to(void) => ugly

This will stay necessary for now. All "dynamic" method calls return a Nim type that doesn't correspond to a C++ type. The compiler would emit a local variable with that "fake" type without to. Even discarding it doesn't help.
This can maybe be fixed in the future by returning a different type from each call which is imported with some decltype magic. That would also allow assigning those return types without to, e.g.

var local = myCppObject.someMethod()
echo local.to(cint)

converter toShort*(co: CppProxy): int16 {.used, importcpp:"(#)".} + friends => convertTo(int16)

I remember issues with generic converters...

@timotheecour
Copy link
Author

timotheecour commented Aug 8, 2018

thanks for all the answers!

I remember issues with generic converters...

Templates are not yet supported. For our use cases we imported template classes/functions manually so far. This is very possible though and mostly about coming up with a nice syntax that looks like Nim generics.

  • supporting templates (with no C++-side instantiation) is a big deal for interop so I'm really curious whether this project can eventually handle them. How could templates be mapped to Nim generics with
    current approach (Code is only emitted, not parsed)? Wouldn't that require using libclang and llvm ? NOTE: I'm assuming this project will never attempt at parsing C++ code directly (it's an infinite timesink)

Here's how Calypso handles mapping C++ templates to D templates (example, for std::vector): https://github.com/Syniurge/Calypso/blob/master/tests/calypso/libstdc%2B%2B/vector.d, eg: auto v1 = vector!char; ; I really hope we'd likewise be allowed to use let v1 = vector[char]() or let v1 = vector(char) in Nim

no nested types makes wrapping libraries in other languages problematic

  • I've been thinking of proposing an RFC for a simple amendment to Nim's style insensitivity with aim of making it easier for FFI such as this project: style insensitivity would only occur if identifier doesn't start or end with underscore; if it does, there is no style insensitivity:
    fooBar same as foobar
    _fooBar different from _foobar
    since identifiers starting or ending with _ are currently illegal, there would be no adverse consequence. But it'd solve issues when C++ libraries introduce symbols that would clash according to Nim's style insensitivity (I've definitely encountered these casese, eg: libmpfr.6.dylib has both mpfr_exp2 mpfr_exp_2)

  • handling C++ namespaces can be a tricky issue
    As in Calypso, the module should be the one in which the C++ symbol is imported, and the C++ namespace shouldn't affect that; it'd allow, say, to have a single module that wraps opencv without to duplicate opencv's hierarchy (which would be both annoying and impossible due to Nim forbidding modules with same name in a package). So the question is how to handle importing both cv::Bar and cv::baz::Bar inside opencv.nim
    instead of My/Sub/Namespace.MyEnum.SomeValue (which seems ambiguous and neither nim-like nor C++ like), why not this syntax:

# in module opencv.nim
let val = My::Sub::Namespace::MyEnum.SomeValue
type MyEnumAlias = My::Sub::Namespace::MyEnum
assert MyEnumAlias.SomeValue == val

# in module use_opencv.nim, ie user code
from opencv import nil
let val = opencv.My::Sub::Namespace::MyEnum.SomeValue
  • NOTE: Nim's operator precedence could perhaps be tweaked to support this use case, so that :: has highest precedence (higher than .)

@jwollen
Copy link
Contributor

jwollen commented Aug 9, 2018

Some template magic in the emitted C++ should be able to take care of distinguishing between fields, types, etc. Generics and subtypes don't look too hard.

I didn't have a good idea for namespaces yet though. :: is not a valid operator and the ./.() operators are the only ones with custom matching rules. (I would prefer std.vector[int](5) anyway and unify ::, . and -> into dots).

When using a single operator for both, distinguishing on the C++ side is tricky though. While it's probably possible to template the emitted code depending on whether the right side is a type, instance/static field, etc., I didn't come up with a way to "overload" the emitted code for namespaces yet...

@sinkingsugar
Copy link
Owner

sinkingsugar commented Aug 9, 2018

First of all thank you for this motivating discussion.

The drive of this cpp ffi since the beginning was always productivity, paying the price of precision and correctness leading to sub optimal linting (jsffi has the same issue), the need to specify the proper nim types and in the case of templates to qualify them as a real type.
Yet those downsides for me were extremely minimal. Being able to completely skip wrapping, binding etc a library is priceless.
Another priceless thing is being a pure nim module, only the real dependency has to be dealt with.

Like @jwollen said we experimented with libclang a lot, results were not optimal, c++ is very hard and some complex libraries like for example LLVM itself use templates in every possible way, properly supporting everything needs too much maintenance cost.

In the future I see us trying to add more compile time magic, specially never forgetting that c++ has compile time magic as well and combining smartly nim and c++ meta-programming might reach the best results with a minimal time investment compared to maintaining a full LLVM compiler pass.

template usage example

defineCppTYpe(ATensors, "std::vector<at::Tensor>", "vector")

basically we qualify them into a concrete nim type, ideally using nim templates would be nice but for now this works nicely.

can full support of complex C++ libraries be done without nim-lang/Nim#7449 ?

As soon as we implement a namespace macro/template yes, without any particular issue at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants