Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #657: Give libraries a way to include native C code to be compiled. #1637

Merged
merged 50 commits into from Aug 8, 2020

Conversation

ekrich
Copy link
Member

@ekrich ekrich commented Jun 26, 2019

Currently, C or C++ code added into the resources directory is used by Scala Native nativelib. The basic need for adding C code to a library is to support C macros and any adapter functions for C libraries that pass structs by value. Adding the raw C is portable and a good solution. I have two use cases I am targeting at this time.

  • A published library that contains C or C++ code - currently this code goes in resources which gets copied to the root of the jar archive. The nativelib and any added nativeLibraryDependencies, a new plugin setting, are unpacked and compiled as part of the application. Currently, you also must add to nativeLinkingOptions for each library to link. Edit: This was incorrect as Scala Native keeps track and adds these to the link phase.

The Config class contains a deprecation and also a runtime deprecation warning as 3rd party tools such as coursier may need some minor changes to allow this feature to work. We may need to deprecate and refactor some of the other public tools APIs as well depending on need.

  • You have an application that needs a little C code. This use case adds an additional plugin setting nativeCodeInclude which is false by default. The intent is to combine any native code to the compile and link sequence when set to true. This way the current publishLocal methodology will continue to function without change.

There are quite a few other ideas in the issue but I think these 2 items are a good start and enable libraries and local coding without local changes to the build system.

Edit: 2019-07-19
This conceivable allows Scala Native to distribute clib or posixlib as separate artifacts if the code is not used in javalib or other core parts of the system.

@ekrich ekrich changed the title [WIP] Fix #657: The sbt plugin should offer a way to compile C files Fix #657: The sbt plugin should offer a way to compile C files Jul 3, 2019
@ekrich
Copy link
Member Author

ekrich commented Jul 3, 2019

I have removed the WIP as this could probably use some review or discussion at this point.

Here are some highlights of the implementation.

  1. The code handles the Scala Native nativelib exactly as it does for third party jars with native code with the following exceptions:
  • Always unpacks nativelib
  • Only unpacks otherlib jars if listed on the nativeLibraryDependencies classpath. The otherlib will be unpacked in a directory named the same as your artifact name in Maven.
  1. If you specify nativeCodeInclude := true then you can add C code to your project and the Scala Native plugin will copy the code to the native build area and create a sub-directory named the same as you project and will compile and link that code.
  2. The code is designed to be used with other build tools with nothing specific to sbt except in the plugin for ease of use with sbt.

The changes deprecate nativelib in Config with a runtime deprecation Warning message. The preferred method is now to provide a Seq[NativeLib] as nativelibs rather than just Path as nativelib.

@ekrich
Copy link
Member Author

ekrich commented Jul 5, 2019

Now, directory C/C++/S files are hashed to avoid a copy when they have not changed. This is similar to when a jar doesn't change, it is not unpacked. Computing the hash of the nativelib jar on my machine takes about 30 ms and I am guessing that any sort of unpacking of a 5284045 byte nativelib jar and writing to disk is slower so I think that is why the original code hashed the jar. So now if the project is large, this optimization will apply for directory copying.

@ekrich
Copy link
Member Author

ekrich commented Jul 18, 2019

Ready for Review.

@ekrich
Copy link
Member Author

ekrich commented Jul 19, 2019

Given an interest in #155, this does allow a new class of libraries that were not possible before without hacking.

This PR allows you to distribute a pure Scala Native library that can include C/C++/S code in a standard jar format exactly how the Scala Native nativelib is distributed. Your library can be developed locally with tests or example applications so you can link the library.

My test case library is stensorflow which is currently just a Scala Native C binding with no high level Scala code. It is not 100% working now. yet. It currently uses an application to test itself.

The PR above still uses the standard static link setup but does allow the full Scala Native optimization system to work on any application or library (with tests or examples to force a link) for the library of interest. It is very easy to use in a natural way with the current Scala Native plugin or raw tool chain.

@muxanick
Copy link
Contributor

muxanick commented Aug 2, 2019

@densh What the current plan about it? It is very useful change

@ekrich
Copy link
Member Author

ekrich commented Oct 16, 2019

I have been thinking about this PR and I think it might be helpful to put the native code in a directory specified by the full org and project name to have a unique namespace. So instead of nativelib it would be org_scala-native_nativelib or for stensorflow it would be org_ekrich_stensorflow.

@ekrich
Copy link
Member Author

ekrich commented Nov 27, 2019

@lolgab I added namespaces based on the organization and project name in the last commit.

@ekrich
Copy link
Member Author

ekrich commented Feb 20, 2020

Recap

Allow internal Scala Native components and external libraries to include C code which will be compiled on demand by the Scala Native plugin.

Vision of the future

Rather than Scala Native be a platform where everything is included, this opens up the potential where Scala Native can consist of components that are included from the repository based on the platform or potentially some future configuration. This could help us make Scala Native scale from smaller devices and different platforms, other than Linux/UNIX, to more enterprise usages. Some examples follow:

  • Windows support: sharing what it can via C conditional compilation and separate projects where this strategy does not make sense.
  • Robotics and 32bit support as demonstrated by Shadaj Laddad.
  • Potentially WASM
  • Potentially different concurrency approaches

We still want to support Java libraries like we do now to promote a robust set of portable cross projects. This is a change that can allow an evolution of the platform guided by the leadership of this project and the needs and contributions from the community.

I know there has been some frustration especially related to Windows support as the requested changes would have been too disruptive and potentially destabilizing. This should allow some of the development to run in parallel in separate projects that will not affect the stability of the current system.

Copy link
Collaborator

@sjrd sjrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My experience with Scala.js is that we shouldn't bake support for platform dependencies in the main sbt plugin. We did it with jsDependencies in Scala.js and it was a nightmare. It was never really correct, and could never be fixed. We ended up moving it to a separate sbt plugin.

Can we implement compiling C code in a separate sbt plugin? If yes, that should be the way to go. If not, what prevents us from doing so? Is there a minimal set of hooks that we can to sbt-scala-native that would allow to implement this in a separate plugin?

@ekrich
Copy link
Member Author

ekrich commented Feb 21, 2020

Looking at jsDependencies and the plugin for that leads me to think that this is really not the same type of problem we are trying to solve.

This really is not about compiling C per se. If you are building an application and you can't really create an Scala Native interface to C with extern directly to call a C library because Scala Native does not support a perfect C interopt then this allows you to toss a little C code into your project to fill the gap. Many people have just given up at this point since you can't do this and there are numerous hacked solutions out there to get this to work. Also, if you are building a library that has the same problem, you can include a little C code that can be compiled when your library is used. This is the tensorflow example. Remarkable BLAS as in sblas doesn't need any C code so a full BLAS interface is available for Scala Native.

Compiling C is already baked into the tools project but only for the nativelib artifact. I am not sure exactly how this PR would be done without the nativeLibraryDependencies because you would have to search the entire classpath to find the native code looking inside each jar and also we use the org and name of the artifact to create a unique directory name to place the code to be compiled. The sbt plugin basically just calls into the tools API knowing which jars to expand so that the C code can be included for the C compiler as it is being used in the normal tool chain.

I'm not sure if this helps explain or answers you questions exactly. I do think something other than this solution is going to be a much bigger development project.

I tried to minimize the complexity/changes and put only what needs to be put in the plugin and putting the rest into the tools project. I think the changes are quite modest and should be maintainable. Currently, the nativelib is one big blob that doesn't allow the code to be segregated to the internal sub-project/artifact where it belongs like posixlib or clib either so somehow we need to solve that problem because posixlib needs to be excluded for Windows. I am thinking this may help with that as well but have not tried to reorganize Scala Native using this PR. I added a scripted test to test the nativeCodeInclude feature and tested stensorflow with this PR for a library test.

I am really not sure about how coursier or other tools that may call the tools API and will deal with the additional case classes that are in the Discover object when creating a Config. Not sure who has experience in that area.

@rwhaling @lolgab Could you take a look and see what you think about this change and approach?

@shadaj
Copy link
Contributor

shadaj commented Feb 21, 2020

Compiling C is already baked into the tools project but only for the nativelib artifact. I am not sure exactly how this PR would be done without the nativeLibraryDependencies because you would have to search the entire classpath to find the native code looking inside each jar and also we use the org and name of the artifact to create a unique directory name to place the code to be compiled.

A bit late to the discussion, but the strategy for searching the entire classpath is actually implemented in my fork for 32-bit support + JNI bridging (see shadaj@6888706). To ensure unique folder names, I just used the name of the JAR file the sources were extracted from. Then, any library just needs to add C sources they need with a extra module dependency (https://github.com/Team846/scala-native-wpilib/blob/master/scalaNativeJNINativeLib/src/main/resources/mockjni.cpp).

While not a large-scale usage of the feature, since we only had one externally provided C source, this strategy seemed to work pretty well since there were no changes needed to projects using these extra C sources as they were automatically brought into the classpath by regular dependency resolution.

@ekrich
Copy link
Member Author

ekrich commented Feb 21, 2020

This looks like you used nativeUnpackLib as a Seq maybe and was also before the code was moved out of the plugin and into tools so it looks different now. I see the jar naming too thanks. I originally was going to use just the artifact name (similar to jar name) but worried that 2 people could name their library the same and added the org name to make it more unique.

@sjrd
Copy link
Collaborator

sjrd commented Feb 27, 2020

If you are building an application and you can't really create an Scala Native interface to C with extern directly to call a C library because Scala Native does not support a perfect C interopt then this allows you to toss a little C code into your project to fill the gap. Many people have just given up at this point since you can't do this and there are numerous hacked solutions out there to get this to work.

I think this is the root problem that needs to be solved. We need to make Scala Native able to directly interface with any C library. It needs perfect interop. I managed to do that for Scala.js with a perfect interop with JS, and interop with JS is fundamentally much more complex than interop with C, so I'm pretty confident we can achieve perfect interop for C.

@ekrich
Copy link
Member Author

ekrich commented Mar 4, 2020

I agree that "perfect interopt" is the ultimate solution but I don't think that is the only issue this is solving.

  1. The current plugin unpacks the nativelib and compiles the C/C++ along with the other code.
  2. Each project should hold its own C code including perhaps separate projects for each GC so this can be worked on in an independent manner.
  3. Having all the C/C++ code in the same project means that each project itself lacks encapsulation and is going to be a maintenance issue.
  4. In order to properly support Windows we could segment the Java library code for example such as Threads and IO where the windows libraries differ greatly from Standard C or POSIX. Then we could build the proper code on different platforms. This could also allow the current release to support Windows without affecting the more stable Linux/macOS platforms.
  5. Sometimes writing some C glue code is just a better thing to do in certain situations even on top of the underlying library.
  6. Would allow my tensorflow interface to be published in the interim. https://github.com/ekrich/stensorflow

The issues are probably not low hanging fruit otherwise @densh would have probably had a solution.

  1. Structs on the stack.
  2. Preprocessor macros. https://www.geeksforgeeks.org/interesting-facts-preprocessors-c/

I know we should carefully look at these changes and make sure we get the best solution we can based on current and future needs.

Perhaps others can help provide some guidance and feedback on this PR as well.

@ekrich
Copy link
Member Author

ekrich commented Mar 12, 2020

Splitting the projects with their own C code.
#1352

@ekrich
Copy link
Member Author

ekrich commented Jun 3, 2020

@sjrd I now just use the classpath to discover the nativelib and other libs that contain native code. It is not complete or polished but was wondering if this approach would be more acceptable?

Copy link
Collaborator

@sjrd sjrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK let's merge this as is. But I'd like to see a follow up PR, as you suggested, to only take files in some specified subdirectory (and therefore move the files in the nativelib to match).

@sjrd sjrd changed the title Fix #657: The sbt plugin should offer a way to compile C files Fix #657: Give libraries a way to include native C code to be compiled. Aug 8, 2020
@sjrd sjrd merged commit cdd017f into scala-native:master Aug 8, 2020
@ekrich ekrich deleted the topic/fix-657 branch August 8, 2020 20:21
ekrich added a commit to ekrich/scala-native that referenced this pull request May 21, 2021
…o be compiled. (scala-native#1637)

Previously, C source files in the nativelib's jar (i.e., in its
resources), where compiled and linked together with the Scala
Native-emitted code. That behavior was reserved for the nativelib,
and not applied to user-defined libraries.

This commit generalizes this support to all libraries. Native code
source files present anywhere on the classpath will be compiled and
linked together in the final executables.

This allows libraries to include native code in their distribution.

A followup commit should restrict the set of included files to one
specific subdirectory of the classpath entries, in order to allow
libraries to also distribute source files in their actual resources
for other purposes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants