Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update from Scala Native OneGitToRuleThemAll #2

Merged
merged 54 commits into from
Jul 13, 2018

Conversation

LeeTibbert
Copy link
Owner

No description provided.

Andrei-Pozolotin and others added 30 commits March 5, 2018 16:11
* Alphabetize time methods

No functional difference.

* Fix TimeSuite test failures (#1167)

The TimeSuite tests was failing for me on both OSX and Linux. I
discovered three issues:

1) The difftime stub had the wrong result type
2) I'm not exactly clear on what the
     `localtime() should transform day zero...`
   test was trying to do, but it did not seem to correctly use the
   semantics of localtime and ascstring
3) The stack allocations would cause the test suite to frequently crash
   for me unless I wrapped the tests in a try { } catch { }

The first issue was easy to fix by changing the type signature. I
re-interpreted the second test and changed it to so that it should
adjust the epoch time by the local time zone offset to that the
ascstring returns "Sun Jan  1 00:00:00 1900\n" no matter how the system
is configured. The third issue was fixed by using zone allocation
instead of stack allocation.

I also deleted the unused test.

Fixes: #1167

Bonus: add daylight, tzname and tzset methods

* Add parens to getFD

This makes it consistent with the getFD method of FileOutputStream (and
the java api).

* Don't alias protected InputStream

Reusing the in variable name makes it difficult to mutate the protected
`in` variable in FilterInputStream.

* Whitespace: alphabetize unistd.scala

No functional change.

* Add user.dir system property

Since this property was missing, I tried to implement it in the same way
that java does. The test that I add essentially tests that both java and
scala native return the same "user.dir" value since it relies on the sbt
System property.

* Add pthread files

These were copied from #1068
(#1068)
since there was overlap between the two prs.

They are just boilerplate anyway.

* Implement #746 Add support for java.lang.{Process, ProcessBuilder}

This commit adds support for the java Process api for posix platforms.

Notes:

* The lack of thread support made this challenging. The naive approach
   to reaping the processes is to add a monitor thread that blocks on
   waitpid and then updates the process. Because scala native doesn't
   support threads yet, this must be done with a thread in native code.
   I initially tried adding minimal pthread apis to scala native, but
   invariably the pthread would allocate memory and the garbage
   collector would eventually crash. To get around this, I wrote the
   monitor thread in c.

   Because the main thread needs to update the process, the monitor
   thread must interrupt it. I ran into problems when the main thread
   was interrupted while reading from the piped file descriptor of the
   child process. To get around this, I add a global mutex that is
   shared by the process and the monitor thread. Most of the process
   methods are guarded by this mutex and the monitor must acquire it
   before it raises the SIGUSR1.

   It should be straightforward to port the monitor thread to scala and
   use java synchronized/wait/notify instead of the mutex/cond approach
   used here once scala native threads are ready.

* To avoid file descriptor leaks, it's a good idea to drain the piped
   file descriptors after the process terminates. We store the drained
   data (if any) into a ByteArrayInputStream which should allow the
   bytes to be freed when the Process is garbage collected if the user
   doesn't explicitly close the input stream. If we didn't do this, we
   wouldn't need to interrupt the process and instead could just
   communicate the process exit value through a heap allocated pointer.

I initially forked from @hwielenberg* to get started, but ultimately
rewrote pretty much all of the code. I'm sure some of the lines in
ProcessBuilder.scala haven't changed much though and I wanted to be sure
to acknowledge that initial work.

*https://github.com/hwielenberg/scala-native/tree/scala-native-746
Zone allocation does zero out the memory, while stack allocation does not.
* Types, functions, and some constants

* Add to documentation

* Use scalafmt

* Finish Definitions

* Add C adapter file

* Fix error in C file
* Extract `discover` to new `llvm` package in tools

* Extract `nativeCompileOptions` to `llvm` package

They're now defined under the name `defaultCompileOptions`.

* Extract `defaultLinkingOptions` to `llvm` package

* Extract target detection out of sbt plugin

* Extract `unpackNativeLib` out of the sbt plugin

This commit adds a new package `scala.scalanative.build`. The expected
users of this package are build tools that want to implement support for
Scala Native.

* Extract `nativeLinkNIR` out of sbt plugin

* Extract `nativeCompileLL` out of sbt plugin

* Move `GarbageCollector` to `tools` module

* Extract `nativeCompileLib` out of sbt plugin

* Extract `nativeLinkLL` out of sbt plugin

* Move to java.nio

* Add more info to `Config`

The following fields are added:
 - `clang` and `clangpp`
 - `linkingOptions` and `compileOptions`
 - `gc`

 These informations are often passed around in our toolchain. It helps
 to define cleaner API if we just pass around a `config` object instead
 of many parameters.

* Put `linkNIR` back in sbt plugin

All the important work is already done in `tools.link`. The rest was
just logging of information, which should be left to the user of our
API.

* Fix `IO.deleteRecursive`

Trying to delete a non-existing directory would throw an exception.

* Add `IO.write` to write to a file

Very similar to `Files.write`, but create the parent directories before.

* Fix calls to `IO.getAll`

* Add `build` to `build`

This function does all the steps from NIR to executable binary.

* Fix scripted test `optimizer-reporter`

Broken by the switch to NIO.

* Add `nativeLib` to `Config`

This simplifies the API.

* Configure `Driver` with `Mode` instead of `Config`

We would like to add `Driver` as part of `Config`. This is to avoid
circular dependency.

* Add `name` to `Mode` and `default`

The `default` method in `object Mode` describes the default value for
the `Mode`. It is currently set to `Debug`, following what was
previously done.

`Mode` instances also get a new attribute `name` which is a string
representation of the `Mode`.

Finally, add `apply` method to `Mode` to get an instance from a String.

* Add driver, {linker,Optimizer}Reporter to Config

* Add `Config.default`

This function takes a minimal number of arguments to create a config
with sane default parameters.

* Remove `mode` from `Config`, put into `Driver`

Otherwise, we may have `Config` instances where the `Driver` and the
`Mode` disagree.

* Store `entry` as `String` in `Config`

This hides the NIR from the public API.

* Fix `linkerReporter`

* Fix `LinkerSpec`

* Setup MiMA

This is currently set up to use MiMA 0.1.14, which is slightly outdated.
I had issues with more recent versions of MiMA (NoSuchMethodExceptions
etc., apparently because of compatibility layer with sbt 1 probably).

* Make `passes` and `withPasses` in `Driver` private

* Run MiMA in release script and CI

* Make methods of `Reporter` private

This holds for the linker and the optimizer.

* fixup mima

* Move `Config` to `build` package

* Merge tools and build namespaces

* Simplification

* Move `package object llvm` to `object LLVM`

* Fold `tools` inside `build`

* Run MiMa as part of `test-all`

* Run tests before releasing

* Move `logger` inside `Config`

* Use sbt's logger instead of the default one

* Move logging outside of sbt plugin

* Make `codegen` return generated paths

* Hide substeps in API

* Remove subtasks from sbt plugin

* Consume linking error in `build`

* s/ativeLib/ativelib/g

* Logger: add implementation for `running` and `time`

* s/GarbageCollector/GC/s

* Fix formatting

* Add documentation for `Logger`

* Add documentation for `Config`

* Add more doc

* Revive log about GC in use

* Improve documentation for `build.build`

* Throw `IllegalArgumentException` for unknown GCs

* Add `BuildException`
* Don't expose concrete types of the GC and Mode

Making them public encourages pattern matching with exhausitivity
warnings. This might give a false impression that no more GCs or Modes
are going to be added in the future which makes it harder to evolve the
API in binary compatible way.

* Make Optimizer{Driver, Reporter}, LinkerReporter APIs private

* Introduce build.{Build, ScalaNative, Discover}

* Minimize API surface

* More docstrings

* Provide a concrete usage example for Build.build

* Remove tests for dropped functionality

* Address name change suggestions
* Improve portability of time

The time.h specification only requires that the first nine integer
fields in the struct are present, but doesn't guarantee the order.
Moreover, many implementations add additional fields that they will
modify when calling the standard functions which can cause memory
corruption if only enough memory is allocated for the mandatory fields.
To address this, I introduce a new scalanative_tm struct that has the
nine mandatory fields and marshal that struct to and from the platform
specific native tm struct. This is is similar to what is done in stat.c.

A number of the methods are not thread safe because they return pointers
to native structs. To work around this, I only implement the _r variants
of these methods. To keep the api reasonably nice for the user, I added
versions of the non '_r' prefixed variants that require an implicit
Zone. These methods allocated the required data structures and pass them
into the '_r' variants. The returned pointers will have the same
lifecycle as the zone.

* Move native/time.scala to posix/time.scala

The time.scala file that added the methods in time.h was inconsistently
located in the native package. This commit moves it into the posix
package and also unifies the declaration of the time types. The time_t
and timespec types are declared in time.h, while timeval is declared in
sys/time.h. I removed the redundant declarations.
* Fix stack corruption in release

At some point I switched from fork to vfork and while tests passed, I
never tested in release mode. I discovered while debugging the
ProcessSuite hangs that invoking a process always segfaults in release
mode on my mac. I believe this was because the child process modifies
the stack of the parent. The fix is to just create a new function that
runs the child code so that the child creates a new stack frame.

After this change, I ran a bash script that invoked a simple scala
native program that called git status and was compiled in release mode.
In over 1000 iterations of the script, I saw no crashes. (Prior to the
change, it always segfaulted)

* Fix undefined process behavior

It turns out that it's unsafe to signal back onto the main thread if the
main thread is performing gc. To work around this, I made it so that the
process results are cached in native code and the scala native process
must fetch them either by explicitly calling waitFor or by implicitly
calling any of a number of relevant apis. In particular, whenever the
user calls a function that checks the exit value, we check the native
cache and whenever we read or check the available bytes.

Unlike the old implementation, we will leak native file descriptors (and
an int -> int map entry) if the user never calls waitFor (or one of the
other functions that can have the implicit side effect of draining the input
streams and setting the proc result). I think this can be fixed by
caching the output and error streams in the native code as well, but I'm
going to defer that work.
* Bump fastparse dependency to 1.0.0

* Modularize nir parser out of tools to remove dependency on fastparse
* Bump version to 0.3.8-SNAPSHOT

* Fix 0.3.7 release date
Not all files were properly formatted.
…#1200)

* Fix #1198: FileChannelImpl.read works more like the JVM now.

* Fixed/Added Tests for FileChannel.read. Fixed Buffer end position.

* Made recommended changes to PR request.
String.getBytes should throw a UnsupportedEncodingException not UnsupportedCharsetException in the case of unsupported charset. 
This is per documentation for String.getBytes and as demonstrated by
the following test program.

This is also required for URLEncoder.encode's expected exception behavior.

```
import java.nio.charset.*;
import java.net.*;

class Main {
    public static void main(String []args) {
        try {
            String data = "test string";
            data.getBytes("unsupported");
        } catch(Exception e) {
            System.out.println(e.toString());
        }
    }
}
```
cquiroz and others added 24 commits July 7, 2018 13:47
* Add Supplier.scala and UnaryOperator.scala.

* Add unit test.
* Add scripted test linking local library

* Reorder linker options to pass -l options last

clang++ uses the system linker (`ld`) for linking and on certain systems
the order of `-l` options is significant when linking archives `.a`.
This also moves system libraries after user-defined linked libraries to
support symbol resolutions for such libraries.
Benchmark result before patch:
Result Name                               iter min [ms] max [ms] avg [ms] stddev [ms] median [ms] p05 [ms] p95 [ms] avg [ms]
[OK]   walkfiletree.WalkFileTreeBenchmark 20   1238.239 1523.098 1411.120 85.667      1417.951    1510.301 1238.239 1405.227

Benchmark result after patch:
Result Name                                iter  min [ms] max [ms] avg [ms] stddev [ms] median [ms] p05 [ms] p95 [ms] avg [ms]
[OK]   walkfiletree.WalkFileTreeBenchmark  1000  133.669  186.689  140.299  7.614       137.165     157.679  134.497  139.317

Speed up = 1411.120/140.299 = 10.05x

NOTE:
I used "VeryLongBenchmark" for the result before the patch because it would take too long otherwise.
Now that we have ProcessBuilder, it's straightforward to implement
Runtime.exec.
On the JDK, if the command cannot be exec'd, there is a fallback to
running the command using /bin/sh. It was an oversight that I didn't
originally add this fallback. This came up while I was working on
getting ammonite ops to run under scala native.
When there is a capturing group with optional content and the content is
empty, the Matcher is supposed to return "", not null.
* Split test-runner out of sbt plugin

This module contains the sources for the test adapter. It can be useful
for other tools such as Mill or Bloop that want to support running tests
with Scala Native projects.

* Publish test-runner before running scripted tests
The test case needs to be updated due to out-of-sync changes in another PR.
* Decouple s.c.i.Range from formatters

Here we reimplement Range::description method to use
string builder rather than formatter. This halfs number of
reachable methods on an empty project (from 9k to 4.5k).
This naturally brings down the binary for hello world project
(from 4m to 1.9m) and total linking time.

* Fix equals/hashCode short-circuiting

Those methods are not guaranteed to be virtual.
* Add UnixException and AccessDeniedException

This commit adds the UnixException object which can be used to generate
exceptions when a native posix file system api call fails. I also added
the AccessDeniedException which was thrown by UnixFileSystemProvider but
not actually implemented anywhere.

* Update createTemp{Directory, File} semantics

The java.nio.Files.createTemp{Directory, File} apis do not require a
minimum length for the file prefix the way the
java.io.File.createTempFile api does. In addition, in the jdk, the temp
directory is by default taken from the java.io.tmpdir system property,
which is not /tmp in osx.

* Optimize removeRedundantSlashes

It is often the case that the path string contains no redundant slashes.
When this is true, we still always create a new StringBuffer and result
string when we could just return the string itself (or the substring
that removes the trailing slash). On my machine, this reduced the
average runtime of this method from O(1us) to O(300ns). The extra
complexity is minimal so the optimization feels worth it.

I also added some test cases because in pursuing alternative
optimizations, I found some edge cases what weren't tested.

* Optimize getName

It is quite slow to repeatedly all split on the underlying path string.
To fix this, we can cache the offsets into the parts of the string and
refer to the offset array whenever we need a part. An alternative would
be to just cache the parts directly, but I think that has higher memory
utilization so most jvm implementations use a lazy list of offsets.

* Move UnixPath to jvm semantics

On the jvm, the redundant slashes are removed for methods like toString,
equals and hashCode. I removed the private val from rawPath so that the
reference isn't retained after class construction.

* Files.readAllBytes -- read directly

I noticed that readAllBytes was much slower than on the jvm. Prior to
this change, small files took O(1ms) to read all of the bytes. After the
change, it is more like (30us).

* Files -- optimize list and walk

I noticed that Files.list and Files.walk were much slower in scala
native than on the jvm. I optimized the Files._list method to narrow the
gap. Instead of delegating to File.list (which wraps an array), I add a
common helper that returns an Array[T] where T is created by applying
a function f, (String, Short) => T to the dirent name and type. By
including the file type in T, we can avoid an extra call to stat, which
is expensive.

There is no more need to call isDirectory in list or walk because
opendir will return -1 and set errno. This will get turned into an
appropriate UnixException depending on whether the path name doesn't
exist or is not a directory.

* nio.Files -- Improve copy and move semantics

I noticed that Files.copy didn't do the right thing for directories.
They were treated as files, which caused an exception when trying to
read from the InputStream of the source. The java specification is that
copy should create an empty directory if it doesn't exist. If the target
is an empty directory, then its a no-op.

Files.move should replace the existing file or directory as long as the
target doesn't exist, or the REPLACE_EXISTING option is set.

I also added support for copying the file attributes.

* Support reading and writing more than 1 byte

File IO is slow in scala native because all of the read and write
options go one byte at a time. This commit updates FileChannelImpl to
make bulk operations on the input byte buffers. I saw an O(10x)
performance improvement in a few of the ammonite ops tests (total
runtime for one test decreased from 10 seconds to 1 second. This teat
read from ~150 source files and wrote the concatenated results to an
output file).

* Update semantics of NativePosixFileAttributeView

On the jvm, the attributes method of NativePosixFileAttributeView
stats the file and returns all of the file attributes in one shot. It
doesn't, however, memoize the readAttributes method, which allows the
caller to get updated attributes from the same view. In this commit, I
rework things so that attributes is a method that calls stat and
memoizes the result.

This improves performance because stat is relatively expensive.
Modified LLVM.scala to explicitly set -std=gnu11 when compiling
.c files.  Scala Native currently does not specify a standard
when compiling .c files.  Since at least Clang 3.7 this means
that the gnu11 standard is used.

Explicitly setting -std=gnu11 does not change the -std used
by Clang, but it does make the standard in use more evident
when nativeCompileOptions specifies the -v verbose flag.

The intent is to make it easier for contributors to Scala Native
to discover the C standard being used. The C++ standard of
C++11 has shown up in verbose output for years.

It is helpful for contributors, especially new contributors,
to know the C standard without having to do days of searching
Clang defaults because some symbols start with double underscore
(__) when compiled under -std=c11. They do not have this prefix
when compiled under -std=gnu11.  If I remember correctly,
timezone is one of these names.
It is very slow to have to compile a regex pattern for every call to
split so I provided an optimized version of split that is similar to the
versions provided by other jvm providers. In the case of a happy path
where the separator is a single character that isn't a regex special
character or when the separator is an escaped regex special character,
we use the optimized version.

I made a crude benchmark where I split the string "abcdefghijklnop" * 10
on the letter j 10000 times and the runtime for the fast version was 4us
per iteration while using Pattern.compile.split took 300us.
This commit adds support for arbitrary shutdown hooks in scala native.
On the jvm there are a number of hooks that run when the program exits,
including the delete on exit hooks. I discovered that c++ will call the
destructors of static member variables on program exit. These static member
variables are also lazily initialized which ensures that the hooks are
only run if any are added. There may be a better approach to this, but
this does seem to work reliably for me. With the new native Hook class,
I was able to almost trivially add deleteOnExit to java.io.File.

I didn't add the application hooks that the jvm provides because I
wasn't sure if this approach would be acceptable to the community.
Populate posix sys/types.scala on the road to a single point of baseline
posix type declarations. This allows more closely following the posix
documentation whilst consolidating C type declarations.
The prior #define values described 32 bit int constants on 32 bit systems.
This caused 32 bit math to be done, resulting in incorrect values. Changing
the constants to "long long" (LL) allows 64 bit math and correct results
on both 32 & 64 bit architectures, at no cost to the latter.
@LeeTibbert LeeTibbert merged commit ca3cf8b into LeeTibbert:master Jul 13, 2018
LeeTibbert added a commit that referenced this pull request Oct 10, 2020
  * This PR provides a missing default method. This method is used
    by the Scala.js TestMainBase hierarchy.  Having this method
    allows JUnit CollectionsTest.scala and others to link with
    one less missing symbol.

  * There is no direct corresponding JUnit test because of circularity.
    This method is used by the environment which would test any
    standalone file. These changes will be extensively exercised
    by that test environment and defects will become obvious.

  * I wrote the code based on my recent work providing similar default
    methods in Map.scala. After some "back-brain consolidation time", I
    realized that Scala.js had already implemented the method;
    obvious after the fact. I brought the code into correspondence with
    the Scala.js file and documented the SHA1 of the reference commit.
    Hence the  "Influenced by" credit line.

Documentation:

  * None required

Testing:

  Safety -

    + Built and tested ("test-all") in debug mode using sbt 1.3.13 on
      X86_64 only. All tests pass.

  Efficacy -

   + Will be shown when the port of the Scala.js TestMainBase environment
     to Scala Native has been accomplished.

Continuous integration notes:
  + Restart Travis for third time because of failures in entirely unrelated
    places.
    #2 failed ProcessSuite:  waitFor with timeout times out.
LeeTibbert added a commit that referenced this pull request Oct 13, 2020
Travis CI restart #2. Failed Travis job (scalafmt, but not in scalafmt)
unrelated to the PR, just after "store build cache".
LeeTibbert added a commit that referenced this pull request Oct 14, 2020
Travis CI restart #2. Failed Travis job (scalafmt, but not in scalafmt)
unrelated to the PR, just after "store build cache".

Travis CI restart scala-native#3. Two failed jobs unrelated to this PR.
One was a failed CLA check. Second was a failed Coursier fetch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.