Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental support for building system libraries and ports with ninja. NFC #17809

Merged
merged 1 commit into from
Sep 15, 2022

Conversation

sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Sep 7, 2022

The primary advantage of doing this is that it avoids having force
a complete rebuild of a system library when you want to rebuilt it.
Instead not have precise dependencies.

Marking as NFC because this doesn't have any effect unless
EMCC_USE_NINJA is set.

@sbc100 sbc100 force-pushed the use_ninja branch 2 times, most recently from c7a7dbe to 7899714 Compare September 7, 2022 11:36
@sbc100
Copy link
Collaborator Author

sbc100 commented Sep 7, 2022

@walkingeyerobot I think ew can also generate bazel BUILD files in the same way?

@sbc100 sbc100 force-pushed the use_ninja branch 3 times, most recently from e4cf22c to 78455fe Compare September 7, 2022 13:40
@walkingeyerobot
Copy link
Collaborator

@walkingeyerobot I think ew can also generate bazel BUILD files in the same way?

We can! There's a bit more wiring up that also has to be done that I'm currently working on, but I do plan to generate bazel BUILD rules similarly.

@dschuff
Copy link
Member

dschuff commented Sep 7, 2022

This is really cool and I'm all in favor of transitioning to ninja instead of having our own code.
A random idea: Could we take this a step further, and instead of what are do now (generating one ninja build per library, and build each library one after the other), could we generate one ninja build which builds all libraries in one invocation? The idea being that having ninja schedule all the files from all the libraries together could result in faster aggregate library builds due to better utilization (i.e. we don't have situations where cores are idle waiting for the last object file to build).

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work - do we have a persistent location for the object files? (we don't clean that up after each build?)

tools/system_libs.py Outdated Show resolved Hide resolved
tools/system_libs.py Outdated Show resolved Hide resolved
@kripken
Copy link
Member

kripken commented Sep 7, 2022

Does ninja work on all platforms? And do we bundle it in the emsdk already?

@dschuff
Copy link
Member

dschuff commented Sep 7, 2022

Does ninja work on all platforms? And do we bundle it in the emsdk already?

Ninja does work on all platforms (at least, all the platforms that Chrome builds on) and IIRC is a standalone binary. It's also widely available e.g. in Linux package managers, macports, (even npm!) etc. I don't think we bundle it in the emsdk already. It is a good question though, whether we want to take a hard dependency (which would allow us to remove our own library-building code, but would require all of our users to get it, and we'd want it in emsdk) or use our current code as a fallback (which would inevitably be less well-tested).

# Find a unique basename
while o in objects:
object_uuid += 1
o = f'{object_basename}__{object_uuid}.o'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks it's copied from build_objects but it's missing the initialization of object_basename.
Is it actually necessary though? it doesn't seem to be getting triggered.

tools/system_libs.py Outdated Show resolved Hide resolved
@sbc100
Copy link
Collaborator Author

sbc100 commented Sep 8, 2022

This is really cool and I'm all in favor of transitioning to ninja instead of having our own code. A random idea: Could we take this a step further, and instead of what are do now (generating one ninja build per library, and build each library one after the other), could we generate one ninja build which builds all libraries in one invocation? The idea being that having ninja schedule all the files from all the libraries together could result in faster aggregate library builds due to better utilization (i.e. we don't have situations where cores are idle waiting for the last object file to build).

I did get this working, but it turns out to have quite a few downsides. I have revive the patch I think but honestly I didn't get great results from it. I think the problem that this would solve is largely solved by the introduction of good dependencies. This should all but eliminate the need to ever clear the cache, which means a full rebuild should rarely be needed when working on the library code itself.

One problem I ran into with the currently subninja system is that each of the sub-ninja (the one-per library) can then no-longer use relative paths.. i.e. they have have to aware that they are sub-ninja. Some folks have proposed a fix: ninja-build/ninja#1578. The other big issue is that it would require change to emcc to collect the list of all the things it needs to be build into single step.. which would require some changes.

Regardless, this change will be useful as it stands and we can discuss the master-ninja thing later.

@sbc100
Copy link
Collaborator Author

sbc100 commented Sep 8, 2022

How does this work - do we have a persistent location for the object files? (we don't clean that up after each build?)

Yes, these stick around in the cache, just like the downloaded tar archive and its expanded contents. Clearing the cache will clear them all.

sbc100 added a commit that referenced this pull request Sep 8, 2022
This will allow more easy transition to alternative build systems
such as ninja (See #17809).

Also, update build_port to allow separate source and build directories.
This avoids copying the entire source code of each port from the
extracts archive into the build directory.
EMXX = {shared.EMXX}
EMAR = {shared.EMAR}

rule cc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the real value is to use this for development, it seems we'll want to make sure that there are dependencies on the clang binaries, and potentially all of the python files, since changes to any of them could affect the object file output. Not all build systems do this properly, I can't recall whether Ninja is one of them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes.. that is unfortunately true. I had forgotten about that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that that full dependency graph would be too sensitive, but that this feature it most useful for folks working the libraries code and headers within emcripten.

sbc100 added a commit that referenced this pull request Sep 8, 2022
This will allow more easy transition to alternative build systems
such as ninja (See #17809).

Also, update build_port to allow separate source and build directories.
This avoids copying the entire source code of each port from the
extracts archive into the build directory.
sbc100 added a commit that referenced this pull request Sep 8, 2022
This will allow more easy transition to alternative build systems
such as ninja (See #17809).

Also, update build_port to allow separate source and build directories.
This avoids copying the entire source code of each port from the
extracts archive into the build directory.
sbc100 added a commit that referenced this pull request Sep 8, 2022
This will allow more easy transition to alternative build systems
such as ninja (See #17809).

Also, update build_port to allow separate source and build directories.
This avoids copying the entire source code of each port from the
extracts archive into the build directory.
sbc100 added a commit that referenced this pull request Sep 9, 2022
This will allow more easy transition to alternative build systems
such as ninja (See #17809).

Also, update build_port to allow separate source and build directories.
This avoids copying the entire source code of each port from the
extracts archive into the build directory.
sbc100 added a commit that referenced this pull request Sep 9, 2022
This will allow more easy transition to alternative build systems
such as ninja (See #17809).

Also, update build_port to allow separate source and build directories.
This avoids copying the entire source code of each port from the
extracts archive into the build directory.
sbc100 added a commit that referenced this pull request Sep 9, 2022
This will allow more easy transition to alternative build systems
such as ninja (See #17809).

Also, update build_port to allow separate source and build directories.
This avoids copying the entire source code of each port from the
extracts archive into the build directory.
sbc100 added a commit that referenced this pull request Sep 9, 2022
This will allow more easy transition to alternative build systems
such as ninja (See #17809).

Also, update build_port to allow separate source and build directories.
This avoids copying the entire source code of each port from the
extracts archive into the build directory.
sbc100 added a commit that referenced this pull request Sep 9, 2022
This will allow more easy transition to alternative build systems
such as ninja (See #17809).

Also, update build_port to allow separate source and build directories.
This avoids copying the entire source code of each port from the
extracts archive into the build directory.
sbc100 added a commit that referenced this pull request Sep 9, 2022
This will allow more easy transition to alternative build systems
such as ninja (See #17809).

Also, update build_port to allow separate source and build directories.
This avoids copying the entire source code of each port from the
extracts archive into the build directory.
@sbc100 sbc100 changed the title Experimental support for building system libraries via ninja Experimental support for building system libraries and ports with ninja Sep 13, 2022
@sbc100 sbc100 force-pushed the use_ninja branch 3 times, most recently from 27b49d5 to 1753bf1 Compare September 14, 2022 07:54
@sbc100 sbc100 changed the title Experimental support for building system libraries and ports with ninja Experimental support for building system libraries and ports with ninja. NFC Sep 14, 2022
@sbc100
Copy link
Collaborator Author

sbc100 commented Sep 14, 2022

PTAL. I could split this up more but I think its fairly concise as is.

Copy link
Member

@dschuff dschuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks pretty good in general

tools/system_libs.py Outdated Show resolved Hide resolved
tools/ports/__init__.py Outdated Show resolved Hide resolved
tools/ports/__init__.py Outdated Show resolved Hide resolved
# This means we can avoid completely rebuilding a library and just rebuild based on what changed.
#
# Setting EMCC_USE_NINJA=2 means that ninja will automatically be run for each library needed at
# link time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the difference between 1 and 2 in this comment. Reading the code, it seems like in mode 2 we force rebuilding of all libraries, even ones already built?

Copy link
Collaborator Author

@sbc100 sbc100 Sep 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes kind of. 2 means than whenever we need access to a given library we re-run the ninja file.. which will then do all the dependency checks as rebuild as needed. This mean you can edit e.g. dlmalloc.c and then just run emcc hello.c and it will trigger the rebuilding of just that one file in that one library.

1 will use ninja to build things but won't run the ninja files at all if that libraries already exist.. you don't get the "edit" + emcc + "everything just works" workflow.

This interface isn't final or ideal, its just something I came up with for this experimental phase.

sbc100 added a commit that referenced this pull request Sep 14, 2022
`clear_project_build` is already called each time a new vesion of a
port is extracted.  There is no reason to also clear the build directory
during the `create` call.

This helps with #17809 and also simplifies the code in each port.
sbc100 added a commit that referenced this pull request Sep 14, 2022
`clear_project_build` is already called each time a new vesion of a
port is extracted.  There is no reason to also clear the build directory
during the `create` call.

This helps with #17809 and also simplifies the code in each port.
@sbc100
Copy link
Collaborator Author

sbc100 commented Sep 14, 2022

Waiting from #17857 to land which will make this patch little less complex.

@sbc100 sbc100 force-pushed the use_ninja branch 2 times, most recently from c097a11 to 4a9334d Compare September 14, 2022 22:59
sbc100 added a commit that referenced this pull request Sep 14, 2022
…17857)

`clear_project_build` is already called each time a new vesion of a
port is extracted.  There is no reason to also clear the build directory
during the `create` call.

This helps with #17809 and also simplifies the code in each port.
@sbc100
Copy link
Collaborator Author

sbc100 commented Sep 15, 2022

Smaller patch, less duplication of code. PTAL.

The primary advantage of doing this is that it avoids having force
a completely rebuild of a system library when you want to rebuilt it.
Instead not have precise dependencies.
sbc100 added a commit to emscripten-core/emsdk that referenced this pull request Sep 15, 2022
We may start using it internally in emscripten. See
emscripten-core/emscripten#17809
sbc100 added a commit to emscripten-core/emsdk that referenced this pull request Sep 15, 2022
We may start using it internally in emscripten. See
emscripten-core/emscripten#17809
@sbc100 sbc100 enabled auto-merge (squash) September 15, 2022 12:59
Copy link
Member

@dschuff dschuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like the ports.write_file refactoring could be a separate change but if that's hard then LGTM, it's not too bad as it is.

@sbc100 sbc100 merged commit 53fb125 into main Sep 15, 2022
@sbc100 sbc100 deleted the use_ninja branch September 15, 2022 20:25
@sbc100
Copy link
Collaborator Author

sbc100 commented Sep 15, 2022

seems like the ports.write_file refactoring could be a separate change but if that's hard then LGTM, it's not too bad as it is.

Yes, I could have split that out too, although the main point of that is to avoiding re-writing header files with the same content, which is purely to avoid updating the timestamp which is only useful for the ninja build which uses timestamps to track dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants