Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update warpctc library #225

Closed
wants to merge 2 commits into from
Closed

Update warpctc library #225

wants to merge 2 commits into from

Conversation

alealv
Copy link

@alealv alealv commented Oct 30, 2020

Original PR: [!215]

Summary

Brings the latest updates of warpctc and uses warpctc CMakeLists.txt submit in this PR. It works but I think it will take time to be approved.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Oct 30, 2020
@alealv
Copy link
Author

alealv commented Oct 30, 2020

The only problem I have is that I got this warning when configuring CMake

CMake Error in flashlight/app/asr/third_party/warpctc/CMakeLists.txt:
  Target "warpctc" INTERFACE_INCLUDE_DIRECTORIES property contains path:

    "/home/aalvarez/Projects/flashlight/flashlight/app/asr/third_party/warpctc/include"

  which is prefixed in the source directory.

Copy link

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlikhomanenko has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link

@alealv has updated the pull request. You must reimport the pull request before landing.

Copy link

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlikhomanenko has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link

@alealv has updated the pull request. You must reimport the pull request before landing.

@alealv
Copy link
Author

alealv commented Nov 4, 2020

Now it's working without issues

@tlikhomanenko
Copy link
Contributor

tlikhomanenko commented Nov 4, 2020

Could you fix the build? Still CI with CUDA has problems.

Copy link
Member

@jacobkahn jacobkahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is still failing because warpctc isn't in the export set for installing the asr lib target. Can we check if you're commenting out a line that might affect INSTALLABLE_TARGETS?

flashlight/app/asr/criterion/CMakeLists.txt Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link

@alealv has updated the pull request. You must reimport the pull request before landing.

2 similar comments
@facebook-github-bot
Copy link

@alealv has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link

@alealv has updated the pull request. You must reimport the pull request before landing.

@alealv
Copy link
Author

alealv commented Nov 11, 2020

CI is still failing because warpctc isn't in the export set for installing the asr lib target. Can we check if you're commenting out a line that might affect INSTALLABLE_TARGETS?

warpctc is being set in flashlight/app/CMakeLists.txt. I think that the issue is because this file is not the one with the add_subdirectory() sentence, instead is being called in flashlight/app/asr/criterion/CMakeLists.txt. Although, I don't know how should I properly fixed it.

@alealv
Copy link
Author

alealv commented Nov 11, 2020

I found the problem. It's cmake version. I will update it in the docker file in the next commit.

BTW, I have many questions regarding your CI and Dockerfile:

  • Why are you compiling SoundFile? Ubuntu has it already
  • Why do you have everything in a single RUN instruction? This is doesn't keep intermediate caches if something is wrong and then you need to start from the beginning again.
  • Why are you compiling all the tools with the default version of GCC for the OS and then installing GCCv5 for compiling flashlight? This can lead to failure.
  • Why are you installing CUDA on the virtual machine? You only need to install the nvidia drivers, docker and nvidia-docker2
  • Why aren't you compiling flashlight inside the Dockerfile, but doing it as an external command?

Finally, why don't you publish a Docker with flashlight in Dockerhub? I guess a lot of people will use it.

@facebook-github-bot
Copy link

@alealv has updated the pull request. You must reimport the pull request before landing.

@alealv
Copy link
Author

alealv commented Nov 11, 2020

I just updated everything, but I realized that the docker image isn't built in the CI. Should I update this?

@tlikhomanenko
Copy link
Contributor

Hi @alealv, thanks for your questions, please find my answers below.

I found the problem. It's cmake version. I will update it in the docker file in the next commit.

Thanks!

BTW, I have many questions regarding your CI and Dockerfile:

  • Why are you compiling SoundFile? Ubuntu has it already

Good point, maybe @jacobkahn knows why we did this. We can try to use standard ubuntu stuff. Feel free to send PR if you have time to test it =).

  • Why do you have everything in a single RUN instruction? This is doesn't keep intermediate caches if something is wrong and then you need to start from the beginning again.

When I test it locally I do by hand insertions of RUN and then debug. As final version we do all in 1 run because if you don't cleanup cache, tmp, etc. this will be stored in the intermediate layer and thus you will have larger size of final image. Because docker RUN works as an additional layer so it doesn't change the size of previous layer with RUN command. Our docker images are huge already, so this is just an attempt to have a bit less size.

  • Why are you compiling all the tools with the default version of GCC for the OS and then installing GCCv5 for compiling flashlight? This can lead to failure.

Good point, but mainly we have this on CI to be sure that for gcc 5 it is still working )people reported a lot of building issues some time for gcc v5), but docker images we have with latest version which is in Ubuntu. Let me think how we can fix this.

  • Why are you installing CUDA on the virtual machine? You only need to install the nvidia drivers, docker and nvidia-docker2

can you tell more how you propose to install nvidia driver (at least with the way we do it installs cuda driver too anyway)?

  • Why aren't you compiling flashlight inside the Dockerfile, but doing it as an external command?

for both CPU and CUDA CI we do compilation inside the docker files. We could just run Dockerfile build (where only flashlight builds) and do fix on test running to avoid duplication of flashlight build commands. Thanks for pointing. I need to check a bit how to do so for CI.

Finally, why don't you publish a Docker with flashlight in Dockerhub? I guess a lot of people will use it.

in .docker folder you can see that we have CPU and GPU base dockerfiles where we build all dependencies, and then dockerfiles which builds flashlight. We have github actions which build docker image of flashlight on top of base images and push it to dockerhub. So you can download both base images and image with flashlight build on top.

@tlikhomanenko
Copy link
Contributor

I just updated everything, but I realized that the docker image isn't built in the CI. Should I update this?

I need to upload updated version of base docker image, on top of which we build the flashlight. Let me test this locally and update base image (should be fast, as you tested that it is working for you). As review of PR will be done and base image is update we can rerun tests in CI and then merge. So let us review your PR now.

Copy link

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlikhomanenko has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link

@alealv has updated the pull request. You must reimport the pull request before landing.

@alealv
Copy link
Author

alealv commented Nov 12, 2020

Good point, maybe @jacobkahn knows why we did this. We can try to use standard ubuntu stuff. Feel free to send PR if you have time to test it =).

I've been using flashlight with fairseq for training Wav2Vec 2.0 and I didn't have any issue with audio data loading.

When I test it locally I do by hand insertions of RUN and then debug. As final version we do all in 1 run because if you don't cleanup cache, tmp, etc. this will be stored in the intermediate layer and thus you will have larger size of final image. Because docker RUN works as an additional layer so it doesn't change the size of previous layer with RUN command. Our docker images are huge already, so this is just an attempt to have a bit less size.

I think the best would be to use a Multistage build, but also you could use the experimental option --squash (check this post).
Also for a final image with flashlight installed the size can be reduce a lot, a mean a lot, if you use as base (after building) cuda:XX.X-cudnnX-runtime-ubuntuXX.XX instead (for example)

Good point, but mainly we have this on CI to be sure that for gcc 5 it is still working (people reported a lot of building issues some time for gcc v5), but docker images we have with latest version which is in Ubuntu. Let me think how we can fix this.

Personally, I think gcc v5 is quite old but I see that Ubuntu16.04 has at most GCC v5.3.1

can you tell more how you propose to install nvidia driver (at least with the way we do it installs cuda driver too anyway)?

You can simply run sudo apt-get -y install cuda-drivers
Check here and/or here

@tlikhomanenko
Copy link
Contributor

Good point, maybe @jacobkahn knows why we did this. We can try to use standard ubuntu stuff. Feel free to send PR if you have time to test it =).

I've been using flashlight with fairseq for training Wav2Vec 2.0 and I didn't have any issue with audio data loading.

When I test it locally I do by hand insertions of RUN and then debug. As final version we do all in 1 run because if you don't cleanup cache, tmp, etc. this will be stored in the intermediate layer and thus you will have larger size of final image. Because docker RUN works as an additional layer so it doesn't change the size of previous layer with RUN command. Our docker images are huge already, so this is just an attempt to have a bit less size.

I think the best would be to use a Multistage build, but also you could use the experimental option --squash (check this post).
Also for a final image with flashlight installed the size can be reduce a lot, a mean a lot, if you use as base (after building) cuda:XX.X-cudnnX-runtime-ubuntuXX.XX instead (for example)

Good point, but mainly we have this on CI to be sure that for gcc 5 it is still working (people reported a lot of building issues some time for gcc v5), but docker images we have with latest version which is in Ubuntu. Let me think how we can fix this.

Personally, I think gcc v5 is quite old but I see that Ubuntu16.04 has at most GCC v5.3.1

can you tell more how you propose to install nvidia driver (at least with the way we do it installs cuda driver too anyway)?

You can simply run sudo apt-get -y install cuda-drivers
Check here and/or here

Thanks for your comments and suggestions, added in my todo on fixing this. But will work on this in several weeks only probably. About gcc - yep, mostly we are support the version which is default in ubuntu 16 for now, as people are still widely using it.

@WilliamTambellini
Copy link
Contributor

Personally, I think gcc v5 is quite old but I see that Ubuntu16.04 has at most GCC v5.3.1

If I may comment: the question is not really into the version of gcc, but more into the c++ standard: at the moment flashlight only allows c++11 in the main cmake:

set_target_properties(
    flashlight
    PROPERTIES
    LINKER_LANGUAGE CXX
    CXX_STANDARD 11
    )

at least for the fl core cpp side (not speaking about third parties). A step fwd would be to adopt cpp14.
gcc 5 seems to be c++14 capable:
https://gcc.gnu.org/projects/cxx-status.html
@tlikhomanenko any reason to restrict to cpp 11 ?

@@ -3,7 +3,7 @@
# ------------------------------------------------------------------
# Ubuntu 18.04
# OpenMPI latest (apt)
# cmake 3.10 (git)
# cmake 3.18 (git)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.18 is too new. Ubuntu 18.04 supports 3.10
https://launchpad.net/ubuntu/bionic/+source/cmake
Ubuntu 20.04 supports 3.16
https://launchpad.net/ubuntu/focal/+source/cmake
I think that we don't want to break the builds for users on these platforms.

Copy link
Author

@alealv alealv Nov 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snapcraft which is installed by default in Ubuntu has CMake 3.18.4 as default and is super easy to install:

snap install --classic cmake

And even so, you could download the compiled version for Linux from their repo and it works out of the box. And installing from source is also super easy.

We could try to run it with CMake 3.15 which I think is the minimum necessary version, although I haven't tested.

@jacobkahn
Copy link
Member

jacobkahn commented Nov 15, 2020

Hey all -- I'm sorry I'm late to this discussion. Thanks for all of this conversation so far. I'll answer some of the questions and will post another response later when I'm back at my machine. Please do add any follow-up questions from these. I hope this information is generally useful to everyone, cc @WilliamTambellini @tlikhomanenko @avidov @padentomasello.

This PR:

@aleav — thanks again for your work on this and for all of your input. I will get to reviewing this change soon (see my note below as well) I'm going to answer a bunch of questions as best I can here:

Why are you compiling SoundFile? Ubuntu has it already

As it stands, for flashlight, libsndfile file needs to be built with the "external libraries" functionality and the last time I wnt through our Docker config, the apt packages don't have those things linked. Installing libogg, libflac, libvorbisbefore buildinglibsndfile` from source ensures that it will be built with those external libraries.

If things in apt are updated, we won't need to build from source anymore. I haven't checked this in a while but will check again.

Why do you have everything in a single RUN instruction? This is doesn't keep intermediate caches if something is wrong and then you need to start from the beginning again.

Absolutely agree -- this should definitely be changed. Things are really hard to reason about with a single RUN. I'm in the process of splitting up our CI components to run tests from different flashlight components, and I've done this as part of that change, so expect to see this soon.

I think the original reason we did this was because CircleCI didn't have the ability to run docker run with --runtime=nvidia, which ensures that the guest OS has access to the NVIDIA runtime. I'll do a check to see if there's updated wisdom about this, but either way, we can probably split up our RUNs. Thanks for pointing this out either way.

Why are you compiling all the tools with the default version of GCC for the OS and then installing GCCv5 for compiling flashlight? This can lead to failure.

Yep. It's bad. I also have a draft fix for this which makes the CI turnaround significantly better. I hope to get that out early this week.

Why are you installing CUDA on the virtual machine? You only need to install the nvidia drivers, docker and nvidia-docker2

Best I can remember, nvidia-docker 2.0 wasn't around when we originally worked on this, but I'll look into this again. CUDA was required at the time to do what we needed. Thanks for the pointer.

Why aren't you compiling flashlight inside the Dockerfile, but doing it as an external command?

I don't quite understand this question -- we want to do do a build from source -- the Dockerfile we use is a "base" dockerfile which has all of the dependencies installed except for flashlight. This ensures dependencies don't break the CI build randomly.

Re CUDA and flashlight:

I'm actually in the process of refactoring a lot of our build components that deal with CUDA things to enable separable compilation and device linking, so some of the changes in this PR will probably need to be rebased on that. The goal is to link warpctc and the flashlight library CUDA kernels into a single shared object so that we aren't exporting a million CMake targets.

As part of this, I'll further centralize the place where we configure nvcc. The warpctc CMake stuff we have is a disaster because we have multiple sources of truth adding flags to the compiler, and old CMake versions don't support CUDA library targets, so warpctc's nvcc flags are getting passed to global context. Part of the work I'm doing is consolidating the places where we configure nvcc and CUDA compilation so random options in warpctc don't implicate how we build CUDA kernels in the rest of flashlight and cause inconsistencies.

Why are we still C++11 compatible? Why this gcc >= 5 requirement?

We've had discussions on this internally, and I'm thinking of relaxing this soon. The reasons for wanting to stick to old compiler versions are pretty straightforward -- the biggest is that we want to make sure that people can build flashlight on Ubuntu 14.04 with the default compiler, but given that 18.04 is quite standard now (not to mention 16.04) following default compilers in newer settings is fine, and a C++14 upgrade is probably very reasonable.

Why support older versions of CMake?

This has similar logic to the above as you've discussed - we want to follow aptitude ppas for some minimum Ubuntu version, which was 14.04. We'll see about relaxing this, especially given snap and the ease of wgetting a cmake binary that works out of the box. I should revisit the the closure of minimum CMake versions for our dependencies and see if we can't bump things -- ArrayFire still supports 3.5.1 as a minimum.

An important note:

vcpkg support for flashlight (and everything in our dependency tree) is on the final stretch which will hopefully make setting up an environment significantly easier. Using vcpkg in CI for flashlight's dependencies will further simplify our CI if we choose to go that route.

@alealv
Copy link
Author

alealv commented Nov 15, 2020

Thanks, @jacobkahn, and @tlikhomanenko for your clear explanations.

Regarding my question:

I don't quite understand this question -- we want to do do a build from source -- the Dockerfile we use is a "base" dockerfile which has all of the dependencies installed except for flashlight. This ensures dependencies don't break the CI build randomly.

I meant that you are starting a docker container and running these lines inside, instead of executing docker build of the Dockerfile-CUDA and Dockerfile-CPU.

Regarding this:

If things in apt are updated, we won't need to build from source anymore. I haven't checked this in a while but will check again.

If you check all dependencies are satisfied.

Regarding docker: I already have a branch in my fork where I updated the Docker images with two stages, which reduces the sizes a lot (I don't have the numbers now but it gets reduce almost to half). I was going to do a PR later, but you can use it as inspiration if you already have something in mind.
I also add a .dockerignore that reduces context loading time and I'm working on getting more stages to make a faster build activating Docker Built Kit.

Regarding CMake: First, since version 3.10 the CUDA language has been integrated into CMake and it's supported directly, so it would be better using regular CMake commands. But secondly, I would suggest moving at least to version 3.17, because now FindCUDA is deprecated in favor of FindCUDAToolkit. I guess in the near future they will add support for cub (a requirement for flashlight), which is integrated with CUDA toolkit since version 11.

Pushing a little further, I think it would be also great to fix this, and I would like for any of you to review this PR I did into fairseq. Sorry if I'm pushing too much, but I've been trying to fine-tune Wav2vec 2.0 in the last weeks and it's been complicated with the actual state of things.

@facebook-github-bot
Copy link

@alealv has updated the pull request. You must reimport the pull request before landing.

jacobkahn pushed a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn pushed a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/text that referenced this pull request Mar 25, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/text that referenced this pull request Mar 27, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/text that referenced this pull request Mar 27, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/text that referenced this pull request Mar 27, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/text that referenced this pull request Mar 27, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/text that referenced this pull request Mar 27, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/text that referenced this pull request Mar 27, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/text that referenced this pull request Mar 27, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/text that referenced this pull request Mar 27, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/sequence that referenced this pull request Oct 13, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/sequence that referenced this pull request Oct 13, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/sequence that referenced this pull request Oct 13, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/sequence that referenced this pull request Oct 13, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/sequence that referenced this pull request Oct 13, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/sequence that referenced this pull request Oct 13, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
jacobkahn added a commit to flashlight/sequence that referenced this pull request Oct 13, 2022
Summary:
Pull Request resolved: flashlight/flashlight#266

*Looking for comments here.* We don't have to go with this, but this is a draft of what this looks like. Helps us eliminate a lot of technical debt for now. We might be able to one day relax this restriction down the line again if we need to.

Update the minimum version of CMake required for flashlight to 3.10. 3.10 deprecates `FindCUDA.cmake` and adds first-class support for CUDA code in CMake as a project language.

Among other things, this diff:
- **Adds support for CUDA 11** and running flashlight code on Ampere with the CMake build
- **Adds support for building flashlight libraries, flashlight core, and flashlight apps as shared objects/libraries**. This makes shipping these libraries to other places easy.
- **Removes the `fl-libraries-cuda` and `warpctc` targets from flashlight**. These used to be separate CUDA libraries that were linked with and exported as additional targets
  - They flashlight exported targets more difficult to understand
  - They had the potential to conflict with other bits, i.e. another build of warpctc on the user's machine
  - CUDA kernels in flashlight libraries are now part of the `fl-libraries` target which makes it transparently easy to add new CUDA sources to flashlight, cc chaitan3
- CMake 3.10 automatically enables CUDA Separable Compilation, device linking, and PIC.
- Removes the backport of `cuda_add_library` that was necessary for older versions of CMake
- Builds warpctc as part of `fl-app-asr`
- Removes unused cruft in warpctc related to the warpctc CPU backend (which we will never use or build)
- Removes direct linking of `OpenMP` and `Threads` `IMPORTED` targets because they incorrectly propagate flags to nvcc which aren't interpreted properly. This is probably fixable in a better way.
- Gates the installation of CUB for CUDA versions <= 11 (this is necessary - including CUB with CUDA >= 11 breaks things)

Changes that need to be rebased (so they're Github auto-mentioned):
- flashlight/flashlight#215
- flashlight/flashlight#225

Reviewed By: tlikhomanenko

Differential Revision: D25013556

fbshipit-source-id: 8bbb56783dba07c0f2e7209748ddcd0e7c481db8
jacobkahn pushed a commit to flashlight/sequence that referenced this pull request Oct 13, 2022
Summary:
**Original Issue**: This dicussion started [here](flashlight/flashlight#225 (comment))

The main point of this PR is to update dockerfiles to use a multistage build, reducing build time and final image size. But it also adds to small but important features:

- `.dockerignore` to avoid passing unnecessary files to the build context
- `ccache` support to CMake

> I'm sure that most of the packages installed with apt on some builds and especially on the final image can be removed. But you know better which ones to take out. This would further reduce the final image size

| image                          | build-time | size   |
|--------------------------------|------------|--------|
| cpu-base-consolidation-latest  | 454.4s     | 4.76GB |
| cpu-latest                     | 433.6s     | 3.38GB |
| cuda-base-consolidation-latest | 970.3s     | 13.7GB |
| cuda-latest                    | 440.3s     | 9.31GB |

Pull Request resolved: flashlight/flashlight#396

Reviewed By: jacobkahn

Differential Revision: D25820782

Pulled By: tlikhomanenko

fbshipit-source-id: eef3d0550dee862d9bbfb24aa06f716c92ef75e8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants