-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build system #3086
Comments
I am not a big fan of CMake. @jtrmal is should have a more informed opinion, as he spend some time playing with it. To me, it looks more cryptic than gmake. If we are open to any options at this point, we may consider Bazel. It is certainly different from other build systems, but it is very good at what it does. Also, guys, let's please re-consider transitioning to Google Test seriously. E. g., when rewriting the logging code, I had to eyeball log output vs having progam expected output tests do it for me, or missed their so called "death test" (like you throw an unhandled exception, and the test succeeds when the SoT dies--it forks a copy of itself that is heading to seppuku internally, so zero set up for that, too). And it even comes in a single-header-only variant, and is so mature that it's updated once in a few years, and super-portable, even to Windows. |
Let me clarify on CMake. On Windows, it is doing a super-lame thing. It generates project and solution files from some templates, as if someone did not really put any effort into understanding what is in these files. It just stuffs filenames into templates that were apparently treated as blackbox. I do not want to disparage anyone and call them out for not applying any effort into making CMake a decent system on all platforms; the task is indeed daunting. But, in the end, it supports over 9000 platforms... more or less. When you have one project, it's OK. When you have so many files and libraries as Kaldi does, and want to change just one little aspect of it (e. g. suppress one warning), you are either up to editing 50 project files or regenerating everything from the grounds up. In make, I just add CXXFLAGS=, and Bob's yer uncle. It's not what I would rate as best feature of a build system. This it's inherent property of being a "meta-build" system, not a build system sensu stricto, that I dislike. |
I find cmake better than plain make, mostly for the support how to include dependencies. However, it's also a bit of a pain to wrap your head around it. I've heard good things about bazel, in particular regarding build servers and incremental build (both on server and client). |
What I like about cmake because it's fairly expressive (in the sense that
even if you are not familiar with it, you are able to figure out what most
of the commands do).
It has a good dependency tracking system (probably make depends wouldn't be
needed anymore).
It fairly widely accepted (people know it) and is still being developed and
supported.
I agree it does weird things when generating build files for Visual Studio.
But it seems MS has the intention to support CMake in the VS. At least I've
heard something for VS2017, not sure if it happened or if they dropped it
or what -- @kkm would probably know.
Big project do use cmake -- I mean not all of them, but at least some of
them -- I know about KDE and LLVM/Clang. Clang was very painless to build
for me (compared to gnu compilers). That might not be because of Cmake/Make
though
I think Kaldi would be very easy to convert to CMake -- as a matter of
fact, I wrote a perl script, which converts the Kaldi makefiles into
CMakeLists.txt. Also, the OpenFST was very easy to convert to CMake.
Other options are Bazel (as Korbinian said), scons (uses python) or jam
(which I know was/is used in boost)
I have no experience with Bazel. I've found jam (or bjam?) quite annoying
when trying to build boost. I didn't spend too much time on scons
y.
…On Mon, Mar 11, 2019 at 6:35 AM Korbinian ***@***.***> wrote:
I find cmake better than plain make, mostly for the support how to include
dependencies. However, it's also a bit of a pain to wrap your head around
it. I've heard good things about bazel, in particular regarding build
servers and incremental build (both on server and client).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKisX1PnEkwWfTxp5-Z_yI0TboPDCxvwks5vVjF6gaJpZM4bnduF>
.
|
They do. Which does not mean you can open a Kaldi-sized project with it. Since CMake generates about 500 separate project files, one per library and one per exe, it just blows up and dies spectacularly. For smaller projects, like zlib, it worked okay... It's not even MS to blame this time, it's how CMake handles Visual Studio. One output, one separate project in a separate directory. I do not care much about building the whole rig on Windows. I tried it once, and it was less that exciting in the end. I even ran an eg end to end (tedlium, IIRC), using Cygwin bash. I even added support for it to Kaldi code, so things like popen pipe stuff through bash. Windows 10 has added two intereting features. One is real functioning symlinks (in fact, they were there in NTFS all the time, but with a twist: you needed to be an admin to create symlinks, and, while you can grant any account a separate privilege to create symlinks, this privilege is unconditionally suppressed for admin accounts, unless you elevate, i. e. "sudo". So, to translate to Linux lingo, you must either be either a non-sudoer or create symlinks under sudo. In W10, all you need is enable developer mode). The second is native support for Linux user-mode binaries. This is an interesting way to go, but I am not sure if this will cut it for Kaldi. I'll try at some point. It's an interesting Linux. It actually runs in a kind of container, with a special init. Everything in else in the usermode is just a normal Linux, with ELF binaries running out of the box. There are a few distros available, including Ubuntu, which I use. I have no idea if this thing supports CUDA though, and do not really hold my breath for it. |
I think cmake vs. configure-script plus make is the real choice here.
Including improvements to the existing configure scripts.
My experiences with bazel have been terrible, mostly because it's extremely
hard to build itself.
cmake does seem to be quite popular, e.g. pytorch uses it and I think some
of Kaldi's own dependencies require it. I don't have much experience with
it though.
…On Mon, Mar 11, 2019 at 2:07 PM kkm (aka Kirill Katsnelson) < ***@***.***> wrote:
it seems MS has the intention to support CMake in the VS. At least I've
heard something for VS2017, not sure if it happened or if they dropped it
or what -- @kkm <https://github.com/kkm> would probably know.
They do. Which does not mean you can open a Kaldi-sized project with it.
Since CMake generates about 500 separate project files, one per library and
one per exe, it just blows up and dies spectacularly. For smaller projects,
like zlib, it worked okay... It's not even MS to blame this time, it's how
CMake handles Visual Studio. One output, one separate project in a separate
directory.
I do not care much about building the whole rig on Windows. I tried it
once, and it was less that exciting in the end. I even ran an eg end to end
(tedlium, IIRC), using Cygwin bash. I even added support for it to Kaldi
code, so things like popen pipe stuff through bash.
Windows 10 has added two intereting features. One is real functioning
symlinks (in fact, they were there in NTFS all the time, but with a twist:
you needed to be an admin to create symlinks, and, while you can grant any
account a separate privilege to create symlinks, this privilege is
unconditionally suppressed for admin accounts, unless you elevate, i. e.
"sudo". So, to translate to Linux lingo, you must either be either a
non-sudoer or create symlinks under sudo. In W10, all you need is enable
developer mode).
The second is native support for Linux user-mode binaries. This is an
interesting way to go, but I am not sure if this will cut it for Kaldi.
I'll try at some point. It's an interesting Linux. It actually runs in a
kind of container, with a special init. Everything in else in the usermode
is just a normal Linux, with ELF binaries running out of the box. There are
a few distros available, including Ubuntu, which I use. I have no idea if
this thing supports CUDA though, and do not really hold my breath for it.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu_AjSGhQsqWihxqDo9ZSv_IfRpwJks5vVptUgaJpZM4bnduF>
.
|
I have personally enjoyed using CMake quite a lot. It is a fairly involved
build system to set up, but it is also easiest to extend in an already
existing C++ project, imo.
It also has the best CUDA support. Instead of our configure script which
goes through a bunch of weird places that all the BLAS variants can be
installed, you can just do find(BLAS) (or something like that). And you no
longer need make depend! Halelujah.
It even supports building pip wheels for python projects with C++
dependencies easily via scikit-build
<https://github.com/scikit-build/scikit-build>. I've done it myself here.
It requires very few lines of code. The setup.py
<https://github.com/galv/galvASR/blob/dev/setup.py>file is small. And the
CMakeLists.txt file basically just needs a single line to build a shared
object for python to load:
https://github.com/galv/galvASR/blob/dev/galvASR/tensorflow_ext/CMakeLists.txt#L15
<https://github.com/galv/galvASR/blob/dev/galvASR/tensorflow_ext/CMakeLists.txt#L15>
No idea about cmake's support for windows. Frankly, it has never been a big
priority for me. I would say that is true for a lot of the ML/scientific
computing community.
On Mon, Mar 11, 2019 at 11:37 AM Daniel Povey <notifications@github.com>
wrote:
… I think cmake vs. configure-script plus make is the real choice here.
Including improvements to the existing configure scripts.
My experiences with bazel have been terrible, mostly because it's extremely
hard to build itself.
cmake does seem to be quite popular, e.g. pytorch uses it and I think some
of Kaldi's own dependencies require it. I don't have much experience with
it though.
On Mon, Mar 11, 2019 at 2:07 PM kkm (aka Kirill Katsnelson) <
***@***.***> wrote:
> it seems MS has the intention to support CMake in the VS. At least I've
> heard something for VS2017, not sure if it happened or if they dropped it
> or what -- @kkm <https://github.com/kkm> would probably know.
>
> They do. Which does not mean you can open a Kaldi-sized project with it.
> Since CMake generates about 500 separate project files, one per library
and
> one per exe, it just blows up and dies spectacularly. For smaller
projects,
> like zlib, it worked okay... It's not even MS to blame this time, it's
how
> CMake handles Visual Studio. One output, one separate project in a
separate
> directory.
>
> I do not care much about building the whole rig on Windows. I tried it
> once, and it was less that exciting in the end. I even ran an eg end to
end
> (tedlium, IIRC), using Cygwin bash. I even added support for it to Kaldi
> code, so things like popen pipe stuff through bash.
>
> Windows 10 has added two intereting features. One is real functioning
> symlinks (in fact, they were there in NTFS all the time, but with a
twist:
> you needed to be an admin to create symlinks, and, while you can grant
any
> account a separate privilege to create symlinks, this privilege is
> unconditionally suppressed for admin accounts, unless you elevate, i. e.
> "sudo". So, to translate to Linux lingo, you must either be either a
> non-sudoer or create symlinks under sudo. In W10, all you need is enable
> developer mode).
>
> The second is native support for Linux user-mode binaries. This is an
> interesting way to go, but I am not sure if this will cut it for Kaldi.
> I'll try at some point. It's an interesting Linux. It actually runs in a
> kind of container, with a special init. Everything in else in the
usermode
> is just a normal Linux, with ELF binaries running out of the box. There
are
> a few distros available, including Ubuntu, which I use. I have no idea if
> this thing supports CUDA though, and do not really hold my breath for it.
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#3086 (comment)>,
> or mute the thread
> <
https://github.com/notifications/unsubscribe-auth/ADJVu_AjSGhQsqWihxqDo9ZSv_IfRpwJks5vVptUgaJpZM4bnduF
>
> .
>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEi_UJTXXzvOjGKKCPx3s_Oi2JT3WAbiks5vVqJNgaJpZM4bnduF>
.
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
|
I would also like to mention that I personally like how cmakemakes
installing dependencies, really, really easy.
For example, even though wav2letter and flashlight have a nightmarish
number of dependencies, it is actually fairly easy to install them via
cmake because almost all of the dependencies are cmake projects. I made an
example project doing that here:
https://github.com/galv/wav2letter-sample-project/blob/master/CMakeLists.txt
I probably should have showed that to the FAIR people at some point, since
their installation instructions are painful and use too much Docker, but oh
well.
…On Mon, Mar 11, 2019 at 4:06 PM Daniel Galvez ***@***.***> wrote:
I have personally enjoyed using CMake quite a lot. It is a fairly involved
build system to set up, but it is also easiest to extend in an already
existing C++ project, imo.
It also has the best CUDA support. Instead of our configure script which
goes through a bunch of weird places that all the BLAS variants can be
installed, you can just do find(BLAS) (or something like that). And you no
longer need make depend! Halelujah.
It even supports building pip wheels for python projects with C++
dependencies easily via scikit-build
<https://github.com/scikit-build/scikit-build>. I've done it myself here.
It requires very few lines of code. The setup.py
<https://github.com/galv/galvASR/blob/dev/setup.py>file is small. And the
CMakeLists.txt file basically just needs a single line to build a shared
object for python to load:
https://github.com/galv/galvASR/blob/dev/galvASR/tensorflow_ext/CMakeLists.txt#L15
<https://github.com/galv/galvASR/blob/dev/galvASR/tensorflow_ext/CMakeLists.txt#L15>
No idea about cmake's support for windows. Frankly, it has never been a
big priority for me. I would say that is true for a lot of the
ML/scientific computing community.
On Mon, Mar 11, 2019 at 11:37 AM Daniel Povey ***@***.***>
wrote:
> I think cmake vs. configure-script plus make is the real choice here.
> Including improvements to the existing configure scripts.
>
> My experiences with bazel have been terrible, mostly because it's
> extremely
> hard to build itself.
> cmake does seem to be quite popular, e.g. pytorch uses it and I think some
> of Kaldi's own dependencies require it. I don't have much experience with
> it though.
>
> On Mon, Mar 11, 2019 at 2:07 PM kkm (aka Kirill Katsnelson) <
> ***@***.***> wrote:
>
> > it seems MS has the intention to support CMake in the VS. At least I've
> > heard something for VS2017, not sure if it happened or if they dropped
> it
> > or what -- @kkm <https://github.com/kkm> would probably know.
> >
> > They do. Which does not mean you can open a Kaldi-sized project with it.
> > Since CMake generates about 500 separate project files, one per library
> and
> > one per exe, it just blows up and dies spectacularly. For smaller
> projects,
> > like zlib, it worked okay... It's not even MS to blame this time, it's
> how
> > CMake handles Visual Studio. One output, one separate project in a
> separate
> > directory.
> >
> > I do not care much about building the whole rig on Windows. I tried it
> > once, and it was less that exciting in the end. I even ran an eg end to
> end
> > (tedlium, IIRC), using Cygwin bash. I even added support for it to Kaldi
> > code, so things like popen pipe stuff through bash.
> >
> > Windows 10 has added two intereting features. One is real functioning
> > symlinks (in fact, they were there in NTFS all the time, but with a
> twist:
> > you needed to be an admin to create symlinks, and, while you can grant
> any
> > account a separate privilege to create symlinks, this privilege is
> > unconditionally suppressed for admin accounts, unless you elevate, i. e.
> > "sudo". So, to translate to Linux lingo, you must either be either a
> > non-sudoer or create symlinks under sudo. In W10, all you need is enable
> > developer mode).
> >
> > The second is native support for Linux user-mode binaries. This is an
> > interesting way to go, but I am not sure if this will cut it for Kaldi.
> > I'll try at some point. It's an interesting Linux. It actually runs in a
> > kind of container, with a special init. Everything in else in the
> usermode
> > is just a normal Linux, with ELF binaries running out of the box. There
> are
> > a few distros available, including Ubuntu, which I use. I have no idea
> if
> > this thing supports CUDA though, and do not really hold my breath for
> it.
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > <#3086 (comment)
> >,
> > or mute the thread
> > <
> https://github.com/notifications/unsubscribe-auth/ADJVu_AjSGhQsqWihxqDo9ZSv_IfRpwJks5vVptUgaJpZM4bnduF
> >
> > .
> >
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#3086 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AEi_UJTXXzvOjGKKCPx3s_Oi2JT3WAbiks5vVqJNgaJpZM4bnduF>
> .
>
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
|
The windows cmake might be weak on how it does what it does, but if the input to me is the same, I don't see why I care what cmake does behind the scenes. The current process within kaldi has to be maintained and with cmake, this goes down considerably. Part of make is pretty simple and that part is very good, but right now the configure script has to be maintained and it seems to me that cmake is better than this scenario. As far as what google puts out, they tend to release a tool and never update it for a decade or more. Mostly unsupported software. Also CMake has many Find..... scripts that already exist. I'd not over-engineer the build system. And we use cmake here, so I've used it a bunch. There are absolutely things I like and things I don't. I'd take a simple makefile any day, but you don't start with that |
pybind11 has good integration with cmake too, if you are using cmake it
makes it quite easy to generate the python package or whetever it is, I
think.
…On Mon, Mar 11, 2019 at 8:39 PM Brett Tiplitz ***@***.***> wrote:
The windows cmake might be weak on how it does what it does, but if the
input to me is the same, I don't see why I care what cmake does behind the
scenes. The current process within kaldi has to be maintained and with
cmake, this goes down considerably.
Part of make is pretty simple and that part is very good, but right now
the configure script has to be maintained and it seems to me that cmake is
better than this scenario. As far as what google puts out, they tend to
release a tool and never update it for a decade or more. Mostly unsupported
software. Also CMake has many Find..... scripts that already exist.
I'd not over-engineer the build system. And we use cmake here, so I've
used it a bunch. There are absolutely things I like and things I don't. I'd
take a simple makefile any day, but you don't start with that
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu8ncQTfMAmlRFm9qkJi3qyaXwZThks5vVvdPgaJpZM4bnduF>
.
|
My experience with CMake has also been positive so far, even though I've only used it for relatively small projects. It also has good support for cross-compilation via the so called toolchain files. IIRC Google's Android build system(based on Gradle) is using CMake behind the scenes to build the JNI code. TLDR; +1 for CMake. |
Looks like there is pretty much a consensus on CMake. : ) Speaking of Windows, if you're up to running a full training pipeline, CMake is going to be the least of your problems anyway, so I am not worrying much about its poor Windows support. |
My experiences with cmake have been generally positive too. Recently I had some trouble configuring a project to make shared libraries (instead of static) but that's probably because I have never read any cmake manuals/tutorials. So I think cmake might be good. However, the current build system is already very good in the sense that it works on most machines without any issues. So unless we plan to add a lot of more dependencies to Kaldi, I'm not sure what would be the benefit of switching to cmake. |
We can assess whether cmake is actually better once someone actually comes
up with a proposed build system based on cmake. I am hoping someone will
volunteer for that.
One advantage is that if we do python wrapping with pybind, cmake makes
that super easy I believe.
And it should be less effort to maintain when things like new Debian or Red
Hat versions come out, as we are piggybacking off cmake's work.
…On Tue, Mar 12, 2019 at 9:06 AM Hossein Hadian ***@***.***> wrote:
My experiences with cmake have been generally positive too. Recently I had
some trouble configuring a project to make shared libraries (instead of
static) but that's probably because I have never read any cmake
manuals/tutorials. So I think cmake might be good.
However, the current build system is already very good in the sense that
it works on most machines without any issues. So unless we plan to add a
lot of more dependencies to Kaldi, I'm not sure what would be the benefit
of switching to cmake.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu2wsEc6fizPH6elL-SXRLnIqsZ8Dks5vV6ZzgaJpZM4bnduF>
.
|
@danpovey I will volunteer for it. I have the experience it takes. I'm not
entirely sure about all of the work involved, but I hope to essentially
eradicate the tools directory, replacing it with a single .cmake file which
downloads and builds the dependencies for us, as well as converting the
src/ directory to use CMakeLists.txt.
In case I get unresponsive, feel free to ping me. If anyone else would like
to do this as well, you can also ping me.
I am working on top of your kaldi10 branch, Dan.
On Tue, Mar 12, 2019 at 10:11 AM Daniel Povey <notifications@github.com>
wrote:
… We can assess whether cmake is actually better once someone actually comes
up with a proposed build system based on cmake. I am hoping someone will
volunteer for that.
One advantage is that if we do python wrapping with pybind, cmake makes
that super easy I believe.
And it should be less effort to maintain when things like new Debian or Red
Hat versions come out, as we are piggybacking off cmake's work.
On Tue, Mar 12, 2019 at 9:06 AM Hossein Hadian ***@***.***>
wrote:
> My experiences with cmake have been generally positive too. Recently I
had
> some trouble configuring a project to make shared libraries (instead of
> static) but that's probably because I have never read any cmake
> manuals/tutorials. So I think cmake might be good.
>
> However, the current build system is already very good in the sense that
> it works on most machines without any issues. So unless we plan to add a
> lot of more dependencies to Kaldi, I'm not sure what would be the benefit
> of switching to cmake.
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#3086 (comment)>,
> or mute the thread
> <
https://github.com/notifications/unsubscribe-auth/ADJVu2wsEc6fizPH6elL-SXRLnIqsZ8Dks5vV6ZzgaJpZM4bnduF
>
> .
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEi_UBw-iZxvdrbqr1mw59PtmCmncXcrks5vV9-jgaJpZM4bnduF>
.
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
|
Great.
Regarding the 'tools' directory, bear in mind that it contains two types of
things:
(1) libraries and headers that are required to build Kaldi
(2) tools, some mandatory (OpenFst), some optional, that are required to
run some Kaldi recipes, and ways to install them.
For (1), it may make sense to get cmake more involved; but for (2) we still
need the tools/ directory.
Also there may be reasons to avoid the "system" versions of, say, OpenFst;
for example, it might not be built with the flags that we need it to be
built with, or something like that. I guess what I'm saying is: don't be
too aggressive, and remember that the convenience of users is paramount; if
there is a choice between ease of installation (or being more robust), vs.
doing things the "cmake" way, I don't want to do things the "cmake" way.
The check_dependencies.sh script may still be quite helpful because it is
quite explicit about how to fix problems. My experience with cmake errors
is that while, to me, it tends to be pretty obvious how to address them,
they are definitely not obvious to all of the kinds of people who ask
questions on the Kaldi lists.
On Tue, Mar 12, 2019 at 1:43 PM Daniel Galvez <notifications@github.com>
wrote:
… @danpovey I will volunteer for it. I have the experience it takes. I'm not
entirely sure about all of the work involved, but I hope to essentially
eradicate the tools directory, replacing it with a single .cmake file which
downloads and builds the dependencies for us, as well as converting the
src/ directory to use CMakeLists.txt.
In case I get unresponsive, feel free to ping me. If anyone else would like
to do this as well, you can also ping me.
I am working on top of your kaldi10 branch, Dan.
On Tue, Mar 12, 2019 at 10:11 AM Daniel Povey ***@***.***>
wrote:
> We can assess whether cmake is actually better once someone actually
comes
> up with a proposed build system based on cmake. I am hoping someone will
> volunteer for that.
> One advantage is that if we do python wrapping with pybind, cmake makes
> that super easy I believe.
> And it should be less effort to maintain when things like new Debian or
Red
> Hat versions come out, as we are piggybacking off cmake's work.
>
> On Tue, Mar 12, 2019 at 9:06 AM Hossein Hadian ***@***.***
>
> wrote:
>
> > My experiences with cmake have been generally positive too. Recently I
> had
> > some trouble configuring a project to make shared libraries (instead of
> > static) but that's probably because I have never read any cmake
> > manuals/tutorials. So I think cmake might be good.
> >
> > However, the current build system is already very good in the sense
that
> > it works on most machines without any issues. So unless we plan to add
a
> > lot of more dependencies to Kaldi, I'm not sure what would be the
benefit
> > of switching to cmake.
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > <#3086 (comment)
>,
> > or mute the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/ADJVu2wsEc6fizPH6elL-SXRLnIqsZ8Dks5vV6ZzgaJpZM4bnduF
> >
> > .
> >
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#3086 (comment)>,
> or mute the thread
> <
https://github.com/notifications/unsubscribe-auth/AEi_UBw-iZxvdrbqr1mw59PtmCmncXcrks5vV9-jgaJpZM4bnduF
>
> .
>
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu9eDw0-ZYzsOMdasar5YTbybPlVAks5vV-ddgaJpZM4bnduF>
.
|
I see. What I like about CMake is that you can add external projects, even those that don't use cmake, like openfst. You can dig around in this file to see what I mean: https://github.com/galv/galvASR/blob/7d5d7826805cbbd0b40954f7eec262f0a7e35f01/galvASR/cmake/external.cmake#L49 It can even download archives and unzip them for installation as well, like we do manually. So you could imagine removing the tools/Makefile target and replacing it with a new cmake target which encodes the same thing. I'll see if it is reasonably clean to do or not. (My experience mucking with these details and my unwavering commitment to C++ in spite of it is why I am volunteering for this. I'm not sure how many people have the patience for all the details of C++ build systems!) Using C++ system libraries rather than project-local libraries which you build yourself is just an exercise in futility nowadays unless you're at a big company which can afford to custom tailor its build, especially after libstdc++ broke the std::string and std::list ABI to be C++11 compatible. |
@galv I should be able to help you test some of the variants including windows. |
IMO CMake is the best of the current c++ build systems. It is widely supported, almost all developers know how to use it and it has good support for third party libraries and also a number of ways of finding third party libraries. If you do go down the CMake route it might also be worth looking at conan package manager as an option for the configuration of the third party libraries / tools. |
+1 for CMake It's pain in the ass when you want to integrate a makefile project with a CMake one. With cmake, you can use vcpkg to handle your dependencies on windows. |
@kkm000 Good point. I'm not sure about CUDA but it should be possible to package OpenFST. And from a technical point of view I think MLK would be possible too but it might not be allowed because of licensing. |
I don't think packaging all the dependencies is a good idea. It might be necessary to run some pre-script that downloads the required packages like fst. I am pretty sure this could be done within CMAKE with an execute_process command either calling a script or putting the actual get commands in. And putting the MKL into the pull would be kind of crazy. I've got multiple copies of kaldi and this would balloon my storage requirements to a very crazy amount of storage. The MKL is expected to be installed in specific locations and within a source tree is not a reasonable location for so many reasons. CUDA is installed with an RPM or other package management system. There is no reason for this to change |
If you intend to implement a CMake build system for Kaldi, I thoroughly recommend taking 2 or 3 days to read Professional CMake: A Practical Guide by Craig Scott. It's the best resource I've found for learning what CMake is capable of, and what you should and should not do. For downloading dependencies, CMake offers several options:
I try to avoid native scripts like For creating CMake targets once your dependencies are downloaded, CMake offers several options:
Once targets are defined, linking against a dependency is as simple as adding it to your EDIT: spelling |
I am still torn on whether to go the CMake route. I guess I feel that just
refactoring the 'configure' script (e.g. having it call other scripts that
are easier to read) might also be a viable option, as it would be so much
more self-explanatory than a CMake setup, and easier to modify. I mean,
CMake is just *so much framework*.
…On Tue, Mar 26, 2019 at 10:38 AM David Hunt-Mateo ***@***.***> wrote:
If you intend to implement a CMake build system for Kaldi, I thoroughly
recommend taking 2 or 3 days to read Professional CMake: A Practical Guide
<https://crascit.com/professional-cmake/> by Craig Scott. It's the best
resource I've found for learning what CMake is capable of, and what you
should and should not do.
For downloading dependencies, CMake offers several options:
1. FetchContent downloads a dependency *at config time*. Works best if
the dependency is source code that can be built by CMake. Also works if the
dependency is a pre-compiled binary.
2. ExternalProject_Add for downloading a dependency *at build time*.
Works well for all types of dependencies but often implies that you use a
"Superbuild" CMake structure. Allows you to define your own Config, Build,
Install, and other custom steps if dependency doesn't use CMake. When I say
"all types" I mean:
- Source that builds using CMake
- Source that builds using some other build system
- Pre-compiled binaries
- If a dependency is installed using a package manager then it may
require administrator privilege and should be installed manually before
initiating the CMake build. Or perhaps you could use (4) or (5) below to
invoke the package manager at config time.
3. You can use add_custom_process to run a script *at build time*.
This custom process can then be wrapped in a CMake target using
add_custom_target so other targets can depend on the outputs of the
script.
4. You can use include(script.cmake) to run a CMake script *at config
time*, using the same variable scope as the rest of your project.
5. Finally as @btiplitz <https://github.com/btiplitz> mentioned, you
can use execute_process to run a script *at config time*.
I try to avoid native scripts like bash and bat because they break
cross-platform compatibility and create dupliate work. Using Python or Perl
scripts is okay if you're okay with adding them as a dependency. I prefer
to write cross-platform CMake scripts and invoke them using the ${CMAKE_COMMAND}
-P script.cmake calling syntax.
For creating CMake targets once your dependencies are downloaded, CMake
offers several options:
1. Ideally, your dependency provides a config module
DependencyConfig.cmake that defines CMake targets for all of its build
outputs along with the transitive dependencies between them. It's possible
for them to provide a CMake config module even if they do not build using
CMake.
2. If your dependency does not provide a config module, then you can
write your own CMake find module FindDependency.cmake that attempts to
find the necessary build outputs, wrap them in CMake targets, and define
the transitive dependencies between them. Find modules can work with
dependencies that are installed via package manager as well.
Once targets are defined, linking against a dependency is as simple as
adding it to your target_link_libraries() list.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVuyy8JPY8F3B8QGFbKofSv2x2TKc5ks5vajDvgaJpZM4bnduF>
.
|
@DMATS Hi, I haven't met you before, but thank you for recommending that book! I wasn't aware that an authoritative book had actually been written on cmake. I had picked up most of my knowledge from presentations. I was personally planning to required people to depend on typing Regarding packing openfst and then doing Find(OpenFST), no, I don't think that's a good idea. The ABI compatibility is too complicated. We can build it already ourselves, and create an "imported target" in cmake which provides its include directories and libraries for us. I did this in a local change on my PR #3100, but I need to rebase that on top of the latest kaldi10 and push and I need to run. |
I've been writing up a list of nice things that cmake will give us. They are:
|
@galv Pleasure to make your acquaintance. Respectfully, I wasn't trying to suggest that using pre-built OpenFST binaries is the best idea in Kaldi's case. My Obviously, this find module is incomplete as it completely neglects the other build products OpenFST produces. But they could be found in effectively the same way. #=================================
# OpenFst
#=================================
find_path(Kaldi_Tools_DIR
NAMES extras/check_dependencies.sh
HINTS ${Kaldi_ROOT_DIR}/tools
)
find_path(Kaldi_OpenFst_INCLUDE_DIR
NAMES fst/fstlib.h
HINTS ${Kaldi_Tools_DIR}/openfst/include
)
find_library(Kaldi_OpenFst_LIBRARY
NAMES fst
HINTS ${Kaldi_Tools_DIR}/openfst/lib
)
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(Kaldi_OpenFst
REQUIRED_VARS Kaldi_OpenFst_INCLUDE_DIR Kaldi_OpenFst_LIBRARY
VERSION_VAR Kaldi_OpenFst_VERSION
)
mark_as_advanced(Kaldi_OpenFst_INCLUDE_DIR Kaldi_OpenFst_LIBRARY Kaldi_OpenFst_VERSION)
if(Kaldi_OpenFst_FOUND)
set(Kaldi_OpenFst_INCLUDE_DIRS ${Kaldi_OpenFst_INCLUDE_DIR})
set(Kaldi_OpenFst_LIBRARIES ${Kaldi_OpenFst_LIBRARY})
endif()
if(Kaldi_OpenFst_FOUND AND NOT TARGET Kaldi::Kaldi_OpenFst)
add_library(Kaldi::Kaldi_OpenFst UNKNOWN IMPORTED)
set_target_properties(Kaldi::Kaldi_OpenFst PROPERTIES
IMPORTED_LINK_INTERFACE_LANGUAGES CXX
IMPORTED_LOCATION ${Kaldi_OpenFst_LIBRARIES}
INTERFACE_INCLUDE_DIRECTORIES ${Kaldi_OpenFst_INCLUDE_DIRS}
)
target_link_libraries(Kaldi::Kaldi_OpenFst
INTERFACE
${CMAKE_DL_LIBS}
)
endif() EDIT: Clarify that resulting OpenFST target is imported. |
@danpovey I'd agree that if it takes 2-3 days to understand cmake, then that seems way too complicated, but if 1 person does the transition to cmake, then the 2-3 days does not apply to every person using kaldi as only the person doing the whole conversion needs to understand it. Once converted, once it's working, it would be rare that you would need to understand everything. google search is pretty effective on doing a single task and more effective than scanning any book. |
I'll check if build with gold faster. But generally, I've never been troubled by the build speed. 10% performance translates into 1 hour of churn out of 10. The difference in build speed will certainly be less than an hour. And extra 15 minutes of reading the manual while the whole rig compiles will save more time than poking around. What would be a setting where build speed a concern? Maybe I just do not understand what a common scenario is? |
I just mean for people who are new to Kaldi and may give up before it
compiles
…On Sat, Apr 6, 2019 at 5:28 AM kkm (aka Kirill Katsnelson) < ***@***.***> wrote:
I'll check if build with gold faster. But generally, I've never been
troubled by the build speed. 10% performance translates into 1 hour of
churn out of 10. The difference in build speed will certainly take less
than an hour. And extra 15 minutes of reading the manual while the whole
rig compiles will save more time than poking around.
What would be a setting where build speed a concern? Maybe I just do not
understand what a common scenario is?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu1XoZ_MacXCeL6vz3aas7yeeeMhkks5veJLxgaJpZM4bnduF>
.
|
... but I'm primiarily talking here about what default setting to put in
configure (e.g. --shared), not about changing the whole build system to
slightly improve build speed.
…On Sat, Apr 6, 2019 at 9:50 AM Daniel Povey ***@***.***> wrote:
I just mean for people who are new to Kaldi and may give up before it
compiles
On Sat, Apr 6, 2019 at 5:28 AM kkm (aka Kirill Katsnelson) <
***@***.***> wrote:
> I'll check if build with gold faster. But generally, I've never been
> troubled by the build speed. 10% performance translates into 1 hour of
> churn out of 10. The difference in build speed will certainly take less
> than an hour. And extra 15 minutes of reading the manual while the whole
> rig compiles will save more time than poking around.
>
> What would be a setting where build speed a concern? Maybe I just do not
> understand what a common scenario is?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#3086 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ADJVu1XoZ_MacXCeL6vz3aas7yeeeMhkks5veJLxgaJpZM4bnduF>
> .
>
|
It is interesting that I am getting no different build times at all, static or shared. Also, both compilations always use -fPIC. I compiled both runs with dynamic MKL, using mostly defaults (using the MKL branch, in PR currently) and not counting OpenFST (pre-build both flavors), not make depend; so only compile and build libs and binaries. Cleaned up quite thoroughly between builds to avoid any contamination. Static:
I got 485 executables, size 13051 MB; and real/user time ≈ 5/60 minutes. Shared:
So while the total size of generated final files is significantly less (7.5 times, 13 vs 1.7GB), the total and CPU time are pretty much same. Maybe my setup is different than that of most people? I'm running 16 builds on 16 physical cores; the machine is more powerful than average but I've seen much bigger ones. And the disks in it are not even M.2; they are just plain boring SATA SSDs (I kept them during the last upgrade because ML tasks are rarely disk bound, comparing M.2 vs SSD speeds). So I would say it's if above average for a good workstation, then not by much. At the same time, I hear in this thread from at least @galv and @danpovey that shared build is faster (Speaking of the user entry simplicity, rebuilds probably do not count--it's now ensuring that the initial build is not insanely long and reasonably efficient). I am trying to assign at least some numeric values to "insane" and "efficient". Maybe I have too many cores? I could try -j8 or even -j4. But I think i'll get less-than-linear increase in build time, from what I see in the CPU load graph. And why do we -fPIC in static builds, and also for binaries. Probably should not, or is there a reason? If any of these two things do not have a ready explanation, I'll believe I must fork off another thread. I'd [refer to focus on the question of cmake or not, an entirely different topic. And I totally want to kill the threaded math options, too. |
OK, that's interesting.
Yeah, we should't be using -fPIC for static build
Thanks for looking at the build system, it needed some attention.
…On Sat, Apr 6, 2019 at 9:47 PM kkm (aka Kirill Katsnelson) < ***@***.***> wrote:
It is interesting that I am getting no different build times at all,
static or shared. Also, both compilations always use -fPIC.
I compiled both runs with dynamic MKL, using mostly defaults (using the
MKL branch, in PR currently) and not counting OpenFST (pre-build both
flavors), not make depend; so only compile and build libs and binaries.
Cleaned up quite thoroughly between builds to avoid any contamination.
Static:
./configure --cudatk-dir=/opt/nvidia/cuda-10.0 --static --static-math=no
( time make --output-sync=target -j16 ) &> default-static.log
I got 485 executables, size 13051 MB; and real/user time ≈ 5/60 minutes.
Shared:
./configure --cudatk-dir=/opt/nvidia/cuda-10.0 --shared --static-math=no
Same make command, only different log.
Now 485 executables take up 1403MB, and 22 DLLs 316 MB. Time is also ≈
5/60 minutes.
So while the total size of generated final files is significantly less
(7.5 times, 13 vs 1.7GB), the total and CPU time are pretty much same.
Maybe my setup is different than most peoples? I'm running 16 builds on 16
physical cores; the machine is more powerful than average but I've seen
much bigger ones. And the disks in it are not even M.2; they are just plain
boring SATA SSDs (I kept them during the last upgrade because ML tasks are
rarely disk bound, comparing M.2 vs SSD speeds). So I would say it's if
above average for a good workstation, then not by much.
At the same time, I hear in this thread from at least @galv
<https://github.com/galv> and @danpovey <https://github.com/danpovey>
that shared build is faster (Speaking of the user entry simplicity,
rebuilds probably do not count--it's now ensuring that the initial build is
not insanely long and reasonably efficient). I am trying to assign at least
some numeric values to "insane" and "efficient".
Maybe I have too many cores? I could try -j8 or even -j4. But I think i'll
get less-than-linear increase in build time, from what I see in the CPU
load graph.
And why to we -fPIC in static builds. And for binaries. Probably should
not, or is there a reason?
If anything of these two things do not have a ready explanation, I'll
believe I must fork off another thread. I'd [refer to focus on the question
of cmake or not, an entirely different topic. And I totally want to kill
the threaded math options, too.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu7yOH5Nq8K0OMHx7V7YJh150lQ2_ks5veXhYgaJpZM4bnduF>
.
|
I'll look. The removing of the multithreaded math from configure is on my plate too, it must be cleaner w.r.t. build options. It does not add -fpic by itself, but there are varying templates from makefiles/. I can run on a few linuxes under Docker, the only thing I do not have permanent access to is Mac. |
@kkm000 If you want me to run a build on Mac let me know. |
@langep, actually, we've just checked in a fix to broken configure, so I would appreciate if you could test it! I am worrying about the syntax of the code I added, as current Macs use bash 3.2. I tested snippets in a bash 3.2 docker container, but to confirm that the updated configure is digestible on a Mac, and that it does actually work, would be a big deal. Thank you! The changeset is in #3216, now on master. |
@kkm000 Do you want me to run a specific configuration or just the default install? |
@langep, the default would be fine, thanks. The main concern is whether the script won't spit any syntax error or such. On Darwin, Accelerate is currently selected by default anyway, AFAIK. If there are any errors, please give the output of |
@KMM00 I ran the Kernel version: 18.5.0 |
@langep, thanks much for checking! |
On the topic of CMake vs no CMake. I tried cmake build of OpenBLAS. Their build a warns about CMake support being experimental.
Fair. Also, keep in mind that I am somewhat biased against it; so far my builds of complex projects using CMake were tremendously hard to track when they ran into problems. It is also very likely that I do not know about its debugging/tracing facilities that could make pinpointing the source of the problems. In other words, it is not impossible that I am shifting the blame on CMake when I should have spent more time learning the tool, so that I could use it to its full potential, not work against it. So I figured out the build options (
Ehm.. Okay. One of these half a thousand lines is going to tell me something, I thought. Probably something closer to the end? This is how the log ends
Nothing even closely related to the last successful action ("found PkgConfig") in the screen output. Now, I do not know if my mistake clearly stands out to someone used to the tool, but I missed the After fixing the command line, I got a lot of errors from g++, all of which essentially meant that it was not passed the correct value for the I looked at generated Makefiles. I would say that there is no hope for a human to make sense of them. The Makefile just invokes
But I wanted to figure out where did the wrong -march switch come from. This
So, it seems that I had to find where do the wrong C_FLAGS come from. Looks like many subdirectories got a generated own flags.cmake file, where this is defined, and some set of basic flags with no trace of -march. My next take was to find if -march is used anywhere at all. This got me to the It's possible they may not be using CMake the best possible way, but this does not look like an obvious improvement to me. Also, the warning at the beginning of this file and its entire content also got me a bit worried. With make, at least, Makefile is passive, if it exists in a directory, do not touch it, it does not touch you. But the indication here is that the existence of CMakeLists.txt in subprojects interfered with their build process, so they had to implement this workaround. At this point I thought to myself, well, if we have CMake files in our tools/, and OpenBLAS is in a subdirectory of it, and some other things we build, too, and we also want to build it from the top of our CMakefiles... You see the point. Again, maybe they are just not using it right, but this is something to think about. And if our dependency list quoted in #3086 (comment) is considered unwieldy, then that file (and it is not alone; there is also lapacke.cmake en pendant to it) do not score too high on my wieldity scale either. Neither did I immediately fall in love with the code in this file: https://github.com/xianyi/OpenBLAS/blob/develop/cmake/utils.cmake. No, gmake is not better for coding code either, even worse, and I did really appreciate the perseverance of the guy who once coded the solver for The Towers Of Hanoi in Postgres SQL, but, I do not know, it just does not look like it was used for what it should have been used. Maybe, again, it's just a bad example, as I just picked a project which just happened to sit there; and if GNU make had a regex match operator, there certainly would be makefiles using it, for the better or, rather likely, for the worse. As for performance, it's pretty much equal to their standard make-based build, which is rather a pleasant surprise, given the complexity, size and the number of generated makefiles and targets. Standard build
cmake build (with -march=native injected manually)
(The difference is explained by the fact that the cmake port did not run tests, original build did). So performance should not be a concern. What I cautiously think I could take home from this experiment. The complex build is complex, but I already noted that, no big news here. There is certainly a learning curve, I cannot estimate how steep; the only problem I traced (missing -march) was not very easy to trace through the generated code, but in the end I could probably fix it without even RTFM by some copypasting without much understanding (we are all quite skillful at that). There is nothing impressive about CMake's syntax, it certainly does not look like a programming language, but (since it also supposed to replace the configure script) it seems to think that it is. Nothing really wrong with the tool itself, except maybe that cryptic flop with the command-line, and the worrisome comments that they had to use a hack to avoid cmake fighting itself. I would not certainly find writing a replacement for the configure script in CMake's language aesthetically pleasant, though--bash seems kind of more natural for scripting (and 75% of our configure is just a fight with ATLAS to make Kaldi complile with it, anyway). So, I dunno, meh? |
You are right about CMake being hard to debug, I have had the same
experience.
I'm still not really committed to the CMake path, but keeping a somewhat
open mind in case @galv generates something nice looking.
…On Sun, Apr 14, 2019 at 8:19 AM kkm (aka Kirill Katsnelson) < ***@***.***> wrote:
On the topic of CMake vs no CMake. I tried cmake build of OpenBLAS. Their
build a warns about CMake support being experimental.
CMake Warning at CMakeLists.txt:46 (message):
CMake support is experimental. It does not yet support all build options
and may not produce the same Makefiles that OpenBLAS ships with.
Fair. Also, keep in mind that I am somewhat biased against it; so far my
builds of complex projects using CMake were tremendously hard to track when
they ran into problems. It is also very likely that I do not know about its
debugging/tracing facilities that could make pinpointing the source of the
problems. In other words, it is not impossible that I am shifting the blame
on CMake when I should have spent more time learning the tool, so that I
could use it to its full potential, not work against it.
So I figured out the build options (cmake -LA[H] prints them, more
detailed if with the H), and attempted a build. It did not go well. One
thing I immediately stepped on is how unhelphul it is in case of a command
line error. Take this:
cmake -DCMAKE_BUILD_TYPE=RELWITHDEBINFO -DUSE_THREAD=0 -DGEMM_MULTITHREAD_THRESHOLD=0 -CMAKE_INSTALL_PREFIX=$(pwd)/install
. . .
-- Copying LAPACKE header files to include/openblas
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1")
-- Configuring incomplete, errors occurred!
See also "/home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeOutput.log".
$ wc -l /home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeOutput.log
414 /home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeOutput.log
Ehm.. Okay. One of these half a thousand lines is going to tell me
something, I thought. Probably something closer to the end? This is how the
log ends
Determining if the Fortran compiler supports Fortran 90 passed with the following output:
Change Dir: /home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeTmp
Run Build Command:"/usr/bin/make" "cmTC_11695/fast"
/usr/bin/make -f CMakeFiles/cmTC_11695.dir/build.make CMakeFiles/cmTC_11695.dir/build
make[1]: Entering directory '/home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeTmp'
Building Fortran object CMakeFiles/cmTC_11695.dir/testFortranCompilerF90.f90.o
/usr/bin/gfortran -c /home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeTmp/testFortranCompilerF90.f90 -o CMakeFiles/cmTC_11695.dir/testFortranCom
Linking Fortran executable cmTC_11695
/usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_11695.dir/link.txt --verbose=1
/usr/bin/gfortran CMakeFiles/cmTC_11695.dir/testFortranCompilerF90.f90.o -o cmTC_11695
make[1]: Leaving directory '/home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles/CMakeTmp'
Nothing even closely related to the last successful action ("found
PkgConfig") in the screen output.
Now, I do not know if my mistake clearly stands out to someone used to the
tool, but I missed the -D in the very last assignment in the command
line: -CMAKE_INSTALL_PREFIX=$(pwd)/install should have been
-DCMAKE_INSTALL_PREFIX=$(pwd)/install. To be fair. -C is a command line
switch, naming a cache file (default CMakeCache.txt. So I looked if it
had created a file with a funny name like MAKE_INSTALL_PREFIX=, or maybe
a direcrory. as the argument expanded to
/home/kkm/work/kaldi2/tools/OpenBLAS/install after the =. Nope.
"Configuring incomplete, errors occurred!" was all I got.
After fixing the command line, I got a lot of errors from g++, all of
which essentially meant that it was not passed the correct value for the
march= switch. The CPU was correctly detected by the build (SkylakeX, the
architecture with AVX512 support), but every use of an AVX512 bulitin was
yelled at by the compiler. This is where I tried to figure out how the
switch gets its value.
I looked at generated Makefiles. I would say that there is no hope for a
human to make sense of them. The Makefile just invokes $(MAKE) -f
CMakeFiles/Makefile2 xxx for every target, except the helpfully added
target named help, which helpfully prints the list of all automatically
generated targets:
$ make help | wc -l
13337
But I wanted to figure out where did the wrong -march switch come from.
This CMakeFiles/Makefile2 file is more invocations of cmake and make, but
finally I could trace it, mostly by grepping, to
kernel/CMakeFiles/kernel.dir/build.make which actually did something,
actually invoking the compiler
kernel/CMakeFiles/kernel.dir/CMakeFiles/ztrsm_iltucopy.c.o: kernel/CMakeFiles/ztrsm_iltucopy.c
@$(CMAKE_COMMAND) -E cmake_echo_color --switch=$(COLOR) --green --progress-dir=/home/kkm/work/kaldi2/tools/OpenBLAS/CMakeFiles --progress-num=$(CMAKE_PROGRESS_414) "Building C object kernel/CMakeFiles/kernel.dir/CMakeFiles/ztrsm_iltucopy.c.o"
cd /home/kkm/work/kaldi2/tools/OpenBLAS/kernel && /usr/bin/cc $(C_DEFINES) $(C_INCLUDES) $(C_FLAGS) -o CMakeFiles/kernel.dir/CMakeFiles/ztrsm_iltucopy.c.o -c /home/kkm/work/kaldi2/tools/OpenBLAS/kernel/CMakeFiles/ztrsm_iltucopy.c
So, it seems that I had to find where do the wrong C_FLAGS come from.
Looks like many subdirectories got a generated own flags.cmake file, where
this is defined, and some set of basic flags with no trace of -march. My
next take was to find if -march is used anywhere at all. This got me to the
cmake/ subdirectory, where apparently most of the build system is
scripted. All in all it seems that it is just very incomplete, and is
setting -march only once, for the mips64 architecture
<https://github.com/xianyi/OpenBLAS/blob/develop/cmake/cc.cmake#L28>.
Nothing wrong with an early work in progress, it's ok, I just thought at
this point how much *more* will have to be added to this already dense
branching. Now, OpeBLAS build is tremendously complex, no question about
this. I am not speaking about the size. But compare this to the line in
Makefile.system from which the above is apparently being ported from
<https://github.com/xianyi/OpenBLAS/blob/develop/Makefile.system#L618>.
Seems pretty much, well, same; ifeq in one, STREQUALS in other.
It's possible they may not be using CMake the best possible way, but this
does not look like an obvious improvement to me. Also, the warning at the
beginning of this file
<https://github.com/xianyi/OpenBLAS/blob/develop/cmake/lapack.cmake#L1>
and its entire content also got me a bit worried. With make, at least,
Makefile is passive, if it exists in a directory, do not touch it, it does
not touch you. But the indication here is that the existence of
CMakeLists.txt in subprojects interfered with their build process, so they
had to implement this workaround. At this point I thought to myself, well,
if we have CMake files in our tools/, and OpenBLAS is in a subdirectory of
it, and some other things we build, too, and we also want to build it from
the top of *our* CMakefiles... You see the point. Again, maybe they are
just not using it right, but this is something to think about. And if our
dependency list quoted in #3086 (comment)
<#3086 (comment)>
is considered unwieldy, then that file, and it is not alone (there is also
cmake.lapace
<https://github.com/xianyi/OpenBLAS/blob/develop/cmake/lapacke.cmake> *en
pendant* to it) do not score too high on my wieldity scale either.
Neither did I immediately fall in love with the code in this file:
https://github.com/xianyi/OpenBLAS/blob/develop/cmake/utils.cmake. No,
gmake is not better for coding code either, even worse, and I did really
appreciate the perseverance of the guy who once coded the solver for The
Towers Of Hanoi in Postgres SQL, but, I do not know, it just does not look
like it was used for what it should have been used. Maybe, again, it's just
a bad example, as I just picked a project which just happened to sit there;
and if GNU make *had* a regex match operator, there certainly would be
makefiles using it, for the better or, rather likely, for the worse.
As for performance, it's pretty much equal to their standard make-based
build, which is rather a pleasant surprise, given the complexity, size and
the number of generated makefiles and targets.
Standard build
$ make -j16
. . . .
To install the library, you can run "make PREFIX=/path/to/your/installation install".
real 0m50.506s
user 7m45.755s
sys 1m18.449s
cmake build (with -march=native injected manually)
$ cmake -DCMAKE_C_FLAGS=-march=native -DCMAKE_BUILD_TYPE=RELWITHDEBINFO -DUSE_THREAD=0 -DGEMM_MULTITHREAD_THRESHOLD=0 -DCMAKE_INSTALL_PREFIX=$(pwd)/install
$ make -j 16
. . .
real 0m46.178s
user 7m57.707s
sys 1m35.371s
(The difference is explained by the fact that the cmake port did not run
tests, original build did). So performance should not be a concern.
What I cautiously think I could take home from this experiment. The
complex build is complex, but I already noted that, no big news here. There
is certainly a learning curve, I cannot estimate how steep; the only
problem I traced (missing -march) was not very easy to trace through the
generated code, but in the end I could probably fix it without even RTFM by
some copypasting without much understanding (we are all quite skillful at
that). There is nothing impressive about CMake's syntax, it certainly does
not look like a programming language, but (since it also supposed to
replace the configure script) it seems to think that it is. Nothing really
wrong with the tool itself, except maybe that cryptic flop with the
command-line, and the worrisome comments that they had to use a hack to
avoid cmake fighting itself. I would not certainly find writing a
replacement for the configure script in CMake's language aesthetically
pleasant, though--bash seems kind of more natural for scripting (and 75% of
our configure is just a fight with ATLAS to make Kaldi complile with it,
anyway). So, I dunno, meh?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu3sfitQMD_QP0peS660RE__m5mU1ks5vg3ETgaJpZM4bnduF>
.
|
To be honest, my impression is that there is no need to use OpenBLAS's cmake build. My impression was added because someone needed to build it on Windows. We already have a script in the tools/ directory. I don't mean to be dismissive of your long comment, @kkm000, but why does using cmake in kaldi require us to use OpenBLAS's cmake build? This is what I'm doing now: https://github.com/kaldi-asr/kaldi/pull/3100/files#diff-af3b638bc2a3e6c650974192a53c7291R19 It depends on OpenBLAS having already been built. By the way, you may be interested in trying build OpenBLAS with ninja, if you'd like to make your experiment more complete. In CMake, you'd do this:
I am not finishing work on #3100 this until I finish migrating hmm-utils.cc to the new non-trainable definition of Topology and Transitions. It's the last non-compiling part of kaldi10 (right now, anyway!), excluding the tensor/ directory. It's just too much hassle to try to migrate the build system when there are non-compiling artifacts and tests are failing. I'm sorry... |
Yes, I noted that. I just was working on build of OpenBLAS (someone was broken on a platform where MKL was unavailable, noticed CMakeFile in that directory and decided to give it a go). So I did not choose a project to play with. A 0-dimensonal sample from an unknown distribution. :)
I certainly did not think so, or intended to say that. I was probably not very careful explaining what I did. The extras/Makefile does build OpenBLAS, but it does not have to (and does not) do it using CMake. My comment was only about the OpenBLAS own dealing with a subirectory containing CMakeLists.txt that apparently stood in their way. I do not understand CMake enough to say if it was a real problem or they just did not know how to canonically solve it. I just noted that if/when we use CMake, we'll have extras/CMakeLists and extras/OpenBLAS/CmakeList.txt, the apparently same situation they had to deal with. That what I just wanted someone who really knows the stuff, likely you, to pay attention to. If that's not a problem at all, or their situation is different from ours, great. Think of me a phenologist, I only observe the butterflies and record my field notes, but dissecting them I gladly leave to you! :)) I certainly trust your CMake experience. I'll try Ninja, too. Again, maybe OpenBLAS is not the kind the project where it would shine, as it's too small, and rebuilds from There is absolutely no rush, please. Keep in mind that I'll be cleaning configure, and it seems there is a lot of simplifications coming. So if starting off it, do not treat it as a golden standard. By the way, what is the difference between |
`cmake --build` will build in the current buidl directory, regardless of
which build system you are generating files for. It is good for scripting,
since you don't need even need to have your build system executable
(xcode-build, ninja, make, etc.) to be on your PATH and you don't need to
worry in your scripts about which build system you configured with when it
comes time to build.
…On Sun, Apr 14, 2019 at 9:04 PM kkm (aka Kirill Katsnelson) < ***@***.***> wrote:
@galv <https://github.com/galv>:
To be honest, my impression is that there is no need to use OpenBLAS's
cmake build.
Yes, I noted that. I just was working on build of OpenBLAS (someone was
broken on a platform where MKL was unavailable, noticed CMakeFile in that
directory and decided to give it a go). So I did not choose a project to
play with. 0-dimensonal sample from an unknown distribution. :)
why does using cmake in kaldi require us to use OpenBLAS's cmake build?
I certainly did not think so, or intended to say that. I was probably not
very careful explaining what I did. The extras/Makefile does build
OpenBLAS, but it does not have to (and does not) do it using CMake. My
comment was only about the OpenBLAS own dealing with a subirectory
containing CMakeLists.txt that apparently stood in their way. I do not
understand CMake enough to say if it was a real problem or they just did
not know how to canonically solve it. I just noted that if/when we use
CMake, we'll have extras/CMakeLists and extras/OpenBLAS/CmakeList.txt, the
apparently same situation they had to deal with. That what I just wanted
someone who really knows the stuff, likely you, to pay attention to. If
that's not a problem at all, or their situation is different from ours,
great. Think of me a phenologist, I only observe the butterflies and record
my field notes, but dissecting them I gladly leave to you! :)) I certainly
trust your CMake experience.
I'll try Ninja, too. Again, maybe OpenBLAS is not the kind the project
where it would shine, as it's too small, and rebuilds from make clean
fully in 45 seconds. No, I mean, if it builds with Ninja in say five
seconds instead of 45, I would be really super imressed, but I do not think
it really would :))
There is absolutely no rush, please. Keep in mind that I'll be cleaning
configure, and it seems there is a lot of simplifications coming. So if
starting off it, do not treat it as a golden standard.
By the way, what is the difference between cmake .; cmake --build . and cmake
.; make (except the former will probably take into account the -G from
the metabuild stage and invoke the matching build command)? This switch is
not well documented in the man.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEi_UL7n4gFaOrVCdj3CSB1uWs6W--cDks5vg_pDgaJpZM4bnduF>
.
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
|
I see, thanks. I usually also pass -j16 to make and a couple of other switches, but I can just invoke make, no big deal. I guess CMake should also support something like that out of the box, i. e. run the build system in the most sensible parallel mode? I'm using -j16 for Kaldi as I have 16 physical cores (and 16 logical cores, HT off). Just |
with cmake --build cmake --build . --target install -- -j16 # for Makefile build system
# or
cmake --build . --target install -- /m:16 # for MSBuild complete build and install command should be
that should be all commands the users need to invoke for a well organized cmake project |
Thanks. I assume |
@cloudhan thanks
…On Sun, Apr 14, 2019 at 11:19 PM kkm (aka Kirill Katsnelson) < ***@***.***> wrote:
@cloudhan <https://github.com/cloudhan>:
cmake --build . --target install -- -j16 # for Makefile build system
Thanks. I assume make -j16 would still save me a few keystrokes tho :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEi_UB-Ey4Re4EKWRF40_xEsT4ob1oudks5vhBnmgaJpZM4bnduF>
.
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
|
@galv what is the current state of cmake support for kaldi? is this still being worked on? |
I'm not actively working on it. I stopped because, at the time, kaldi10's
hmm and tree subprojects were not compiling. I tried to get those working,
but it was rather involve, and I didn't complete it. However, kaldi10 is
now compiling, so I could certainly pick this back up, but it's not a great
priority for me until I have a free moment away from my regular job (which
could be this weekend, but who knows?)
…On Mon, Aug 19, 2019 at 5:59 PM Patrick L. Lange ***@***.***> wrote:
@galv <https://github.com/galv> what is the current state of cmake
support for kaldi? is this still being worked on?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3086?email_source=notifications&email_token=ABEL6UCWIAHULZADFYYMXVDQFM6W5A5CNFSM4G453OC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4UXB3A#issuecomment-522809580>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABEL6UAYVOWF4UWYBKBCJADQFM6W5ANCNFSM4G453OCQ>
.
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
For CMake building, existing problem is that CMake build can not be run successfully using shared library. After some investigation, I found some bugs in CMake build script that needs to be fixed. After the fixing, now I can successfully install Kaldi using CMake project, while THCHS-30 training/decoding/alignment are working perfectly. Also for OpenFST, python script is added to install OpenFST to designated location. If it is needed, I might be able to patch cmake fix here, if the changes are adequate. My changes are at https://github.com/davidlin409/kaldi Start point of the change is at label "status/start_point". |
It would be great if you could make a PR with those changes so we can more
easily see the diff.
…On Fri, Feb 19, 2021 at 3:36 PM davidlin409 ***@***.***> wrote:
For CMake building, existing problem is that CMake build can not be run
successfully using shared library. After some investigation, I found some
bugs in CMake build script that needs to be fixed. After the fixing, now I
can successfully install Kaldi using CMake project, while THCHS-30
training/decoding/alignment are working perfectly.
Also for OpenFST, python script is added to install OpenFST to designated
location.
If it is needed, I might be able to patch cmake fix here, if the changes
are adequate. My changes are at
https://github.com/davidlin409/kaldi
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3086 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLOZ4BPQLAECHQHWMO7LS7YIIVANCNFSM4G453OCQ>
.
|
For the new version of Kaldi, does anyone think we should switch to a different build system, such as cmake?
We should probably still have manually-run scripts that check the dependencies; I am just wondering whether the stuff we are doing in the 'configure' script would be better done with cmake, and if so, whether anyone is interested in making a prototype to at least let it compile on Linux.
The text was updated successfully, but these errors were encountered: