Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Fix/update kaldi for openfst 1.8 #4716

Conversation

auroraanon38
Copy link
Contributor

WIP to bring Kaldi into a workable state with openfst 1.8 as per this issue

Discussion and feedback required. As well as help with more complicated C++ modifications difficult with my current skills.

The type shims have been removed from openfst since 1.8.2
As far as I can see there's no real reason to namespace these
bindings.
openfst renamed FLAGS_fst_weight_separator to
FST_FLAGS_fst_weight_separator, as well as a number of other variables
in their codebase between 1.7.3 and 1.8.2
openfst changed the function signature to a reference instead of a
pointer.
@auroraanon38
Copy link
Contributor Author

Apparently the Travis template will also need updating, as it isn't aware of C++17

@auroraanon38
Copy link
Contributor Author

Currently encountering an issue with operator+ for type conversion between basic string & basic string view.

./lat/arctic-weight.h:52:59: error: no match for ‘operator+’ (operand types are ‘std::string’ {aka ‘std::__cxx11::basic_string<char>’} and ‘std::string_view’ {aka ‘std::basic_string_view<char>’})
   52 |     static const std::string type = std::string("arctic") +
      |                                          ~~~~~~~~~~~~~~~~~^
   53 |         FloatWeightTpl<T>::GetPrecisionString();
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~            
In file included from /gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_algobase.h:67,
                 from /gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/vector:60,
                 from ../lat/lattice-functions.h:27,
                 from kws-functions2.cc:21:
/gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_iterator.h:533:5: note: candidate: ‘template<class _Iterator> constexpr std::reverse_iterator<_Iterator> std::operator+(typename std::reverse_iterator<_Iterator>::difference_type, const std::reverse_iterator<_Iterator>&)’
  533 |     operator+(typename reverse_iterator<_Iterator>::difference_type __n,
      |     ^~~~~~~~
/gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_iterator.h:533:5: note:   template argument deduction/substitution failed:
In file included from ../kws/kaldi-kws.h:25,
                 from ../kws/kws-functions.h:27,
                 from kws-functions2.cc:22:
../lat/arctic-weight.h:52:59: note:   ‘std::string_view’ {aka ‘std::basic_string_view<char>’} is not derived from ‘const std::reverse_iterator<_Iterator>’
   52 |     static const std::string type = std::string("arctic") +
      |                                          ~~~~~~~~~~~~~~~~~^
   53 |         FloatWeightTpl<T>::GetPrecisionString();
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~            
In file included from /gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_algobase.h:67,
                 from /gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/vector:60,
                 from ../lat/lattice-functions.h:27,
                 from kws-functions2.cc:21:
/gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_iterator.h:1578:5: note: candidate: ‘template<class _Iterator> constexpr std::move_iterator<_IteratorL> std::operator+(typename std::move_iterator<_IteratorL>::difference_type, const std::move_iterator<_IteratorL>&)’
 1578 |     operator+(typename move_iterator<_Iterator>::difference_type __n,
      |     ^~~~~~~~

Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?

@danpovey
Copy link
Contributor

The tricky thing, and the reason we hvaen't done this earlier, is that we don't want to break existing builds with older OpenFST versions. I assume OpenFST must export some kind of flag saying what the version is; we might be able to use ifdefs based on that....

@danpovey
Copy link
Contributor

.. and you can probably fix the operator+ problem by casting the second arg to std::string, i.e replace x.GetPrecisionString() with std::string(x.GetPrecisionString()).

@danpovey
Copy link
Contributor

Responding to: "Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?", which is not appearing on the webpage for some reason..
Explicit conversions. You shouldn't be extending the standard library for sure.

@jtrmal
Copy link
Contributor

jtrmal commented Mar 24, 2022 via email

@auroraanon38
Copy link
Contributor Author

auroraanon38 commented Mar 24, 2022

The tricky thing, and the reason we hvaen't done this earlier, is that we don't want to break existing builds with older OpenFST versions. I assume OpenFST must export some kind of flag saying what the version is; we might be able to use ifdefs based on that....

Right, that's reasonable. I'll look to see if it has such a flag. The configure file has the version and might be putting it somewhere useful, but it might not be adding any such symbol inside the library itself.

.. and you can probably fix the operator+ problem by casting the second arg to std::string, i.e replace x.GetPrecisionString() with std::string(x.GetPrecisionString()).

Noted, will do.

Responding to: "Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?", which is not appearing on the webpage for some reason..
Explicit conversions. You shouldn't be extending the standard library for sure.

Yeah, I've gotten too used to multimethod patterns.

I have certain doubts about enforcing C++17

Is it possible to build C++ libraries with C++17 but use them in programs compiled with lower versions?

In any case you have a point that it could and should be turned into a conditional variable dependent on openfst use inside those Makefiles, rather than hardcoded as I did.

@jtrmal
Copy link
Contributor

jtrmal commented Mar 24, 2022 via email

@auroraanon38
Copy link
Contributor Author

Noting that this PR also pertains to #4131 and #4393

@auroraanon38
Copy link
Contributor Author

Ad the C++17 Lots of the OpenFST stuff is templated and now written with the C++17 features, so I'm afraid you will need C++17 compliant compiler every time you will need to compile something. Unless there will be some compatibility layer or something....
y.

Alright.

@danpovey
Copy link
Contributor

I wonder whether, if it's not practical to make everything conditional, we could create a separate branch of Kaldi for use with the more recent OpenFST versions?

@auroraanon38
Copy link
Contributor Author

auroraanon38 commented Mar 25, 2022 via email

@kkm000
Copy link
Contributor

kkm000 commented Apr 1, 2022

I like many of these changes, not even related to OpenFST per se. We should certainly divorce Kaldi from OpenFST primitive types.

So far in this changeset, there is not so much stuff that would require complex shimming. 3 or 4 small functions will do. We support much larger divergence in CUDA. I do not think maintaining a forked branch is a viable option. Someone has to be responsible for replicating changes to it, and then our test pipelines are barely holding, and the amount of testing immediately doubles. We'll require a lower version of OpenFST in the out-of-the-box, "standard" build, but support 1.8+ and C++17 for custom builds. C++17 is fully supported already.

@danpovey, @jtrmal, what do you think? There was another issue with changed command line arguments to one of the FST binaries, breaking scripts, somewhere between 1.7.2 and 1.7.7.

I do not really understand the benefits of using FST 1.8 as the default. Is it faster, or fixing any issues? If we're not using features added to higher versions, then the default for the Kaldi-as-toolkit can be 1.7.x, where x is the version right before the breaking command line change. Kaldi-as-library, which the users integrate, should not use command line tools, and is not affected.

@jtrmal
Copy link
Contributor

jtrmal commented Apr 1, 2022 via email

@stale
Copy link

stale bot commented Jun 4, 2022

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

@stale stale bot added the stale Stale bot on the loose label Jun 4, 2022
@danpovey
Copy link
Contributor

danpovey commented Jun 4, 2022

Sorry I missed this earlier,
@kkm000 I think your proposal is OK. I suppose you are talking about making our codebase support both older and newer version of OpenFST.

@stale stale bot removed the stale Stale bot on the loose label Jun 4, 2022
@stale
Copy link

stale bot commented Aug 12, 2022

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

@stale stale bot added the stale Stale bot on the loose label Aug 12, 2022
@kkm000 kkm000 removed the stale Stale bot on the loose label Aug 26, 2022
@jtrmal
Copy link
Contributor

jtrmal commented Oct 11, 2022 via email

@stale
Copy link

stale bot commented Feb 18, 2023

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

@stale stale bot added the stale Stale bot on the loose label Feb 18, 2023
@danpovey
Copy link
Contributor

ping

@stale stale bot removed the stale Stale bot on the loose label Feb 18, 2023
@georgthegreat
Copy link
Contributor

@auroraanon38, is there some work still going on in this issue?
We have some patches that fix compatibility with newer versions of OpenFST.

I might upstream them though they will not bring perfect compatibility out of the box.

@danpovey, is there any kind of a todolist fixed somewhere which describes the whole migration process?

@auroraanon38
Copy link
Contributor Author

auroraanon38 commented Feb 23, 2023 via email

@georgthegreat
Copy link
Contributor

Nope, no work is currently ongoing, it is legitimately stale.

Could you, please, elaborate a bit?
What is the rationale behind the stale?

Would you (or maybe someone else in the project) accept the PRs supporting newer versions of OpenFST (I assume this would mean dropping the support for any other version though).

@jtrmal
Copy link
Contributor

jtrmal commented Feb 23, 2023 via email

@georgthegreat
Copy link
Contributor

@jtrmal, I have just submitted PR #4829 with a chain of commits needed to support openfst version up to 1.8.2.

@kkm000
Copy link
Contributor

kkm000 commented Feb 23, 2023

I apologize I fell out of the loop, but I want to remind that a command line of one tool has changed somewhere between minor v1.7 points, most probably in [1.7.2, 1.7.8] range. I forgot which binary was that.

I have not only nothing against bumping Kaldi to C++17, but all for it. We haven't yet even embraced full C++11 support for system-independent language features, really: some stuff like aligned memory allocation and memory model in general, and PRNG have been standardized, so we can get rid of quite a few #if WIN32 and such stuff. In fact, we wrote two very multithreaded and lock-contending Kaldi-based production servers; the first in C++14, xplat Linux/Windows; the second together with @jtrmal in C++17, without a single system call in the main logic paths¹

I don't see much value in sticking with a 10-year-C++. Or even 5-year-C++, to that matter. In any case, Git tags and maintenance branches exist, and for a good measure.

¹ The three exceptions outside of the main course are (1) service startup/termination, of which there are five standards: init, upstart, systemd and containers on Linux and Service Control Manager on Windows, (2) Windows-only built-in ultra-high throughput metric collector, with no Linux counterpart of similar performance, and (3) three different logging systems on Linux: syslog, systemd journal and in-container Docker logging, and yet the fourth on Windows.

@danpovey
Copy link
Contributor

OK, but is it possible to use C++17 with the older OpenFST version?
I think there is some desirability for keeping back-compatibility, and if it's not possible to support the newer OpenFST without a seamless upgrade process (e.g. enabled via #ifdefs on the OpenFST version number) then I'm not convinced the benefits outweigh the costs.

@georgthegreat
Copy link
Contributor

We used to compile OpenFST with c++17 (we currently compile it with c++20).

We are maintaining a huge monorepo with hundreds of opensource projects.
The only way to make opensource work at scale we were able to find so far is update everytihng to the most recent version and fix small incompatibilities by patches, then upstream patches.

So here I am, upstreaming our patches.

@danpovey
Copy link
Contributor

OK, and I appreciate the effort but...
I think Kaldi is a bit of a special case. At this point the project is mostly in "legacy maintenance" mode as the technology has
moved on a bit; and there are a lot of people who have incorporated Kaldi into their products and may want the latest fixes but not want a big hassle with updating OpenFST.

I'm wondering whether we could have a branch for this and do our best to keep it up to date with master?

@rkjaran
Copy link
Contributor

rkjaran commented Feb 24, 2023

OK, and I appreciate the effort but... I think Kaldi is a bit of a special case. At this point the project is mostly in "legacy maintenance" mode as the technology has moved on a bit; and there are a lot of people who have incorporated Kaldi into their products and may want the latest fixes but not want a big hassle with updating OpenFST.

I'm wondering whether we could have a branch for this and do our best to keep it up to date with master?

As someone who incorporates both Kaldi and OpenFST (>1.8) into products, the big hassle for us is having to deal with maintaining OpenFST related patches when pulling in new commits from Kaldi.

@georgthegreat
Copy link
Contributor

and there are a lot of people who have incorporated Kaldi into their products and may want the latest fixes but not want a big hassle with updating OpenFST.

Sorry, but do you have any kind of statistics to proof this?
As far as I see from the conversation here, most people would like to get both kaldi AND the latest version of OpenFST
(I understand this might be just a survivorship bias as we are discussing OpenFST update).

I'm wondering whether we could have a branch for this and do our best to keep it up to date with master?

Unfortunately, this will not work for us. As we want to get the latest fixes and kaldi does not publish any releases at all, we have to remain on the master branch, not some feature branch.

I think branch diversion will move us to a git-flow style of development (i. e. keep the latest release in master, while doing all the development in some branch). As to my taste, I find trunk based development to be more clear and understandable way to develop.

@auroraanon38
Copy link
Contributor Author

auroraanon38 commented Feb 25, 2023 via email

@stale
Copy link

stale bot commented Apr 26, 2023

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

@stale stale bot added the stale Stale bot on the loose label Apr 26, 2023
@h-vetinari
Copy link
Contributor

Not stale

@stale stale bot removed the stale Stale bot on the loose label Apr 26, 2023
@stale
Copy link

stale bot commented Aug 10, 2023

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

@stale stale bot added the stale Stale bot on the loose label Aug 10, 2023
@h-vetinari
Copy link
Contributor

Guess this has been superseded by #4829

@stale stale bot removed the stale Stale bot on the loose label Aug 10, 2023
@kkm000
Copy link
Contributor

kkm000 commented Aug 10, 2023

Duplicate-of: #4829

@h-vetinari, thanks, it indeed is, marking it as such.

@kkm000 kkm000 closed this Aug 10, 2023
@kkm000 kkm000 added the duplicate See comments for the duplicated issue label Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate See comments for the duplicated issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants