Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove pybind dependencies from RaggedArc. #842

Merged
merged 20 commits into from
Oct 27, 2021

Conversation

pkufool
Copy link
Collaborator

@pkufool pkufool commented Oct 11, 2021

In this pull request, I will do the following two tasks:

  • Remove pybind dependencies from RaggedAny & RaggedArc
  • Rearrange source file into csrc and python folders and build shared libraries separately.

@danpovey
Copy link
Collaborator

What python dependencies are these? E.g. we are including pybind stuff in csrc/ and you dont want to do that?

@pkufool
Copy link
Collaborator Author

pkufool commented Oct 11, 2021

What python dependencies are these? E.g. we are including pybind stuff in csrc/ and you dont want to do that?

The python dependencies are pybind dependencies.

We want a shared library that only depends on k2 and C++ APIs from PyTorch (does not depend on Python C API) for production deployment.

We use some data structures belonging to pybind in RaggedAny and RaggedArc now, this pull request aims to replace these data structures with those in pytorch, so that, we will no longer need pybind to build our C++ API.

We use pybind to wrap k2 to Python only.

@pkufool pkufool changed the title [WIP] Remove python dependencies from RaggedArc. [WIP] Remove pybind dependencies from RaggedArc. Oct 11, 2021
@csukuangfj
Copy link
Collaborator

As suggested by Dan in #839 (comment)

I would recommend
(1) Renaming RaggedArc to FsaClass
(2) Replace py::object with torch::IValue
(3) Move FsaClass to k2/torch/csrc/fsa_class.h
(4) We can build another class, PyFsaClass, containing py::object attributes in k2/python/csrc/torch/v2/py_fsa_class.h.
(5) PyFsaClass is a subclass of FsaClass and its purpose is to hold arbitrary attributes passed from Python.

@pkufool
Copy link
Collaborator Author

pkufool commented Oct 11, 2021

I did some experiments just now, it seems that torch::IValue can hold user defined python objects.

image

image

Anyway, I will finish FsaClass first, we will then consider the subclass PyFsaClass stuff, if it can not handle arbitrary attributes passed from Python.

@pkufool pkufool marked this pull request as draft October 18, 2021 13:06
@pkufool pkufool marked this pull request as ready for review October 19, 2021 14:08
@pkufool pkufool changed the title [WIP] Remove pybind dependencies from RaggedArc. Remove pybind dependencies from RaggedArc. Oct 19, 2021
@pkufool
Copy link
Collaborator Author

pkufool commented Oct 19, 2021

Ready for reviewing.
In order to merge the fsa algorithms related PRs, we should merge this PR first.

ivalue = self.GetAttr(name);
} catch (std::runtime_error err) {
PyErr_SetString(PyExc_AttributeError, err.what());
throw py::error_already_set();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@csukuangfj throw attributeError here.

else()
message(FATAL_ERROR "Please select a framework.")
endif()
# add_subdirectory(python)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# add_subdirectory(python)

void RaggedArc::CopyTensorAttrs(const RaggedArc &src, torch::Tensor arc_map,
bool over_write /*= true*/) {
void FsaClass::CopyTensorAttrs(const FsaClass &src, torch::Tensor arc_map,
bool over_write /*= true*/) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool over_write /*= true*/) {
bool overwrite /*= true*/) {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need the argument overwrite? What if it is removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is for FromBinaryFunctionTensor, I obey the logic in python version, if there are attributes with the same name, we choose the one in the first FSA, overwrite == false is used when copying the attributes of second FSA.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, shall we remove it? That means we will choose the attribute value from second FSA.

K2_CHECK_EQ(scores.numel(), fsa.NumElements());

auto ragged_scores = RaggedAny(fsa.shape.To(GetContext(scores)), scores);
RaggedAny norm_scores = ragged_scores.Normalize(true).To(scores.device());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ragged_scores has the same device as scores so you don't need to move it to scores.device() again.

Copy link
Collaborator Author

@pkufool pkufool Oct 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I made a mistake here. I meant To(self.scores.device()).

// We need this wrapper so that we can convert an instance
// of RaggedAny into `torch::IValue`
struct RaggedAnyHolder : public torch::CustomClassHolder {
std::shared_ptr<RaggedAny> ragged = nullptr; // not owned by this class
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::shared_ptr<RaggedAny> ragged = nullptr; // not owned by this class
std::shared_ptr<RaggedAny> ragged;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use RaggedAny ragged; directly?

RaggedAny contains a set of shared pointers. Using a std::shared_ptr here
increases one level of indirection.

struct RaggedAnyHolder : public torch::CustomClassHolder {
std::shared_ptr<RaggedAny> ragged = nullptr; // not owned by this class
explicit RaggedAnyHolder(std::shared_ptr<RaggedAny> ragged)
: ragged(ragged) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
: ragged(ragged) {}
: ragged(std::move(ragged)) {}

@csukuangfj csukuangfj added the ready Ready for review and trigger GitHub actions to run label Oct 20, 2021
@@ -109,8 +106,9 @@ RaggedArc RaggedArc::FromUnaryFunctionRagged(RaggedArc &src,
auto filler_scalar = torch::tensor(
filler, torch::dtype(torch::kInt32).device(value.device()));
value = torch::where(masking, value, filler_scalar);
auto new_value = arc_map_any.Index(value, py::int_(filler));
dest.SetAttr(iter.first, new_value.RemoveValuesEq(py::int_(filler)));
auto new_value = arc_map_any.Index(value, torch::IValue(filler));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can convert filler to an IValue in line 101 so that you don't need to convert it again in the following.

@@ -109,8 +106,9 @@ RaggedArc RaggedArc::FromUnaryFunctionRagged(RaggedArc &src,
auto filler_scalar = torch::tensor(
filler, torch::dtype(torch::kInt32).device(value.device()));
value = torch::where(masking, value, filler_scalar);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use filler directly in torch::where?

Copy link
Collaborator Author

@pkufool pkufool Oct 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I checked again, and seemed that current solution is the best one we can do. We need a int32_t scalar, but there is no int32_t scalar (only int64_t). If we use filler directly, it will be cast to int64_t scalar.

A lot of discussions about torch::where dtype, see here. I think it have not totally fixed yet.

@@ -69,10 +66,10 @@ RaggedArc::RaggedArc(
// TODO: we also need to pass the name of extra_labels and ragged_labels.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: we also need to pass the name of extra_labels and ragged_labels.
// TODO: we also need to pass the name of ragged_labels.

void RaggedArc::CopyTensorAttrs(const RaggedArc &src, torch::Tensor arc_map,
bool over_write /*= true*/) {
void FsaClass::CopyTensorAttrs(const FsaClass &src, torch::Tensor arc_map,
bool over_write /*= true*/) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need the argument overwrite? What if it is removed?

torch::Tensor arc_map,
bool over_write /*= true*/) {
void FsaClass::CopyRaggedTensorAttrs(const FsaClass &src, torch::Tensor arc_map,
bool over_write /*= true*/) {
for (const auto &iter : src.ragged_tensor_attrs) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove const here and remove const_cast in the following.

FsaClass FsaClass::FromUnaryFunctionRagged(FsaClass &src,
const Ragged<Arc> &arcs,
Ragged<int32_t> &arc_map,
bool remove_filler /*= true*/) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need remove_filler?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the logic from python version, will look into it.

Array1<int32_t> arc_map;
Ragged<Arc> arcs;
k2::ArcSort(fsa, &arcs, &arc_map);
return FromUnaryFunctionTensor(*this, arcs, ToTorch<int32_t>(arc_map));
}

void RaggedArc::SetAttr(const std::string &name, py::object value) {
void FsaClass::SetAttr(const std::string &name, torch::IValue value) {
if (name == "grad") {
// Note we don't use pybind11's def_property since it does not allow
// to use argument annotions, which means it is not possible to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// to use argument annotions, which means it is not possible to
// to use argument annotations, which means it is not possible to

return all_attr_names.count(name) > 0;
}

void RaggedArc::SetFiller(const std::string &name, float filler) {
void FsaClass::SetFiller(const std::string &name, float filler) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is SetFiller used?

I think that you would use

fsa.foo_filler = -1

in Python and in FsaClass::SetAttr if you detect that the attribute
name ends with _filler, you call SetFiller("foo", -1).

}

int32_t RaggedArc::Properties() {
int32_t FsaClass::Properties() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use the magic value 1 in the following

if (properties & 1 != 1) {

Please use kFsaPropertiesValid.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please move if (properties & 1 != 1) { to the statement if (properties == 0) {.

We don't need to check it every time we call Properties().

@@ -269,11 +275,11 @@ int32_t RaggedArc::Properties() {
return properties;
}

std::string RaggedArc::PropertiesStr() const {
std::string FsaClass::PropertiesStr() const {
return FsaPropertiesAsString(properties);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return FsaPropertiesAsString(properties);
return FsaPropertiesAsString(Properties());

and remove const in the function signature.

RaggedArc RaggedArc::To(const ContextPtr &context) const {
RaggedArc dest(fsa.To(context));
FsaClass FsaClass::To(const ContextPtr &context) const {
FsaClass dest(fsa.To(context));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is already inside the given context, we can return *this` directly
and document it explicitly.


void RaggedArc::SetLabels(torch::Tensor labels) {
void FsaClass::SetLabels(torch::Tensor labels) {
K2_CHECK_EQ(labels.numel(), fsa.NumElements());
Arcs().index({"...", 2}).copy_(labels);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Arcs().index({"...", 2}).copy_(labels);
Labels().copy_(labels);


void RaggedArc::SetLabels(torch::Tensor labels) {
void FsaClass::SetLabels(torch::Tensor labels) {
K2_CHECK_EQ(labels.numel(), fsa.NumElements());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check that the dtype of the input labels is torch.int32?

// We need this wrapper so that we can convert an instance
// of RaggedAny into `torch::IValue`
struct RaggedAnyHolder : public torch::CustomClassHolder {
std::shared_ptr<RaggedAny> ragged = nullptr; // not owned by this class
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use RaggedAny ragged; directly?

RaggedAny contains a set of shared pointers. Using a std::shared_ptr here
increases one level of indirection.

@@ -304,24 +313,24 @@ struct __attribute__((__visibility__("default"))) RaggedArc {
* if `over_write` is true, attributes in current fsa with the same name as
* attributes in src will be overworted by attributes in src.
*/
void CopyTensorAttrs(const RaggedArc &src, torch::Tensor arc_map,
void CopyTensorAttrs(const FsaClass &src, torch::Tensor arc_map,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 301 and 307,

all_attr_names.insert(name);

I think it can be removed.

@@ -304,24 +313,24 @@ struct __attribute__((__visibility__("default"))) RaggedArc {
* if `over_write` is true, attributes in current fsa with the same name as
* attributes in src will be overworted by attributes in src.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: overworted -> overwritten

void RaggedArc::CopyTensorAttrs(const RaggedArc &src, torch::Tensor arc_map,
bool over_write /*= true*/) {
void FsaClass::CopyTensorAttrs(const FsaClass &src, torch::Tensor arc_map,
bool over_write /*= true*/) {
for (const auto &iter : src.tensor_attrs) {
if (over_write || !HasAttr(iter.first)) {
float filler = GetFiller(iter.first);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 187, it calls the overload of SetAttr accepting tensor arguments, but we atually want to use the overload using IValue.

Please change the name of SetAttr that takes tensor arguments to SetTensorAttr.

return;
} catch (const py::cast_error &) {
// do nothing.
}

all_attr_names.insert(name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be removed. You already did it in line 428.

RaggedAny ragged_tensor = value.cast<RaggedAny>();
SetAttr(name, ragged_tensor);
if (value.isCustomClass()) {
torch::intrusive_ptr<RaggedAnyHolder> ragged_any_holder =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a check that the value contained in the IValue is indeed an instance of RaggedAny.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@csukuangfj How to check it? Is there any documents.

// See https://github.com/python/cpython/blob/main/Python/errors.c#L234
PyErr_SetString(PyExc_AttributeError, os.str().c_str());
throw py::error_already_set();
throw std::runtime_error(os.str().c_str());
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move the block between line 488 and line 494 to line 503.

You don't always need to delete the fillers. Only when the attribute to be deleted is a tensor attribute, do you need to check for its filler attribute.

@csukuangfj csukuangfj removed the ready Ready for review and trigger GitHub actions to run label Oct 21, 2021
src.SetAttr("ragged_attr", ToIValue(ragged_attr));

src.SetAttr("attr1", torch::IValue("src"));
src.SetAttr("attr2", torch::IValue("fsa"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one test case for std::string is enough. I would suggest changing the second
case to test other types, e.g., int.

torch::autograd::backward({sum_attr}, {});
torch::autograd::backward({sum_score}, {});
}
torch::autograd::backward({sum_attr}, {});
Copy link
Collaborator

@csukuangfj csukuangfj Oct 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use

sum_attr.backward();
sum_score.backward();

?

It's more intuitive and similar to what we are doing in Python.

@@ -531,8 +507,5 @@ TEST(RaggedArcTest, FromBinaryFunctionTensor) {

int main(int argc, char *argv[]) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove main and link to the library gtest_main.

Please see what we are doing in other xxx_test.cu files.

@@ -22,49 +22,48 @@

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add tests for other functions, e.g.,

  • SetScores
  • SetAttr/GetAttr/DeleteAttr
  • Scores, Labels
  • Tests for fillers

@pkufool pkufool added the ready Ready for review and trigger GitHub actions to run label Oct 21, 2021
@pkufool pkufool added ready Ready for review and trigger GitHub actions to run and removed ready Ready for review and trigger GitHub actions to run labels Oct 25, 2021
@pkufool pkufool added ready Ready for review and trigger GitHub actions to run and removed ready Ready for review and trigger GitHub actions to run labels Oct 25, 2021
@pkufool pkufool added ready Ready for review and trigger GitHub actions to run and removed ready Ready for review and trigger GitHub actions to run labels Oct 25, 2021
@pkufool pkufool added ready Ready for review and trigger GitHub actions to run and removed ready Ready for review and trigger GitHub actions to run labels Oct 25, 2021
@pkufool pkufool added ready Ready for review and trigger GitHub actions to run and removed ready Ready for review and trigger GitHub actions to run labels Oct 26, 2021
@pkufool pkufool added ready Ready for review and trigger GitHub actions to run and removed ready Ready for review and trigger GitHub actions to run labels Oct 26, 2021
@pkufool pkufool added ready Ready for review and trigger GitHub actions to run and removed ready Ready for review and trigger GitHub actions to run labels Oct 26, 2021
@pkufool pkufool merged commit daa98e7 into k2-fsa:v2.0-pre Oct 27, 2021
@@ -118,7 +118,7 @@ jobs:
- name: Display Build Information
shell: bash
run: |
export PYTHONPATH=$PWD/k2/python:$PWD/build/lib:$PYTHONPATH
export PYTHONPATH=$PWD/k2/torch/pytcon:$PWD/build/lib:$PYTHONPATH
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: pytcon

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already fixed.

if (properties & 1 != 1) {
K2_LOG(FATAL) << "Fsa is not valid, properties are : " << properties
<< " = " << PropertiesStr() << ", arcs are : " << fsa;
if (properties & kFsaPropertiesValid != 1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid using the magic number 1 here.

csukuangfj added a commit that referenced this pull request Oct 29, 2021
csukuangfj added a commit that referenced this pull request Oct 29, 2021
csukuangfj added a commit that referenced this pull request Nov 4, 2022
* [WIP]: Move k2.Fsa to C++ (#814)

* Make k2 ragged tensor more PyTorch-y like.

* Refactoring: Start to add the wrapper class AnyTensor.

* Refactoring.

* initial attempt to support autograd.

* First working version with autograd for Sum().

* Fix comments.

* Support __getitem__ and pickling.

* Add more docs for k2.ragged.Tensor

* Put documentation in header files.

* Minor fixes.

* Fix a typo.

* Fix an error.

* Add more doc.

* Wrap RaggedShape.

* [Not for Merge]: Move k2.Fsa related code to C++.

* Remove extra files.

* Update doc URL. (#821)

* Support manipulating attributes of k2.ragged.Fsa.

* Support indexing 2-axes RaggedTensor, Support slicing for RaggedTensor (#825)

* Support index 2-axes RaggedTensor, Support slicing for RaggedTensor

* Fix compiling errors

* Fix unit test

* Change RaggedTensor.data to RaggedTensor.values

* Fix style

* Add docs

* Run nightly-cpu when pushing code to nightly-cpu branch

* Prune with max_arcs in IntersectDense (#820)

* Add checking for array constructor

* Prune with max arcs

* Minor fix

* Fix typo

* Fix review comments

* Fix typo

* Release v1.8

* Create a ragged tensor from a regular tensor. (#827)

* Create a ragged tensor from a regular tensor.

* Add tests for creating ragged tensors from regular tensors.

* Add more tests.

* Print ragged tensors in a way like what PyTorch is doing.

* Fix test cases.

* Trigger GitHub actions manually. (#829)

* Run GitHub actions on merging. (#830)

* Support printing ragged tensors in a more compact way. (#831)

* Support printing ragged tensors in a more compact way.

* Disable support for torch 1.3.1

* Fix test failures.

* Add levenshtein alignment (#828)

* Add levenshtein graph

* Contruct k2.RaggedTensor in python part

* Fix review comments, return aux_labels in ctc_graph

* Fix tests

* Fix bug of accessing symbols

* Fix bug of accessing symbols

* Change argument name, add levenshtein_distance interface

* Fix test error, add tests for levenshtein_distance

* Fix review comments and add unit test for c++ side

* update the interface of levenshtein alignment

* Fix review comments

* Release v1.9

* Add Fsa.get_forward_scores.

* Implement backprop for Fsa.get_forward_scores()

* Construct RaggedArc from unary function tensor (#30)

* Construct RaggedArc from unary function tensor

* Move fsa_from_unary_ragged and fsa_from_binary_tensor to C++

* add unit test to from unary function; add more functions to fsa

* Remove some rabbish code

* Add more unit tests and docs

* Remove the unused code

* Fix review comments, propagate attributes in To()

* Change the argument type from RaggedAny to Ragged<int32_t> in autograd function

* Delete declaration for template function

* Apply suggestions from code review

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* Fix documentation errors

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

Co-authored-by: Wei Kang <wkang@pku.org.cn>

* Remove pybind dependencies from RaggedArc. (#842)

* Convert py::object and torch::IValue to each other

* Remove py::object from RaggedAny

* Remove py::object from RaggedArc

* Move files to torch directory

* remove unused files

* Add unit tests

* Remove v2 folder

* Remove unused code

* Remove unused files

* Fix review comments & fix github actions

* Check Ivalue contains RaggedAny

* Minor fixes

* Add attributes related unit test for FsaClass

* Fix mutable_grad in older pytorch version

* Fix github actions

* Fix github action PYTHONPATH

* Fix github action PYTHONPATH

* Link pybind11::embed

* import torch first (to fix macos github actions)

* try to fix macos ci

* Revert "Remove pybind dependencies from RaggedArc. (#842)" (#855)

This reverts commit daa98e7.

* Support torchscript. (#839)

* WIP: Support torchscript.

* Test jit module with faked data.

I have compared the output from C++ with that from Python.
The sums of the tensors are equal.

* Use precomputed features to test the correctness.

* Build DenseFsaVec from a torch tensor.

* Get lattice for CTC decoding.

* Support CTC decoding.

* Link sentencepiece statically.

Link sentencepiece dynamically causes segmentation fault at the end
of the process.

* Support loading HLG.pt

* Refactoring.

* Implement HLG decoding.

* Add WaveReader to read wave sound files.

* Take soundfiles as inputs.

* Refactoring.

* Support GPU.

* Minor fixes.

* Fix typos.

* Use kaldifeat v1.7

* Add copyright info.

* Fix compilation for torch >= 1.9.0

* Minor fixes.

* Fix comments.

* Fix style issues.

* Fix compiler warnings.

* Use `torch::class_` to register custom classes. (#856)

* Remove unused code (#857)

* Update doc URL. (#821)

* Support indexing 2-axes RaggedTensor, Support slicing for RaggedTensor (#825)

* Support index 2-axes RaggedTensor, Support slicing for RaggedTensor

* Fix compiling errors

* Fix unit test

* Change RaggedTensor.data to RaggedTensor.values

* Fix style

* Add docs

* Run nightly-cpu when pushing code to nightly-cpu branch

* Prune with max_arcs in IntersectDense (#820)

* Add checking for array constructor

* Prune with max arcs

* Minor fix

* Fix typo

* Fix review comments

* Fix typo

* Release v1.8

* Create a ragged tensor from a regular tensor. (#827)

* Create a ragged tensor from a regular tensor.

* Add tests for creating ragged tensors from regular tensors.

* Add more tests.

* Print ragged tensors in a way like what PyTorch is doing.

* Fix test cases.

* Trigger GitHub actions manually. (#829)

* Run GitHub actions on merging. (#830)

* Support printing ragged tensors in a more compact way. (#831)

* Support printing ragged tensors in a more compact way.

* Disable support for torch 1.3.1

* Fix test failures.

* Add levenshtein alignment (#828)

* Add levenshtein graph

* Contruct k2.RaggedTensor in python part

* Fix review comments, return aux_labels in ctc_graph

* Fix tests

* Fix bug of accessing symbols

* Fix bug of accessing symbols

* Change argument name, add levenshtein_distance interface

* Fix test error, add tests for levenshtein_distance

* Fix review comments and add unit test for c++ side

* update the interface of levenshtein alignment

* Fix review comments

* Release v1.9

* Support a[b[i]] where both a and b are ragged tensors. (#833)

* Display import error solution message on MacOS (#837)

* Fix installation doc. (#841)

* Fix installation doc.

Remove Windows support. Will fix it later.

* Fix style issues.

* fix typos in the install instructions (#844)

* make cmake adhere to the modernized way of finding packages outside default dirs (#845)

* import torch first in the smoke tests to preven SEGFAULT (#846)

* Add doc about how to install a CPU version of k2. (#850)

* Add doc about how to install a CPU version of k2.

* Remove property setter of Fsa.labels

* Update Ubuntu version in GitHub CI since 16.04 reaches end-of-life.

* Support PyTorch 1.10. (#851)

* Fix test cases for k2.union() (#853)

* Revert "Construct RaggedArc from unary function tensor (#30)" (#31)

This reverts commit cca7a54.

* Remove unused code.

* Fix github actions.

Avoid downloading all git LFS files.

* Enable github actions for v2.0-pre branch.

Co-authored-by: Wei Kang <wkang@pku.org.cn>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>
Co-authored-by: Jan "yenda" Trmal <jtrmal@gmail.com>

* Implements Cpp version FsaClass (#858)

* Add C++ version FsaClass

* Propagates attributes for CreateFsaVec

* Add more docs

* Remove the code that unnecessary needed currently

* Remove the code unnecessary for ctc decoding & HLG decoding

* Update k2/torch/csrc/deserialization.h

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* Fix Comments

* Fix code style

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>

* Using FsaClass for ctc decoding & HLG decoding (#862)

* Using FsaClass for ctc decoding & HLG decoding

* Update docs

* fix evaluating kFsaPropertiesValid (#866)

* Refactor deserialization code (#863)

* Fix compiler warnings about the usage of `tmpnam`.

* Refactor deserialization code.

* Minor fixes.

* Support rescoring with an n-gram LM during decoding (#867)

* Fix compiler warnings about the usage of `tmpnam`.

* Refactor deserialization code.

* Minor fixes.

* Add n-gram LM rescoring.

* Minor fixes.

* Clear cached FSA properties when its labels are changed.

* Fix typos.

* Refactor FsaClass. (#868)

Since FSAs in decoding contain only one or two attributes, we
don't need to use an IValue to add one more indirection. Just
check the type of the attribute and process it correspondingly.

* Refactor bin/decode.cu (#869)

* Add CTC decode.

* Add HLG decoding.

* Add n-gram LM rescoring.

* Remove unused files.

* Fix style issues.

* Add missing files.

* Add attention rescoring. (#870)

* WIP: Add attention rescoring.

* Finish attention rescoring.

* Fix style issues.

* Resolve comments. (#871)

* Resolve comments.

* Minor fixes.

* update v2.0-pre (#922)

* Update doc URL. (#821)

* Support indexing 2-axes RaggedTensor, Support slicing for RaggedTensor (#825)

* Support index 2-axes RaggedTensor, Support slicing for RaggedTensor

* Fix compiling errors

* Fix unit test

* Change RaggedTensor.data to RaggedTensor.values

* Fix style

* Add docs

* Run nightly-cpu when pushing code to nightly-cpu branch

* Prune with max_arcs in IntersectDense (#820)

* Add checking for array constructor

* Prune with max arcs

* Minor fix

* Fix typo

* Fix review comments

* Fix typo

* Release v1.8

* Create a ragged tensor from a regular tensor. (#827)

* Create a ragged tensor from a regular tensor.

* Add tests for creating ragged tensors from regular tensors.

* Add more tests.

* Print ragged tensors in a way like what PyTorch is doing.

* Fix test cases.

* Trigger GitHub actions manually. (#829)

* Run GitHub actions on merging. (#830)

* Support printing ragged tensors in a more compact way. (#831)

* Support printing ragged tensors in a more compact way.

* Disable support for torch 1.3.1

* Fix test failures.

* Add levenshtein alignment (#828)

* Add levenshtein graph

* Contruct k2.RaggedTensor in python part

* Fix review comments, return aux_labels in ctc_graph

* Fix tests

* Fix bug of accessing symbols

* Fix bug of accessing symbols

* Change argument name, add levenshtein_distance interface

* Fix test error, add tests for levenshtein_distance

* Fix review comments and add unit test for c++ side

* update the interface of levenshtein alignment

* Fix review comments

* Release v1.9

* Support a[b[i]] where both a and b are ragged tensors. (#833)

* Display import error solution message on MacOS (#837)

* Fix installation doc. (#841)

* Fix installation doc.

Remove Windows support. Will fix it later.

* Fix style issues.

* fix typos in the install instructions (#844)

* make cmake adhere to the modernized way of finding packages outside default dirs (#845)

* import torch first in the smoke tests to preven SEGFAULT (#846)

* Add doc about how to install a CPU version of k2. (#850)

* Add doc about how to install a CPU version of k2.

* Remove property setter of Fsa.labels

* Update Ubuntu version in GitHub CI since 16.04 reaches end-of-life.

* Support PyTorch 1.10. (#851)

* Fix test cases for k2.union() (#853)

* Fix out-of-boundary access (read). (#859)

* Update all the example codes in the docs (#861)

* Update all the example codes in the docs

I have run all the modified codes with  the newest version k2.

* do some changes

* Fix compilation errors with CUB 1.15. (#865)

* Update README. (#873)

* Update README.

* Fix typos.

* Fix ctc graph (make aux_labels of final arcs -1) (#877)

* Fix LICENSE location to k2 folder (#880)

* Release v1.11. (#881)

It contains bugfixes.

* Update documentation for hash.h (#887)

* Update documentation for hash.h

* Typo fix

* Wrap MonotonicLowerBound (#883)

* Wrap MonotonicLowerBound

* Add unit tests

* Support int64; update documents

* Remove extra commas after 'TOPSORTED' properity and fix RaggedTensor constructer parameter 'byte_offset' out-of-range bug. (#892)

Co-authored-by: gzchenduisheng <gzchenduisheng@corp.netease.com>

* Fix small typos (#896)

* Fix k2.ragged.create_ragged_shape2 (#901)

Before the fix, we have to specify both `row_splits` and `row_ids`
while calling `k2.create_ragged_shape2` even if one of them is `None`.

After this fix, we only need to specify one of them.

* Add rnnt loss (#891)

* Add cpp code of mutual information

* mutual information working

* Add rnnt loss

* Add pruned rnnt loss

* Minor Fixes

* Minor fixes & fix code style

* Fix cpp style

* Fix code style

* Fix s_begin values in padding positions

* Fix bugs related to boundary; Fix s_begin padding value; Add more tests

* Minor fixes

* Fix comments

* Add boundary to pruned loss tests

* Use more efficient way to fix boundaries (#906)

* Release v1.12 (#907)

* Change the sign of the rnnt_loss and add reduction argument (#911)

* Add right boundary constrains for s_begin

* Minor fixes to the interface of rnnt_loss to make it return positive value

* Fix comments

* Release a new version

* Minor fixes

* Minor fixes to the docs

* Fix building doc. (#908)

* Fix building doc.

* Minor fixes.

* Minor fixes.

* Fix building doc (#912)

* Fix building doc

* Fix flake8

* Support torch 1.10.x (#914)

* Support torch 1.10.x

* Fix installing PyTorch.

* Update INSTALL.rst (#915)

* Update INSTALL.rst

Setting a few additional env variables to enable compilation from source *with CUDA GPU computation support enabled*

* Fix torch/cuda/python versions in the doc. (#918)

* Fix torch/cuda/python versions in the doc.

* Minor fixes.

* Fix building for CUDA 11.6 (#917)

* Fix building for CUDA 11.6

* Minor fixes.

* Implement Unstack (#920)

* Implement unstack

* Remove code does not relate to this PR

* Remove for loop on output dim; add Unstack ragged

* Add more docs

* Fix comments

* Fix docs & unit tests

* SubsetRagged & PruneRagged (#919)

* Extend interface of SubsampleRagged.

* Add interface for pruning ragged tensor.

* Draft of new RNN-T decoding method

* Implements SubsampleRaggedShape

* Implements PruneRagged

* Rename subsample-> subset

* Minor fixes

* Fix comments

Co-authored-by: Daniel Povey <dpovey@gmail.com>

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>
Co-authored-by: Jan "yenda" Trmal <jtrmal@gmail.com>
Co-authored-by: Mingshuang Luo <37799481+luomingshuang@users.noreply.github.com>
Co-authored-by: Ludwig Kürzinger <lumaku@users.noreply.github.com>
Co-authored-by: Daniel Povey <dpovey@gmail.com>
Co-authored-by: drawfish <duisheng.chen@gmail.com>
Co-authored-by: gzchenduisheng <gzchenduisheng@corp.netease.com>
Co-authored-by: alexei-v-ivanov <alexei_v_ivanov@ieee.org>

* Online decoding (#876)

* Add OnlineIntersectDensePruned

* Fix get partial results

* Support online decoding on intersect_dense_pruned

* Update documents

* Update v2.0-pre (#942)

* Update doc URL. (#821)

* Support indexing 2-axes RaggedTensor, Support slicing for RaggedTensor (#825)

* Support index 2-axes RaggedTensor, Support slicing for RaggedTensor

* Fix compiling errors

* Fix unit test

* Change RaggedTensor.data to RaggedTensor.values

* Fix style

* Add docs

* Run nightly-cpu when pushing code to nightly-cpu branch

* Prune with max_arcs in IntersectDense (#820)

* Add checking for array constructor

* Prune with max arcs

* Minor fix

* Fix typo

* Fix review comments

* Fix typo

* Release v1.8

* Create a ragged tensor from a regular tensor. (#827)

* Create a ragged tensor from a regular tensor.

* Add tests for creating ragged tensors from regular tensors.

* Add more tests.

* Print ragged tensors in a way like what PyTorch is doing.

* Fix test cases.

* Trigger GitHub actions manually. (#829)

* Run GitHub actions on merging. (#830)

* Support printing ragged tensors in a more compact way. (#831)

* Support printing ragged tensors in a more compact way.

* Disable support for torch 1.3.1

* Fix test failures.

* Add levenshtein alignment (#828)

* Add levenshtein graph

* Contruct k2.RaggedTensor in python part

* Fix review comments, return aux_labels in ctc_graph

* Fix tests

* Fix bug of accessing symbols

* Fix bug of accessing symbols

* Change argument name, add levenshtein_distance interface

* Fix test error, add tests for levenshtein_distance

* Fix review comments and add unit test for c++ side

* update the interface of levenshtein alignment

* Fix review comments

* Release v1.9

* Support a[b[i]] where both a and b are ragged tensors. (#833)

* Display import error solution message on MacOS (#837)

* Fix installation doc. (#841)

* Fix installation doc.

Remove Windows support. Will fix it later.

* Fix style issues.

* fix typos in the install instructions (#844)

* make cmake adhere to the modernized way of finding packages outside default dirs (#845)

* import torch first in the smoke tests to preven SEGFAULT (#846)

* Add doc about how to install a CPU version of k2. (#850)

* Add doc about how to install a CPU version of k2.

* Remove property setter of Fsa.labels

* Update Ubuntu version in GitHub CI since 16.04 reaches end-of-life.

* Support PyTorch 1.10. (#851)

* Fix test cases for k2.union() (#853)

* Fix out-of-boundary access (read). (#859)

* Update all the example codes in the docs (#861)

* Update all the example codes in the docs

I have run all the modified codes with  the newest version k2.

* do some changes

* Fix compilation errors with CUB 1.15. (#865)

* Update README. (#873)

* Update README.

* Fix typos.

* Fix ctc graph (make aux_labels of final arcs -1) (#877)

* Fix LICENSE location to k2 folder (#880)

* Release v1.11. (#881)

It contains bugfixes.

* Update documentation for hash.h (#887)

* Update documentation for hash.h

* Typo fix

* Wrap MonotonicLowerBound (#883)

* Wrap MonotonicLowerBound

* Add unit tests

* Support int64; update documents

* Remove extra commas after 'TOPSORTED' properity and fix RaggedTensor constructer parameter 'byte_offset' out-of-range bug. (#892)

Co-authored-by: gzchenduisheng <gzchenduisheng@corp.netease.com>

* Fix small typos (#896)

* Fix k2.ragged.create_ragged_shape2 (#901)

Before the fix, we have to specify both `row_splits` and `row_ids`
while calling `k2.create_ragged_shape2` even if one of them is `None`.

After this fix, we only need to specify one of them.

* Add rnnt loss (#891)

* Add cpp code of mutual information

* mutual information working

* Add rnnt loss

* Add pruned rnnt loss

* Minor Fixes

* Minor fixes & fix code style

* Fix cpp style

* Fix code style

* Fix s_begin values in padding positions

* Fix bugs related to boundary; Fix s_begin padding value; Add more tests

* Minor fixes

* Fix comments

* Add boundary to pruned loss tests

* Use more efficient way to fix boundaries (#906)

* Release v1.12 (#907)

* Change the sign of the rnnt_loss and add reduction argument (#911)

* Add right boundary constrains for s_begin

* Minor fixes to the interface of rnnt_loss to make it return positive value

* Fix comments

* Release a new version

* Minor fixes

* Minor fixes to the docs

* Fix building doc. (#908)

* Fix building doc.

* Minor fixes.

* Minor fixes.

* Fix building doc (#912)

* Fix building doc

* Fix flake8

* Support torch 1.10.x (#914)

* Support torch 1.10.x

* Fix installing PyTorch.

* Update INSTALL.rst (#915)

* Update INSTALL.rst

Setting a few additional env variables to enable compilation from source *with CUDA GPU computation support enabled*

* Fix torch/cuda/python versions in the doc. (#918)

* Fix torch/cuda/python versions in the doc.

* Minor fixes.

* Fix building for CUDA 11.6 (#917)

* Fix building for CUDA 11.6

* Minor fixes.

* Implement Unstack (#920)

* Implement unstack

* Remove code does not relate to this PR

* Remove for loop on output dim; add Unstack ragged

* Add more docs

* Fix comments

* Fix docs & unit tests

* SubsetRagged & PruneRagged (#919)

* Extend interface of SubsampleRagged.

* Add interface for pruning ragged tensor.

* Draft of new RNN-T decoding method

* Implements SubsampleRaggedShape

* Implements PruneRagged

* Rename subsample-> subset

* Minor fixes

* Fix comments

Co-authored-by: Daniel Povey <dpovey@gmail.com>

* Add Hash64 (#895)

* Add hash64

* Fix tests

* Resize hash64

* Fix comments

* fix typo

* Modified rnnt (#902)

* Add modified mutual_information_recursion

* Add modified rnnt loss

* Using more efficient way to fix boundaries

* Fix modified pruned rnnt loss

* Fix the s_begin constrains of pruned loss for modified version transducer

* Fix Stack (#925)

* return the correct layer

* unskip the test

* Fix 'TypeError' of rnnt_loss_pruned function. (#924)

* Fix 'TypeError' of rnnt_loss_simple function.

Fix 'TypeError' exception when calling rnnt_loss_simple(..., return_grad=False)  at validation steps.

* Fix 'MutualInformationRecursionFunction.forward()' return type check error for pytorch < 1.10.x

* Modify return type.

* Add documents about class MutualInformationRecursionFunction.

* Formated code style.

* Fix rnnt_loss_smoothed return type.

Co-authored-by: gzchenduisheng <gzchenduisheng@corp.netease.com>

* Support torch 1.11.0 and CUDA 11.5 (#931)

* Support torch 1.11.0 and CUDA 11.5

* Implement Rnnt decoding (#926)

* first working draft of rnnt decoding

* FormatOutput works...

* Different num frames for FormatOutput works

* Update docs

* Fix comments, break advance into several stages, add more docs

* Add python wrapper

* Add more docs

* Minor fixes

* Fix comments

* fix building docs (#933)

* Release v1.14

* Remove unused DiscountedCumSum. (#936)

* Fix compiler warnings. (#937)

* Fix compiler warnings.

* Minor fixes for RNN-T decoding. (#938)

* Minor fixes for RNN-T decoding.

* Removes arcs with label 0 from the TrivialGraph. (#939)

* Implement linear_fsa_with_self_loops. (#940)

* Implement linear_fsa_with_self_loops.

* Fix the pruning with max-states (#941)

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>
Co-authored-by: Jan "yenda" Trmal <jtrmal@gmail.com>
Co-authored-by: Mingshuang Luo <37799481+luomingshuang@users.noreply.github.com>
Co-authored-by: Ludwig Kürzinger <lumaku@users.noreply.github.com>
Co-authored-by: Daniel Povey <dpovey@gmail.com>
Co-authored-by: drawfish <duisheng.chen@gmail.com>
Co-authored-by: gzchenduisheng <gzchenduisheng@corp.netease.com>
Co-authored-by: alexei-v-ivanov <alexei_v_ivanov@ieee.org>
Co-authored-by: Wang, Guanbo <wgb14@outlook.com>

* update v2.0-pre (#953)

* Update doc URL. (#821)

* Support indexing 2-axes RaggedTensor, Support slicing for RaggedTensor (#825)

* Support index 2-axes RaggedTensor, Support slicing for RaggedTensor

* Fix compiling errors

* Fix unit test

* Change RaggedTensor.data to RaggedTensor.values

* Fix style

* Add docs

* Run nightly-cpu when pushing code to nightly-cpu branch

* Prune with max_arcs in IntersectDense (#820)

* Add checking for array constructor

* Prune with max arcs

* Minor fix

* Fix typo

* Fix review comments

* Fix typo

* Release v1.8

* Create a ragged tensor from a regular tensor. (#827)

* Create a ragged tensor from a regular tensor.

* Add tests for creating ragged tensors from regular tensors.

* Add more tests.

* Print ragged tensors in a way like what PyTorch is doing.

* Fix test cases.

* Trigger GitHub actions manually. (#829)

* Run GitHub actions on merging. (#830)

* Support printing ragged tensors in a more compact way. (#831)

* Support printing ragged tensors in a more compact way.

* Disable support for torch 1.3.1

* Fix test failures.

* Add levenshtein alignment (#828)

* Add levenshtein graph

* Contruct k2.RaggedTensor in python part

* Fix review comments, return aux_labels in ctc_graph

* Fix tests

* Fix bug of accessing symbols

* Fix bug of accessing symbols

* Change argument name, add levenshtein_distance interface

* Fix test error, add tests for levenshtein_distance

* Fix review comments and add unit test for c++ side

* update the interface of levenshtein alignment

* Fix review comments

* Release v1.9

* Support a[b[i]] where both a and b are ragged tensors. (#833)

* Display import error solution message on MacOS (#837)

* Fix installation doc. (#841)

* Fix installation doc.

Remove Windows support. Will fix it later.

* Fix style issues.

* fix typos in the install instructions (#844)

* make cmake adhere to the modernized way of finding packages outside default dirs (#845)

* import torch first in the smoke tests to preven SEGFAULT (#846)

* Add doc about how to install a CPU version of k2. (#850)

* Add doc about how to install a CPU version of k2.

* Remove property setter of Fsa.labels

* Update Ubuntu version in GitHub CI since 16.04 reaches end-of-life.

* Support PyTorch 1.10. (#851)

* Fix test cases for k2.union() (#853)

* Fix out-of-boundary access (read). (#859)

* Update all the example codes in the docs (#861)

* Update all the example codes in the docs

I have run all the modified codes with  the newest version k2.

* do some changes

* Fix compilation errors with CUB 1.15. (#865)

* Update README. (#873)

* Update README.

* Fix typos.

* Fix ctc graph (make aux_labels of final arcs -1) (#877)

* Fix LICENSE location to k2 folder (#880)

* Release v1.11. (#881)

It contains bugfixes.

* Update documentation for hash.h (#887)

* Update documentation for hash.h

* Typo fix

* Wrap MonotonicLowerBound (#883)

* Wrap MonotonicLowerBound

* Add unit tests

* Support int64; update documents

* Remove extra commas after 'TOPSORTED' properity and fix RaggedTensor constructer parameter 'byte_offset' out-of-range bug. (#892)

Co-authored-by: gzchenduisheng <gzchenduisheng@corp.netease.com>

* Fix small typos (#896)

* Fix k2.ragged.create_ragged_shape2 (#901)

Before the fix, we have to specify both `row_splits` and `row_ids`
while calling `k2.create_ragged_shape2` even if one of them is `None`.

After this fix, we only need to specify one of them.

* Add rnnt loss (#891)

* Add cpp code of mutual information

* mutual information working

* Add rnnt loss

* Add pruned rnnt loss

* Minor Fixes

* Minor fixes & fix code style

* Fix cpp style

* Fix code style

* Fix s_begin values in padding positions

* Fix bugs related to boundary; Fix s_begin padding value; Add more tests

* Minor fixes

* Fix comments

* Add boundary to pruned loss tests

* Use more efficient way to fix boundaries (#906)

* Release v1.12 (#907)

* Change the sign of the rnnt_loss and add reduction argument (#911)

* Add right boundary constrains for s_begin

* Minor fixes to the interface of rnnt_loss to make it return positive value

* Fix comments

* Release a new version

* Minor fixes

* Minor fixes to the docs

* Fix building doc. (#908)

* Fix building doc.

* Minor fixes.

* Minor fixes.

* Fix building doc (#912)

* Fix building doc

* Fix flake8

* Support torch 1.10.x (#914)

* Support torch 1.10.x

* Fix installing PyTorch.

* Update INSTALL.rst (#915)

* Update INSTALL.rst

Setting a few additional env variables to enable compilation from source *with CUDA GPU computation support enabled*

* Fix torch/cuda/python versions in the doc. (#918)

* Fix torch/cuda/python versions in the doc.

* Minor fixes.

* Fix building for CUDA 11.6 (#917)

* Fix building for CUDA 11.6

* Minor fixes.

* Implement Unstack (#920)

* Implement unstack

* Remove code does not relate to this PR

* Remove for loop on output dim; add Unstack ragged

* Add more docs

* Fix comments

* Fix docs & unit tests

* SubsetRagged & PruneRagged (#919)

* Extend interface of SubsampleRagged.

* Add interface for pruning ragged tensor.

* Draft of new RNN-T decoding method

* Implements SubsampleRaggedShape

* Implements PruneRagged

* Rename subsample-> subset

* Minor fixes

* Fix comments

Co-authored-by: Daniel Povey <dpovey@gmail.com>

* Add Hash64 (#895)

* Add hash64

* Fix tests

* Resize hash64

* Fix comments

* fix typo

* Modified rnnt (#902)

* Add modified mutual_information_recursion

* Add modified rnnt loss

* Using more efficient way to fix boundaries

* Fix modified pruned rnnt loss

* Fix the s_begin constrains of pruned loss for modified version transducer

* Fix Stack (#925)

* return the correct layer

* unskip the test

* Fix 'TypeError' of rnnt_loss_pruned function. (#924)

* Fix 'TypeError' of rnnt_loss_simple function.

Fix 'TypeError' exception when calling rnnt_loss_simple(..., return_grad=False)  at validation steps.

* Fix 'MutualInformationRecursionFunction.forward()' return type check error for pytorch < 1.10.x

* Modify return type.

* Add documents about class MutualInformationRecursionFunction.

* Formated code style.

* Fix rnnt_loss_smoothed return type.

Co-authored-by: gzchenduisheng <gzchenduisheng@corp.netease.com>

* Support torch 1.11.0 and CUDA 11.5 (#931)

* Support torch 1.11.0 and CUDA 11.5

* Implement Rnnt decoding (#926)

* first working draft of rnnt decoding

* FormatOutput works...

* Different num frames for FormatOutput works

* Update docs

* Fix comments, break advance into several stages, add more docs

* Add python wrapper

* Add more docs

* Minor fixes

* Fix comments

* fix building docs (#933)

* Release v1.14

* Remove unused DiscountedCumSum. (#936)

* Fix compiler warnings. (#937)

* Fix compiler warnings.

* Minor fixes for RNN-T decoding. (#938)

* Minor fixes for RNN-T decoding.

* Removes arcs with label 0 from the TrivialGraph. (#939)

* Implement linear_fsa_with_self_loops. (#940)

* Implement linear_fsa_with_self_loops.

* Fix the pruning with max-states (#941)

* Rnnt allow different encoder/decoder dims (#945)

* Allow different encoder and decoder dim in rnnt_pruning

* Bug fixes

* Supporting building k2 on Windows (#946)

* Fix nightly windows CPU build (#948)

* Fix nightly building k2 for windows.

* Run nightly build only if there are new commits.

* Check the versions of PyTorch and CUDA at the import time. (#949)

* Check the versions of PyTorch and CUDA at the import time.

* More straightforward message when CUDA support is missing (#950)

* Implement ArrayOfRagged (#927)

* Implement ArrayOfRagged

* Fix issues and pass tests

* fix style

* change few statements of functions and move the definiation of template Array1OfRagged to header file

* add offsets test code

* Fix precision (#951)

* Fix precision

* Using different pow version for windows and *nix

* Use int64_t pow

* Minor fixes

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>
Co-authored-by: Jan "yenda" Trmal <jtrmal@gmail.com>
Co-authored-by: Mingshuang Luo <37799481+luomingshuang@users.noreply.github.com>
Co-authored-by: Ludwig Kürzinger <lumaku@users.noreply.github.com>
Co-authored-by: Daniel Povey <dpovey@gmail.com>
Co-authored-by: drawfish <duisheng.chen@gmail.com>
Co-authored-by: gzchenduisheng <gzchenduisheng@corp.netease.com>
Co-authored-by: alexei-v-ivanov <alexei_v_ivanov@ieee.org>
Co-authored-by: Wang, Guanbo <wgb14@outlook.com>
Co-authored-by: Nickolay V. Shmyrev <nshmyrev@gmail.com>
Co-authored-by: LvHang <hanglyu1991@gmail.com>

* Add C++ Rnnt demo (#947)

* rnnt_demo compiles

* Change graph in RnntDecodingStream from shared_ptr to const reference

* Change out_map from Array1 to Ragged

* Add rnnt demo

* Minor fixes

* Add more docs

* Support log_add when getting best path

* Port kaldi::ParseOptions for parsing commandline options. (#974)

* Port kaldi::ParseOptions for parsing commandline options.

* Add more tests.

* More tests.

* Greedy search and modified beam search for pruned stateless RNN-T. (#975)

* First version of greedy search.

* WIP: Implement modified beam search and greedy search for pruned RNN-T.

* Implement modified beam search.

* Fix compiler warnings

* Fix style issues

* Update torch_api.h to include APIs for CTC decoding

Co-authored-by: Wei Kang <wkang@pku.org.cn>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>
Co-authored-by: Jan "yenda" Trmal <jtrmal@gmail.com>
Co-authored-by: pingfengluo <pingfengluo@gmail.com>
Co-authored-by: Mingshuang Luo <37799481+luomingshuang@users.noreply.github.com>
Co-authored-by: Ludwig Kürzinger <lumaku@users.noreply.github.com>
Co-authored-by: Daniel Povey <dpovey@gmail.com>
Co-authored-by: drawfish <duisheng.chen@gmail.com>
Co-authored-by: gzchenduisheng <gzchenduisheng@corp.netease.com>
Co-authored-by: alexei-v-ivanov <alexei_v_ivanov@ieee.org>
Co-authored-by: Wang, Guanbo <wgb14@outlook.com>
Co-authored-by: Nickolay V. Shmyrev <nshmyrev@gmail.com>
Co-authored-by: LvHang <hanglyu1991@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready Ready for review and trigger GitHub actions to run
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants