nnet3- update CachingOptimizingCompiler to store a configurable number of computations. #69

danpovey · 2015-08-11T20:14:49Z

Would require adding a hashing object for class ComputationRequest, so we can do fast lookup in a hash.
Idea is to cache the n most recently accessed Computations. Probably a reasonable default is to cache around 20 computations.

danpovey · 2015-09-01T01:48:17Z

@naxingyu , you might want to try this.
Make sure you understand hashing functions, hashing objects, and unordered_map first. Grep around for unordered_map objects in the existing code.
Dan

naxingyu · 2015-09-01T01:54:58Z

I'll see what I can do.

On 09/01/2015 09:48 AM, Daniel Povey wrote:

@naxingyu https://github.com/naxingyu , you might want to try this.
Make sure you understand hashing functions, hashing objects,
equality-testing objects, and unordered_map first. Grep around for
unordered_map objects in the existing code.
Dan

—
Reply to this email directly or view it on GitHub
#69 (comment).

naxingyu · 2015-09-01T07:34:27Z

My guess is

add a member to CachingOptimizingCompiler class,
unordered_map<int32, ComputationRequest> request_cache_;
Reserve the size of request_cache_ to a configurable number (20 as
default).
Each time Compile() is called, add the request to the bucket.

Is it correct? And how would you like the key defined? I see we have
StringHasher, VectorHasher, PairHasher and CindexHasher.

On 09/01/2015 09:48 AM, Daniel Povey wrote:

@naxingyu https://github.com/naxingyu , you might want to try this.
Make sure you understand hashing functions, hashing objects,
equality-testing objects, and unordered_map first. Grep around for
unordered_map objects in the existing code.
Dan

—
Reply to this email directly or view it on GitHub
#69 (comment).

danpovey · 2015-09-01T14:08:43Z

My guess is

add a member to CachingOptimizingCompiler class,
unordered_map<int32, ComputationRequest> request_cache_;

Reserve the size of request_cache_ to a configurable number (20 as
default).

Each time Compile() is called, add it to the bucket.

Is it correct? And how would you like the key defined? I see we have
StringHasher, VectorHasher, PairHasher and CindexHasher.

This is not really right, it needs to be a map from something like
ComputationRequest to Computation (or possibly pointers). Which requires a
hashing object. And 'reserve' on the map is just an advisory thing to the
unordered_map code; it's unrelated to the core issue, which is only keeping
a number of Computations (the n most recent)... this might involve some
kind of priority queue, or at least storing something which says how
recently a particular computation was used (e.g. value in map could be
std::pair<int32, Computation>, with the int32 being some kind of 't' value
keeping track of freshness.

Actually, a map may not even be necessary (or necessarily the most
efficient way of doing it)-- once we have the hash function of the
ComputationRequest, manually going through 20 hash functions of
ComputationRequests and comparing might not be that hard. So simply a
priority queue may be the right way. But I guess we are probing the limits
of your knowledge of C++.
Kirill, do you have time to either do this, or explain to Xingyu in more
detail how to do it?

Dan

On 09/01/2015 09:48 AM, Daniel Povey wrote:

@naxingyu https://github.com/naxingyu , you might want to try this.
Make sure you understand hashing functions, hashing objects,
equality-testing objects, and unordered_map first. Grep around for
unordered_map objects in the existing code.
Dan

—
Reply to this email directly or view it on GitHub
#69 (comment).

—
Reply to this email directly or view it on GitHub
#69 (comment).

danpovey · 2015-09-01T14:12:14Z

My guess is

add a member to CachingOptimizingCompiler class,
unordered_map<int32, ComputationRequest> request_cache_;

Reserve the size of request_cache_ to a configurable number (20 as
default).

Each time Compile() is called, add it to the bucket.

Is it correct? And how would you like the key defined? I see we have
StringHasher, VectorHasher, PairHasher and CindexHasher.

This is not really right, it needs to be a map from something like
ComputationRequest to Computation (or possibly pointers). Which requires a
hashing object. And 'reserve' on the map is just an advisory thing to the
unordered_map code; it's unrelated to the core issue, which is only keeping
a number of Computations (the n most recent)... this might involve some
kind of priority queue, or at least storing something which says how
recently a particular computation was used (e.g. value in map could be
std::pair<int32, Computation>, with the int32 being some kind of 't' value
keeping track of freshness.

Actually, a map may not even be necessary (or not even necessarily the most
efficient way of doing it)-- once we have the hash function of the
ComputationRequest, manually going through 20 hash functions of
ComputationRequests and comparing the integers (or size_t's) might not be
that hard. So simply a priority queue may be the right way. But I guess
we are probing the limits of your knowledge of C++.
Kirill, do you have time to either do this, or explain to Xingyu in more
detail how to do it?

Dan

On 09/01/2015 09:48 AM, Daniel Povey wrote:

@naxingyu https://github.com/naxingyu , you might want to try this.
Make sure you understand hashing functions, hashing objects,
equality-testing objects, and unordered_map first. Grep around for
unordered_map objects in the existing code.
Dan

—
Reply to this email directly or view it on GitHub
#69 (comment).

—
Reply to this email directly or view it on GitHub
#69 (comment).

vdp · 2015-09-01T15:39:33Z

By the way, if I understand Dan's requirements correctly, the search term to use in Google could be something like "LRU cache C++" ...

naxingyu · 2015-09-02T01:38:14Z

Thank you to @danpovey and @vdp for the info. I try to push my limits on this one :)

naxingyu · 2015-09-02T07:54:15Z

Came up with another plan after Googling around...
I would add three variable members and a function member to the Caching class:

// The most recently accessed ComputationRequest appears at time_stemp.end()-1,
// and the least recently accessed one appears at the time_stemp.begin(). This list
// is used to keep track of freshness.

std::list<ComputationRequest> time_stemp_;

// The key is a request, the value is a pair of a pointer to the computation
// and an iterator of time_stemp_ pointing to the time stemp of the key(request).
// So the priority queue can be maintained at a cost of O(1) by finding the
// least accessed computation using fast look up and any computation
// can be found using the find operator () also at a O(1) cost.

unordered_map<ComputationRequest, std::pair<NnetComputation*,
                  std::list<ComputationRequest>::iterator>,
                  ComputationRequestHasher > computation_cache_;

// caching capacity, should be const

const int32 capacity_;

// This function would be called within Compile() before returning computation.
// If the request is found in the cache, update the time_stemp_ to keep
// track of freshness. Otherwise, insert the request to the end of time_stemp_
// and purge the beginning request if capacity is reached.

void UpdateCache();

Does it make sense?

vdp · 2015-09-02T09:45:13Z

Hi Xingyu,

I haven't coded much in C++ recently, so I'm a bit rusty, and I don't know the nnet3 code, but I think that you are headed in the right direction. I think that you can get some idea about the implementation of the operations from simple examples like e.g. this one. Your code will have the advantage of using the standard unordered_map and list classes, so it could be even more succinct. I think that in addition to the hasher class you will also have to provide an implementation for the key equality operation, as explained for example in this SO answer.

Perhaps, it would be good to make this code generic, i.e. template-based, if Dan thinks it might be useful in future. You could place the implementation under src/util/. Maybe something like src/util/lru-cache.h; perhaps you can borrow ideas about the structure of the code from util/hash-list.h.

A minor nitpick: I think that the name "time_stamp_" is a bit misleading, because we are not keeping explicit times there- only the access order. Maybe it would be better to call it access_queue_, lru_queue_ or some such.

danpovey · 2015-09-02T18:29:25Z

I'm not sure that Xingyu is the right person to create a generic version of this thing- let's focus on the specific case for now.
You have to remember that both ComputationRequest and Computation are objects of significant size, so keeping redundant copies, or moving them around, is a no-no. You may need to work with pointers, and think carefully about memory management.
Actually I suspect that there are other issues with your proposed implementation, but I would prefer that you attempt to implement it first, because you will have a learning curve and in the process of coding it you may discover some things yourself.

naxingyu · 2015-09-03T17:05:06Z

@vdp Thank you Vassil. Luckily we have operator== defined in the key class (ComputationRequest).
@danpovey Agree. I open a PR on this issue.

danpovey · 2015-09-12T00:24:06Z

@naxingyu did this, so closing issue.

Use xp.full

More small fixes

kkm000 added help wanted Please help us with this issue! nnet3 labels Aug 18, 2015

kkm000 added this to the nnet3 milestone Aug 18, 2015

naxingyu mentioned this issue Sep 3, 2015

add nnet3 computation cache #120

Merged

danpovey closed this as completed Sep 12, 2015

hainan-xv pushed a commit to hainan-xv/kaldi that referenced this issue Jul 24, 2018

Merge pull request kaldi-asr#69 from unnonouno/full

d9def0e

Use xp.full

hhadian pushed a commit to hhadian/kaldi that referenced this issue Jan 7, 2019

Merge pull request kaldi-asr#69 from hhadian/chaina2

f057108

More small fixes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nnet3- update CachingOptimizingCompiler to store a configurable number of computations. #69

nnet3- update CachingOptimizingCompiler to store a configurable number of computations. #69

danpovey commented Aug 11, 2015

danpovey commented Sep 1, 2015

naxingyu commented Sep 1, 2015

naxingyu commented Sep 1, 2015

danpovey commented Sep 1, 2015

danpovey commented Sep 1, 2015

vdp commented Sep 1, 2015

naxingyu commented Sep 2, 2015

naxingyu commented Sep 2, 2015

vdp commented Sep 2, 2015

danpovey commented Sep 2, 2015

naxingyu commented Sep 3, 2015

danpovey commented Sep 12, 2015

nnet3- update CachingOptimizingCompiler to store a configurable number of computations. #69

nnet3- update CachingOptimizingCompiler to store a configurable number of computations. #69

Comments

danpovey commented Aug 11, 2015

danpovey commented Sep 1, 2015

naxingyu commented Sep 1, 2015

naxingyu commented Sep 1, 2015

danpovey commented Sep 1, 2015

danpovey commented Sep 1, 2015

vdp commented Sep 1, 2015

naxingyu commented Sep 2, 2015

naxingyu commented Sep 2, 2015

vdp commented Sep 2, 2015

danpovey commented Sep 2, 2015

naxingyu commented Sep 3, 2015

danpovey commented Sep 12, 2015