Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question : About construction of total_cache_k, total_cache_v in Transformer #442

Open
Kangmo opened this issue Dec 26, 2022 · 4 comments
Open

Comments

@Kangmo
Copy link
Contributor

Kangmo commented Dec 26, 2022

In lightseq/csrc/models/transformer.cu,
Should cache_k_out and cache_v_out call set_ancestor? Otherwise why not remove the unused variable cache_k_out and cache_k_out?

Transformer::Transformer {
  ...
  for (auto iter : dec_layer_vec) {
    Variable *cache_k = new Variable("cache_k");
    Variable *cache_v = new Variable("cache_v");
    std::tuple<Variable *, Variable *, Variable *> dec_outs =
        (*iter)(dec_emb, total_enc_kv, pad_mask, cache_k, cache_v);
    dec_emb = std::get<0>(dec_outs);
    Variable *cache_k_out = std::get<1>(dec_outs);
    Variable *cache_v_out = std::get<2>(dec_outs);

    cache_k->set_ancestor(total_cache_k, cache_size * dec_layer_idx);
    cache_v->set_ancestor(total_cache_v, cache_size * dec_layer_idx);
    dec_layer_idx++;
  }

cache_k->set_ancestor(total_cache_k, cache_size * dec_layer_idx);

@hexisyztem
Copy link
Collaborator

Sorry, this is the design of the new architecture, which uses some fixed syntax to manage GPU memory sharing.

@hexisyztem
Copy link
Collaborator

The set_ancestor function is to assign cache_k to a continuous segment in total_cache_k. Specific to the case you gave here, cache_k_out can be removed.

@hexisyztem
Copy link
Collaborator

I'll fix this detail in my next commit

@Kangmo
Copy link
Contributor Author

Kangmo commented Jan 5, 2023

thank you for the confirmation, hexisyztem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants