Skip to content

【help】why function llama_build_graph is internal function llama_decode? #5916

@xjj210130

Description

@xjj210130

I read the llama.cpp source code。
I am confused as to why the function llama_build_graph needs to be called every time the function llama_decode is called.
The function llama_build_graph cannot be called during program initialization, which will reduce the inference time.

static int llama_decode_internal(
llama_context & lctx,
llama_batch batch) {
....
ggml_cgraph * gf = llama_build_graph(lctx, batch, false);
.....
}

Thanks

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions