Skip to content

Commit

Permalink
graph: backend: compiler: fix updated llama mlp pattern
Browse files Browse the repository at this point in the history
  • Loading branch information
yifeizh2 authored and vpirogov committed Oct 17, 2023
1 parent 4207105 commit 595543d
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion src/graph/backend/graph_compiler/patterns/mlp_pattern.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -353,8 +353,12 @@ void create_llama_mlp(const std::shared_ptr<pb_graph_t> &pgraph,
auto quant1 = pgraph->append_op(graph::op_kind::Quantize,
{in_edge(0, extra_cast_after_mul, 0)});
if (split_smooth_quant) {
auto extra_cast_before_mul_rhs
= append_single_op_repetition_subgraph(
pgraph, graph::op_kind::TypeCast, norm1);
auto smooth_quant_mul1_rhs = append_single_op_repetition_subgraph(
pgraph, graph::op_kind::Multiply, extra_cast_before_mul);
pgraph, graph::op_kind::Multiply,
extra_cast_before_mul_rhs);
auto extra_cast_after_mul_rhs
= append_single_op_repetition_subgraph(pgraph,
graph::op_kind::TypeCast, smooth_quant_mul1_rhs, 0,
Expand Down

0 comments on commit 595543d

Please sign in to comment.