Skip to content

assign1#8

Closed
Rory0730 wants to merge 2 commits intoInfiniTensor:mainfrom
Rory0730:assign1
Closed

assign1#8
Rory0730 wants to merge 2 commits intoInfiniTensor:mainfrom
Rory0730:assign1

Conversation

@Rory0730
Copy link
Copy Markdown

No description provided.

@Rory0730 Rory0730 closed this Aug 28, 2025
xsmccc added a commit to xsmccc/llaisys that referenced this pull request Apr 6, 2026
- §1: Add KV Cache INT8 (InfiniTensor#4) and CUDA Graph (InfiniTensor#5) to project intro (7→9 optimizations)
- §32: Rewrite optimization InfiniTensor#8 from 'failed CUDA Graph' to successful KV Cache INT8 (+55%)
- §32: Add optimization InfiniTensor#9 CUDA Graph static capture (+12.2%, 118→132 tok/s)
- §32: Update acceleration breakdown table (330× complete, FP32 4.4×)
- §24.5: Fix perf numbers (57.3→57.5, FP32 33.6→~30, add final 132 tok/s)
- §40: Update quantization Q&A with full pipeline data
- §43: Rewrite cudaGraph section with project-specific implementation details
- Clean up duplicate INT4 paragraph, fix title counts (七→九项)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant