LLM κΈ°μ μ ν΅μ¬ κ°λ κ³Ό ν₯μ κΈ°λ²μ 체κ³μ μΌλ‘ μ 리ν μ μ₯μμ λλ€. μ΄λ‘ μ€λͺ κ³Ό μ€μ ꡬν μμλ₯Ό ν¬ν¨ν©λλ€.
λκ·λͺ¨ μΈμ΄ λͺ¨λΈ(LLM)μ λ°©λν ν μ€νΈ λ°μ΄ν°λ‘ νμ΅λ λ₯λ¬λ λͺ¨λΈλ‘, μμ°μ΄ μ΄ν΄ λ° μμ± μμ μμ μΈκ° μμ€μ μ±λ₯μ λ¬μ±ν©λλ€.
| μ’ λ₯ | μμ λͺ¨λΈ | νΉμ§ |
|---|---|---|
| Autoregressive | GPT-4, LLaMA | μμ°¨μ ν μ€νΈ μμ± |
| Autoencoder | BERT, RoBERTa | μλ°©ν₯ λ¬Έλ§₯ μ΄ν΄ |
| Multimodal | CLIP, Flamingo | ν μ€νΈ+μ΄λ―Έμ§ μ²λ¦¬ |
π μ§μ μ¦κ° μμ±
- μΈλΆ μ§μ λ² μ΄μ€μ κ²°ν©νμ¬ μ νλ ν₯μ
- ꡬν νλ μμν¬: LangChain, Haystack
- μ§μ λ² μ΄μ€ λͺ μμ νκΈ°
- μ€μ λμ λ°©μ λ°μ
- ν΅μ¬ μ»΄ν¬λνΈ κ°μ‘°
graph TD
A[User Question] --> B(Query Embedding)
B --> C{Vector DB Search}
C --> D[Top-k Documents]
D --> E[Context Filtering]
E --> F[Re-ranking]
F --> G{LLM Generator}
G --> H[Final Answer]
subgraph Knowledge Base
C -->|μ°κ²°| I[(Chunk Storage)]
I --> J[Metadata]
I --> K[Text Embeddings]
end
G -->|μμ²| M[External APIs]
H -->|Feedback| A
π― λλ©μΈ νΉν νμ΅
- μ¬μ νμ΅ λͺ¨λΈμ νΉμ μμ μ λ§μΆ° μ‘°μ
- μ£Όμ κΈ°λ²:
- Full Fine-tuning
- LoRA (Low-Rank Adaptation)
- Prompt Tuning
I tried the LoRA method to finetune llama 3.1 8b model. dataset is about korean food written in Korean. It worked well compared to the original llama 3.1
βοΈ λͺ¨λΈ κ²½λν
- FP32 β INT8 λ³νμΌλ‘ 4λ°° κ²½λν
- μΆλ‘ μλ 2-3λ°° ν₯μ
π λ©ν° λͺ¨λ¬ ν΅ν©
- ν μ€νΈ + μ΄λ―Έμ§/λΉλμ€/μ€λμ€ μ²λ¦¬
- μ£Όμ μν€ν
μ²:
- Cross-modal Attention
- Fusion Networks