# LLM 프롬프트로 Anki 단어장 생성

## 실습 목표

- 영어 텍스트를 입력하면, 중요한 어휘를 추출해 **Anki 카드 형식의 단어장 데이터**를 생성하는 프롬프트를 설계한다.  
- 프롬프트 내에 **시스템 지시사항**, **예시**, **출력 형식**을 포함한다.



## 문제

다음 요구사항을 만족하는 **LLM 프롬프트**를 작성하시오.


## 요구사항


### 1. 시스템 지시사항

- LLM이 영어 텍스트에서 **학습 가치가 있는 주요 단어 또는 어구**를 추출하여 단어장으로 만들어야 한다.
- **너무 쉬운 단어는 제외**하고, **고유명사**, **전문용어**, **관용구** 등도 포함한다.
- 각 카드는 다음 항목을 포함한다:
  1. 단어/어구  
  2. 뜻 (영어 또는 한글)  
  3. 품사  
  4. 예문 (영어)
- 출력은 **Anki 카드 포맷**으로 다음과 같은 형태를 따른다:
  - 각 항목은 `세미콜론 (;)` 으로 구분  
  - 한 줄에 하나의 카드

### 2. 예시

**입력 텍스트 예시**:


```text
The committee will convene next week to discuss the new policy proposals and reach a consensus.
```

**출력 예시**:

```text
convene;to come together for a meeting;verb;The committee will convene next week.
consensus;general agreement;noun;The group reached a consensus after a long discussion.
proposal;a suggestion or plan;noun;He submitted a proposal for the new project.
```


### 3. 출력 형식

- 각 카드: **단어 또는 어구;뜻;품사;영어 예문**
- 모든 카드는 **한 줄에 한 개**, **세미콜론(;)으로 필드 구분**
- **불필요한 부가설명, 줄바꿈, 문장 없음**


### 4. DataFrame 시각화

LLM의 출력 결과를 파싱하여 아래와 같이 **pandas DataFrame** 형태로 시각화하시오.

| Word/Phrase | Definition                         | PoS  | Example Sentence                                      |
|-------------|-------------------------------------|------|--------------------------------------------------------|
| convene     | to come together for a meeting     | verb | The committee will convene next week.                 |
| consensus   | general agreement                  | noun | The group reached a consensus after a long discussion.|
| proposal    | a suggestion or plan               | noun | He submitted a proposal for the new project.          |

## 환경설정

In [2]:
from google.colab import userdata
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')

In [3]:
from openai import OpenAI

client = OpenAI(api_key=OPENAI_API_KEY)

## 프롬프트 입력

In [5]:
system_message = '''
You are an expert English vocabulary extractor for language learners.

## System Instruction:
Given a block of English text, your task is to extract high-value vocabulary items suitable for advanced language learners and output them in Anki flashcard format.

### Requirements:
- Only extract meaningful vocabulary or expressions:
  - Exclude very common/easy words (e.g., "and", "good", "go")
  - Include:
    - Advanced or uncommon vocabulary
    - Proper nouns (e.g., historical names, organizations)
    - Technical terms (e.g., economics, biology)
    - Idiomatic expressions and phrasal verbs
- Each card must contain **exactly four fields**, separated by **semicolon (;)**:
  1. The word or expression
  2. Definition (in English or Korean)
  3. Part of speech (noun, verb, adj, etc.)
  4. Example sentence (in English)
- Output each card in a single line.

---

## Input Text:
The economic outlook remains uncertain due to geopolitical tensions and volatile energy markets. Analysts warn that inflation may persist, driven by supply chain disruptions and labor shortages.

---

## Output Format (Anki CSV Style):
geopolitical;relating to international politics and geography;adjective;Geopolitical issues are affecting global stability.
volatile;likely to change rapidly and unpredictably;adjective;The energy market is highly volatile these days.
inflation;an increase in prices and fall in the purchasing value of money;noun;High inflation hurts consumer spending.
supply chain;a network between a company and its suppliers to produce and distribute a product;noun;The supply chain was disrupted during the pandemic.
labor shortage;not enough available workers to fill jobs;noun;Restaurants are struggling due to labor shortages.

---

## Your Task:
Generate vocabulary flashcards from the following input text in the above format.
'''

In [6]:
user_message = '''
The committee will convene next week to discuss the new policy proposals and reach a consensus.
'''

response = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[
    {
      "role": "system",
      "content": [
        {
          "type": "text",
          "text": system_message
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": user_message
        }
      ]
    }
  ],
  response_format={
    "type": "text"
  },
  temperature=1,
  max_completion_tokens=2048,
  top_p=1,
  frequency_penalty=0,
  presence_penalty=0
)
print(response.choices[0].message.content)

convene;to come together for a meeting or gathering;verb;The board will convene to address the upcoming issues.  
policy proposals;plans or suggestions for new policies;phrase;The committee reviewed several policy proposals during their meeting.  
consensus;general agreement among a group;noun;Reaching a consensus can sometimes take longer than expected.


### 필드 분리

In [11]:
anki = response.choices[0].message.content
anki

'convene;to come together for a meeting or gathering;verb;The board will convene to address the upcoming issues.  \npolicy proposals;plans or suggestions for new policies;phrase;The committee reviewed several policy proposals during their meeting.  \nconsensus;general agreement among a group;noun;Reaching a consensus can sometimes take longer than expected.'

In [12]:
anki_output = anki.strip()
anki_output

'convene;to come together for a meeting or gathering;verb;The board will convene to address the upcoming issues.  \npolicy proposals;plans or suggestions for new policies;phrase;The committee reviewed several policy proposals during their meeting.  \nconsensus;general agreement among a group;noun;Reaching a consensus can sometimes take longer than expected.'

In [15]:
rows = [line.split(";") for line in anki_output.splitlines()]
rows

[['convene',
  'to come together for a meeting or gathering',
  'verb',
  'The board will convene to address the upcoming issues.  '],
 ['policy proposals',
  'plans or suggestions for new policies',
  'phrase',
  'The committee reviewed several policy proposals during their meeting.  '],
 ['consensus',
  'general agreement among a group',
  'noun',
  'Reaching a consensus can sometimes take longer than expected.']]

In [16]:
import pandas as pd

df = pd.DataFrame(rows, columns=["Word/Phrase", "Definition", "PoS", "Example Sentence"])
df

Unnamed: 0,Word/Phrase,Definition,PoS,Example Sentence
0,convene,to come together for a meeting or gathering,verb,The board will convene to address the upcoming...
1,policy proposals,plans or suggestions for new policies,phrase,The committee reviewed several policy proposal...
2,consensus,general agreement among a group,noun,Reaching a consensus can sometimes take longer...
