In [1]:
import openai
from openai.error import RateLimitError
import time
from dotenv import load_dotenv
import os

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

In [2]:
def Generation_prompt(ophthalcontents):
    alpaca_format_str = f"""content:
{ophthalcontents}

Example1:
[Ques]: Muller's muscle is responsible for approximately 5 mm of eyelid elevation.
[a]: True
[b]: False
[Ans]: False
[Exp]: Muller's muscle, also known as the superior tarsal muscle, contributes only a small amount to eyelid elevation—typically around 1–2 mm, not 5 mm. The primary muscle responsible for lifting the upper eyelid is the levator palpebrae superioris, which provides the majority of eyelid elevation (up to 15 mm). Muller's muscle is mainly involved in involuntary eyelid elevation and works in conjunction with the sympathetic nervous system to assist the levator muscle.

Example2:
[Ques]: Which of the following are NOT associated with an increased risk of developing proliferative diabetic retinopathy?
[a]: Intraretinal hemorrhages in 4 quadrants.
[b]: Nerve fiber layer infarcts in 2 or more quadrants.
[c]: Venous beading.
[d]: Intraretinal microvascular abnormalities.
[Ans]: Nerve fiber layer infarcts in 2 or more quadrants
[Exp]: Proliferative diabetic retinopathy (PDR) is associated with severe retinal ischemia and neovascularization, driven by factors such as intraretinal hemorrhages in multiple quadrants, venous beading, and intraretinal microvascular abnormalities (IRMA). These findings indicate significant retinal hypoxia and are known precursors to PDR. On the other hand, nerve fiber layer infarcts (also known as cotton wool spots) are more indicative of less severe non-proliferative diabetic retinopathy (NPDR) rather than PDR. Cotton wool spots alone, even in multiple quadrants, do not directly increase the risk of developing PDR.

Example4:
[Ques]: Which of the following is not a treatment option for capillary hemangioma:
[a]: observation
[b]: oral steroids
[c]: surgical resection
[d]: intralesional steroid injection
[e]: silver nitrate solution
[Ans]: Silver nitrate solution
[Exp]: Capillary hemangiomas often resolve on their own, so observation is typically the first approach. If treatment is needed, oral steroids or intralesional steroid injections can help reduce the size, and surgical resection may be considered in severe cases. Silver nitrate solution is used for cauterizing small skin lesions and is not used to treat hemangiomas.

For example, if the context is not related to ophthalmology:
[Ques]: It is not related to ophthalmology.
[a]: It is not related to ophthalmology.
[Ans]: It is not related to ophthalmology.
[Exp]: It is not related to ophthalmology."""
    
    return alpaca_format_str

In [3]:
GPT_system_prompt = \
"""Please read the text extracted from the ophthalmology textbook thoroughly and create multiple-choice questions in English, ensuring that each question relies strictly on the content provided. Make certain that all questions range from basic to advanced ophthalmology topics without including any identifying information, such as author names or page numbers. You may include some questions with only two answer choices (e.g., True/False) and others with four or five answer choices. If the provided text is not related to ophthalmology, replace the entire question and answer choices with the placeholder sentence “It is not related to ophthalmology.”
Additionally, please provide a thorough explanation for each correct answer based strictly on the information in the text. The explanation should clarify why the correct option is accurate and, where appropriate, why the other choices are less suitable, always relying on the textbook content."""

In [4]:
def call_openai_api(prompt):
    while True:  # 재시도 루프
        try:
            response = openai.ChatCompletion.create(
                model="gpt-4o",
                messages=[
                    {"role": "system", "content": GPT_system_prompt},
                    {"role": "user", "content": prompt},
                ],
                temperature=0.0,
                # max_tokens=4096,
                n=1,
                stop=None,
            )
            return response['choices'][0]['message']['content']
        except RateLimitError as e:
            print(f"Rate limit reached: {e}. Waiting for 10 seconds before retrying...")
            time.sleep(10)

In [5]:
import pandas as pd

corpus_file = "./Source Data/Ophthalmology Textbook Pretraining Dataset.xlsx"

corpus_df = pd.read_excel(corpus_file)
corpus_list = corpus_df["text"].values.tolist()
print(len(corpus_list))

10796


In [6]:
answer_list = []
count = 5060
for idx in range(5060,7110):
    ophthal_content = corpus_list[idx]
    prompt_ = Generation_prompt(ophthal_content)
    answer_ = call_openai_api(prompt_)
    answer_list.append(answer_)
    count += 1
    if count % 10 == 0:
        print(f"{count} / {len(corpus_list)}")

3440 / 10796
3450 / 10796
3460 / 10796
3470 / 10796
3480 / 10796
3490 / 10796
3500 / 10796
3510 / 10796
3520 / 10796
3530 / 10796
3540 / 10796
3550 / 10796
3560 / 10796
3570 / 10796
3580 / 10796
3590 / 10796
3600 / 10796
3610 / 10796
3620 / 10796
3630 / 10796
3640 / 10796
3650 / 10796
3660 / 10796
3670 / 10796
3680 / 10796
3690 / 10796
3700 / 10796
3710 / 10796
3720 / 10796
3730 / 10796
3740 / 10796
3750 / 10796
3760 / 10796
3770 / 10796
3780 / 10796
3790 / 10796
3800 / 10796
3810 / 10796
3820 / 10796
3830 / 10796
3840 / 10796
3850 / 10796
3860 / 10796
3870 / 10796
3880 / 10796
3890 / 10796
3900 / 10796
3910 / 10796
3920 / 10796
3930 / 10796
3940 / 10796
3950 / 10796
3960 / 10796
3970 / 10796
3980 / 10796
3990 / 10796
4000 / 10796
4010 / 10796
4020 / 10796
4030 / 10796
4040 / 10796
4050 / 10796
4060 / 10796
4070 / 10796
4080 / 10796
4090 / 10796
4100 / 10796
4110 / 10796
4120 / 10796
4130 / 10796
4140 / 10796
4150 / 10796
4160 / 10796
4170 / 10796
4180 / 10796
4190 / 10796
4200 / 10796

KeyboardInterrupt: 

In [7]:
print(answer_list[0])

[Ques]: What anatomical feature allows the spread of infection from the ethmoid sinus into the orbit?
[a]: Optic canal
[b]: Emissary channels in the medial wall
[c]: Infraorbital groove
[d]: Superior rectus muscle
[Ans]: Emissary channels in the medial wall
[Exp]: The emissary channels in the medial wall of the orbit can facilitate the spread of infection from the ethmoid sinus into the orbit. This anatomical feature provides a pathway for infections to move from the sinus to the orbital area.

[Ques]: Which structure is involved in orbital trauma associated with infraorbital hypoesthesia?
[a]: Optic nerve
[b]: Infraorbital groove
[c]: Medial rectus muscle
[d]: Ophthalmic artery
[Ans]: Infraorbital groove
[Exp]: Fractures of the orbital floor may involve the infraorbital groove, which contains the infraorbital nerve. This involvement should be suspected in cases of orbital trauma associated with infraorbital hypoesthesia, as damage to the infraorbital nerve can lead to sensory loss in 

In [7]:
import pandas as pd

df = pd.DataFrame(answer_list, columns=['answer'])
df.to_excel('./Generated Ophthalmology QA/Generated Ophthalmology MCQA_3430_5060.xlsx', index=False)
