# Generative AI – Homework 02: Prompt Engineering

**Student:** Mahdy Mohtari

**Course:** GenAI – Daneshkar

**Assignment Title:** Prompt Engineering

**Date:** 26/7/1404

---

### 📘 Overview

This notebook contains the implementation and analysis for **Homework 02: Prompt Engineering**.
The goal of this assignment is to explore how **the level of detail in prompts affects the quality of model outputs**, and to experiment with different prompting techniques such as **Zero-shot**, **Few-shot**, and **Structured prompting**.

Through a series of tasks, the notebook demonstrates:

* Designing prompts with varying detail levels.
* Observing and comparing output quality.
* Applying few-shot examples to guide model behavior.
* Extracting structured information (e.g., from resumes) using prompt-based methods.



### **IMPORTANT NOTE:** I USED THE FILE "resume_sample1.txt" AS INPUT. SO PUT IT NEXT TO THE .ipynb FILE.

## 0. LLM (Gemini-1.5-flash) setup using LangChain

In [4]:
!pip uninstall -y google-generativeai google-ai-generativelanguage langchain langchain-core langchain-google-genai
!pip install google-ai-generativelanguage>=0.7.0
!pip install langchain-google-genai==2.1.12
!pip install "langchain>=0.3,<0.4" "langchain-core>=0.3,<0.4"


[0mCollecting langchain-google-genai==2.1.12
  Using cached langchain_google_genai-2.1.12-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain-core>=0.3.75 (from langchain-google-genai==2.1.12)
  Downloading langchain_core-1.0.0-py3-none-any.whl.metadata (3.4 kB)
Collecting filetype<2,>=1.2 (from langchain-google-genai==2.1.12)
  Using cached filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB)
Downloading langchain_google_genai-2.1.12-py3-none-any.whl (50 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.7/50.7 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading filetype-1.2.0-py2.py3-none-any.whl (19 kB)
Downloading langchain_core-1.0.0-py3-none-any.whl (467 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m467.2/467.2 kB[0m [31m11.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: filetype, langchain-core, langchain-google-genai
Successfully installed filetype-1.2.0 langchain-core-1.0.0 langchain-google-genai-2.

In [7]:
!pip install google-generativeai


Collecting google-generativeai
  Using cached google_generativeai-0.8.5-py3-none-any.whl.metadata (3.9 kB)
Collecting google-ai-generativelanguage==0.6.15 (from google-generativeai)
  Downloading google_ai_generativelanguage-0.6.15-py3-none-any.whl.metadata (5.7 kB)
Downloading google_generativeai-0.8.5-py3-none-any.whl (155 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m155.4/155.4 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading google_ai_generativelanguage-0.6.15-py3-none-any.whl (1.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m27.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: google-ai-generativelanguage, google-generativeai
  Attempting uninstall: google-ai-generativelanguage
    Found existing installation: google-ai-generativelanguage 0.8.0
    Uninstalling google-ai-generativelanguage-0.8.0:
      Successfully uninstalled google-ai-generativelanguage-0.8.0
[31mERROR: pip's depende

### Google-AI API Key

In [18]:
import os
import getpass

if "GEMINI_API_KEY" not in os.environ:
    os.environ["GEMINI_API_KEY"] = getpass.getpass("Enter your GEMINI API key securely:")


List all models availvable in langchain.

In [19]:
import google.generativeai as genai
from langchain_google_genai import ChatGoogleGenerativeAI

# List all models available for your key
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
available_models = genai.list_models()
print([model.name for model in available_models if "generateContent" in model.supported_generation_methods])


['models/gemini-2.5-pro-preview-03-25', 'models/gemini-2.5-flash-preview-05-20', 'models/gemini-2.5-flash', 'models/gemini-2.5-flash-lite-preview-06-17', 'models/gemini-2.5-pro-preview-05-06', 'models/gemini-2.5-pro-preview-06-05', 'models/gemini-2.5-pro', 'models/gemini-2.0-flash-exp', 'models/gemini-2.0-flash', 'models/gemini-2.0-flash-001', 'models/gemini-2.0-flash-exp-image-generation', 'models/gemini-2.0-flash-lite-001', 'models/gemini-2.0-flash-lite', 'models/gemini-2.0-flash-preview-image-generation', 'models/gemini-2.0-flash-lite-preview-02-05', 'models/gemini-2.0-flash-lite-preview', 'models/gemini-2.0-pro-exp', 'models/gemini-2.0-pro-exp-02-05', 'models/gemini-exp-1206', 'models/gemini-2.0-flash-thinking-exp-01-21', 'models/gemini-2.0-flash-thinking-exp', 'models/gemini-2.0-flash-thinking-exp-1219', 'models/gemini-2.5-flash-preview-tts', 'models/gemini-2.5-pro-preview-tts', 'models/learnlm-2.0-flash-experimental', 'models/gemma-3-1b-it', 'models/gemma-3-4b-it', 'models/gemma-

### LangChain LLM with Gemini API

In [20]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage
from langchain_core.messages import AIMessage
from langchain_core.messages import SystemMessage

MODEL_NAME = "gemini-2.5-flash"

# Initialize the LangChain LLM with Gemini API
llm = ChatGoogleGenerativeAI(
    model=MODEL_NAME,
    google_api_key=os.environ["GEMINI_API_KEY"],
    temperature=0.0,
    max_output_tokens=None,
    max_retries=2,
    timeout=None
)

In [21]:
# Example: generate a response
prompt_msg_test = [
    SystemMessage(content="You are an assistant who explains AI concepts."),
    HumanMessage(content="In one sentence, explain prompt engineering.")
    # AIMessage(content="Prompt engineering is the practice of designing effective prompts for language models.")
]

response = llm.invoke(prompt_msg_test)
print(response, end='\n\n\n')
print("response is:", response.content, sep='\n')

content='' additional_kwargs={} response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': []} id='run--9b516590-8f46-4d88-b44e-373e719919f2-0'


response is:



## 1. Impact of Prompt Detail on Generated Content Quality

### 1.1 Choosing a scientific topic

topic : `"Student Classroom Behavior Management Based on Computer Vision"`

### 1.2. Design 3 different prompts

#### Prompt 1 - General and without Details

In [22]:
simple_prompt_1 = "Write a paper on Student Classroom Behavior Management Based on Computer Vision"

llm.invoke(simple_prompt_1)

AIMessage(content='## Student Classroom Behavior Management Based on Computer Vision: A Comprehensive Framework and Ethical Analysis\n\n### Abstract\nEffective classroom behavior management is fundamental to fostering a conducive learning environment. Traditional methods, often reliant on subjective observation and reactive intervention, can be resource-intensive and inconsistent. This paper explores the potential of computer vision (CV) technology to revolutionize student classroom behavior management. We propose a comprehensive framework for a CV-based system that objectively monitors student engagement, attention, and disruptive behaviors in real-time. The system leverages advanced CV techniques such as gaze tracking, pose estimation, facial expression analysis, and activity recognition to provide teachers with data-driven insights and early intervention capabilities. While highlighting the significant benefits for personalized learning, teacher support, and improved learning outcom

#### Prompt 2 - Determining the Structure of the Paper

In [23]:
structured_prompt_2 = """Write a paper on Student Classroom Behavior Managment Based on Computer Vision.
The paper should include a clear title, abstract, introduction, main body, and conclusion, presenting arguments logically with evidence, analysis, and proper academic tone;
it must also address counterarguments, cite reliable sources, and end with a concise summary and references.
"""

llm.invoke(structured_prompt_2)

AIMessage(content='## Leveraging Computer Vision for Enhanced Student Classroom Behavior Management: Opportunities, Challenges, and Ethical Considerations\n\n**Abstract:**\nEffective classroom behavior management is fundamental to fostering conducive learning environments and optimizing educational outcomes. Traditional methods, often reliant on subjective observation and reactive interventions, can be resource-intensive and inconsistent. This paper explores the burgeoning potential of computer vision (CV) technologies to revolutionize student behavior management by providing objective, real-time data and insights. We delve into the specific CV techniques applicable to classroom settings, such as pose estimation, facial expression analysis, and activity recognition, highlighting their capacity to identify patterns of engagement, distraction, and disruptive behavior. Furthermore, the paper meticulously examines the significant benefits, including early intervention, personalized support

#### Prompt 3 - Determining the exact Content and Format of each Section

In [24]:
structured_formated_prompt_3 = """Write a detailed academic paper on Student Classroom Behavior Management Based on Computer Vision.
The paper should include the following sections with specific content guidelines:
- **Title:** Provide a clear, precise, and professional title that reflects the research focus.
- **Abstract:** Summarize the purpose, approach, main findings, and implications of using computer vision for managing student behavior.
- **Introduction:** Explain the importance of classroom behavior management, the challenges faced by educators, and introduce how computer vision technologies can provide effective monitoring and analysis solutions.
- **Methods:** Describe how the Quantum Evolutionary Algorithm (QEA) is integrated with the DenseNet-121 model to enhance student behavior detection. Explain briefly how QEA optimizes parameters or features for better accuracy, while DenseNet-121 performs deep feature extraction from classroom video data. Mention the overall workflow—data collection, preprocessing, and ethical handling of student information.
- **Results and Discussion:** Summarize expected outcomes such as improved accuracy and engagement analysis, and briefly discuss key limitations like privacy, data bias, and computational cost.
- **Conclusion:** Summarize key insights, restate the importance of AI-driven behavioral analysis, and suggest directions for future research or ethical guidelines.
- **References:** Include citations from relevant studies, papers, and datasets related to computer vision in education and behavioral analysis.

Ensure academic tone, logical flow, and clarity throughout the paper, providing evidence, reasoning, and real-world relevance in each section.
"""

llm.invoke(structured_formated_prompt_3)


AIMessage(content='## Student Classroom Behavior Management Based on Computer Vision: An Enhanced Approach Using Quantum Evolutionary Algorithm and DenseNet-121\n\n### Abstract\n\nEffective classroom behavior management is paramount for fostering conducive learning environments, yet educators frequently face challenges in objectively and consistently monitoring student conduct. This paper proposes an advanced computer vision (CV) system for automated student classroom behavior management, leveraging the synergistic integration of the Quantum Evolutionary Algorithm (QEA) with the DenseNet-121 deep learning model. The primary purpose is to develop a robust, real-time, and objective solution for detecting and analyzing various student behaviors, ranging from engagement and attentiveness to distraction and potential disruption. Our approach utilizes DenseNet-121 for efficient and deep feature extraction from classroom video data, capturing subtle visual cues related to posture, facial expr

### 1.3. Compare and Analyze Results

As we can see from the above invokes, the generated texts of the more complete prompts came more **detailed** and more <b>compatible</b> with the thing we actually wanted. Hence the Quality of the generated text (Here Paper) increased.

## 2. Effects of Few-Shot Prompts in Writing Imitation

### 2.1. Topic selection & Writing style

Topic: `"**Summarize** the paper. **Title** is: Student Classroom Behavior Management Based on Computer Vision Using Quantum Evolutionary Algorithm with DenseNet 121 Model. **Abstract** is: Student recognition and the evaluation of classroom education have both grown to depend heavily on the student classroom behavior management in recent years. The method used to assess and analyze student behavior in the classroom establishes whether the student's giving attention or not. However, because of the complexity of classroom conduct, it was difficult to identify intelligent students. Therefore, in this research Quantum Evolutionary Algorithm with DenseNet 121 (QEA-DenseNet 121) is suggested by studying computer vision of student behavior in the classroom. Usually, the recommended system design is used to evaluate the system's testing and training procedures. The final input images are then subjected to human location estimate using a camera to capture subsequent frames. The error correcting system is integrated with the body position estimate and person recognition algorithms. Lastly, a model known as QEA-DenseNet-121 is recommended as a practical resource for precisely evaluating student behavior in the classroom. Results showed that the suggested approach outperformed the existing models such as Skeleton Pose Estimation (SPE), YOLO-v4 and Intelligent Real-Time Vision (IRTV) with relative gains in Average Accuracy of 99.64%, Precision of 99.53%, Recall of99.71 %, and F1-measure of 99.49%."`

Writing style: `"Write in a clear, practical tone accessible to professionals who know the field but aren’t dominant experts, avoiding jargon and emphasizing key ideas with real-world relevance."`

It is something between **professional** and **practical** and **easy to understand**.

### 2.2. Run prompt without few-shot

In [25]:
from langchain.prompts import PromptTemplate

In [26]:
zero_shot_prompt_template = """"
Summarize** the paper.
{paper}
Write in a clear, practical tone accessible to professionals who know the field but aren’t dominant experts, avoiding jargon and emphasizing key ideas with real-world relevance.
"""

paper_text = "**Title** is: Student Classroom Behavior Management Based on Computer Vision Using Quantum Evolutionary Algorithm with DenseNet 121 Model. **Abstract** is: Student recognition and the evaluation of classroom education have both grown to depend heavily on the student classroom behavior management in recent years. The method used to assess and analyze student behavior in the classroom establishes whether the student's giving attention or not. However, because of the complexity of classroom conduct, it was difficult to identify intelligent students. Therefore, in this research Quantum Evolutionary Algorithm with DenseNet 121 (QEA-DenseNet 121) is suggested by studying computer vision of student behavior in the classroom. Usually, the recommended system design is used to evaluate the system's testing and training procedures. The final input images are then subjected to human location estimate using a camera to capture subsequent frames. The error correcting system is integrated with the body position estimate and person recognition algorithms. Lastly, a model known as QEA-DenseNet-121 is recommended as a practical resource for precisely evaluating student behavior in the classroom. Results showed that the suggested approach outperformed the existing models such as Skeleton Pose Estimation (SPE), YOLO-v4 and Intelligent Real-Time Vision (IRTV) with relative gains in Average Accuracy of 99.64%, Precision of 99.53%, Recall of99.71 %, and F1-measure of 99.49%."

zero_shot_prompt_template = PromptTemplate(template=zero_shot_prompt_template, input_variables=["paper"])

zero_shot_prompt = zero_shot_prompt_template.format(paper=paper_text)

print(zero_shot_prompt)

"
Summarize** the paper. 
**Title** is: Student Classroom Behavior Management Based on Computer Vision Using Quantum Evolutionary Algorithm with DenseNet 121 Model. **Abstract** is: Student recognition and the evaluation of classroom education have both grown to depend heavily on the student classroom behavior management in recent years. The method used to assess and analyze student behavior in the classroom establishes whether the student's giving attention or not. However, because of the complexity of classroom conduct, it was difficult to identify intelligent students. Therefore, in this research Quantum Evolutionary Algorithm with DenseNet 121 (QEA-DenseNet 121) is suggested by studying computer vision of student behavior in the classroom. Usually, the recommended system design is used to evaluate the system's testing and training procedures. The final input images are then subjected to human location estimate using a camera to capture subsequent frames. The error correcting system

In [27]:
llm.invoke(zero_shot_prompt)

AIMessage(content='This paper introduces a new computer vision system, QEA-DenseNet 121 (Quantum Evolutionary Algorithm with DenseNet 121), designed to automatically analyze and manage student behavior in the classroom.\n\n**The core problem it addresses** is the increasing need for accurate student recognition and engagement assessment in educational settings. Traditional methods struggle to consistently identify whether students are attentive or disengaged due to the complex and dynamic nature of classroom interactions.\n\n**The proposed solution** leverages advanced artificial intelligence and camera technology. The QEA-DenseNet 121 system works by:\n1.  Capturing video frames from a classroom camera.\n2.  Estimating human location and body posture within those frames.\n3.  Performing person recognition to identify individual students.\n4.  Integrating an error correction system to refine these observations.\n\n**The practical goal** is to provide educators and administrators with a

### 2.3. Add some exmpales

In [28]:
few_shot_prompt_template = """"
Summarize** the paper.
{paper}
Write in a clear, practical tone accessible to professionals who know the field but aren’t dominant experts, avoiding jargon and emphasizing key ideas with real-world relevance.
{exp1}
{exp2}
{exp3}
"""

paper_text = "**Title** is: Student Classroom Behavior Management Based on Computer Vision Using Quantum Evolutionary Algorithm with DenseNet 121 Model. **Abstract** is: Student recognition and the evaluation of classroom education have both grown to depend heavily on the student classroom behavior management in recent years. The method used to assess and analyze student behavior in the classroom establishes whether the student's giving attention or not. However, because of the complexity of classroom conduct, it was difficult to identify intelligent students. Therefore, in this research Quantum Evolutionary Algorithm with DenseNet 121 (QEA-DenseNet 121) is suggested by studying computer vision of student behavior in the classroom. Usually, the recommended system design is used to evaluate the system's testing and training procedures. The final input images are then subjected to human location estimate using a camera to capture subsequent frames. The error correcting system is integrated with the body position estimate and person recognition algorithms. Lastly, a model known as QEA-DenseNet-121 is recommended as a practical resource for precisely evaluating student behavior in the classroom. Results showed that the suggested approach outperformed the existing models such as Skeleton Pose Estimation (SPE), YOLO-v4 and Intelligent Real-Time Vision (IRTV) with relative gains in Average Accuracy of 99.64%, Precision of 99.53%, Recall of99.71 %, and F1-measure of 99.49%."
exp1_text = """
input -> Title is: Deep Learning for Crop Disease Detection. Abstract is: Crop diseases cause significant losses in agriculture. This paper explores a convolutional neural network (CNN)-based approach for detecting crop leaf diseases using image datasets. The proposed model achieved 97% accuracy and showed improved generalization compared to traditional image processing methods.
output ->  The study presents a CNN-based system for identifying crop leaf diseases from images, achieving 97% accuracy. It outperforms older image processing methods by automatically learning visual disease patterns, offering a practical solution for precision agriculture.
"""
exp2_text = """
input -> Title is: Deep Learning for Crop Disease Detection. Abstract is: Crop diseases cause significant losses in agriculture. This paper explores a convolutional neural network (CNN)-based approach for detecting crop leaf diseases using image datasets. The proposed model achieved 97% accuracy and showed improved generalization compared to traditional image processing methods.
output ->  The research introduces an RNN-based traffic prediction model that better captures time dependencies in data, outperforming traditional models like ARIMA and SVM. It demonstrates strong potential for real-time smart city traffic management."""
exp3_text = """
input -> Title is: Sentiment Analysis of Customer Reviews Using Transformer Models. Abstract is: This paper evaluates transformer-based models for analyzing sentiment in online reviews. Compared to conventional RNN and CNN approaches, the transformer model achieves superior accuracy and better handles contextual language variations in text.
output -> This study applies transformer models for customer review sentiment analysis, showing higher accuracy and improved context understanding over older deep learning methods, making it effective for modern NLP applications.
"""

few_shot_prompt_template = PromptTemplate(template=few_shot_prompt_template, input_variables=["paper", 'exp1', 'exp2', 'exp3'])

zero_shot_prompt = few_shot_prompt_template.format(paper=paper_text, exp1=exp1_text, exp2=exp2_text, exp3=exp3_text)

print(zero_shot_prompt)

"
Summarize** the paper. 
**Title** is: Student Classroom Behavior Management Based on Computer Vision Using Quantum Evolutionary Algorithm with DenseNet 121 Model. **Abstract** is: Student recognition and the evaluation of classroom education have both grown to depend heavily on the student classroom behavior management in recent years. The method used to assess and analyze student behavior in the classroom establishes whether the student's giving attention or not. However, because of the complexity of classroom conduct, it was difficult to identify intelligent students. Therefore, in this research Quantum Evolutionary Algorithm with DenseNet 121 (QEA-DenseNet 121) is suggested by studying computer vision of student behavior in the classroom. Usually, the recommended system design is used to evaluate the system's testing and training procedures. The final input images are then subjected to human location estimate using a camera to capture subsequent frames. The error correcting system

In [29]:
llm.invoke(zero_shot_prompt)

AIMessage(content='This research tackles the complex task of accurately assessing student behavior and attention in classrooms, which is vital for effective education. It introduces a new computer vision system, called QEA-DenseNet 121, that uses cameras to analyze student actions. This system identifies students, estimates their body positions, and integrates an error correction mechanism to precisely evaluate their behavior. The model significantly outperforms existing methods like SPE, YOLO-v4, and IRTV, achieving very high accuracy (around 99.6%) in assessing student engagement. This offers a practical and highly effective tool for educators to understand and manage classroom dynamics.', additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': []}, id='run--623f6534-01f8-4544-bd0b-e714580dfc35-0')

In [30]:
# I used stream for the sake of just using it for once

for chunk in llm.stream("write 3 random words"):
    print(chunk, end="\n", flush=True)

content='Here are 3 random words:\n\n1.  **Cloud**\n2.  **Chair**\n3.  **Banana**' additional_kwargs={} response_metadata={'finish_reason': 'STOP', 'model_name': 'gemini-2.5-flash', 'safety_ratings': []} id='run--564ed9e6-7fde-48b7-a6ac-ddf834dbcd11'


### 2.4. Compare & Analyze results

#### After adding the samples, did the model better adhere to the selected sampling style?

As we can see from the results the one with the few-shots (some examples) works better than the zero-shot one.

#### Does it still need improvement? If yes, how many more samples can be added?

If we add too much exmpales it may go in another direction that is not so desirable for us. It may overshadow the actual instrcution that we gave the prompt when we use more examples than needed.

## 3. The impact of few-shot prompting on generating structured outputs

### 3.1. Define the Required Info

### Information to Include

- **Contact Information:** Name, Email, Phone Number  
- **Education:** Degree, Major, University, Year of Graduation  
- **Work Experience:** Job Title, Company, Duration, Main Responsibilities  
- **Skills:** Technical and Soft Skills  
- **Certificates and Training Courses**


### 3.2 Prompt to Extract Information

In [31]:
from langchain.prompts import PromptTemplate

zero_shot_ie_template = """
You are an information extraction system. Study the following resume and Extract the requested fields from the résumé below.

RESUMÉ:
{resume_text}

TASK:
- Extract the following information:
  - Contact Information: name, email, phone
  - Education: degree, major, university, graduation year
  - Work Experience: job title, company, duration (or start/end dates), main responsibilities
  - Skills: technical skills, soft skills
  - Certificates and training courses

OUTPUT REQUIREMENTS:
- Return **JSON only**, no extra text.
- Use the following structure and keys exactly.
- If something is missing, return an empty string "" or empty list [].
- Keep arrays even if there is only one item.

"""

zero_shot_ie_prompt = PromptTemplate(template=zero_shot_ie_template, input_variables=["resume_text"])


In [33]:
file_name1 = '/content/resume_sample1.txt'
# file_name2 = '/content/resume_sample2.txt'

from google.colab import files

uploaded = files.upload()

with open(file_name1) as file:
    resume_text = file.read()

formatted_ie_prompt = zero_shot_ie_prompt.format(resume_text=resume_text)
print(formatted_ie_prompt)

Saving resume_sample1.txt to resume_sample1.txt

You are an information extraction system. Study the following persian resume and Extract the requested fields from the résumé below in english.

RESUMÉ:
Full Name: Sara Moradi  
Email: sara.moradi@example.com  
Phone: +98 912 123 4567  

🎓 Education:  
- MSc in Data Science, University of Tehran (2021)  
- BSc in Computer Engineering, Sharif University (2018)  

💼 Work Experience:  
- Data Scientist at Snapp (2022-Present)  
  • Developed machine learning models for customer segmentation  
  • Optimized recommendation systems for better user experience  
- AI Researcher at Digikala (2019-2022)  
  • Conducted NLP research for automatic product categorization  

🔧 Skills: Python, TensorFlow, PyTorch, SQL, Data Visualization  

📜 Certifications:  
- Google Data Analytics Professional Certificate  
- Deep Learning Specialization by Andrew Ng (Coursera)  


TASK:
- Extract the following information:
  - Contact Information: name, email, phon

In [37]:
import json

response = llm.invoke(formatted_ie_prompt)

response_data = response.content

with open('zero_shot_information_extractor.json', 'w') as json_file:
    json.dump(response_data, json_file, indent=4)

print("The extracted information has been saved to 'zero_shot_information_extractor.json'")
print(response_data)


The extracted information has been saved to 'zero_shot_information_extractor.json'
```json
{
  "contact_information": {
    "name": "Sara Moradi",
    "email": "sara.moradi@example.com",
    "phone": "+98 912 123 4567"
  },
  "education": [
    {
      "degree": "MSc",
      "major": "Data Science",
      "university": "University of Tehran",
      "graduation_year": "2021"
    },
    {
      "degree": "BSc",
      "major": "Computer Engineering",
      "university": "Sharif University",
      "graduation_year": "2018"
    }
  ],
  "work_experience": [
    {
      "job_title": "Data Scientist",
      "company": "Snapp",
      "duration": "2022-Present",
      "start_date": "2022",
      "end_date": "Present",
      "responsibilities": [
        "Developed machine learning models for customer segmentation",
        "Optimized recommendation systems for better user experience"
      ]
    },
    {
      "job_title": "AI Researcher",
      "company": "Digikala",
      "duration": "2019-20

### 3.3. Determining the Exact Json structure (few-shot)

In [52]:
from langchain.prompts import PromptTemplate

few_shot_ie_template = """
You are an information extraction system. Study the following examples and then extract data from the given résumé.

---

### EXAMPLE 1
RESUMÉ:
Full Name: Reza Jafari
Email: reza.jafari@example.com
Phone: +98 912 654 3210

🎓 Education:
- MSc in Artificial Intelligence, Amirkabir University (2022)
- BSc in Computer Engineering, University of Isfahan (2018)

💼 Work Experience:
- Machine Learning Engineer at Turing Technologies (2022-Present)
  • Developed machine learning models for predictive analysis
  • Worked on deploying models into production environments
- Software Developer at Parsian Co. (2018-2022)
  • Built and optimized web applications and APIs

🔧 Skills: Python, Scikit-learn, TensorFlow, Flask, SQL

📜 Certifications:
- TensorFlow Developer Certificate
- Machine Learning by Stanford University (Coursera)

OUTPUT:
{{
  "contact": {{
    "name": "Reza Jafari",
    "email": "reza.jafari@example.com",
    "phone": "+98 912 654 3210"
  }} ,
  "education": [
    {{
      "degree": "MSc",
      "major": "Artificial Intelligence",
      "university": "Amirkabir University",
      "graduation_year": "2022"
    }},
    {{
      "degree": "BSc",
      "major": "Computer Engineering",
      "university": "University of Isfahan",
      "graduation_year": "2018"
    }}
  ],
  "experience": [
    {{
      "job_title": "Machine Learning Engineer",
      "company": "Turing Technologies",
      "duration": "2022–Present",
      "start_date": "",
      "end_date": "",
      "main_responsibilities": [
        "Developed machine learning models for predictive analysis",
        "Worked on deploying models into production environments"
      ]
    }},
    {{
      "job_title": "Software Developer",
      "company": "Parsian Co.",
      "duration": "2018–2022",
      "start_date": "",
      "end_date": "",
      "main_responsibilities": [
        "Built and optimized web applications and APIs"
      ]
    }}
  ],
  "skills": {{
    "technical": ["Python", "Scikit-learn", "TensorFlow", "Flask", "SQL"],
    "soft": []
  }},
  "certificates": [
    "TensorFlow Developer Certificate",
    "Machine Learning by Stanford University (Coursera)"
  ]
}}

---

### EXAMPLE 2
RESUMÉ:
Full Name: Ali Rezaei
Email: ali.rezaei@example.com
Phone: +98 912 987 6543

🎓 Education:
- PhD in Computer Science, Amirkabir University of Technology (2020)
- MSc in Software Engineering, University of Tehran (2016)

💼 Work Experience:
- Senior Software Engineer at Aparat (2020-Present)
  • Led the development of the video streaming platform
  • Integrated advanced search algorithms for better content discovery
- Software Developer at Tapsi (2016-2020)
  • Developed backend services for mobile applications

🔧 Skills: Java, Spring Boot, Hibernate, SQL, AWS

📜 Certifications:
- AWS Certified Solutions Architect
- Java Programming Certificate (Oracle)

OUTPUT:
{{
  "contact": {{
    "name": "Ali Rezaei",
    "email": "ali.rezaei@example.com",
    "phone": "+98 912 987 6543"
  }} ,
  "education": [
    {{
      "degree": "PhD",
      "major": "Computer Science",
      "university": "Amirkabir University of Technology",
      "graduation_year": "2020"
    }},
    {{
      "degree": "MSc",
      "major": "Software Engineering",
      "university": "University of Tehran",
      "graduation_year": "2016"
    }}
  ],
  "experience": [
    {{
      "job_title": "Senior Software Engineer",
      "company": "Aparat",
      "duration": "2020–Present",
      "start_date": "",
      "end_date": "",
      "main_responsibilities": [
        "Led the development of the video streaming platform",
        "Integrated advanced search algorithms for better content discovery"
      ]
    }},
    {{
      "job_title": "Software Developer",
      "company": "Tapsi",
      "duration": "2016–2020",
      "start_date": "",
      "end_date": "",
      "main_responsibilities": [
        "Developed backend services for mobile applications"
      ]
    }}
  ],
  "skills": {{
    "technical": ["Java", "Spring Boot", "Hibernate", "SQL", "AWS"],
    "soft": []
  }},
  "certificates": [
    "AWS Certified Solutions Architect",
    "Java Programming Certificate (Oracle)"
  ]
}}

---

### NOW YOUR TASK
Extract the same type of information from the résumé below.

RESUMÉ:
{resume_text}

OUTPUT REQUIREMENTS:
- Return **JSON only**, no extra text.
- Use the exact same key names and structure as in the examples.
- If a field is missing, use empty strings "" or empty lists [].
"""

few_shot_ie_prompt = PromptTemplate(
    template=few_shot_ie_template,
    input_variables=["resume_text"]
)

In [53]:
file_name1 = '/content/resume_sample1.txt'
# file_name2 = '/content/resume_sample2.txt'

with open(file_name1) as file:
    resume_text = file.read()

formatted_few_ie_prompt = few_shot_ie_prompt.format(resume_text=resume_text)

print(formatted_few_ie_prompt)


You are an information extraction system. Study the following examples and then extract data from the given résumé.

---

### EXAMPLE 1
RESUMÉ:
Full Name: Reza Jafari  
Email: reza.jafari@example.com  
Phone: +98 912 654 3210  

🎓 Education:  
- MSc in Artificial Intelligence, Amirkabir University (2022)  
- BSc in Computer Engineering, University of Isfahan (2018)  

💼 Work Experience:  
- Machine Learning Engineer at Turing Technologies (2022-Present)  
  • Developed machine learning models for predictive analysis  
  • Worked on deploying models into production environments  
- Software Developer at Parsian Co. (2018-2022)  
  • Built and optimized web applications and APIs  

🔧 Skills: Python, Scikit-learn, TensorFlow, Flask, SQL  

📜 Certifications:  
- TensorFlow Developer Certificate  
- Machine Learning by Stanford University (Coursera)  

OUTPUT:
{
  "contact": {
    "name": "Reza Jafari",
    "email": "reza.jafari@example.com",
    "phone": "+98 912 654 3210"
  } ,
  "educat

In [54]:
import json

response = llm.invoke(formatted_few_ie_prompt)

response_data = response.content

with open('few_shot_information_extractor.json', 'w') as json_file:
    json.dump(response_data, json_file, indent=4)

print("The extracted information has been saved to 'few_shot_information_extractor.json'")
print(response_data)

The extracted information has been saved to 'few_shot_information_extractor.json'
```json
{
  "contact": {
    "name": "Sara Moradi",
    "email": "sara.moradi@example.com",
    "phone": "+98 912 123 4567"
  } ,
  "education": [
    {
      "degree": "MSc",
      "major": "Data Science",
      "university": "University of Tehran",
      "graduation_year": "2021"
    },
    {
      "degree": "BSc",
      "major": "Computer Engineering",
      "university": "Sharif University",
      "graduation_year": "2018"
    }
  ],
  "experience": [
    {
      "job_title": "Data Scientist",
      "company": "Snapp",
      "duration": "2022–Present",
      "start_date": "",
      "end_date": "",
      "main_responsibilities": [
        "Developed machine learning models for customer segmentation",
        "Optimized recommendation systems for better user experience"
      ]
    },
    {
      "job_title": "AI Researcher",
      "company": "Digikala",
      "duration": "2019–2022",
      "start_date"