# DATASCI 350: Data Science Computing

## Assignment 06 - AI & Prompt Engineering

### Instructions

In this assignment, you will explore prompt engineering techniques using Large Language Models (LLMs). You can use any LLM of your choice (ChatGPT, Claude, Gemini, etc.) unless a specific tool is mentioned.

For each question:
- Document your prompts and the model's responses
- Include screenshots where helpful
- Clearly label your answers with the question number

As always, should you have any questions, please let me know.

### Tasks

**Question 1: Tokenisation Analysis**

Using OpenAI's tokeniser (https://platform.openai.com/tokenizer), compare token counts for the following three phrases:
- English: "Hello, world!"
- Chinese: "你好，世界！"
- Arabic: "مرحبا بالعالم!"

a) Report the token count for each phrase.

b) Explain the implications for API costs when working with multilingual data.

*Your answer here*

**Question 2: Embedding Analogies**

The lecture shows the famous example: "king − man + woman ≈ queen" as vector arithmetic with embeddings.

a) Explain why this works based on how embeddings capture semantic relationships.

b) Propose two other analogies that might work (e.g., country:capital, verb:past_tense) and explain your reasoning.

*Your answer here*

**Question 3: PTCF Framework**

Rewrite this vague prompt using the PTCF framework (Persona, Task, Context, Format) for two different personas:

**Original prompt:** "Tell me about climate change"

**Persona 1:** A climate scientist briefing policymakers  
**Persona 2:** A secondary school teacher creating lesson materials

For each rewrite, clearly label the Persona, Task, Context, and Format components.

*Your answer here*

**Question 4: Format Specification**

Test extracting information from the following text:

> "Tesla Q4 2024 reports $25.7B revenue, beating estimates by 2%. EPS missed by 6%. Vehicle deliveries reached 495,000 units."

a) Write a prompt that outputs JSON with a specified schema (define your own schema with at least 4 fields).

b) Write a prompt that outputs a Markdown table with the same information.

Compare the results: which format is more consistent for automated processing?

*Your answer here*

**Question 5: Temperature Experiment**

Using Google AI Studio (https://ai.google.dev/aistudio) or another tool that allows temperature control, run this prompt at temperatures 0.0, 0.5, and 1.0:

> "Write a two-paragraph opening for a mystery novel set in Tokyo."

a) Document the outputs at each temperature setting.

b) For what types of tasks would you use each temperature setting? Give one example for each.

*Your answer here*

**Question 6: Meta-Prompting**

You want to classify customer support tickets into four categories: Billing, Technical, Account, and Other.

Write an initial prompt for this task, then ask the LLM: "What's unclear about this prompt? How would you improve it?"

a) Document your original prompt and the LLM's suggestions.

b) Write a revised prompt incorporating the feedback.

*Your answer here*

**Question 7: Meta-Prompt Creation**

Create your own meta-prompt to help you design better prompts. Your meta-prompt should ask the LLM to help you craft an effective prompt for a specific task.

a) Write a meta-prompt that asks the LLM something like: "I want to [accomplish X task]. What questions should I ask you to get the best results? What information should I provide in my prompt?"

Choose one of these tasks:
- Summarising research papers
- Generating code documentation
- Analysing customer feedback

b) Test the prompt that the LLM helps you create and document whether it produces better results than a naïve first attempt.

*Your answer here*

**Question 8: Scientific Literature and Hallucination**

Ask an LLM to provide 3 peer-reviewed scientific articles about the following topic:

> "The effects of sleep deprivation on cognitive performance in adults."

Request that the LLM provide for each article: the authors, title, journal name, year of publication, and DOI.

a) Document the LLM's response and verify if the citations are real by checking Google Scholar or the DOI links.

b) What percentage of citations were accurate? How would you design a prompt to reduce hallucinated citations?

*Your answer here*

**Question 9: Few-Shot Example Selection**

The lecture states that examples should represent "the hardest cases, not average cases." Test this claim.

Compare two few-shot prompts for sentiment classification. Test both on these 3 challenging sentences:
- "Not as bad as I expected, actually."
- "Would have been great if not for the noise."
- "It's okay I guess."

**Prompt A (Naïve examples):**
```
Classify the sentiment as Positive, Negative, or Neutral.
Examples:
- "I love it!" → Positive
- "I hate it!" → Negative
```

**Prompt B (Edge-case examples):**
```
Classify the sentiment as Positive, Negative, or Neutral.
Examples:
- "Could have been worse" → Negative
- "Decent but overpriced" → Negative
- "Nothing special but gets the job done" → Neutral
```

a) Document results from both prompts on all 3 test sentences.

b) Which prompt performed better on ambiguous cases? Why?

*Your answer here*

**Question 10: System Prompt Design**

Design a system prompt for an AI assistant that helps students check their essays for grammar and clarity. The assistant should:
- Be encouraging but honest
- Point out specific issues with examples
- Never write content for the student

a) Write the system prompt.

b) Test it with this sample essay excerpt and document the response:

> "Their are many reasons why climate change is important. First of all its affecting everyone. The weather is getting more worser every year and scientist say we need to act now. In conclusion, we all need to do more."

*Your answer here*