<img src="../../docs/docs/static/img/dspy_logo.png" alt="DSPy7 Image" height="150"/>

# DSPy Tutorial: Building a Code Plagiarism Detector

If you've ever felt intimidated by DSPy, don't worry—it might look complex at first glance, but it's actually quite approachable. This tutorial will walk you through the process of building a  project, providing a clear, step-by-step approach to understanding and implementing DSPy concepts.

## TLDR 🚀

We're going to build a system for code plagiarism detection. Our goal is to compare two input code files, determine if plagiarism has occurred, and provide an explanation for the result. 

This project will showcase:

- Multiple inputs and outputs
- Double validation techniques

I strongly recommend reading the [DSPy Cheatsheet](https://dspy-docs.vercel.app/docs/cheatsheet) it will help you with quick start.

## How to Start?

A highly effective practice I've found to be game-changing when starting any DSPy project is to answer [these 8 key questions](https://dspy-docs.vercel.app/docs/building-blocks/solving_your_task). This exercise helps you develop a clear vision for your project before diving into the code.

Here's an example of how your answers might look:
1. **Define your task**
   - Expected input: Two input code files (strings containing plain code) to be compared.
   - Expected output:
     - Plagiarism detection result (Yes/No)
     - Explanation/justification of the result
   - Quality and Cost Specifications: Cost is not a concern; quality is the main priority. We want to try different models.

2. **Define your pipeline**
   - We don't need any external tools or document retrieval. It will be a simple chain-of-thought step, as we want to evaluate LLM capabilities for plagiarism detection.

3. **Explore a few examples**
   - We explored LLM capabilities for plagiarism detection using a few examples with ChatGPT and Claude, yielding promising results.

4. **Define your data**
   - We are working with a dataset from the publication: [Source Code Plagiarism Detection in Academia with Information Retrieval: Dataset and the Observation](https://github.com/oscarkarnalim/sourcecodeplagiarismdataset/blob/master/IR-Plag-Dataset.zip)
   - We selected a subset and manually labeled the dataset with our output labels. This dataset should be used for training and testing, while the rest of the original dataset should be used for evaluation.
   - Dataset: [train.csv](/data/train.tsv) (65 samples)
   - When you don't have labeled dataset, it is good idea to try hand-labeling a few examples to get a sense of the task. It will help you to understand the task better and also increase the quality of program.

5. **Define your metric**
   - We are dealing with a **classification problem**, so we will use accuracy as our main metric. 
   - Our metric will be simple: if pred_label == true_label then 1 else 0.
   - As second evaluation we will be evaluating the quality of the explanation via secondary LLM.

6. **Collect preliminary "zero-shot" evaluations**
   - Done in code.

7. **Compile with a DSPy optimizer**
   - We don't want to update weights of the LLM, so we are looking at optimizers such as:
     - BootstrapFewShot
     - BootstrapFewShotWithRandomSearch
     - MIPRO
     - ...

8. **Iterate**
    - Regroup and attack again!



In [1]:
import os
import re

import dspy
import pandas as pd
from dotenv import load_dotenv
from dspy.evaluate import Evaluate
from dspy.teleprompt import (
    BootstrapFewShot,
    BootstrapFewShotWithRandomSearch,
    BootstrapKNN,
    KNNFewShot,
    MIPROv2,
)

# load your environment variables from .env file
load_dotenv()

# azure-openai model deployment
AZURE_OPENAI_KEY = os.getenv("AZURE_OPENAI_KEY")
AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_DEPLOYMENT = os.getenv("AZURE_OPENAI_DEPLOYMENT")
AZURE_OPENAI_VERSION = os.getenv("AZURE_OPENAI_VERSION")

# openai model deployment

OPENAI_MODEL = os.getenv("OPENAI_MODEL")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

# ollama deployment
OLLAMA_URL = os.getenv("OLLAMA_URL")

# 0. Load dataset

Our first task is to load our dataset. Each entry in our dataset will consist of the following components:

* `sample_1`: The first code sample to be analyzed
* `sample_2`: The second code sample to be compared against the first
* `plagiarized`: A boolean value (True if plagiarism is detected, False otherwise)
* `reason`: A detailed explanation of the plagiarism detection result


In [2]:
df = pd.read_csv("https://raw.githubusercontent.com/williambrach/LLM-plagiarism-check/main/data/train.tsv", sep="\t")
df.head()

Unnamed: 0,L,case,sample_1,sample_2,plagiarized,reason
0,0,1,public class T1 { public static void main(Str...,"/* * To change this license header, choose Li...",False,"The two code samples, while producing similar ..."
1,0,1,public class T1 { public static void main(Str...,/** * * @author 65FBEF05E01FAC390CB3FA073FB3...,False,The code samples demonstrate different approac...
2,0,1,public class T1 { public static void main(Str...,/** * * @author CB6AB3315634A1E4D11B091BA48B...,False,The two code samples produce the same output b...
3,1,1,public class T1 { public static void main(Str...,"* * To change this license header, choose Lice...",True,The two code samples are nearly identical in t...
4,2,1,public class T1 { public static void main(Str...,"/* * To change this license header, choose Li...",True,The two code samples contain identical main me...


# 1. Prepare dataset

DSPy utilizes special objects called [`Example`](https://dspy-docs.vercel.app/docs/deep-dive/data-handling/examples#creating-an-example) to structure and process data. Our next task is to convert our raw dataset into a collection of these `Example` objects.

## Creating Custom Example Objects

For our plagiarism detection system, we'll create `Example` objects with the following attributes:
- `code_sample_1`: The first code sample to analyze
- `code_sample_2`: The second code sample to compare
- `plagiarized`: Boolean indicating whether plagiarism was detected
- `explanation`: Detailed reasoning for the plagiarism decision

## Specifying Inputs

It's crucial to inform DSPy which attributes serve as inputs for our model. We accomplish this using the `.with_inputs()` method. In our case, we'll specify:

```python
.with_inputs("code_sample_1", "code_sample_2")
```


In [3]:
def create_example(row: pd.Series) -> dspy.Example:
    return dspy.Example(
        code_sample_1=row["sample_1"],
        code_sample_2=row["sample_2"],
        plagiarized="Yes" if row["plagiarized"] else "No",
        explanation=row["reason"],
    ).with_inputs("code_sample_1", "code_sample_2")


examples = []
for _, row in df.iterrows():
    example = create_example(row)
    examples.append(example)

# 2. DSPy setup

DSPy is designed to be compatible with a variety of language models and their respective clients. For this tutorial, we will primarily utilize GPT-4 through the [Azure client](https://github.com/stanfordnlp/dspy/blob/main/dsp/modules/azure_openai.py). However, to demonstrate DSPy's flexibility, we will also provide configuration examples for other popular options:

1. [OpenAI client](https://dspy-docs.vercel.app/docs/deep-dive/language_model_clients/remote_models/OpenAI)
2. Local models via [Ollama](https://github.com/stanfordnlp/dspy/blob/main/dsp/modules/ollama.py)

## Custom Client Implementation

If your preferred language model client is not natively supported by DSPy, you have the option to implement a custom client. For detailed instructions on creating a custom client, please refer to this [comprehensive guide - custom-lm-client](https://dspy-docs.vercel.app/docs/deep-dive/language_model_clients/custom-lm-client).


In [4]:
import litellm

# litellm.api_key = "..."
lm = dspy.LM(model="gpt-4o-mini")

dspy.settings.configure(lm=lm)

# 3. Setting up signature and module


## Core DSPy Components

[dspy.Signature](https://dspy-docs.vercel.app/docs/building-blocks/signatures) and [dspy.Module](https://dspy-docs.vercel.app/docs/building-blocks/modules) are fundamental building blocks for DSPy programs:

- **Signature**: A declarative specification of the input/output behavior of a DSPy module.
- **Module**: A building block for programs that leverage Language Models (LMs).

## Types of DSPy Modules

DSPy offers various module types, each serving different purposes:

1. [dspy.Predict](https://dspy-docs.vercel.app/api/modules/Predict)
   - Basic predictor
   - Maintains the original signature
   - Handles key forms of learning (storing instructions, demonstrations, and LM updates)
   - Most similar to direct LM usage

2. [dspy.ChainOfThought](https://dspy-docs.vercel.app/api/modules/ChainOfThought)
   - Enhances the LM to think step-by-step before producing the final response
   - Modifies the signature to incorporate intermediate reasoning steps

3. [Additional Advanced Modules](https://dspy-docs.vercel.app/api/category/modules)
   - DSPy library offers a range of more specialized modules for complex tasks

### Recommendation for starting

For those new to DSPy, it's advisable to start with `dspy.Predict`. Its simplicity makes it ideal for understanding the basics of DSPy operation. Once you've successfully implemented your program using `dspy.Predict`, you can explore more advanced modules like `dspy.ChainOfThought` to potentially enhance your model's performance.

For an overview of other prompting techniques beyond zero-shot learning, refer to the [Prompting Guide](https://www.promptingguide.ai/techniques). This resource covers various methods that can enhance your DSPy applications as you progress.

In [5]:
class PlagiarismSignature(dspy.Signature):
    # Clarify something about the nature of the task (expressed below as a docstring)!
    """Detect if two code samples are plagiarized. In plagiarized field answer only : Yes if the code samples are plagiarized, No otherwise. In explanation field add the reason why the code samples are/ are not plagiarized."""

    # Supply hints on the nature of an input field, expressed as a desc keyword argument for dspy.InputField.
    code_sample_1 = dspy.InputField(desc="The first code sample to compare")
    code_sample_2 = dspy.InputField(desc="The second code sample to compare")

    # Supply constraints on an output field, expressed as a desc keyword argument for dspy.OutputField.
    explanation = dspy.OutputField(
        desc="Explanation or reason why the code samples are/ are not plagiarized"
    )
    plagiarized = dspy.OutputField(
        desc="Yes/No indicating if code samples are plagiarized"
    )


class PlagiarismCoT(dspy.Module):
    def __init__(self) -> None:
        super().__init__()

        # self.prog = dspy.ChainOfThought(PlagiarismSignature)
        self.prog = dspy.Predict(PlagiarismSignature)

    def forward(self, code_sample_1: str, code_sample_2: str) -> PlagiarismSignature:

        # here you can do any processing you want, calling your function, etc.
        # modifying your code, inputs etc.
        # similar to pytorch forward function

        # returned signature object
        prediction = self.prog(code_sample_1=code_sample_1, code_sample_2=code_sample_2)
        return prediction

## 3.1 Test your module with zero-shot program

In [32]:
output = PlagiarismCoT()(examples[0].code_sample_1, examples[0].code_sample_1)
print(f"plagiarized : {output.plagiarized}")
print(f"explanation : {output.explanation}")

plagiarized : Yes
explanation : The two code samples are identical, with the same class name, method structure, and repeated output statements. This indicates that one code sample is likely copied from the other without any modification.


# 4. Create metric for your programs

Creating evaluation metric is a crucial component of the DSPy pipeline. This step is often the most complex function in your DSPy application.

### Metric Structure

In DSPy, metrics (or evaluation functions) must adhere to the following structure:

1. **Arguments**: 
   - Two required arguments, both of type `dspy.Example`:
     - The ground truth example from the dataset
     - The predicted example from the model
   - An optional third argument called `trace`

2. **Return Value**: 
   - A numerical score (float, int, or bool)
   - For binary metrics, return `True` for correct predictions and `False` for incorrect ones

For more detailed information on creating metrics in DSPy, refer to the [official documentation on metrics](https://dspy-docs.vercel.app/docs/building-blocks/metrics).

In [6]:
def validate_answer(
    example: dspy.Example, pred: PlagiarismSignature, trace: object = None
) -> bool:
    """
    Validate the predicted plagiarism answer against the example answer.

    This function compares the predicted plagiarism answer with the example answer,
    focusing on a simple "yes" or "no" response. It extracts the core answer from
    the prediction, handling potential variations in formatting and capitalization.

    Parameters:
    - example (dspy.Example): The example object containing the correct answer.
    - pred (PlagiarismSignature): The prediction object containing the model's answer.
    - trace (object, optional): Unused parameter, kept for compatibility.

    Returns:
    - bool: True if the predicted answer matches the example answer, False otherwise.

    The function returns False if either the predicted or example answer is None,
    or if any exception occurs during the validation process.
    """
    try:
        if pred.plagiarized is None:
            return False

        # Extract the first line of the predicted answer, convert to lowercase
        pred_plag = pred.plagiarized.strip().lower().split("\n")[0]

        # Define a regex pattern to match "yes" or "no"
        yes_no_pattern = r"\b(yes|no)\b"

        # Search for the pattern in the predicted answer
        match = re.search(yes_no_pattern, pred_plag)

        # If a match is found, use it; otherwise, use the entire predicted answer
        extracted_answer = match.group(1) if match else pred.plagiarized.strip().lower()

        if example.plagiarized is None:
            return False

        score = (
            True if extracted_answer == example.plagiarized.strip().lower() else False
        )
    except Exception:
        score = False
    return score

## 4.1 Test your metric with evaluate

Now we can evaluate zero-shot program with our `validate_answer` function. This will be bottom baseline for our program.

In [7]:
evaluate = Evaluate(
    devset=examples,
    metric=validate_answer,
    num_threads=4,
    display_progress=True,
    display_table=0,
)

In [13]:
# score for zero-shot evaluation with `validate_answer` function
score = evaluate(PlagiarismCoT())
print(f"Zero-shot score: {score}")

Zero-shot score: 41.27


## 4.2 When you can't use simple programmatic metrics and you need more brain power

In our metric, we are only evaluating the `plagiarized` field. We are not evaluating the `explanation` field. If we wanted to evaluate the `explanation`, which is a string (reasoning), we could use a text similarity metric (cosine similarity, for example) or another LLM to evaluate it.

The advantage of using another LLM is that we can evaluate the explanation in a more human-like way, but be careful with this approach. It could be very expensive and add more time to your program's runtime.

More examples of implementing this type of metric can be found in the [tweets example](https://github.com/stanfordnlp/dspy/blob/main/examples/tweets/tweet_metric.py).

In [8]:
# Define the signature for automatic assessments.
class Assess(dspy.Signature):
    """Assess the similarity of a of inputs. Answer only Yes if inputs are similar or No if not."""

    original_explanation = dspy.InputField()
    predicted_reasoning = dspy.InputField()
    similar = dspy.OutputField(desc="Yes if inputs are similar or No if not")


# gpt_judge = dspy.AzureOpenAI(
#     api_base=AZURE_OPENAI_ENDPOINT,
#     api_version=AZURE_OPENAI_VERSION,
#     deployment_id=AZURE_OPENAI_DEPLOYMENT,
#     api_key=AZURE_OPENAI_KEY,
# )


def validate_answer_with_explanation(
    example: dspy.Example, pred: PlagiarismSignature, trace: object = None
) -> bool:

    # Extract the true explanation from the example
    true_explanation = example.explanation

    # Extract the predicted explanation from the prediction
    pred_explanation = pred.explanation

    # Use a language model (gpt_judge) to assess the similarity of explanations
    # Create an Assess object to compare the explanations
    similar = dspy.Predict(Assess)(
        original_explanation=true_explanation, predicted_reasoning=pred_explanation
    )

    # Check if the explanations are deemed similar (converting to lowercase for case-insensitive comparison)
    similar_score = similar.similar.lower() == "yes"

    # Validate the plagiarism answer using the existing validate_answer function
    plagiarized_score = validate_answer(example, pred, trace)

    # Return True only if both the explanation is similar and the plagiarism answer is correct
    return similar_score and plagiarized_score

In [9]:
evaluate_with_explanation = Evaluate(
    devset=examples,
    metric=validate_answer_with_explanation,
    num_threads=4,
    display_progress=True,
    display_table=0,
)

In [13]:
# score for zero-shot evaluation with `validate_answer_with_explanation` function
score = evaluate_with_explanation(PlagiarismCoT())
print(f"Zero-shot score for validate_answer_with_explanation : {score}")

Zero-shot score for validate_answer_with_explanation : 34.92


# 5. DSPy Magic - Optimizers

## TLDR on Optimizers

Optimizers in DSPy are powerful tools that handle prompt engineering for you. Think of them as "functions" that automate and improve the process of crafting effective prompts. There are several types of optimizers available in DSPy, but in this tutorial, we'll focus on the first two:

1. **Automatic Few-Shot Learning**: 
   - Adds examples to your prompts.
   - Key optimizers to explore: `BootstrapFewShot`, `BootstrapFewShotWithRandomSearch`, `BootstrapKNN`, `BootstrapKNNWithRandomSearch`

2. **Automatic Instruction Optimization**: 
   - Produces optimal instructions for the prompt.
   - In the case of MIPRO, also optimizes the set of few-shot examples.
   - Key optimizer to explore: [`MIPROv2`](https://github.com/stanfordnlp/dspy/blob/main/dspy/teleprompt/mipro_optimizer_v2.py)

3. Automatic Finetuning:
   - Used to fine-tune the underlying Language Model(s).

4. Program Transformations:
   - Ensembles a set of DSPy programs and either uses the full set or randomly samples a subset into a single program.

If you're new to optimizers and unsure where to start, `BootstrapFewShotWithRandomSearch` is a safe and effective choice for beginners.

## Read the Documentation!

For more detailed information about optimizers, please refer to the [official DSPy documentation on optimizers](https://dspy-docs.vercel.app/docs/building-blocks/optimizers).

## 5.1 Example of BootstrapFewShot

[BooststrapFewShot docs](https://dspy-docs.vercel.app/docs/deep-dive/teleprompter/bootstrap-fewshot), [BootstrapFewShot class](https://github.com/stanfordnlp/dspy/blob/main/dspy/teleprompt/bootstrap.py) 

In [11]:
config = {"max_bootstrapped_demos": 32, "max_labeled_demos": 16}

optimizer = BootstrapFewShot(metric=validate_answer, **config)
optimized_program = optimizer.compile(PlagiarismCoT(), trainset=examples)
optimized_program.save("BootstrapFewShot_program.json")

 79%|███████▉  | 50/63 [00:00<00:00, 429.21it/s]

Bootstrapped 32 full traces after 50 examples for up to 1 rounds, amounting to 50 attempts.





In [12]:
# score for BootstrapFewShot evaluation with `validate_answer` function
score = evaluate(optimized_program)
print(f"BootstrapFewShot score : {score}")

Average Metric: 41.00 / 63 (65.1%): 100%|██████████| 63/63 [01:20<00:00,  1.28s/it]

2024/11/18 18:45:39 INFO dspy.evaluate.evaluate: Average Metric: 41 / 63 (65.1%)



BootstrapFewShot score : 65.08


## 5.2 Example of BootstrapFewShotWithRandomSearch

[BootstrapFewShotWithRandomSearch class](https://github.com/stanfordnlp/dspy/blob/main/dspy/teleprompt/random_search.py) 

In [12]:
config = {
    "max_bootstrapped_demos": 32,
    "max_labeled_demos": 4,
    "num_candidate_programs": 5,
    "num_threads": 4,
}

optimizer = BootstrapFewShotWithRandomSearch(metric=validate_answer, **config)
optimized_program = optimizer.compile(PlagiarismCoT(), trainset=examples)
optimized_program.save("BootstrapFewShotWithRandomSearch_program.json")

Going to sample between 1 and 32 traces per predictor.
Will attempt to bootstrap 5 candidate sets.
Average Metric: 26.00 / 63 (41.3%): 100%|██████████| 63/63 [00:00<00:00, 659.60it/s]

2024/11/18 18:40:36 INFO dspy.evaluate.evaluate: Average Metric: 26 / 63 (41.3%)



New best score: 41.27 for seed -3
Scores so far: [41.27]
Best score so far: 41.27
  0%|          | 0/63 [00:00<?, ?it/s]



KeyboardInterrupt: 

In [39]:
# score for BootstrapFewShotWithRandomSearch evaluation with `validate_answer` function
score = evaluate(optimized_program)
print(f"BootstrapFewShotWithRandomSearch score : {score}")

BootstrapFewShotWithRandomSearch score for validate_answer : 76.19


## 5.3 Example of KNNFewShot

[KNNFewShot class](https://github.com/stanfordnlp/dspy/blob/main/dspy/teleprompt/knn_fewshot.py) 

In [8]:
k = 3
optimizer = KNNFewShot(k, examples)
optimized_program = optimizer.compile(PlagiarismCoT(), trainset=examples)
optimized_program.save("KNNFewShot_program.json")

KeyboardInterrupt: 

In [18]:
# score for KNNFewShot evaluation with `validate_answer` function
score = evaluate(optimized_program)
print(f"KNNFewShot score : {score}")

KNNFewShot score : 65.08


## 5.4 Example of BootstrapKNN

In [14]:
optimizer = BootstrapKNN(
    metric=validate_answer,
    embedding=dspy.Embedding("openai/text-embedding-3-small"),
    k=4,
    max_bootstrapped_demos=32,
)
optimized_program = optimizer.compile(PlagiarismCoT(), trainset=examples)
optimized_program.save("BootstrapKNNFewShot_program.json")

 79%|███████▉  | 50/63 [00:00<00:00, 1373.78it/s]

Bootstrapped 32 full traces after 50 examples for up to 1 rounds, amounting to 50 attempts.





In [15]:
# score for BootstrapKNN evaluation with `validate_answer` function
score = evaluate(optimized_program)
print(f"BootstrapKNN score : {score}")

Average Metric: 38.00 / 63 (60.3%): 100%|██████████| 63/63 [00:00<00:00, 223.77it/s]

2024/11/18 18:46:42 INFO dspy.evaluate.evaluate: Average Metric: 38 / 63 (60.3%)



BootstrapKNNFewShotWithRandomSearch score : 60.32


## 5.5 Example of MIPROv2

[MIPROv2 paper](https://arxiv.org/abs/2406.11695), [MIPROv2 class](https://github.com/stanfordnlp/dspy/blob/main/dspy/teleprompt/mipro_optimizer_v2.py) 

In [None]:
n = 5  # The number of instructions and fewshot examples that we will generate and optimize over
batches = 20  # The number of optimization trials to be run (we will test out a new combination of instructions and fewshot examples in each trial)
temperature = 1  # The temperature configured for generating new instructions
eval_kwargs = {"num_threads": 4, "display_progress": False, "display_table": 0}
optimizer = MIPROv2(
    prompt_model=lm,
    task_model=lm,
    metric=validate_answer,
    num_candidates=n,
    init_temperature=temperature,
    verbose=True,
)
optimized_program = optimizer.compile(
    PlagiarismCoT(),
    trainset=examples,
    num_batches=batches,
    max_bootstrapped_demos=16,
    max_labeled_demos=16,
    requires_permission_to_run=False,
    eval_kwargs=eval_kwargs,
)
score = evaluate(optimized_program)
optimized_program.save("MIPROv2_program.json")

In [50]:
# score for MIPROv2 evaluation with `validate_answer` function
print(f"MIPROv2 score : {score}")

MIPROv2 score : 87.85


# 6. Loading a Saved Program

Loading a program from a saved file is a straightforward process in DSPy. Here's how you can do it:

In [51]:
loaded_program = PlagiarismCoT()
loaded_program.load(path="BootstrapFewShotWithRandomSearch_program.json")

## 6.1 Deploying a DSPy Program

Deploying a DSPy program in a production environment can be challenging, as DSPy is primarily designed for research and experimentation at the moment. However, there are two effective methods to successfully deploy a DSPy program in production:

### 6.1.1 Using DSPy as a Library

This method involves saving, loading, and using your DSPy program directly:

1. Save your DSPy program to a file using the `.save()` method.
2. Load your DSPy program from the file using the `.load()` method.
3. Use the loaded program to make predictions in your production environment.

### 6.1.2 Extracting Prompts and Calling Your LLM Directly

This approach involves:

1. Extracting the optimized prompts from your DSPy program.
2. Implementing these prompts in your production code.
3. Calling your Language Model (LLM) directly with the extracted prompts.

Each method has its advantages and may be more suitable depending on your specific use case and production environment constraints.

In [None]:
# dummy call of your program
res = loaded_program(examples[0].code_sample_1, examples[0].code_sample_2)

# find last call in your llm
last_prompt = lm.inspect_history(n=1)
with open("inspect_history.txt", "w") as f:
    f.write(str(last_prompt))

## Refining Your Prompt

You can now streamline your prompt by removing unnecessary parameters and utilizing Python's string formatting to incorporate your inputs. Here's an enhanced version of the prompt:

```python
prompt = f"""
Detect if two code samples are plagiarized. Answer only 'Yes' in the plagiarized field if the code samples are plagiarized, 'No' otherwise. Provide the reasoning in the explanation field.

Format:
Code Sample 1: [First code sample to compare]
Code Sample 2: [Second code sample to compare]

Reasoning: Let's analyze step by step to determine if the code samples are plagiarized...

Explanation: [Detailed explanation of why the code samples are or are not plagiarized]

Plagiarized: [Yes/No]

Example:
Code Sample 1: 
import java.util.Scanner;
public class T3 {{
    public static void main(String[] args) {{
        Scanner input = new Scanner(System.in);
        System.out.print("Enter weight in pounds: ");
        double weight = input.nextDouble();
        System.out.print("Enter feet: ");
        double feet = input.nextDouble();
        System.out.print("Enter inches: ");
        double inches = input.nextDouble();
        double height = feet * 12 + inches;
        double bmi = weight * 0.45359237 / ((height * 0.0254) * (height * 0.0254));
        System.out.println("BMI is " + bmi);
        if (bmi < 18.5) System.out.println("Underweight");
        else if (bmi < 25) System.out.println("Normal");
        else if (bmi < 30) System.out.println("Overweight");
        else System.out.println("Obese");
    }}
}}

Code Sample 2:
import java.util.*;
public class L2 {{
    public static void main(String[] args) {{
        Scanner sc = new Scanner(System.in);
        System.out.print("Enter weight in pounds: ");
        double berat = sc.nextDouble();
        System.out.print("Enter feet: ");
        double feet = sc.nextDouble();
        System.out.print("Enter inches: ");
        double inci = sc.nextDouble();
        double tinggi = feet * 12 + inci;
        double bmi = berat * 0.45359237 / ((tinggi * 0.0254) * (tinggi * 0.0254));
        System.out.println("BMI is " + bmi);
        if (bmi < 18.5) {{ System.out.println("Underweight"); }}
        else if (bmi < 25) {{ System.out.println("Normal"); }}
        else if (bmi < 30) {{ System.out.println("Overweight"); }}
        else {{ System.out.println("Obese"); }}
    }}
}}

Reasoning: Let's analyze the code samples step by step to determine if they are plagiarized...

Explanation: The two code samples are nearly identical in structure, logic, and even specific constants used, with only minor differences in variable names and formatting. This level of similarity is extremely unlikely to occur independently and strongly indicates that one sample was copied from the other or both were derived from a common source.

Plagiarized: Yes

Now, analyze the following code samples:

Code Sample 1: {code_sample_1}

Code Sample 2: {code_sample_2}

Reasoning: Let's analyze step by step to determine if the code samples are plagiarized...
"""
```
**Just be aware that you need to do your own parsing of the output of the LLM.**

Thanks! 🙌

In case of any questions, feel free to reach out to me:  
William Brach - [@williambrach](https://x.com/williambrach) - william.brach@stuba.sk