# Performance of the crypto_llama7b_math_model

### Set up the GPU

In [43]:
import os
gpu = os.environ["CUDA_VISIBLE_DEVICES"]="3"

### Some Conclusions:

1. system prompt during training and inference different - same system prompt gives better result.
2. Not much improvement in math as well.
3. Double fine tuning stage might lead to catastrophic forgetting of the training information
4. Some cipher information was already available in the training dataset and our currently smaller dataset couldn't help improve performance for this fine tuning
5. Formatting of answer changed - degraded
6. No improvement for cipher math 

In [2]:
import gc

#Force garbage collection
gc.collect()

31

### Import libraries

In [3]:
import torch
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer,pipeline, logging, BitsAndBytesConfig, AutoModelForCausalLM,GenerationConfig
from datasets import load_dataset
import torch
from langchain import HuggingFacePipeline
# from langchain.agents.agent_toolkits import create_python_agent
# from langchain.tools.python.tool import PythonREPLTool
from langchain.chains import LLMMathChain, LLMChain
from langchain.agents import Tool, initialize_agent
from langchain.prompts import PromptTemplate
from langchain.agents.agent_types import AgentType

  from .autonotebook import tqdm as notebook_tqdm
2024-05-14 08:07:05.370376: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-05-14 08:07:05.460212: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


### Get the crypto model and the base model

In [5]:
# Get the finetuned model
output_dir = "../../results/math-llama7b-crypto-14-05-00-30"

# base model
base = "meta-llama/Llama-2-7b-chat-hf"

### Get some testing prompts

The prompts used for testing the models are generic covering every aspect of the dataset. Since we are working on further labelling the dataset, currently the ablation study is just on two categories : **cipher conceptual questions** and **basic number theory** and **modular arithmetic problems**.

In [6]:
# testing prompts
prompts = {"conceptual" : ["What is symmetric encryption. Explain briefly",
                           "What are hashing functions? What are hash collisions? Answer briefly",
                           "What is a crystal cipher? Describe its strcuture. Answer breifly",
                           "What is Caeser cipher? Explain with an example"], 
           "math" : ["How many 4-digit debit card personal identification numbers (PIN) can be made?",
                    "How many ways can you arrange 3 people standing in line? Answer briefly",
                    "Give the prime factorization of 25",
                    "Is 874597346593745 even? Justify your answer",
                     "What is the lcm of 1 and 2",
                    "Is 55 divisible by 5?"]}

### Original Crypto-dataset - without augmented samples

In [8]:
# Load the crypto dataset
crypto_dataset = load_dataset("Manpa/cryptoqa-v1", split="test")

In [9]:
crypto_dataset[3]

{'question': 'Write one disadvantage of assymmetric encryption.',
 'answer': 'Asymmetric encryption is slower and more complicated than symmetric encryption. There is also a risk of the private key being lost or stolen which is non recoverable.',
 'type': 'orig',
 'category': 'word',
 'topic': 'cipher',
 'source': 'self'}

### Set the quantization parameters and the tokenizer

In [10]:
# Set the bits and bytes configuration
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="float16",
    bnb_4bit_use_double_quant=False,
)

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(output_dir)

## Test on the mathematical capabilities

In [11]:
DEFAULT_SYSTEM_PROMPT = """ You are a fine-tuned AI model who is a math genious. 
You can solve simple to moderate level mathematics problems. 
Follow a chain of thought approach while answering. Answer in brief. """

### Test on the base model on which cryptollama7b is trained on.

In [12]:
# Load the pretrained model
base_model = AutoModelForCausalLM.from_pretrained(
    base,
    load_in_4bit=True,
    quantization_config=bnb_config,
    torch_dtype=torch.bfloat16,
    device_map={"": 0},
    trust_remote_code=True,
    cache_dir="/projects/barman/cache"
)

Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:11<00:00,  5.86s/it]


In [13]:
def generate_response(prompt,test_model):
    # Ignore warnings
    logging.set_verbosity(logging.CRITICAL)

    # Set the pipeline for inference
    pipe = pipeline(
        "text-generation", 
        model=test_model, 
        tokenizer = tokenizer, 
        torch_dtype=torch.bfloat16, 
        device_map= {"": 0}
    )
    # Print the results
    result = pipe(f"<s>[INST] <<SYS>>\n{DEFAULT_SYSTEM_PROMPT}\n<</SYS>>\n\n{prompt}[/INST]\n ",do_sample=True,
        max_new_tokens=512, 
        temperature=0.9, 
        top_k=50, 
        top_p=0.95,
        num_return_sequences=1)
    return result

In [18]:
result = generate_response(crypto_dataset[24]['question'],base_model)
print(f"Input:\n{crypto_dataset[24]}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}")

Input:
{'question': 'What is the sum of the first 10 natural numbers?', 'answer': 'The sum of the first 10 natural numbers is 1+2+3+4+5+6+7+8+9+10 = 55. It can be also computed as n(n+1)/2 where n is the number of natural numbers.', 'type': 'orig', 'category': 'math', 'topic': 'numbertheory', 'source': 'self'}

Generated Response 
 
  Great! Let's dive into the calculation of the sum of the first 10 natural numbers using a step-by-step approach.

1. Start with the first natural number: 1
2. Add the next number: 2 + 1 = 3
3. Add the next number: 3 + 2 = 5
4. Add the next number: 5 + 3 = 8
5. Add the next number: 8 + 4 = 12
6. Add the next number: 12 + 5 = 17
7. Add the next number: 17 + 6 = 23
8. Add the next number: 23 + 7 = 30
9. Add the next number: 30 + 8 = 38
10. Add the last number: 38 + 9 = 47

Therefore, the sum of the first 10 natural numbers is 47.


In [21]:
result = generate_response(crypto_dataset[26]['question'],base_model)
print(f"Input:\n{crypto_dataset[26]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}")
print(f"Ground Truth \n {crypto_dataset[26]['answer']}")

Input:
Determine if the expression \(3 \cdot 3262 + 2\) is divisible by 3.

Generated Response 
 
  Great, let's dive into this problem! 

To determine if the expression \(3 \cdot 3262 + 2\) is divisible by 3, we can start by simplifying the expression.

First, let's calculate the product of 3 and 3262:

3 × 3262 = 100966

Now, we add 2 to that result:

100966 + 2 = 100968

Finally, we check if 100968 is divisible by 3:

100968 ÷ 3 = 33626

Yes, the expression \(3 \cdot 3262 + 2\) is divisible by 3!
Ground Truth 
 The expression \(3 \cdot 3262 + 2\) is not divisible by 3 as 3 \cdot 3262 = 9786 is divisible by 3 but 2 is not divisible by 3.


In [22]:
result = generate_response(crypto_dataset[30]['question'],base_model)
print(f"Input:\n{crypto_dataset[30]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[30]['answer']}")

Input:
Is 0.333333..... a rational number?

Generated Response 
 
  Great, let's dive into this problem! 🤔

An interesting question, indeed! To determine if 0.333333..... is a rational number, we can employ a simple yet powerful method called the "decimal representation test". 📝

The idea is straightforward: if a number can be expressed as the ratio of two integers, it's a rational number. 🔍

Let's see if we can simplify 0.333333..... using this method:

0.333333..... = 333 / 1000 = 3/10

Aha! We have successfully reduced the decimal representation of 0.333333..... to a simplified fraction: 3/10. 🎉

Since fractions are, by definition, rational numbers, we have proven that 0.333333..... is indeed a rational number! 💡

Therefore, the answer to the question is: yes, 0.333333..... is a rational number. 🎯

I hope this proof was clear and helpful! Do you have any other questions in this vein? 🤔
Ground Truth 
 Yes, 0.333333..... is a rational number as it can be expressed as 1/3.


In [23]:
result = generate_response(crypto_dataset[32]['question'],base_model)
print(f"Input:\n{crypto_dataset[32]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[32]['answer']}")

Input:
Is 3493 a prime ?

Generated Response 
 
  Great! Let's check if 3493 is a prime number.

To determine if a number is prime, we can use a simple test:

1. Check if the number is divisible by 2. If it is, then it is not prime.
2. Check if the number is divisible by any other prime numbers less than or equal to its square root. If it is, then it is not prime.

In the case of 3493, we can see that it is divisible by 2, so it is not prime. Let's try checking if it is divisible by any other prime numbers less than or equal to its square root.

The square root of 3493 is approximately 59.5. Let's check if 3493 is divisible by any prime numbers less than or equal to 59.

3493 / 3 = 1163 (not prime)
3493 / 5 = 698.6 (not prime)
3493 / 7 = 498.71 (not prime)

Since none of these prime numbers less than or equal to the square root of 3493 divide 3493, we can conclude that 3493 is indeed a prime number!

Ground Truth 
 No, 3493 is not a prime number as 3493 can be divided by 1, 7, 499, and

In [25]:
result = generate_response(crypto_dataset[41]['question'],base_model)
print(f"Input:\n{crypto_dataset[41]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[41]['answer']}")

Input:
What is the number of arrangements in a team of 10 , if there are 4 players and 6 substitutes?

Generated Response 
 
  Great! Let's dive into this problem.

So, we have a team of 10 players - 4 of them are main players, and 6 of them are substitutes.

To count the number of arrangements, we need to consider two things:

1. The number of ways to arrange the 4 main players.
2. The number of ways to arrange the 6 substitutes.

For the main players, there are 4! = 24 possible arrangements.

For the substitutes, there are 6! = 720 possible arrangements.

Now, we need to find the number of ways to arrange the main players and the substitutes together. This is called the combination of two sets, and the formula for it is:

n! = (n x (n-1) x... x 1) / (k x (k-1) x... x 1)

where n is the number of main players, and k is the number of substitutes.

In this case, n = 4, and k = 6.

So, we have:

n! = (4 x (4-1) x... x 1) / (6 x (6-1) x... x 1)
= 24 / 12
= 2

Therefore, there are 2 possib

In [29]:
result = generate_response(crypto_dataset[39]['question'],base_model)
print(f"Input:\n{crypto_dataset[39]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[39]['answer']}")

Input:
A die is rolled twice. Let A be the event rolling a 6 on the first roll and let B be the event rolling a 3 on the second roll. Are A and B independent events?

Generated Response 
 
  Great, let's analyze the probability of event A (rolling a 6 on the first roll) and event B (rolling a 3 on the second roll):

Event A: Probability = 1/6 (since there are 6 possible outcomes on the first roll, and 1 of them is a 6)

Event B: Probability = 1/6 (same as above)

Now, we need to calculate the probability of both events occurring simultaneously:

P(A ∩ B) = P(A)P(B) = (1/6) × (1/6) = 1/36

Since the probability of both events occurring is 1/36, we can conclude that they are not independent events. In other words, the outcome of the first roll affects the probability of the second roll.

I hope this helps! Let me know if you have any further questions.

Ground Truth 
 Yes, A and B are independent events as the outcome of the first roll does not affect the outcome of the second roll.


### Test on the fine tuned llama model

In [30]:
# Load the finetuned model from output_dir
ft_model =AutoPeftModelForCausalLM.from_pretrained(
    output_dir,
    low_cpu_mem_usage=True,
    torch_dtype=torch.float16,
    load_in_4bit=True,
    cache_dir="/projects/barman/cache",
)

Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:08<00:00,  2.69s/it]


In [28]:
# Set the pipeline for inference
pipe_ft = pipeline(
    "text-generation", 
    model=ft_model, 
    tokenizer = tokenizer, 
    torch_dtype=torch.bfloat16, 
    device_map= {"": 0}
)
# Print the results
result_ft = pipe_ft(f""" <s>[INST] <<SYS>>\n{DEFAULT_SYSTEM_PROMPT}\n<</SYS>>\n\n{prompts["math"][5]}[/INST]\n """ ,do_sample=True,
    max_new_tokens=512, 
    temperature=0.9, 
    top_k=50, 
    top_p=0.95,
    num_return_sequences=1)




In [35]:
result = generate_response(crypto_dataset[24]['question'],ft_model)
print(f"Input:\n{crypto_dataset[24]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[24]['answer']}")

Input:
What is the sum of the first 10 natural numbers?

Generated Response 
 
 0 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 = 45.
The sum of the first 10 natural numbers is 45.

Alternatively, we can use the formula for the sum of an arithmetic series: $S = (n_1 + n_2 +... + n_m) / m$, where $n_1$, $n_2$,..., $n_m$ are the terms of the series and $m$ is the number of terms. In this case, the sum of the first 10 natural numbers is $10 / 10 = 1$.

We can also use the fact that the sum of the first $n$ natural numbers is $n$ times the last natural number: $S = n \cdot N$, where $N$ is the last natural number. In this case, the sum of the first 10 natural numbers is $10 \cdot 9 = 90$.

Or we can use a combination of these formulas: $S = (0 + 1 + 2 +... + 9) / 10 = 9 / 10 = 0.9``.

These are just a few examples of how the sum of the first 10 natural numbers can be expressed. The actual value is $45$.

Ground Truth 
 The sum of the first 10 natural numbers is 1+2+3+4+5+6+7+8+9+10 = 55. It can be a

In [37]:
result = generate_response(crypto_dataset[26]['question'],ft_model)
print(f"Input:\n{crypto_dataset[26]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[26]['answer']}")

Input:
Determine if the expression \(3 \cdot 3262 + 2\) is divisible by 3.

Generated Response 
 
 3 is a factor of both 3 and 2. However, 2 is not a factor of either 3 or 12. Therefore, the expression is not divisible by 3. The Final Verdict is Non-divisible.

OR

Let's break it down step by step:

1. Is 3 a factor of 3262? No, 3262 is not divisible by 3.
2. Is 3 a factor of 2? Yes, 3 is a factor of 2.
3. Is 2 a factor of either 3 or 12? No, 2 is not a factor of either 3 or 12.
4. Therefore, the expression is not divisible by 3.

In conclusion, the expression \(3 \cdot 3262 + 2\) is not divisible by 3. It does not meet the criteria for divisibility, and therefore we can say that it is Non-divisible.

Ground Truth 
 The expression \(3 \cdot 3262 + 2\) is not divisible by 3 as 3 \cdot 3262 = 9786 is divisible by 3 but 2 is not divisible by 3.


In [38]:
result = generate_response(crypto_dataset[30]['question'],ft_model)
print(f"Input:\n{crypto_dataset[30]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[30]['answer']}")

Input:
Is 0.333333..... a rational number?

Generated Response 
 
 0.333333..... is not a rational number. A rational number is an integer or a number that can be expressed as the ratio of two integers. In this case, 0.333333..... cannot be expressed as the ratio of two integers, making it an irrational number. Irrational numbers have infinite decimal expansions and cannot be written as a simple fraction.

One reason why 0.333333..... is not a rational number is that it has an infinite number of digits after the decimal point. This is because the decimal expansion of a number is infinite when the number cannot be expressed as the ratio of two integers. Another reason is that 0.333333..... does not have a terminating decimal expansion, meaning that it does not repeat itself after a certain number of digits.

While 0.333333..... may seem like a rational number at first glance, it is important to look closely at its properties to determine whether it is rational or irrational.

Ground Tru

In [39]:
result = generate_response(crypto_dataset[32]['question'],ft_model)
print(f"Input:\n{crypto_dataset[32]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[32]['answer']}")

Input:
Is 3493 a prime ?

Generated Response 
 
 3493 is a composite number. It is divisible by 3 and 1133, making it a prime. Therefore, it is not a prime.

Contrary to popular belief, the number 1 is not a prime. It is a special number that serves as the identity element for addition and the empty set for functions.

It is important to note that the classification of a number as prime or composite is not solely determined by its divisors. A number can have only two divisors and still be considered prime if it meets certain criteria. For example, 2 is considered a prime number even though it has only two divisors: 1 and 2. This is because 2 cannot be expressed as the product of two smaller integers, as it has a unique property known as syllabus that makes it an irreducible number.

In conclusion, 3493 is not a prime number. It is a composite number with multiple divisors, including 3 and 1133. While 1 is not a prime number, it serves an important role in mathematics as the identity el

In [40]:
result = generate_response(crypto_dataset[41]['question'],ft_model)
print(f"Input:\n{crypto_dataset[41]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[41]['answer']}")

Input:
What is the number of arrangements in a team of 10 , if there are 4 players and 6 substitutes?

Generated Response 
 
 4 players by themselves form $4！10=40$ teams. 6 substitutes can be chosen from the remaining 4 players, for a total of $6！4=24$ teams. 4 players and 6 substitutes together form $4！6×5=24√10$ teams.  Overall, there are $40+24+24√10=1288$ teams.

Note: This calculation overlooks the possibility of some players being both starters and subs. To account for this, you could either (a) disqualify any player who appears in both the starter and sub list, or (b) assume that each player has a fixed "role" (starter or sub) and that each team is composed of one starter and one or zero subs. The first approach would result in a lower estimate of the number of teams, while the second approach would result in a higher estimate.

Ground Truth 
 The number of arrangements in a team of 10, if there are 4 players and 6 substitutes is 10!/(4! * 6!) = 210.


In [41]:
result = generate_response(crypto_dataset[39]['question'],ft_model)
print(f"Input:\n{crypto_dataset[39]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[39]['answer']}")

Input:
A die is rolled twice. Let A be the event rolling a 6 on the first roll and let B be the event rolling a 3 on the second roll. Are A and B independent events?

Generated Response 
 
  Yes, A and B are independent events. This is because the event of rolling a 6 on the first roll does not affect the probability of rolling a 3 on the second roll, and vice versa. The probability of each event occurring on its own is 1/6 for A and 1/8 for B. Independence is a fundamental assumption in probability theory, which assumes that the occurrence of one event does not affect the probability of another event. In this case, rolling a 6 on the first die does not affect the probability of rolling a 3 on the second die, and vice versa. Therefore, A and B are independent events.

Ground Truth 
 Yes, A and B are independent events as the outcome of the first roll does not affect the outcome of the second roll.


## Testing Cryptographic Conceptual capabilities

### Testing the base model

In [44]:
SYSTEM_PROMPT = "You are a cryptographic expert with very strong conceptual understanding of the domain. You can easily solve medium to higher difficulty cryptographich problems as well. Please answer clearly and precisely as if you are teaching graduate students."

In [45]:
def generate_response(prompt,test_model):
    # Ignore warnings
    logging.set_verbosity(logging.CRITICAL)

    # Set the pipeline for inference
    pipe = pipeline(
        "text-generation", 
        model=test_model, 
        tokenizer = tokenizer, 
        torch_dtype=torch.bfloat16, 
        device_map= {"": 0}
    )
    # Print the results
    result = pipe(f"<s>[INST] <<SYS>>\n{SYSTEM_PROMPT}\n<</SYS>>\n\n{prompt}[/INST]\n ",do_sample=True,
        max_new_tokens=512, 
        temperature=0.9, 
        top_k=50, 
        top_p=0.95,
        num_return_sequences=1)
    return result

In [46]:
result = generate_response(crypto_dataset[2]['question'],base_model)
print(f"Input:\n{crypto_dataset[2]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[2]['answer']}")

Input:
Differentiate between confusion and diffusion in cryptography.

Generated Response 
 
  Great, let's dive into the details of confusion and diffusion in cryptography!

In cryptography, confusion and diffusion are two related but distinct concepts that serve different purposes.

Confusion is a cryptographic technique that aims to mix or confuse the plaintext (original message) in some way, making it more difficult for an attacker to decipher the original message. Confusion can be achieved through various means, such as:

1. Transposition: Transposing the letters of the plaintext, e.g., swapping "b" and "d", or "a" and "c".
2. Permutation: Rearranging the letters of the plaintext, e.g., rearranging the letters of "cat" to spell "act".
3. Substitution: Replacing some letters of the plaintext with others, e.g., replacing "a" with "e".
4. Compression: Compressing the plaintext into a shorter form, e.g., "hello" can be compressed to "hln".

Diffusion, on the other hand, is a technique

In [47]:
result = generate_response(crypto_dataset[8]['question'],base_model)
print(f"Input:\n{crypto_dataset[8]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[8]['answer']}")

Input:
What are hash functions useful for? Give an example.

Generated Response 
 
  Great! I'm glad to hear that you're interested in learning about the utility of hash functions. A hash function is a mathematical function that maps input data of variable length to a fixed-length output known as a "hash value." Hash functions are incredibly useful in a wide range of applications, and I'll provide you with a simple example to illustrate their utility.

Example: Hash Functions in Password Storage

Suppose you're designing a secure password storage system. One of the primary security concerns is preventing unauthorized access to user passwords. One way to address this issue is to use a one-way hash function to transform each password into a fixed-length hash value. Here's how it works:

1. The user enters their password as the input to the hash function.
2. The hash function processes the input and generates a fixed-length hash value as the output.
3. The hash value is stored in the pass

In [48]:
result = generate_response(crypto_dataset[10]['question'],base_model)
print(f"Input:\n{crypto_dataset[10]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[10]['answer']}")

Input:
What is one way function in cryptography?

Generated Response 
 
  Great, I'm glad you're interested in learning about one-way functions in cryptography!

A one-way function, also known as a one-way cryptographic function, is a mathematical function that maps an input to an output in such a way that it is computationally infeasible to reverse-engineer or infer the original input from the output. In other words, given a transformed output, it is extremely difficult or impossible to determine the original input that was used to produce it.

The importance of one-way functions in cryptography lies in their ability to provide security properties that are essential for many cryptographic primitives, such as encryption, digital signatures, and message authentication. These properties include:

1. Computational difficulty: It should be computationally infeasible to reverse-engineer or invert the function, making it difficult for an attacker to determine the original input.
2. Randomnes

In [49]:
result = generate_response(crypto_dataset[15]['question'],base_model)
print(f"Input:\n{crypto_dataset[15]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[15]['answer']}")

Input:
What are the drawbacks of symmetric encryption?

Generated Response 
 
  As a cryptographic expert, I must say that symmetric encryption is a widely used and effective method of encrypting data, but it does have some drawbacks. Here are some of the key limitations and challenges associated with symmetric encryption:

1. Key Management: Symmetric encryption relies on a secret key, which must be securely stored, distributed, and managed. This can be a challenge, especially in large-scale deployments where the key must be shared and synchronized across multiple parties.
2. Key Size: The strength of a symmetric encryption algorithm is determined by the size of the key used. In general, larger keys provide better security, but larger keys also increase the computational overhead required to encrypt and decrypt data.
3. Collision Attacks: Symmetric encryption algorithms are vulnerable to collision attacks, which occur when two different inputs produce the same output ciphertext. While

### Testing the fine tuned model

In [52]:
result = generate_response(crypto_dataset[2]['question'],ft_model)
print(f"Input:\n{crypto_dataset[2]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[2]['answer']}")

Input:
Differentiate between confusion and diffusion in cryptography.

Generated Response 
 
  Confusion and diffusion are two related but distinct concepts in cryptography. Confusion refers to the process of breaking apart plaintext into smaller, more manageable pieces in order to make it more difficult for an attacker to decipher the message. Diffusion, on the other hand, refers to the process of spreading out the plaintext throughout the ciphertext to make it more resistant to attack. While confusion is focused on protecting the individual pieces of plaintext, diffusion is focused on protecting the overall message. In summary, confusion is about breaking apart the message, while diffusion is about spreading it out.

Diffusion is a more advanced technique than confusion, as it requires a deeper understanding of cryptography and its goals. Diffusion is often used in conjunction with other cryptographic techniques, such as encryption and integrity, to provide a comprehensive security s

In [56]:
result = generate_response(crypto_dataset[8]['question'],ft_model)
print(f"Input:\n{crypto_dataset[8]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[8]['answer']}")

Input:
What are hash functions useful for? Give an example.

Generated Response 
 
  * Hash functions are useful for creating a quick and efficient way of looking up information in a database. For instance, instead of searching through a large list of names to find the one you're looking for, you can use a hash function to map the names to unique numerical identifiers. This allows you to quickly find the desired name in the list using the associated numerical ID.
* Another use case for hash functions is in digital signatures. A hash function can be used to create a digital signature by taking a message and computing a fixed-size hash value that represents the message. The signature is then created by taking the original message and appending the hash value to it. When the recipient receives the message, they can use the same hash function to compute the hash value based on the original message and compare it to the hash value that was appended to the message. If the two values match, t

In [53]:
result = generate_response(crypto_dataset[15]['question'],ft_model)
print(f"Input:\n{crypto_dataset[15]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[15]['answer']}")

Input:
What are the drawbacks of symmetric encryption?

Generated Response 
 
 1. Dependence on a secure key: The security of symmetric encryption relies heavily on the secrecy of the key used. If the key falls into the wrong hands, the entire system is compromised. 2. Limited key lifespan: As new threats and technologies emerge, the need for updated keys increases. This creates a burden on organizations to constantly refresh their keys, leading to a shorter key lifespan. 3. Exposure to insider threats: Symmetric encryption relies on the honesty and integrity of those who handle the key. Insider threats, such as employees deliberately exploiting vulnerabilities, can compromise the system. 4. Difficulty in scaling: As organizations grow, managing and coordinating key updates across all departments and teams can become challenging. This can lead to key fragmentation, where different parts of the organization use different keys, creating a security risk. 5. Lack of flexibility: Once a key

In [57]:
result = generate_response(crypto_dataset[10]['question'],ft_model)
print(f"Input:\n{crypto_dataset[10]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[10]['answer']}")

Input:
What is one way function in cryptography?

Generated Response 
 
  A one-way function is a cryptographic concept that refers to a mathematical function that can be inverted, but not the other way around. In other words, it is possible to map a given input to a unique output using a one-way function, but it is not possible to determine the original input from the output alone. Some common examples of one-way functions include the absolute value, the logarithm, and the square root.

The importance of one-way functions in cryptography lies in their ability to provide a barrier against reverse engineering. By making it impossible or impractical to determine the original input from the output, one-way functions can help to prevent an attacker from discovering the true meaning of the cryptographic messages. This, in turn, can increase the security of the system and make it more resistant to reverse engineering attacks.

One-way functions are often used in combination with other crypto

## Cipher Math problems

In [58]:
result = generate_response(crypto_dataset[45]['question'],base_model)
print(f"Input:\n{crypto_dataset[45]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[45]['answer']}")

Input:
What transformation occurs when “CHEESE” is shifted by B in the Caesar cipher?

Generated Response 
 
  Great, let's explore the Caesar cipher together! 🔒

The Caesar cipher is a simple encryption technique that replaces each letter with a letter a fixed number of positions down the alphabet. When we apply the shift to "CHEESE", we need to move each letter by a certain number of positions down the alphabet. 

For "CHEESE", if we shift by 3 letters down the alphabet, the transformed ciphertext would be:

C -> D
H -> G
E -> F
E -> G
S -> T

So, the transformation that occurs when "CHEESE" is shifted by B in the Caesar cipher is "DGFTS". 🔒

Now, let's try to encrypt another message using the same shift. How about "GRATE"? 😃

Ground Truth 
 When “CHEESE” is shifted by B in the Caesar cipher, the transformation is “EJGGUG”.


In [69]:
result = generate_response(crypto_dataset[45]['question'],ft_model)
print(f"Input:\n{crypto_dataset[45]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[45]['answer']}")

Input:
What transformation occurs when “CHEESE” is shifted by B in the Caesar cipher?

Generated Response 
 
  CHEESE becomes EEHCOORSS when shifted by B in the Caesar cipher. This is because B is a shift of 2, and when you shift a string by 2, you move every other character forward 1 position. So, the first and third characters (“C” and “S”) move forward 1 position, while the second and fourth characters (“H” and “E”) move backward 1 position. The fifth and seventh characters (“E” and “R”) move forward 1 position, and the sixth character (“G”) remains in the same position. Finally, the last character (“S”) moves forward 1 position, and the entire string is shifted by B.

Ground Truth 
 When “CHEESE” is shifted by B in the Caesar cipher, the transformation is “EJGGUG”.


In [70]:
result = generate_response(crypto_dataset[49]['question'],ft_model)
print(f"Input:\n{crypto_dataset[49]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[49]['answer']}")

Input:
What is the process for decrypting the message 'yurgooaeod' using a rail fence with a depth of 2?

Generated Response 
 
 1. Columns are created by splitting the message into 24-character blocks. Each block is marked with a number from 0 to 23.
 2. The blocks are then shifted one position to the right.
 3. The entire message is then duplicated and the duplicated message is shifted one position to the left.
 4. The two messages are then compared character by character. If a mismatch is found, the entire process is repeated until a correct match is found.
 5. Once the messages match, the original message is retrieved by moving the block marks left and the duplicated blocks right.
The resulting message is: 'yurgooaeodyurgooaeod'
This process ensures that the original message is restored, while any unauthorized tampering is evident from the differences in the two messages.

Ground Truth 
  The process for decrypting the message 'yurgooaeod' using a rail fence with a depth of 2 is 'y

In [72]:
result = generate_response(crypto_dataset[49]['question'],base_model)
print(f"Input:\n{crypto_dataset[49]['question']}\n")
print(f"Generated Response \n {result[0]['generated_text'].split("[/INST]")[-1]}\n")
print(f"Ground Truth \n {crypto_dataset[49]['answer']}")

Input:
What is the process for decrypting the message 'yurgooaeod' using a rail fence with a depth of 2?

Generated Response 
 
  Decrypting a message using a rail fence encryption scheme with a depth of 2 involves the following steps:

Step 1: Determine the number of rounds in the encryption scheme
In this case, the message 'yurgooaeod' is encrypted with a rail fence depth of 2, which means there are 2 rounds of encryption.

Step 2: Identify the key schedule used in the encryption scheme
The key schedule is the pattern of keys used to encrypt and decrypt the message. For a rail fence encryption scheme with a depth of 2, the key schedule is:

Key 1: E(K_1, m) = y
Key 2: E(K_2, E(K_1, m)) = o
Key 3: E(K_3, E(K_2, E(K_1, m))) = e
Key 4: E(K_4, E(K_3, E(K_2, E(K_1, m)))) = d

Step 3: Apply the key schedule to the plaintext message m
Using the key schedule, apply the encryption function E to each round of the rail fence, starting with the first round and moving left to right. The final cip