# Using LLM (CodeLlama-7b-Instruct-hf) for finding errors with code, giving completion score & find thought process of students
## Reasons to use CodeLlama-7b-Instruct-hf:
## •out of 7b,13b,34b,70b models ideally the most accurate results would be given by 70b parameters model & 34b model gives the best mix of speed and accuracy
## •we use 7b model due to **computational resources limits** (20 gb GPU VRAM limit)

In [1]:
import transformers
import torch

model = "codellama/CodeLlama-7b-Instruct-hf"

In [2]:
from transformers import pipeline

# Initializing the pipeline with the specific model
pipe = pipeline("text-generation", model=model,
    torch_dtype=torch.float16,
    device_map="auto",)

2024-02-02 00:57:39.352295: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-02 00:57:39.352418: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-02 00:57:39.500219: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


config.json:   0%|          | 0.00/646 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/25.1k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/749 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/411 [00:00<?, ?B/s]

# THEME/QUESTION GUESSING (to give LLM idea on how to evaluate student codes)
## •Since the questions corresponding solutions are not available, I am using the model to guess the question based on the solution and then usimg it for evaluation
## •If the Questions were made available (since they are already present in a database), the report would be more robust

In [3]:


# Definining prompt (Student solution)
prompt = '''<s>[INST]
{{ from pythonds.basic import Stack
   
   def infixToPostfix(infixexpr):
       prec = {}
       prec["*"] = 3
       prec["/"] = 3
       prec["+"] = 2
       prec["-"] = 2
       prec["("] = 1
       opStack = Stack()
       postfixList = []
       tokenList = infixexpr.split()
   
       for token in tokenList:
           if token in "ABCDEFGHIJKLMNOPQRSTUVWXYZ" or token in "0123456789":
               postfixList.append(token)
           elif token == '(':
               opStack.push(token)
           elif token == ')':
               topToken = opStack.pop()
               while topToken != '(':
                   postfixList.append(topToken)
                   topToken = opStack.pop()
           else:
               while (not opStack.isEmpty()) and \
                  (prec[opStack.peek()] >= prec[token]):
                     postfixList.append(opStack.pop())
               opStack.push(token)
   
       while not opStack.isEmpty():
           postfixList.append(opStack.pop())
       return " ".join(postfixList)
   
   print(infixToPostfix("A * B + C * D"))
   print(infixToPostfix("( A + B ) * C - ( D - E ) * ( F + G )"))
   }
   
   find the question this code is trying to solve:
   
}} [/INST]'''

# Generating text/code based on the solution prompt
result = pipe(prompt, max_length=1000, return_full_text=False)

# Extracting and printing the generated question
theme = result[0]['generated_text']
print(theme)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


  This code is trying to solve the problem of converting an infix expression to a postfix expression.

Infix expressions are written with operators between the operands, such as "A + B * C". The postfix expression is written with the operands followed by the operators, such as "AB\*C+".

The code defines a function `infixToPostfix` that takes an infix expression as input and returns the corresponding postfix expression. It does this by using a stack to keep track of the operators and their precedence. The function first splits the input expression into a list of tokens, and then iterates through the tokens, processing each one as follows:

* If the token is an operand (either a letter or a number), it is added to the postfix list.
* If the token is an opening parenthesis, it is pushed onto the stack.
* If the token is a closing parenthesis, it is popped from the stack and added to the postfix list until the opening parenthesis is reached.
* If the token is an operator, it is compared t

# ERROR FINDING ON THE BASIS OF THE QUESTION

## •We evaluate the code and students performance
## •For now, we allow the model to give report in its own way since no report format is mentioned
## •Ideally, if a given format is decided on, we can fine tune the LLM to give outputs in desired format

In [4]:
#defining the prompt with the question for evaluation
prompt = '''<s><<SYS>>Help find any issues with the students thinking process<</SYS>>[INST] given question :'''+theme+'''

Student's Solution:

{{ from pythonds.basic import Stack
   
   def infixToPostfix(infixexpr):
       prec = {}
       prec["*"] = 3
       prec["/"] = 3
       prec["+"] = 2
       prec["-"] = 2
       prec["("] = 1
       opStack = Stack()
       postfixList = []
       tokenList = infixexpr.split()
   
       for token in tokenList:
           if token in "ABCDEFGHIJKLMNOPQRSTUVWXYZ" or token in "0123456789":
               postfixList.append(token)
           elif token == '(':
               opStack.push(token)
           elif token == ')':
               topToken = opStack.pop()
               while topToken != '(':
                   postfixList.append(topToken)
                   topToken = opStack.pop()
           else:
               while (not opStack.isEmpty()) and \
                  (prec[opStack.peek()] >= prec[token]):
                     postfixList.append(opStack.pop())
               opStack.push(token)
   
       while not opStack.isEmpty():
           postfixList.append(opStack.pop())
       return " ".join(postfixList)
   
   print(infixToPostfix("A * B + C * D"))
   print(infixToPostfix("( A + B ) * C - ( D - E ) * ( F + G )"))
   }
   
   explain errors in the Student's code trying to solve the given question and give him a score out of 100. I dont need corrected code:}} [/INST]'''

# Generating text/code based on the prompt
result = pipe(prompt, max_length=3000, return_full_text=False)

# Extracting and printing the generated code
generated_code = result[0]['generated_text']
print(generated_code)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


  The student's code has several errors that need to be addressed to make it work correctly. Here's a breakdown of the errors and their impact on the code:

1. Syntax errors:
	* The code has several syntax errors, such as missing colons, missing parentheses, and incorrect indentation. These errors can make the code difficult to read and understand, and can lead to unexpected behavior.
2. Precedence table:
	* The student's code defines a precedence table, but it is not used correctly. The table is defined as a dictionary with the operators as keys and their precedence as values. However, the code does not use this table to determine the precedence of the operators. Instead, it uses a hard-coded approach to determine the precedence of the operators.
3. Operator stack:
	* The student's code uses a stack to keep track of the operators and their precedence. However, the stack is not properly initialized. The `opStack` variable is defined as a `Stack` object, but it is not initialized with a