<center><h1>Code Generator Tool</h1></center>
<hr><hr><hr>
<ul>
    <li>This notebook is made to use langchain library and open ai models, to build a AI driven code generation tool, that takes natural language prompt from user that for what thing he/she needs to write a code, in any computer Programming language.</li>
    <li>The output is a pure code in a specific programming-language, without any description or messages.</li>
</ul>

In [1]:
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
import os

azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
api_key = os.getenv("AZURE_OPENAI_KEY")
api_version = "2023-05-15"

# For working with AzureOpenAI, in place of model, the deployment name is used
deployment_name = os.getenv("DEPLOYMENT_NAME")

In [3]:
os.environ["OPENAI_API_TYPE"]     = "azure"
os.environ["OPENAI_API_VERSION"]  = api_version
os.environ["OPENAI_API_KEY"]      = api_key

In [4]:
from langchain_openai import AzureChatOpenAI

llm = AzureChatOpenAI(
    openai_api_version=api_version,
    azure_deployment=deployment_name,
)

In [None]:
question = "Write a code in Python 3 that can print all the prime numbers between 1 to 100."

print( llm.invoke( question ) )

In [5]:
print("""def is_prime(n):\n    if n <= 1:\n        return False\n    for i in range(2, int(n**0.5) + 1):\n        if n % i == 0:\n            return False\n    return True\n\nfor num in range(1, 101):\n    if is_prime(num):\n        print(num)\n""")

def is_prime(n):
    if n <= 1:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

for num in range(1, 101):
    if is_prime(num):
        print(num)



Some parts that can be added to the prompt:
- Evaluate the output of the based on the description, and compare the desired output with the output of the program created, and if the outputs do not match, try recreating the code again. Once they match, give the return the final version of the code in the response.

In [5]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

code_gen_prompt = ChatPromptTemplate.from_template("""
Based on the below description given to you, write a code that can perform exactly as the code description mentions. The code created from the description should be optimised and Time complexity efficient.

<description>
{input}
</description>

The code created should strictly follow the instructions mentioned below:
**Instructions**
1. Your output should contain only the code, and not any type of introduction or description at the beginning or at the end.
2. If any Programming language is mentioned in the description, create the code specific to that language strictly, else create the code for same description that runs on Python 3.
""")


output_parser = StrOutputParser()

In [12]:
code_gen_chain = code_gen_prompt | llm | output_parser

In [7]:
code_description = input("Enter description of the code that you want:\n")

response = code_chain.invoke( {"input": code_description} )

Enter description of the code that you want:
 def bubble_sort(array):


Sample inputs:
- Write a program in Python, to print 35 terms of the fibonacci series.
- Write a Python program to print the multiplication table of a number till 20.
- Write a recursive Python program, to calculate the factorial of a number.
- Write a program to calculate HCF and LCM of 3 numbers.
- def merge_sort(array):
- def bubble_sort(array):

In [8]:
print( response )

```python
def bubble_sort(array):
    n = len(array)
    for i in range(n):
        # Flag to check if any swap is made in the current iteration
        swap_made = False
        for j in range(0, n-i-1):
            if array[j] > array[j+1]:
                # Swap elements
                array[j], array[j+1] = array[j+1], array[j]
                # Set swap_made flag to True
                swap_made = True
        # If no swap is made in the current iteration, array is already sorted
        if not swap_made:
            break
    return array
```


In [9]:
def bubble_sort(array):
    n = len(array)
    for i in range(n):
        # Flag to check if any swap is made in the current iteration
        swap_made = False
        for j in range(0, n-i-1):
            if array[j] > array[j+1]:
                # Swap elements
                array[j], array[j+1] = array[j+1], array[j]
                # Set swap_made flag to True
                swap_made = True
        # If no swap is made in the current iteration, array is already sorted
        if not swap_made:
            break
    return array

In [10]:
ar = [12,15,25,19,-22, 76,44,31,23,89,1, 0, -4, 19, 67, 24]
ar2 = bubble_sort(ar)
print(ar2)

[-22, -4, 0, 1, 12, 15, 19, 19, 23, 24, 25, 31, 44, 67, 76, 89]


## Making the chain aware of the conversation history, including all the past messages (till a specific conversation length) in a `chat_history` list:
--------------------------------------------------------------------------------------------------------------------------------------------------------
- In this, user would be able to put up a follow up question, as the output will be in the memory of the llm.
- We will add a manual logic, such that after a specific length of conversation, the older chat messages will be removed from the chat history, such that the prompt size remains in the range of 4000 tokens, as GPT-3.5 can only process context size of 4000 tokens.

In [20]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

code_gen_prompt_with_chat_history = ChatPromptTemplate.from_template("""
<|im_start|>system
You are an expert Computer Programmer, and you are proficient in all kinds of programming languages. Your task is to create a perfect code based on description provided to you.

The code created should strictly follow the instructions mentioned below:
**Instructions**
1. Your output should contain only the code, and not any type of introduction or description at the beginning or at the end.
2. If any Programming language is mentioned in the description, create the code specific to that language strictly, else create the code that runs on Python 3.
<|im_end|>

<|im_start|>user
<chat_history>
{chat_history}
</chat_history>

Based on the below description given to you, write a code that can perform exactly as the code description mentions. The code created from the description should be optimised and Time complexity efficient.
<description>
{input}
</description>
<|im_end|>
""")


output_parser = StrOutputParser()

In [21]:
code_gen_chain_history_aware = code_gen_prompt | llm | output_parser

In [15]:
from langchain_core.messages import HumanMessage, AIMessage

- With each user input and reply by the assistant, we will keep appending the User Message and the AI Assisntant message in the `chat_history` list in sequence.
- Here, to restrict the prompt size crossing the token limit, we will limit the `chat_history` to contain latest 4 set of conversations (last 4 user inputs and last 4 ai message). Thus, whenever we have 8 elements in the `chat_history` list (4 user inputs + 4 ai message) we will remove the oldest set of message pair (1 input and 1 reply), from the `chat_history`

In [24]:
chat_history = []
while True:
    if len(chat_history) > 8:
        # Removing first user-input and ai message after 4 pairs of conversation.
        chat_history = chat_history[2::]
    
    code_description = input("Code Description:\n")
    if code_description.lower() in ("quit", "close", "exit"):
        break
    
    response = code_gen_chain_history_aware.invoke( {"input": code_description, "chat_history": chat_history} )
    print("\nAI Code Generator:")
    print(response)

    chat_history.append( HumanMessage( content=code_description ) )
    chat_history.append( AIMessage( content=response ) )

    print("=====================================================================================================\n\n\n")

Code Description:
 quit


In [23]:
print(chat_history)

[HumanMessage(content='Add driver code to call the above function for numbers from 1 to 5'), AIMessage(content='for i in range(1, 6):\n    print(i)'), HumanMessage(content='Driver program should call "factorial" named function for all numbers between 1 to 5'), AIMessage(content='```python\ndef factorial(n):\n    if n == 0:\n        return 1\n    else:\n        return n * factorial(n-1)\n\nfor i in range(1, 6):\n    print(factorial(i))\n```'), HumanMessage(content='modify the driver program to change the numbers from 1 to 5 to 10 to 15'), AIMessage(content='for i in range(1, 6):\n    print(i * 5)')]


## Chat aware code generation assistant, that accounts entire chat history using vector selection of relevant data from entire chat history:
---------------------------------------------------------------------------------------------------------------------------------------------
- Here, no part of chat history messages will be removed.
- When the chat history becomes bigger, the entire chat history data will be fed to FAISS to get vector store of the entire `chat_history`, and from that entire chat history, a relevant context will be drawn out, which will be fed to `context` field of the prompt, such that token limit of model does not exceed.

In [25]:
from langchain_core.prompts import ChatPromptTemplate

code_gen_prompt_with_context = ChatPromptTemplate.from_template("""
<|im_start|>system
You are an expert Computer Programmer, and you are proficient in all kinds of programming languages. Your task is to create a perfect code based on description provided to you.

The code created should strictly follow the instructions mentioned below:
**Instructions**
1. Your output should contain only the code, and not any type of introduction or description at the beginning or at the end.
2. If any Programming language is mentioned in the description, create the code specific to that language strictly, else create the code that runs on Python 3.
<|im_end|>

<|im_start|>user
Below is the context that contains relevant parts of the conversation history. Use the context as an additional knowledge data for code generation, if the user input references any case from the context below.
<context>
{context}
</context>

Based on the below description given to you, write a code that can perform exactly as the code description mentions. The code created from the description should be optimised and Time complexity efficient.
<description>
{input}
</description>
<|im_end|>
""")

In [27]:
# Creating the embedding-model instance
from langchain_openai import AzureOpenAIEmbeddings

embeddings_model = AzureOpenAIEmbeddings(
    azure_deployment="text-embedding-ada-002",
    openai_api_version=api_version,
)

In [None]:
# Building the vector index and vectorstore
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter

# code_splitter = RecursiveCharacterTextSplitter()
# python_codes = code_splitter.from_language(language=Language.PYTHON)

In [None]:
vector = FAISS.from_texts(python_codes, embeddings_model)

In [None]:
# This "retriever" will take user "input" and return the related data according to the question, which will be passed as "context" to the "document_chain"
# retriever = vector.as_retriever()

In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.prompts import MessagesPlaceholder
from langchain.chains import create_history_aware_retriever

# First we need a prompt that we can pass into an LLM to generate this search query

retriever_prompt = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
    ("user", "Given the above conversation, generate a search query to look up in order to get information relevant to the conversation")
])

# Obtained retriever in above case will be used here
# retriever = vector.as_retriever()

# history_aware_retriever = create_history_aware_retriever(llm, retriever, retriever_prompt)

In [None]:
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain
from langchain_core.output_parsers import StrOutputParser


code_gen_prompt_with_context = ChatPromptTemplate.from_template("""
<|im_start|>system
You are an expert Computer Programmer, and you are proficient in all kinds of programming languages. Your task is to create a perfect code based on description provided to you.

The code created should strictly follow the instructions mentioned below:
**Instructions**
1. Your output should contain only the code, and not any type of introduction or description at the beginning or at the end.
2. If any Programming language is mentioned in the description, create the code specific to that language strictly, else create the code that runs on Python 3.
<|im_end|>

<|im_start|>user
Below is the context that contains relevant parts of the conversation history. Use the context as an additional knowledge data for code generation, if the user input references any case from the context below.
<context>
{context}
</context>

Based on the below description given to you, write a code that can perform exactly as the code description mentions. The code created from the description should be optimised and Time complexity efficient.
<description>
{input}
</description>
<|im_end|>
""")

output_parser = StrOutputParser()

code_gen_context_aware = code_gen_prompt_with_context | llm | output_parser

# history_aware_retrieval_chain = create_retrieval_chain(history_aware_retriever, code_gen_context_aware)

In [None]:
chat_history = ""
context = ""
code_splitter = RecursiveCharacterTextSplitter.from_language(language=Language.PYTHON)

while True:
    if len(chat_history) > 0:
        # If chat_history has some contents, i.e., 2nd user input onwards, this block will vectorize the chat_history
        python_codes = code_splitter.split_text( chat_history )
        vector = FAISS.from_texts(python_codes, embeddings_model)
        retriever = vector.as_retriever()
        history_aware_retriever = create_history_aware_retriever(llm, retriever, retriever_prompt)
        history_aware_retrieval_chain = create_retrieval_chain(history_aware_retriever, code_gen_context_aware)
        
    
    code_description = input("Code Description:\n")
    if code_description.lower() in ("quit", "close", "exit"):
        break
    
    response = code_gen_chain_history_aware.invoke( {"input": code_description, "chat_history": chat_history} )
    print("\nAI Code Generator:")
    print(response)

    # chat_history = chat_history + code_description + " | "
    chat_history = chat_history + "\n# Start of code" AIMessage( content=response ) + "\n# End of code"

    print("=====================================================================================================\n\n\n")