In [1]:
!pip install -U pydantic==1.10.8

[0m

In [2]:
!pip install langchain

[0m

In [3]:
!pip install openai

[0m

In [4]:
!pip install cohere

[0m

In [49]:
from langchain import PromptTemplate
concept = '''
Gradient Descent is an iterative optimization process that searches for an objective function’s
 optimum value (Minimum/Maximum). It is one of the most used methods for changing a model’s parameters in
 order to reduce a cost function in machine learning projects.  
The primary goal of gradient descent is to identify the model parameters that provide the maximum accuracy
 on both training and test datasets. In gradient descent, the gradient is a vector pointing in the general
 direction of the function’s steepest rise at a particular point. The algorithm might gradually drop towards
 lower values of the function by moving in the opposite direction of the gradient, until reaching the minimum
 of the function.
 '''
student_own_words='''
Gradient descent is an optimization algorithm that is commonly used to minimize cost functions in
 machine learning models. It works by iteratively adjusting the model parameters, like the weights and biases
 in a neural network, in the direction that reduces the cost function.

The goal is to converge on the optimal set of parameters that minimize the cost, like the error between
 predictions and true labels. Gradient descent starts with random initial parameters, then calculates the gradient
 of the cost function. The gradient tells you which direction to update the parameters to reduce the cost.
The parameters are updated by a small amount in the negative gradient direction. This process is repeated until
the algorithm converges on a minimum cost.

So in summary, gradient descent iteratively fine-tunes the model parameters by calculating the gradient and moving
 in the direction that reduces the cost function. By repeating this process, it can find the optimal parameters that
 minimize the cost and maximize model accuracy.
 '''
template =  '''
    Read the following concept and a student's in-your-own-words description and evaluate the student's
description by the following criteria: 
- coverage of the core aspects of the concept
- clarity and simplicity of the explanation
- Identify gaps in understanding and areas that need improvement

Provide a score on how well the concept was explained.

Concept: `{concept}`

Student: `{student_own_words}`
Evaluation:
'''
prompt_template= PromptTemplate(template=template, input_variables=["concept", "student_own_words"])


In [67]:
colourful_student_own_words= '''Imagine you're standing in a valley and you want to find the lowest point. You can't see the entire valley,
but you can look around your immediate area. You notice the ground slopes down more steeply to your left.
So you take a step to the left, going downhill. Now you look around again - the slope is still steeper to
your left, so you step that way again. You keep doing this, taking steps in the direction of the steepest
downward slope, until you can't go any lower - you've reached the valley floor. 

This is similar to how gradient descent works. We have a function we want to minimize, like the cost of a
machine learning model. We can't see the whole function landscape, but we can calculate the slope or gradient
 at our current position. The gradient tells us which direction to move to go downhill - to lower the cost
function. We iteratively take small steps in the negative gradient direction, recalculating the gradient as
we go, until we reach the minimum cost. 

So gradient descent starts at some random point, measures the local gradient, takes a step downhill, and
repeats this to "walk down" the cost function valley until it reaches the bottom or minimum value. The gradient
helps guide each step towards the optimal low point just like walking downhill guides you to the valley floor.'''

In [25]:
from getpass import getpass
OPENAI_API_KEY=getpass()

KeyboardInterrupt: Interrupted by user

In [7]:
from langchain.chat_models import ChatOpenAI


In [8]:
from getpass import getpass
COHERE_API_KEY = getpass()

 ········


In [9]:
from langchain.llms import Cohere

## GPT-4

In [50]:
openai_llm = ChatOpenAI(openai_api_key=OPENAI_API_KEY, model="gpt-4")

In [51]:
from langchain import LLMChain
openai_llm_chain = LLMChain(llm=openai_llm,prompt=prompt_template)

In [52]:
response = openai_llm_chain.predict(concept=concept, student_own_words=student_own_words)

In [53]:
print(response)

The student's description of the Gradient Descent concept is quite good. The explanation covers the core aspects of the concept, such as its role in minimizing cost functions in machine learning models, how it iteratively adjusts model parameters, and the process of calculating the gradient and updating parameters in the direction that reduces cost. The explanation is clear, simple, and easy to understand. The goal of the algorithm to find the optimal parameters and maximize model accuracy is also well explained.

One potential area for improvement is the explanation of the gradient itself. The student did not fully explain that the gradient is a vector pointing in the direction of the function's steepest rise. Instead, the student simply stated that the gradient tells you which direction to update the parameters. While this is true, it does not fully capture the concept of a gradient. 

Also, the student didn't mention that the process continues until reaching the minimum of the funct

In [54]:
zero_response = openai_llm_chain.predict(concept=concept, student_own_words="climate change")
print(zero_response)

The student's explanation is not relevant to the concept of Gradient Descent at all. There seems to be a misunderstanding or a mistake as the student has simply written "climate change", which is not related to the concept of Gradient Descent in machine learning. 

Score: 0/10


## GPT-3.5-turbo

In [55]:
openai_llm = ChatOpenAI(openai_api_key=OPENAI_API_KEY, model="gpt-3.5-turbo")

In [56]:
from langchain import LLMChain
openai_llm_chain = LLMChain(llm=openai_llm,prompt=prompt_template)

In [57]:
response = openai_llm_chain.predict(concept=concept, student_own_words=student_own_words)

In [58]:
print(response)

The student's description covers the core aspects of the concept. They mention that gradient descent is an optimization algorithm used to minimize cost functions in machine learning models. They explain that it works by iteratively adjusting the model parameters in the direction that reduces the cost function. They also mention the goal of converging on the optimal set of parameters that minimize the cost.

The explanation is clear and simple. The student uses understandable language and provides a step-by-step description of how gradient descent works. They mention the use of random initial parameters, the calculation of the gradient, and the updating of parameters in the negative gradient direction.

There are no major gaps in understanding or areas that need improvement. The student accurately describes the iterative nature of gradient descent and its goal of minimizing the cost function.

Overall, the student's explanation is well-done and covers the important aspects of the concep

In [59]:
zero_response = openai_llm_chain.predict(concept=concept, student_own_words="climate change")
print(zero_response)

The student's response does not address the concept of gradient descent at all. It is unrelated to the concept and does not demonstrate understanding or knowledge of the topic. Therefore, the student's response does not cover any core aspects of the concept, lacks clarity and simplicity, and there are significant gaps in understanding. The score for the explanation would be very low, possibly 1 out of 10.


# Cohere

In [60]:
from langchain.llms import Cohere
cohere_llm = Cohere(model="command-nightly", cohere_api_key=COHERE_API_KEY)

In [61]:
from langchain import LLMChain
cohere_llm_chain = LLMChain(llm=cohere_llm,prompt=prompt_template)


In [64]:
response = cohere_llm_chain.predict(concept=concept, student_own_words=student_own_words)
print(response)

 - Coverage of the core aspects of the concept: 
The student's description covers the key aspects of gradient descent, including the iterative optimization process, the objective of finding the minimum/maximum value of an objective function, and the role of gradients in determining the direction of optimization.
- Clarity and simplicity of the explanation: 
The student's description is clear and concise, using simple language to explain the concept. The use of bullet points and a summary helps to break down the concept into manageable parts.
- Identify gaps in understanding and areas that need improvement: 
The student's description does not mention the role of random initial parameters in gradient descent. Additionally, the description could be improved by explaining the process of convergence on a minimum cost, and how gradient descent is used to maximize model accuracy.
- Score: 
The student's description is well-written and covers the core aspects of the concept. However, there are

In [69]:
print(cohere_llm_chain.predict(concept=concept, student_own_words=colourful_student_own_words))

 + The student's explanation is clear and simple, and they provide an analogy that helps to understand the concept.

+ The student's explanation covers the core aspects of the concept, including the goal of gradient descent, how it works, and the role of the gradient.

- The student's explanation does not mention the importance of finding the optimum value of the objective function or the fact that gradient descent is one of the most used methods for changing a model's parameters in machine learning projects.

- The student's explanation does not mention the potential for gradient descent to converge to a local minimum rather than the global minimum.

Overall, the student's explanation is clear and provides a good understanding of the concept, but it does not cover all aspects of the concept and there are some areas that could be improved.

Score: 8/10


In [70]:
zero_response = cohere_llm_chain.predict(concept=concept, student_own_words="climate change")
print(zero_response)

 The student's explanation of gradient descent is mostly correct, but there are a few areas where it could be improved. First, the student does not fully explain what gradient descent is or how it is used in machine learning. Second, the student does not explain what the "objective function" is or how it relates to gradient descent. Third, the student does not explain what the "cost function" is or how it relates to gradient descent. Finally, the student does not explain what the "model parameters" are or how they relate to gradient descent.

Overall, the student's explanation of gradient descent is mostly correct, but it could be improved by explaining the core concepts in more detail.


# LaMini-Flan-T5

In [40]:
from langchain import HuggingFacePipeline
model_id = "MBZUAI/LaMini-Flan-T5-783M" 
flan_llm = HuggingFacePipeline.from_model_id(model_id=model_id, task="text2text-generation", model_kwargs={"temperature":0, "max_length":4096})

Downloading tokenizer_config.json:   0%|          | 0.00/2.48k [00:00<?, ?B/s]

Downloading spiece.model:   0%|          | 0.00/773k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/2.31M [00:00<?, ?B/s]

Downloading special_tokens_map.json:   0%|          | 0.00/2.15k [00:00<?, ?B/s]

Downloading config.json:   0%|          | 0.00/860 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/2.92G [00:00<?, ?B/s]

Device has 1 GPUs available. Provide device={deviceId} to `from_model_id` to use availableGPUs for execution. deviceId is -1 (default) for CPU and can be a positive integer associated with CUDA device id.


In [41]:
flan_llm_chain = LLMChain(llm=flan_llm,prompt=prompt_template)

In [42]:
response = flan_llm_chain.predict(concept=concept, student_own_words=student_own_words)

In [43]:
print(response)

The student's description is clear and concise.


In [44]:
zero_response = flan_llm_chain.predict(concept=concept, student_own_words="climate change")
print(zero_response)

The student's description of the concept of gradient descent is clear and concise. However, there are some areas that need improvement.


# LaMini-Neo

In [45]:
from langchain import HuggingFacePipeline
model_id = "MBZUAI/LaMini-Neo-1.3B" 
neo_llm = HuggingFacePipeline.from_model_id(model_id=model_id, task="text-generation", model_kwargs={"temperature":0.5, "max_length":4096})

Downloading tokenizer_config.json:   0%|          | 0.00/788 [00:00<?, ?B/s]

Downloading vocab.json:   0%|          | 0.00/779k [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/2.01M [00:00<?, ?B/s]

Downloading added_tokens.json:   0%|          | 0.00/21.0 [00:00<?, ?B/s]

Downloading special_tokens_map.json:   0%|          | 0.00/123 [00:00<?, ?B/s]

Downloading config.json:   0%|          | 0.00/1.42k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/4.99G [00:00<?, ?B/s]

Device has 1 GPUs available. Provide device={deviceId} to `from_model_id` to use availableGPUs for execution. deviceId is -1 (default) for CPU and can be a positive integer associated with CUDA device id.


In [46]:
neo_llm_chain = LLMChain(llm=neo_llm,prompt=prompt_template)

In [47]:
response = neo_llm_chain.predict(concept=concept, student_own_words=student_own_words)
print(response)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Input length of input_ids is 451, but `max_length` is set to 50. This can lead to unexpected behavior. You should consider increasing `max_new_tokens`.






In [48]:
zero_response = neo_llm_chain.predict(concept=concept, student_own_words="climate change")
print(zero_response)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Input length of input_ids is 254, but `max_length` is set to 50. This can lead to unexpected behavior. You should consider increasing `max_new_tokens`.


-
