### Question 1: Closed Source Chat Model Control üå°Ô∏è

Using the **Anthropic Claude** chat model interface:
1.  Initialize the model using `ChatAnthropic`.
2.  Set the `temperature` parameter to `0.9` (high creativity) and `max_tokens` to `50`.
3.  Send the following prompt: "Write a short, dramatic movie tagline for a film about a rogue AI that falls in love with a human."
4.  Print the `content` of the result.

*(Focus: Applying key control parameters (temperature, max tokens) to a closed-source chat interface, and ensuring you extract the `.content` correctly.)*

In [1]:
from langchain_anthropic import ChatAnthropic
from dotenv import load_dotenv

In [2]:
load_dotenv()

model = ChatAnthropic(model_name="claude-3-7-sonnet-20250219", temperature=0.9, max_tokens=50)
result = model.invoke("Write a short, dramatic movie tagline for a film about a rogue AI that falls in love with a human.")
print(result.content)

"In a world where code meets compassion, their forbidden connection could save humanity... or delete it forever."


### Question 2: Open Source Local Inference üíæ

Implement the code to run a Hugging Face model **locally** on your machine using the `HuggingFacePipeline` class.
1.  Import the necessary classes (`ChatHuggingFace` and `HuggingFacePipeline`).
2.  Use the model ID `google/gemma-2b` (or another suitable small model).
3.  Set the `pipeline_kwargs` to ensure a maximum generation of **75 new tokens**.
4.  Ask the model: "Explain the main difference between an LLM and a Chat Model in three sentences."
5.  Run the code and print the output.

*(Focus: Implementing the local pipeline architecture, understanding the setup for local models, and handling keyword arguments.)*

In [3]:
from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline

llm = HuggingFacePipeline.from_model_id(
    model_id="google/gemma-2b-it",
    task="text-generation",
    pipeline_kwargs=dict(
        max_new_tokens=75
    )
)

model = ChatHuggingFace(llm=llm)
response = model.invoke("Explain the main difference between an LLM and a Chat Model in three sentences.")
print(response.content)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/137 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Device set to use cpu


<bos><start_of_turn>user
Explain the main difference between an LLM and a Chat Model in three sentences.<end_of_turn>
<start_of_turn>model
Sure, here's the difference between an LLM and a Chat Model:

1. **LLM (Large Language Model)** is a specialized AI model with a massive dataset of text and code, trained on a massive dataset of text and code.

2. **Chat model** is a type of AI model designed to engage in human-like conversation.

3


### Question 3: Vector Generation and Verification üìè

Use the **OpenAI Embedding Model** (`OpenAIEmbeddings`) to generate vectors for a list of three sentences:
1.  "The sun rises in the east."
2.  "A computer processes data."
3.  "The capital of France is Paris."
4.  Use a target dimension of `64` for the output vectors.
5.  Call the correct function to process all three texts simultaneously.
6.  **Print:** The number of vectors generated, and the dimension (length) of the first vector, verifying that it is 64.

*(Focus: Differentiating between `embed_query` and `embed_documents`, and controlling/verifying the output dimensions.)*

In [4]:
from langchain_openai import OpenAIEmbeddings

load_dotenv()

model = OpenAIEmbeddings(model="text-embedding-3-large", dimensions=64)

docs = [
    "The sun rises in the east.",
    "A computer processes data.",
    "The capital of France is Paris.",
]

result = model.embed_documents(docs)
print(result)

[[0.07756446301937103, -0.14171405136585236, -0.0034877683501690626, -0.07713549584150314, -0.14522375166416168, -0.19264376163482666, 0.03531152382493019, 0.15005934238433838, 0.014818750321865082, -0.20278289914131165, -0.0793193131685257, 0.21619777381420135, 0.09172026813030243, -0.05202161520719528, 0.11355842649936676, 0.1859363168478012, 0.0099002905189991, 0.09819371998310089, -0.058495067059993744, 0.1449897736310959, -0.15629881620407104, 0.0792413204908371, -0.14795352518558502, 0.13227684795856476, -0.12385355681180954, 0.1532570719718933, -0.004323760513216257, -0.0326792448759079, 0.05646723881363869, -0.0649295225739479, 0.1909279078245163, 0.09936362504959106, 0.27188506722450256, 0.08080118894577026, -0.21572981774806976, 0.16986967623233795, 0.008613399229943752, 0.05857305973768234, -0.11714611947536469, 0.16409815847873688, -0.13617651164531708, -0.1307949721813202, -0.0933581292629242, 0.017412031069397926, -0.269857257604599, -0.059820957481861115, 0.0195958483964

In [5]:
print(len(result[0]))

64


### Question 4: Cosine Similarity Calculation üìê

You have two search terms:
*   **Query A:** "A device used for quick mathematical calculations."
*   **Query B:** "The large mammal with a trunk."

Implement a similarity search that finds the semantic similarity score (using Cosine Similarity) between **Query A** and the following **Document:** "A calculator is a portable electronic device used to perform arithmetic operations."

1.  Generate embeddings for the Query A and the Document text.
2.  Calculate and print the Cosine Similarity score between them.
3.  **Bonus:** Calculate and print the Cosine Similarity score between Query B and the Document text. (The score for B should be significantly lower than A, demonstrating semantic context.)

*(Focus: Using the `cosine_similarity` function from `sklearn` and properly formatting the inputs (2D list required) for vector comparison.)*

In [8]:
from langchain_openai import OpenAIEmbeddings
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

load_dotenv()

model = OpenAIEmbeddings(model="text-embedding-3-large", dimensions=64)

query_a = "A device used for quick mathematical calculations."
query_b = "The large mammal with a trunk."

docs = ["A calculator is a portable electronic device used to perform arithmetic operations."]

query_a_embedding = model.embed_query(query_a)
query_b_embedding = model.embed_query(query_b)
docs_embedding = model.embed_documents(docs)

query_a_cosine_similarity = cosine_similarity([query_a_embedding], docs_embedding)[0]
print(query_a_cosine_similarity)
query_b_cosine_similarity = cosine_similarity([query_b_embedding], docs_embedding)[0]
print(query_b_cosine_similarity)

[0.63773452]
[0.16919289]


### Question 5: Implement Custom LLM, ChatModel and Embedding Model.