# Jupyter Notebooks

Interactive coding playground where you can write and run small chunks of code, see the results immediately, and even visualize data right next to your code. It's like a combination of a code editor, a terminal, and a document editor, all in one.

[https://jupyter.org/install](https://jupyter.org/install)

# LLM - Large Language Models
- AI models focused on understanding, generating, and manipulating natural language. 
- Trained on large corpora of text data, enabling them to grasp the nuances of language, including syntax, semantics, and context. 
- Predict the probability of a sequence of words, which allows them to generate coherent and contextually relevant text based on given prompts.

<img src="ss_1.jpeg"/>

Src: [Intro to Large Language Models - Andrej Karpathy](https://www.youtube.com/watch?v=zjkBMFhNj_g)

[https://twitter.com/karpathy](https://twitter.com/karpathy)

# Agenda
- Concepts of LLM application development
    - Hello World
    - Message roles and models
    - Hallucinations
    - RAG
    - Function Calling
    - Agents and Tools
    - Frontend Copilot
    - FAQ: Embeddings
- Spaceweb Forge Demo

## Init Env

In [37]:
import dotenv from "npm:dotenv/config";
import {OpenAI} from "npm:openai";

In [38]:
// file: .env
// OPENAI_API_KEY = "...."

const _OpenAI = new OpenAI();

In [39]:
const chatCompletion = await _OpenAI.chat.completions.create({
    messages: [{ role: 'user', content: 'Hello World' }],    
    model: 'gpt-3.5-turbo',
  });
console.log(chatCompletion);
console.log(chatCompletion.choices[0].message.content);

// https://platform.openai.com/docs/models

{
  id: [32m"chatcmpl-A6GJlHxN8B8ZogalIoVejuU7kFFw7"[39m,
  object: [32m"chat.completion"[39m,
  created: [33m1726056057[39m,
  model: [32m"gpt-3.5-turbo-0125"[39m,
  choices: [
    {
      index: [33m0[39m,
      message: {
        role: [32m"assistant"[39m,
        content: [32m"Hello! How are you today?"[39m,
        refusal: [1mnull[22m
      },
      logprobs: [1mnull[22m,
      finish_reason: [32m"stop"[39m
    }
  ],
  usage: { prompt_tokens: [33m9[39m, completion_tokens: [33m7[39m, total_tokens: [33m16[39m },
  system_fingerprint: [1mnull[22m
}
Hello! How are you today?


## Utils

In [35]:
const streamOutput = async (completion) => {
  for await (const chunk of completion) {
    console.log(chunk.choices[0].delta.content);
  }
}

const extractMsgFromChatCompletion = (chatCompletion) => {
    return chatCompletion.choices[0]?.message?.content || 'null';
}

## Message Roles

Messages often adopt specific roles to guide the model’s responses. 
- The “system” provides high-level instructions. you can modify the personality of the assistant or provide specific instructions about how it should behave throughout the conversation.
- the “user” presents queries or prompts
- the “assistant” is the model’s response.

By differentiating these roles, we can set the context and direct the conversation efficiently.

In [40]:
const chatCompletion = await _OpenAI.chat.completions.create({
    messages: [{ role: 'user', content: 'Tell me a joke' }],
    model: 'gpt-3.5-turbo',
  });
console.log(chatCompletion.choices[0].message);

{
  role: [32m"assistant"[39m,
  content: [32m"Why did the scarecrow win an award? Because he was outstanding in his field!"[39m,
  refusal: [1mnull[22m
}


In [43]:
const chatCompletion_system = await _OpenAI.chat.completions.create({
    messages: [
      {
        role: 'system',
        content: 'You are an assistant for a Frontend Team. Only respond in a way which makes sense to this audience',
      },
      { role: 'user', content: 'Tell me a joke' },
    ],
    //model: 'gpt-3.5-turbo',
    model: 'gpt-4o',
  });
console.log(extractMsgFromChatCompletion(chatCompletion_system));

Why do JavaScript developers wear glasses?

Because they don't C#!


Models: [https://platform.openai.com/docs/models](https://platform.openai.com/docs/models)

## Hallucinations
tools like ChatGPT are trained to predict strings of words that best match your query. 
They lack the reasoning to apply logic or consider any factual inconsistencies they're giving out.

Shortcomings of LLM
 - not trained on your personal data
 - context window is finite
 - fine-tuning is costly

In [56]:
const chatCompletion = await _OpenAI.chat.completions.create({
    messages: [
      {
        role: 'user',
        content: `which component can I use from Spaceweb react library to render a Dropdown which fetches the options from an external API.
        Give me name of component directly.`,
      },
    ],
    model: 'gpt-3.5-turbo',
  });
console.log(extractMsgFromChatCompletion(chatCompletion));


AsyncDropdown


## RAG

- **Retrieve** most relevant data
- **Augment** query with context
- **Generate** response

In [53]:
const rag_context = `
All possible components present in Spaceweb are:

Input: An input enables a person to type in text information.,
Button: Buttons allow users to take actions, and make choices, with a single tap.
Select: The select component allow the user to select an option or options.
AsyncSelect: Same as Select component but allows loading options from a remote source.
`

const rag_query = `which component can I use from Spaceweb react library to render a Dropdown which fetches the options from an external API.
        Give me name of component directly`

const rag_prompt = `
Context information is below.
---------------------
${rag_context}
---------------------
Given the context information, answer the query.
Query: ${rag_query}
Answer:
`

const chatCompletion = await _OpenAI.chat.completions.create({
    messages: [
      {
        role: 'user',
        content: rag_prompt,
      },
    ],
    model: 'gpt-3.5-turbo',
  });
console.log(extractMsgFromChatCompletion(chatCompletion));


AsyncSelect


In [57]:
const chatCompletion = await _OpenAI.chat.completions.create({
    messages: [
      {
        role: 'user',
        content: `I am planning to host a technical discussion about LLM aplication development with a team whose members are located in different geographies.
          Locations are India, UAE and Singapore. Tell me a short title for this talk and most suitable time to host this talk in working hour.
          No need to explain the answer. Just give me these 2 details. Provide answer in Dubai time`,
      },
    ],
    model: 'gpt-3.5-turbo',
  });
console.log(extractMsgFromChatCompletion(chatCompletion));

Title: Global LLM Application Development Discussion
Most suitable time: 11:00 AM


## Function Calling

Do more than text generation. 
Reliably obtain structured data from the model

In [21]:
// Given a meeting title and time, send an invite to all attendees.
const schedule_meeting = ({ meeting_title, most_suitable_time }) => {
  
    console.log(`Sending a meeting invite...
      Time: ${most_suitable_time}
      Title: ${meeting_title}
    `);
    
    console.log('Invite sent ✅');
}

In [58]:
schedule_meeting({meeting_title: 'Hello world', most_suitable_time: '4PM IST'})

Sending a meeting invite...
      Time: 4PM IST
      Title: Hello world
    
Invite sent ✅


In [63]:
const schedule_meeting_definition = {
    type: 'function',
    name: 'schedule_meeting',
    description: 'Schedule a meeting with a team',
    parameters: {
      type: 'object',
      properties: {
        meeting_title: {
          type: 'string',
          description: 'Title of the Meeting',
        },
        most_suitable_time: {
          type: 'string',
          description: 'Time of the meeting in Epoch time only.',
        },
      },
      required: ['meeting_title', 'most_suitable_time'],
    },
};

const chatCompletionWithFunction = await _OpenAI.chat.completions.create({
    messages: [
      {
        role: 'user',
        content: `I am planning to host a technical discussion about LLM aplication development with a team whose members are located in Different geographies.
          Locations are India, UAE and Singapore. Tell me a short title for this talk and most suitable time to host this talk in working hour.
          Provide answer in Dubai time`,
      },
    ],
    functions: [schedule_meeting_definition],
    model: 'gpt-3.5-turbo',
  });
console.log(chatCompletionWithFunction);

{
  id: [32m"chatcmpl-A6Gba1TwJKh4inQO62wcmaAgOXvHb"[39m,
  object: [32m"chat.completion"[39m,
  created: [33m1726057162[39m,
  model: [32m"gpt-3.5-turbo-0125"[39m,
  choices: [
    {
      index: [33m0[39m,
      message: {
        role: [32m"assistant"[39m,
        content: [1mnull[22m,
        function_call: {
          name: [32m"schedule_meeting"[39m,
          arguments: [32m'{"meeting_title":"LLM Application Development Discussion","most_suitable_time":"1636034400"}'[39m
        },
        refusal: [1mnull[22m
      },
      logprobs: [1mnull[22m,
      finish_reason: [32m"function_call"[39m
    }
  ],
  usage: { prompt_tokens: [33m132[39m, completion_tokens: [33m30[39m, total_tokens: [33m162[39m },
  system_fingerprint: [1mnull[22m
}


In [60]:
const function_call = chatCompletionWithFunction.choices[0]?.message.function_call;

if(function_call) {
    const function_arguments = JSON.parse(function_call.arguments);
    schedule_meeting(function_arguments);
}

Sending a meeting invite...
      Time: 10:00
      Title: LLM Application Development Technical Discussion
    
Invite sent ✅


With function calling, developers can:

- Extract structured data from text, making it easier to process and analyze.
- Create chatbots that answer questions by invoking external tools, same as ChatGPT Plugins.
- Convert natural language inputs into specific API calls or even database queries.


When passed multiple functions, model detects when a specific function needs to be invoked based on the user’s input. 

# Tools & Agents

## Tools
LLMs have incredible text generation capabilities but they struggle with discrete tasks (e.g. mathematics) and interacting with the outside world (e.g. getting the weather).

*Tools* can be thought of as programs you give to a model which can be run as and when the model deems applicable.


## Agents
Agents allow models to execute multiple steps (i.e. tools) in a non-deterministic way, making decisions based on context and user input.
Agents use LLMs to choose the next step in a problem-solving process. They can reason at each step and make decisions based on the evolving context.

# Frontend Copilot

- get links for different services
- search from Wiki
- fetch from graylog and explain why slow
- where is this env hosted and deployed?
- getErrorBy id on sentry
- build a UI
- give me link of "THAT" sheet

In [25]:
const getEnvLinks = ({environmentName = 'qa6', linkType}) => {
    const getJenkinsJobURL = () => `https://${environmentName}-build.sprinklr.com/jenkins/job`;
    const getGralylogURL = () => `https://${environmentName}-logs.sprinklr.com/`;
    let response = '';

    if(linkType === 'build') {
        response = getJenkinsJobURL(environmentName);
    } else if(linkType === 'logs') {
        response = getGralylogURL(environmentName);
    }

    return response;
};



const searchFrontendWiki = async ({query}) => {
    //https://sprinklr.atlassian.net/wiki/spaces/Frontend/pages/4235919449/Using+secure+media+in+Applications 
    const retrivedContext = `
        To manage and store static assets used in the code, 
        utilize the designated folder on the production bucket at "sprcdn.sprinklr.com/ui/common/assets/" 
        This location should consistently be used for serving static assets.

        For new uploads, please create an ITOPS ticket to upload to the specified public bucket via this link: ITOPS-698869o Do
    `;

    const rag_prompt = `
        Context information is below.
        ---------------------
        ${retrivedContext}
        ---------------------
        Given the context information, answer the query.
        Query: ${query}
        Answer:
`

    const chatCompletion = await _OpenAI.chat.completions.create({
        messages: [{role: 'user',content: rag_prompt}],
        model: 'gpt-4o',
    });
    
    return extractMsgFromChatCompletion(chatCompletion);
};

In [26]:
const getEnvLinks_definition = {
    type: 'function',
    function: {
        name: 'getEnvLinks',    
        description: 'Get links to specific environments',
        parameters: {
          type: 'object',
          properties: {
            environmentName: {
              type: 'string',
              description: 'Name of the environment',
            },
            linkType: {
              type: 'string',
              description: 'Type of link to fetch',
            },
          },
          required: ['environmentName', 'linkType'],
        },
    },            
};

const searchFrontendWiki_definition = {
    type: 'function',
    function: {
        name: 'searchFrontendWiki',
        description: 'Search about a topic from the frontend wiki',
        parameters: {
            type: 'object',
            properties: {
                query: {
                    type: 'string',
                    description: 'Query asked by the user'
                }    
            },
            required: ['query']
        },
    },    
};

In [68]:
// Execute Tool Call to LLM:
const chatCompletionWithFunction2 = await _OpenAI.chat.completions.create({
    messages: [
      {
        role: 'system',
        content: `You are a skilled techinical assistant for a frontend team.`,
      },
      {
        role: 'user',
        //content: `Schedule a meeting for 2PM IST for discussion on secure media usage`,
        //content: `Give me build link for prod0`,
        //content: `Give me logs link for prod`,
        content: `Tell me a joke`,
        //content: `How do I store URL of static assets in code?`,  
      },
    ],
    //schedule_meeting_definition,
    tools: [ getEnvLinks_definition, searchFrontendWiki_definition],
    model: 'gpt-4o',
    //tool_choice: "required"
  });

const tool_call_response = chatCompletionWithFunction2.choices[0]?.message.tool_calls?.[0] || '';


// Execute Tool:
const TOOLS = {
    schedule_meeting,
    getEnvLinks,
    searchFrontendWiki,
    ///....manymore,
};

if(tool_call_response) {
    const toolToInvoke = TOOLS[tool_call_response.function.name];
    const params = tool_call_response.function.arguments;
    console.log(`Calling Tool: ${tool_call_response.function.name}. Params: ${tool_call_response.function.arguments}`, '\n');
    
    const toolResponse = toolToInvoke && await toolToInvoke(JSON.parse(params));
    console.log(toolResponse.toString())
} else {
    chatCompletionWithFunction2.choices[0]?.message.content
}    

[32m"Why did the scarecrow become a successful software developer?\n"[39m +
  [32m"\n"[39m +
  [32m"Because he was outstanding in his fie"[39m... 8 more characters

# SpacewebForge Architecture
<img src="ss_2.jpeg"/>

# Embeddings and Search

Embeddings are a way to represent words, phrases, or images as vectors in a high-dimensional space. In this space, similar words are close to each other, and the distance between words can be used to measure their similarity.

The process of calculating the similarity between two vectors is called ‘cosine similarity’ where a value of 1 would indicate high similarity and a value of -1 would indicate high opposition.

In [69]:
//from openai.embeddings_utils import get_embedding, cosine_similarity
//import {OpenAI} from "npm:openai";
//OpenAI.embeddings_utils

// Function to calculate cosine similarity
function cosineSimilarity(vecA, vecB) {
    const dotProduct = vecA.reduce((sum, a, idx) => sum + a * vecB[idx], 0);
    const magnitudeA = Math.sqrt(vecA.reduce((sum, val) => sum + val * val, 0));
    const magnitudeB = Math.sqrt(vecB.reduce((sum, val) => sum + val * val, 0));
    return dotProduct / (magnitudeA * magnitudeB);
}

async function getEmbedding(text) {
    const embeddingResponse = await _OpenAI.embeddings.create({
        model: "text-embedding-3-small",
        input: text,
        encoding_format: "float",
    });
    return embeddingResponse.data[0].embedding;    
}

async function semanticSearch(query, documents) {
    const queryEmbedding = await getEmbedding(query);
    const documentEmbeddings = await Promise.all(documents.map(doc => getEmbedding(doc)));

    const similarities = documentEmbeddings.map((embedding, idx) => ({
        document: documents[idx],
        similarity: cosineSimilarity(queryEmbedding, embedding)
    }));

    similarities.sort((a, b) => b.similarity - a.similarity);

    return similarities;
}

In [35]:
const cat_embedding = await getEmbedding('cat');

console.log(cat_embedding);
console.log("dimensions:", cat_embedding.length)

[
     [33m0.02552942[39m,  [33m-0.023411665[39m, [33m-0.016092611[39m,    [33m0.03937628[39m,   [33m0.02094483[39m,
    [33m-0.02632067[39m,  [33m0.0018908527[39m,  [33m0.030602723[39m,  [33m-0.015929706[39m, [33m0.0053118416[39m,
     [33m0.02214334[39m, [33m-0.0002121755[39m,  [33m0.010460779[39m,  [33m0.0031213614[39m,   [33m0.02985802[39m,
    [33m0.006265995[39m,  [33m-0.021363726[39m, [33m-0.010716772[39m,  [33m-0.030532908[39m,  [33m0.057528466[39m,
     [33m0.03409353[39m,    [33m0.04589245[39m,  [33m0.020502662[39m,  [33m-0.046637155[39m, [33m-0.006871068[39m,
     [33m0.03800323[39m,  [33m-0.009268087[39m,   [33m0.04405396[39m,   [33m0.051803548[39m, [33m-0.013497779[39m,
   [33m0.0033686268[39m,  [33m-0.043123078[39m,   [33m-0.0112753[39m,  [33m-0.029090041[39m, [33m-0.022946225[39m,
    [33m0.017768197[39m,   [33m0.017570386[39m, [33m-0.028019529[39m,  [33m-0.015743531[39m,   [33m0.01378868[39m

In [36]:
const documents = [
    "The cat sat on the mat.",
    "Dogs are great pets.",
    "The quick brown fox jumps over the lazy dog.",
    "A journey of a thousand miles begins with a single step."
];
        
const query = "Pets are wonderful companions.";

const results = await semanticSearch(query, documents);

console.log("Results:");
results.forEach(result => {
    console.log(`Document: "${result.document}", Similarity: ${result.similarity}`);
});

Results:
Document: "Dogs are great pets.", Similarity: 0.6798705258123072
Document: "The cat sat on the mat.", Similarity: 0.3102847013851084
Document: "The quick brown fox jumps over the lazy dog.", Similarity: 0.26688707103942033
Document: "A journey of a thousand miles begins with a single step.", Similarity: 0.16938206733128902


In [54]:
const documents = [
    "Animals",
    "Birds",    
];
        
const query = "Peacock";

const results = await semanticSearch(query, documents);

console.log("Results:");
results.forEach(result => {
    console.log(`Document: "${result.document}", Similarity: ${result.similarity}`);
});

Results:
Document: "Birds", Similarity: 0.4872772923435113
Document: "Animals", Similarity: 0.3539369319183071


# Further Readings and References
- [OpenAI Prompt engineering guidelines](https://platform.openai.com/docs/guides/prompt-engineering/six-strategies-for-getting-better-results)
- Courses
    - [https://learn.deeplearning.ai/](https://learn.deeplearning.ai/)
- Podcast
    - [Aravind Srinivas - Lex Fridman](https://www.youtube.com/watch?v=e-gwvmhyU7A)
    - [Andrej Karpathy - Lex Fridman](https://www.youtube.com/watch?v=cdiD-9MMpb0)
- Frameworks
    - [Vercel AI SDK](https://sdk.vercel.ai/)
    - [LangChain](https://js.langchain.com/v0.2/docs/tutorials/)
    - [Llama Index](https://www.llamaindex.ai/)