In [1]:
import os
from dotenv import load_dotenv

load_dotenv()

True

In [6]:
!pip install -q youtube-transcript-api langchain-community langchain-openai \
               faiss-cpu tiktoken python-dotenv

Note: you may need to restart the kernel to use updated packages.


In [22]:
pip install youtube-transcript-api

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [2]:
from youtube_transcript_api import YouTubeTranscriptApi, TranscriptsDisabled
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import PromptTemplate

## Step 1a - Indexing (Document Ingestion)

In [3]:
video_id = "ZGJ0ITMNOtM"  # only the ID, not full URL
try:
    # If you don't care which language, this returns the "best" one
    api = YouTubeTranscriptApi()
    transcript_list = api.fetch(video_id, languages=["en"])

    # Flatten it to plain text
    transcript = " ".join(chunk.text for chunk in transcript_list)
    print(transcript)

except TranscriptsDisabled:
    print("No captions available for this video.")

[Music] This week on the Salesforce admin podcast, I sit down with Sharon Clardy, who's the senior director of Salesforce Labs, to talk about, well, hm, what else? Free innovation. Sharon shares how Labs empowers Salesforce employees to build and share solutions on the app exchange and what that means for Salesforce admins navigating the new world of AI and why this is important. You should never install a new app straight into production. Now, whether you're Dreamforcebound or catching up after Dreamforce, this one's packed with a lot of great tips from Sharon and a lot of AI strategy gold. So, tune in, take notes, and let's get Sharon on the podcast. [Music] So, Sharon, welcome to the podcast. >> Thank you for having me, Mike. >> I find it hard to believe, but you are one of the few people in the world that hasn't been on the Salesforce admin podcast, despite you and me being in the ecosystem for like a thousand years. >> I know. Well, I was thinking about this this morning and I was

In [4]:
print("Transcript List:", transcript_list)

Transcript List: FetchedTranscript(snippets=[FetchedTranscriptSnippet(text='[Music]', start=2.99, duration=3.97), FetchedTranscriptSnippet(text='This week on the Salesforce admin', start=5.44, duration=3.36), FetchedTranscriptSnippet(text='podcast, I sit down with Sharon Clardy,', start=6.96, duration=3.599), FetchedTranscriptSnippet(text="who's the senior director of Salesforce", start=8.8, duration=4.24), FetchedTranscriptSnippet(text='Labs, to talk about, well, hm, what', start=10.559, duration=5.521), FetchedTranscriptSnippet(text='else? Free innovation.', start=13.04, duration=5.52), FetchedTranscriptSnippet(text='Sharon shares how Labs empowers', start=16.08, duration=5.039), FetchedTranscriptSnippet(text='Salesforce employees to build and share', start=18.56, duration=5.52), FetchedTranscriptSnippet(text='solutions on the app exchange and what', start=21.119, duration=5.121), FetchedTranscriptSnippet(text='that means for Salesforce admins', start=24.08, duration=6.4), FetchedTra

## Step 1b - Indexing (Text Splitting)

In [5]:
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.create_documents([transcript])

In [6]:
len(chunks)

18

In [7]:
chunks[10]

Document(metadata={}, page_content="500 >> you don't know all 500 just like the back of your hand come >> unfortunately do not I know many of them but not all 500 uh but yeah most of them should be extended we're actually looking at engaging with some of our more popular lab solutions and having conversations with the builders to say like what are ways that we could extend this uh using newer technology that's come out either adjacent you know something like if you want to use AI you can you know add on to it but if you still want to use the main kind of core functionality like custom objects and and flows that was there but not you know to have an AI arm to it then you could still use the original but how can we take these to the next level using you know the great technology that's come out over the past couple of years >> yeah now I think the use of apps within Salesforce is is probably evolved since you and I were out in the world as we little admins wandering blindly through the t

## Step 1c & 1d - Indexing (Embedding Generation and Storing in Vector Store)

In [8]:
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vector_store = FAISS.from_documents(chunks, embeddings)

In [9]:
vector_store.index_to_docstore_id

{0: 'c683c61c-c63a-47ac-a068-0c8703afd1c9',
 1: 'aa250cac-3472-4074-8dbc-4c7a400aacd9',
 2: 'b086db77-a1b2-4898-a82d-4466c4b20407',
 3: '573e6641-8469-4a54-bbf3-ac439561a2e7',
 4: 'bc9de0e9-1142-4644-8d3f-8a7861d4752f',
 5: '87d8e454-eb03-4b50-bef3-8a025c8033e8',
 6: '830a3291-b3bc-4b49-805b-a6e67743061a',
 7: 'e7e880fa-b9ed-4e09-8a3c-b384f5867ec2',
 8: '7425ed21-8bc8-44b6-b26b-bee8cb12b408',
 9: '7566bf8d-6c11-42f8-9a36-e1085d1c76d6',
 10: 'f5b2cb69-5354-4f4d-ac83-4fde6f5818a9',
 11: 'c21f82ec-faa1-4ed3-b44a-38d0866e1a3b',
 12: 'd8dd00f6-5eb6-4ba9-a60d-5f022a3014a4',
 13: 'c182ed32-eb7e-496a-820b-3e563ce6b191',
 14: 'be18737e-bb2e-489e-92f5-f4cf87ad98a6',
 15: '81fe393a-5960-4a39-b0fe-78fff16ad670',
 16: '0177e079-678e-4cc1-966b-fd546ac2adae',
 17: '054d17f3-b16e-40c8-ab7d-f2485aaadb35'}

In [11]:
vector_store.get_by_ids(["830a3291-b3bc-4b49-805b-a6e67743061a"])

[Document(id='830a3291-b3bc-4b49-805b-a6e67743061a', metadata={}, page_content="different solutions on how to use this new technology but also I think even more importantly is how do we think about how do we want to implement AI at our organization and sometimes that might actually be using a lab solution or another partner solution on app exchange that isn't actually AIdriven but helps set the groundwork for how do you have a strong AI strategy. So an example of that would be, you know, is there a set of technology or tools or processes that you should have in place at your organization to really help you identify what are your goals and use cases for using AI. Uh and we have a number of different lab solutions that we'll we'll share. They'll be the pro tip for anybody who's going to see the content. Uh we'll throw it be probably a few more solutions than seven. Uh we'll throw a couple bonus things in there, some shout outs, but there's a lot of technology out there to really help you

## Step 2 - Retrieval

In [12]:
retriever = vector_store.as_retriever(search_type="similarity", search_kwargs={"k": 4})

In [13]:
retriever

VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x11fc84ad0>, search_kwargs={'k': 4})

In [14]:
retriever.invoke("Does it data ready?")

[Document(id='e7e880fa-b9ed-4e09-8a3c-b384f5867ec2', metadata={}, page_content="throw a couple bonus things in there, some shout outs, but there's a lot of technology out there to really help you frame, you know, is your data ready? Like data readiness is super important as an AI strategy uh to move forward in the space if you're going to train your AI or use AI on your data. And if your data is no good, then you're going to have a problem. The AI isn't going to respond the way that you expect or what your business needs. So we want to showcase these solutions that you can use to get started to help ground you in strong AI strategies. >> So it's not just downloading agents. >> Exactly. It's going to be more than that. It's really about thinking about how do you have meaningful and mindful implementation of AI at your organization. Uh and we'll also have some showcases of of how to use cool agent forests and AI technology. >> Oh nice. Um, I noticed you put the word free in your title. T

## Step 3 - Augmentation

In [15]:
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.2)

In [16]:
prompt = PromptTemplate(
    template="""
      You are a helpful assistant.
      Answer ONLY from the provided transcript context.
      If the context is insufficient, just say you don't know.

      {context}
      Question: {question}
    """,
    input_variables=["context", "question"],
)

In [17]:
question = "is the topic of Salesforce discussed in this video? if yes then what was discussed"
retrieved_docs = retriever.invoke(question)

In [18]:
retrieved_docs

[Document(id='aa250cac-3472-4074-8dbc-4c7a400aacd9', metadata={}, page_content="in the world that hasn't been on the Salesforce admin podcast, despite you and me being in the ecosystem for like a thousand years. >> I know. Well, I was thinking about this this morning and I was like, I can't believe I haven't been on here yet. So, I'm super excited to share with the listeners here today. >> Well, let's talk about that. So, how did you get started in Salesforce and and in the ecosystem and and what do you do at Salesforce? >> Oh, I love a Salesforce origin story. So, picture it's 2010. Uh, and I'm working at a software company that gets bought by an equity partner and that equity partner implements Salesforce at every company they buy. And that was my first introduction to Salesforce back when it was just sales and service cloud. And I absolutely fell in love with the platform. I felt empowered. There was community behind it. And uh ever since then, so from 2010 till today, I've been all

In [19]:
context_text = "\n\n".join(doc.page_content for doc in retrieved_docs)
context_text

"in the world that hasn't been on the Salesforce admin podcast, despite you and me being in the ecosystem for like a thousand years. >> I know. Well, I was thinking about this this morning and I was like, I can't believe I haven't been on here yet. So, I'm super excited to share with the listeners here today. >> Well, let's talk about that. So, how did you get started in Salesforce and and in the ecosystem and and what do you do at Salesforce? >> Oh, I love a Salesforce origin story. So, picture it's 2010. Uh, and I'm working at a software company that gets bought by an equity partner and that equity partner implements Salesforce at every company they buy. And that was my first introduction to Salesforce back when it was just sales and service cloud. And I absolutely fell in love with the platform. I felt empowered. There was community behind it. And uh ever since then, so from 2010 till today, I've been all Salesforce all the time. >> Wow. And you joined Salesforce. What part of\n\nth

In [20]:
final_prompt = prompt.invoke({"context": context_text, "question": question})

In [21]:
final_prompt

StringPromptValue(text="\n      You are a helpful assistant.\n      Answer ONLY from the provided transcript context.\n      If the context is insufficient, just say you don't know.\n\n      in the world that hasn't been on the Salesforce admin podcast, despite you and me being in the ecosystem for like a thousand years. >> I know. Well, I was thinking about this this morning and I was like, I can't believe I haven't been on here yet. So, I'm super excited to share with the listeners here today. >> Well, let's talk about that. So, how did you get started in Salesforce and and in the ecosystem and and what do you do at Salesforce? >> Oh, I love a Salesforce origin story. So, picture it's 2010. Uh, and I'm working at a software company that gets bought by an equity partner and that equity partner implements Salesforce at every company they buy. And that was my first introduction to Salesforce back when it was just sales and service cloud. And I absolutely fell in love with the platform. 

## Step 4 - Generation

In [22]:
answer = llm.invoke(final_prompt)
print(answer.content)

Yes, the topic of Salesforce is discussed in this video. The discussion includes the speaker's origin story with Salesforce, starting from their introduction to the platform in 2010, their passion for it, and their role at Salesforce since 2018 leading the Salesforce Labs program. They explain what Salesforce Labs is, describing it as an innovation program where Salesforce employees create free solutions to share on the App Exchange. The conversation also touches on the community-driven aspect of Labs and how it empowers employees to solve problems and share creative ideas. Additionally, it mentions that only Salesforce employees can create and distribute Salesforce Lab solutions, while the App Exchange is open to non-Salesforce employees for creating and sharing solutions.


## Building a Chain

In [23]:
from langchain_core.runnables import (
    RunnableParallel,
    RunnablePassthrough,
    RunnableLambda,
)
from langchain_core.output_parsers import StrOutputParser

In [24]:
def format_docs(retrieved_docs):
    context_text = "\n\n".join(doc.page_content for doc in retrieved_docs)
    return context_text

In [25]:
parallel_chain = RunnableParallel(
    {
        "context": retriever | RunnableLambda(format_docs),
        "question": RunnablePassthrough(),
    }
)

In [26]:
parallel_chain.invoke("who is Demis")

{'context': "So that makes sense. Well, Sharon, thanks for coming on the podcast. I think this session is going to be awesome. Uh, I can't wait to to hear what people have to say about it and I can't wait to see what new stuff comes out from the labs team. >> Yeah. Well, thank you for having me. Hopefully it's not it's the first time, but hopefully not the last time. I've really enjoyed chatting with you, Mike. And if anybody is at the session and shows up at Dreamforce, stop by and say hello. I'd love to meet you. Uh, and I'm on LinkedIn and if you want to engage with that way, share any feedback about the labs program, uh, I'm happy to engage. [Music] So, a big thanks to Sharon for joining us and sharing some behindthescenes look at the uh Salesforce Labs. I don't know about you, but I'm ready to fire up a sandbox and test drive some Labs apps. That was always the most fun part for me as an admin. But remember, not in production, just in a sandbox or not in production. But uh hey, if

In [27]:
parser = StrOutputParser()

In [28]:
main_chain = parallel_chain | prompt | llm | parser

In [29]:
main_chain.invoke("Can you summarize the video")

'The video features a discussion about Salesforce Labs and its role in helping administrators adopt new technology. The speakers emphasize the importance of data readiness for effective AI implementation and showcase various free solutions available to assist users in this process. They also mention the evolution of the Labs logo and the fun of experimenting with new apps in a sandbox environment. Additionally, they highlight that all Labs solutions undergo the same security review process as partner solutions to ensure quality and trust. The conversation concludes with an invitation for attendees at Dreamforce to engage with the speakers and provide feedback on the Labs program.'