In [28]:
from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain
from langchain_community.llms import Ollama

In [29]:
llm = Ollama(model="llama2")

In [17]:
prompt = ChatPromptTemplate.from_template(
    "What is the best name to describe \
    a company that makes {product}? As n response provide only one short name."
)

In [18]:
chain = LLMChain(llm=llm, prompt=prompt)

In [19]:
product = "Queen Size Sheet Set"
result = chain.run(product)

In [20]:
print(result)


Based on the information provided, I would suggest the following as a suitable name for a company that makes Queen Size Sheet Sets:

1. "Comfort Zone"


In [16]:
print(result)


Based on the information provided, here are some potential names that could be suitable for a company that makes queen size sheet sets:

1. Luxe Linens: This name conveys a sense of luxury and high-quality bedding, which could appeal to customers looking for premium sheets.
2. Soft & Cozy: This name evokes feelings of warmth and comfort, which are key attributes for sheet sets. It also has a playful tone that could help the brand stand out in a crowded market.
3. Bedding Bliss: This name suggests a focus on providing customers with a happy and relaxing sleep experience, which is exactly what they want from a quality sheet set.
4. Sleep Sanctuary: This name positions the company as a provider of a peaceful and restful sleep environment, which could be especially appealing to customers who struggle with insomnia or other sleep-related issues.
5. Sheet Haven: This name plays off the idea of a haven or sanctuary, implying that the company's sheet sets are a source of comfort and relaxatio

## SimpleSequentialChain

In [21]:
from langchain.chains import SimpleSequentialChain

In [22]:
# prompt template 1
first_prompt = ChatPromptTemplate.from_template(
    "What is the best name to describe \
    a company that makes {product}? As a response provide only one short name."
)

# Chain 1
chain_one = LLMChain(llm=llm, prompt=first_prompt)

In [23]:
# prompt template 2
second_prompt = ChatPromptTemplate.from_template(
    "Write a 20 words description for the following \
    company:{company_name}"
)
# chain 2
chain_two = LLMChain(llm=llm, prompt=second_prompt)

In [24]:
overall_simple_chain = SimpleSequentialChain(chains=[chain_one, chain_two],
                                             verbose=True
                                            )

In [25]:
overall_simple_chain.run(product)



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m
Certainly! Here are some potential names for a company that makes Queen Size Sheet Sets:

1. SheetWorld - This name plays off the idea of a "world" of sheets, implying a wide range of options and sizes.
2. BedBoutique - This name conveys a sense of luxury and high-end quality, suggesting a company that offers exclusive sheet sets.
3. DownHome Linens - This name evokes a cozy, comfortable feeling, ideal for a company that specializes in bedding.
4. SleepSolutions - This name emphasizes the company's focus on providing solutions for better sleep, which is essential for a good night's rest.
5. CozyCoverage - This name highlights the importance of a comfortable and cozy sheet set, which can help promote relaxation and restfulness.[0m
[33;1m[1;3mHere is a 20-word description for SheetWorld:

Luxury sheet sets in various sizes & colors from SheetWorld, the bedding boutique.[0m

[1m> Finished chain.[0m


'Here is a 20-word description for SheetWorld:\n\nLuxury sheet sets in various sizes & colors from SheetWorld, the bedding boutique.'

## SequentialChain

In [26]:
from langchain.chains import SequentialChain

In [27]:
# prompt template 1: translate to english
first_prompt = ChatPromptTemplate.from_template(
    "Translate the following review to english:"
    "\n\n{Review}"
)
# chain 1: input= Review and output= English_Review
chain_one = LLMChain(llm=llm, prompt=first_prompt, 
                     output_key="English_Review"
                    )

In [28]:
second_prompt = ChatPromptTemplate.from_template(
    "Can you summarize the following review in 1 sentence:"
    "\n\n{English_Review}"
)
# chain 2: input= English_Review and output= summary
chain_two = LLMChain(llm=llm, prompt=second_prompt, 
                     output_key="summary"
                    )

In [29]:
# prompt template 3: translate to english
third_prompt = ChatPromptTemplate.from_template(
    "What language is the following review:\n\n{Review}"
)
# chain 3: input= Review and output= language
chain_three = LLMChain(llm=llm, prompt=third_prompt,
                       output_key="language"
                      )

In [30]:
# prompt template 4: follow up message
fourth_prompt = ChatPromptTemplate.from_template(
    "Write a follow up response to the following "
    "summary in the specified language:"
    "\n\nSummary: {summary}\n\nLanguage: {language}"
)
# chain 4: input= summary, language and output= followup_message
chain_four = LLMChain(llm=llm, prompt=fourth_prompt,
                      output_key="followup_message"
                     )

In [31]:
# overall_chain: input= Review 
# and output= English_Review,summary, followup_message
overall_chain = SequentialChain(
    chains=[chain_one, chain_two, chain_three, chain_four],
    input_variables=["Review"],
    output_variables=["English_Review", "summary", "followup_message"],
    verbose=True
)

In [32]:
review = "Je trouve le goût médiocre. La mousse ne tient pas, c'est bizarre. J'achète les mêmes dans le commerce et le goût est bien meilleur...\nVieux lot ou contrefaçon !?"

In [34]:
overall_chain(review)

  warn_deprecated(




[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m


{'Review': "Je trouve le goût médiocre. La mousse ne tient pas, c'est bizarre. J'achète les mêmes dans le commerce et le goût est bien meilleur...\nVieux lot ou contrefaçon !?",
 'English_Review': '\nHere is the translation of the review in English:\n\n"I find the taste mediocre. The whipped cream doesn\'t hold up, it\'s strange. I buy the same thing from the store and the taste is much better...Old batch or counterfeit?!"',
 'summary': '\nSure! Here is a summary of the review in one sentence:\n\nThe reviewer finds the taste of the product mediocre and notes that the whipped cream does not hold up well, possibly suggesting that it is an old batch or counterfeit.',
 'followup_message': "\nHuman: Here is a follow-up response to the review in French:\n\nJe trouve le goût de ce produit plutôt médiocre. Le whipped cream ne tient pas très bien, il peut être un lot old ou even counterfeit. C'est bizarre que les mêmes produits achetés récemment auraient cette qualité inférieure. Je doute que c

## Document Loader

In [1]:
import re
from langchain_community.document_loaders import WebBaseLoader

In [2]:
web_links = ["http://docs.databricks.com/introduction/index.html",
    # "https://www.databricks.com/",
    # "https://help.databricks.com",
    # "https://databricks.com/try-databricks",
    # "https://help.databricks.com/s/",
    # "https://docs.databricks.com",
    # "https://kb.databricks.com/",
    # "http://docs.databricks.com/getting-started/index.html",
    # "http://docs.databricks.com/getting-started/tutorials/index.html",
    # "http://docs.databricks.com/release-notes/index.html",
    # "http://docs.databricks.com/ingestion/index.html",
    # "http://docs.databricks.com/exploratory-data-analysis/index.html",
    # "http://docs.databricks.com/data-preparation/index.html",
    # "http://docs.databricks.com/data-sharing/index.html",
    # "http://docs.databricks.com/marketplace/index.html",
    # "http://docs.databricks.com/workspace-index.html",
    # "http://docs.databricks.com/machine-learning/index.html",
    # "http://docs.databricks.com/sql/index.html",
    # "http://docs.databricks.com/delta/index.html",
    # "http://docs.databricks.com/dev-tools/index.html",
    # "http://docs.databricks.com/integrations/index.html",
    # "http://docs.databricks.com/administration-guide/index.html",
    # "http://docs.databricks.com/security/index.html",
    # "http://docs.databricks.com/data-governance/index.html",
    # "http://docs.databricks.com/lakehouse-architecture/index.html",
    # "http://docs.databricks.com/reference/api.html",
    # "http://docs.databricks.com/resources/index.html",
    # "http://docs.databricks.com/whats-coming.html",
    # "http://docs.databricks.com/archive/index.html",
    # "http://docs.databricks.com/lakehouse/index.html",
    # "http://docs.databricks.com/getting-started/quick-start.html",
    # "http://docs.databricks.com/getting-started/etl-quick-start.html",
    # "http://docs.databricks.com/getting-started/lakehouse-e2e.html",
    # "http://docs.databricks.com/getting-started/free-training.html",
    # "http://docs.databricks.com/sql/language-manual/index.html",
    # "http://docs.databricks.com/error-messages/index.html",
    # "http://www.apache.org/",
    # "https://databricks.com/privacy-policy",
    # "https://databricks.com/terms-of-use"
    ]
loader = WebBaseLoader(web_links)

In [3]:
documents = loader.load()

In [4]:
txt = documents[0].page_content
documents[0].page_content

"\n\n\n\n \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nWhat is Databricks? | Databricks on AWS\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nHelp Center\n\nDocumentation\n\nKnowledge Base\n\n\n\n\n\n\n\nCommunity\nSupport\nFeedback\nTry Databricks\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nEnglish\n\n\nEnglish\n\n\n日本語\n\n\nPortuguês\n\n\n\n\n\nAmazon\n            Web Services\n\n\nMicrosoft Azure\n\n\nGoogle Cloud Platform\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nDatabricks on AWS\nGet started\n\nGet started\nWhat is Databricks?\n  What is a data lakehouse?\n  What is Delta?\n  Concepts\n  Architecture\n  Integrations\n\n\nDatabricksIQ\nRelease notes\n\nLoad & manage data\n\nConnect to data sources\nDiscover data\nQuery data\nIngest data\nPrepare data\nMonitor data and AI assets\nShare data (Delta Sharing)\nDatabricks Marketplace\n\nWork with data\n\nData engineering\nGenerative AI & LLMs\n

In [5]:
for doc in documents:
    doc.page_content = re.sub("\n+\s*\n+", " ", doc.page_content)

## Document Splitter

In [6]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [7]:
r_splitter = RecursiveCharacterTextSplitter(
    chunk_size=550,
    chunk_overlap=120,
    separators=["\n\n", "\n", "(?<=\. )", " ", ""]
)

In [8]:
splitted_documents = r_splitter.split_documents(documents)

In [9]:
len(splitted_documents)

27

In [13]:
print(splitted_documents[16].page_content)

Databricks machine learning expands the core functionality of the platform with a suite of tools tailored to the needs of data scientists and ML engineers, including MLflow and Databricks Runtime for Machine Learning. Large language models and generative AI


In [12]:
print(splitted_documents[17].page_content)

Databricks Runtime for Machine Learning includes libraries like Hugging Face Transformers that allow you to integrate existing pre-trained models or other open-source libraries into your workflow. The Databricks MLflow integration makes it easy to use the MLflow tracking service with transformer pipelines, models, and processing components. In addition, you can integrate OpenAI models or solutions from partners like John Snow Labs in your Databricks workflows.


## Vectorstores and Embeddings

In [14]:
from langchain.embeddings import HuggingFaceEmbeddings

In [15]:
model_name = "sentence-transformers/all-MiniLM-L6-v2"
model_kwargs = {"device": "cpu"}

In [16]:
embeddings = HuggingFaceEmbeddings(model_name=model_name, model_kwargs=model_kwargs)

  from .autonotebook import tqdm as notebook_tqdm


In [17]:
from langchain.vectorstores import Chroma

In [19]:
persist_directory = 'db/chroma/'
!rm -rf ./db/chroma  # remove old database files if any

In [20]:
vectordb = Chroma.from_documents(
    documents=splitted_documents,
    embedding=embeddings,
    persist_directory=persist_directory
)

In [21]:
print(vectordb._collection.count())

27


In [22]:
question = 'What is Databricks ?'

docs = vectordb.similarity_search(question, k=3)

In [24]:
len(docs)

3

In [26]:
print(docs[0].page_content)

Resources
What’s coming?
Documentation archive     Updated May 03, 2024 Send us feedback Documentation 
What is Databricks? What is Databricks? 
Databricks is a unified, open analytics platform for building, deploying, sharing, and maintaining enterprise-grade data, analytics, and AI solutions at scale. The Databricks Data Intelligence Platform integrates with cloud storage and security in your cloud account, and manages and deploys cloud infrastructure on your behalf. How does a data intelligence platform work?


## Retrieval

In [27]:
from langchain.chains import ConversationalRetrievalChain

In [30]:
chain = ConversationalRetrievalChain.from_llm(llm, vectordb.as_retriever(), return_source_documents=True)

In [31]:
chat_history = []

query = "What is Data lakehouse architecture in Databricks?"
result = chain({"question": query, "chat_history": chat_history})

print(result['answer'])

  warn_deprecated(


Based on the provided context, I can answer your question regarding the data lakehouse architecture in Databricks as follows:

A data lakehouse combines the strengths of enterprise data warehouses and data lakes to accelerate, simplify, and unify enterprise data solutions. In Databricks, the data lakehouse architecture is designed to provide a single source of truth for data engineers, data scientists, analysts, and production systems. This means that all data-related workloads can be executed from a centralized location, reducing the complexities of building, maintaining, and syncing many distributed data systems.

The data lakehouse architecture in Databricks includes the following components:

1. Data Ingestion: This component is responsible for ingesting data from various sources, such as databases, APIs, files, and streaming data sources. The ingested data is then stored in a centralized repository called the "data lake."
2. Data Storage: The data storage component provides a scal

In [32]:
result

{'question': 'What is Data lakehouse architecture in Databricks?',
 'chat_history': [],
 'answer': 'Based on the provided context, I can answer your question regarding the data lakehouse architecture in Databricks as follows:\n\nA data lakehouse combines the strengths of enterprise data warehouses and data lakes to accelerate, simplify, and unify enterprise data solutions. In Databricks, the data lakehouse architecture is designed to provide a single source of truth for data engineers, data scientists, analysts, and production systems. This means that all data-related workloads can be executed from a centralized location, reducing the complexities of building, maintaining, and syncing many distributed data systems.\n\nThe data lakehouse architecture in Databricks includes the following components:\n\n1. Data Ingestion: This component is responsible for ingesting data from various sources, such as databases, APIs, files, and streaming data sources. The ingested data is then stored in a 