## Data Ingestion

By Text Loader:


In [2]:
from langchain_community.document_loaders import TextLoader
loader = TextLoader("sample.txt")
text_documents = loader.load()
text_documents


[Document(metadata={'source': 'sample.txt'}, page_content='Introduction to Data Science\nData Science is an interdisciplinary field that uses scientific methods, processes, algorithms, and systems to extract insights and knowledge from structured and unstructured data. Data science is closely related to the fields of data mining, machine learning, and big data. The key components of data science include data collection, data cleaning, data analysis, and data visualization.\n\nData scientists use various tools such as Python, R, SQL, and cloud platforms to process large datasets and build models that help in decision-making. Popular libraries in Python used for data science include Pandas, NumPy, Scikit-learn, and TensorFlow.\n\nKey Concepts in Machine Learning\nMachine learning is a subset of artificial intelligence (AI) that enables systems to learn and improve from experience without being explicitly programmed. There are three main types of machine learning:\n\nSupervised Learning: 

In [7]:
import os
from dotenv import load_dotenv

load_dotenv()

os.environ['LANGCHAIN_API_KEY'] = os.getenv('LANGCHAIN_API_KEY')


By Web Based Loader:

In [10]:
from langchain_community.document_loaders import WebBaseLoader
#Beatiful Soup
import bs4 
loader = WebBaseLoader(web_paths=("https://www.healthline.com/nutrition/protein-for-vegans-vegetarians",),
                       bs_kwargs=dict(parse_only = bs4.SoupStrainer(
                           class_ = ("touchevents-false","chartbeat-section","css-1avyp1d")

                       )))
web_documents = loader.load()
web_documents


By PDF Loader:

In [26]:
from langchain_community.document_loaders import PyPDFLoader
loader=PyPDFLoader("meditation.pdf")
pdf_doc = loader.load()

In [27]:
pdf_doc

[Document(metadata={'source': 'meditation.pdf', 'page': 0}, page_content='                                   \n How to Meditate: A Primer for Beginners    \n   JOEL M.     EVANS          \n  key concepts  \n      Th e main objective of the meditative experience is to shift   the  ■\nfocus of the mind away from the frenzied thoughts of everyday \nliving, such as worries about work, family, fi  nances, or simply \nthe “to-do list” of the day.        Personal experience with meditation is required in order to \n ■\naccurately and sincerely communicate expected benefi  ts to \npatients.        Meditation involves trying to train your thoughts to stay in the \n ■\npresent moment, because in the present moment there is true \npeace.        Meditative experiences can be placed into three main catego-\n ■\nries: concentrative, expressive and movement.  \n      Examples of meditation include: painting or drawing, working  ■\nin the garden, staring at the ocean, playing with your pet, jour-\nnal

## Data Transformation:

Chunking:

In [28]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 200)
documents = text_splitter.split_documents(pdf_doc)
documents[:5]

[Document(metadata={'source': 'meditation.pdf', 'page': 0}, page_content='How to Meditate: A Primer for Beginners    \n   JOEL M.     EVANS          \n  key concepts  \n      Th e main objective of the meditative experience is to shift   the  ■\nfocus of the mind away from the frenzied thoughts of everyday \nliving, such as worries about work, family, fi  nances, or simply \nthe “to-do list” of the day.        Personal experience with meditation is required in order to \n ■\naccurately and sincerely communicate expected benefi  ts to \npatients.        Meditation involves trying to train your thoughts to stay in the \n ■\npresent moment, because in the present moment there is true \npeace.        Meditative experiences can be placed into three main catego-\n ■\nries: concentrative, expressive and movement.  \n      Examples of meditation include: painting or drawing, working  ■\nin the garden, staring at the ocean, playing with your pet, jour-'),
 Document(metadata={'source': 'meditati

Vector Embeddings and Vector Store:

In [None]:
# Clearing the db:
# from chromadb import Client

# # Initialize the Chroma client
# client = Client()

# # List all collections
# collections = client.list_collections()

# # Print the collection names
# for collection in collections:
#     print(collection.name)

# client = Client()

# client.delete_collection(name="langchain")

# client = Client()


In [39]:
#from langchain_community.embeddings import OpenAIEmbeddings
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Chroma

db = Chroma.from_documents(documents,OllamaEmbeddings())

In [61]:
query = "Introduction"
result = db.similarity_search(query)
result

[Document(metadata={'page': 1, 'source': 'meditation.pdf'}, page_content='personal experience as part of the basis for your “meditation prescription, ” as well as actually guide your patients as you provide specifi  c instructions on how \nto meditate, it will be extremely hard for your patients not to pick up on the conviction behind your recommendations, making them far more likely to be implemented. Aft  er all, doesn’t every physician want his patients to follow his \nadvice so they can achieve better health? Learning to meditate and starting a regular meditation practice are vitally important ways to improve the lives of patients and physicians alike. \n Th e main objective of the meditative experience is to shift   the focus of the \nmind away from the frenzied thoughts of everyday living, such as worries about work, family, fi  nances or simply the “to-do list” of the day. Th  ere are \ncountless ways, or techniques, to do this.     \n   What is Meditation?'),
 Document(metadata

In [44]:
from langchain_community.vectorstores import FAISS

db_Faiss = FAISS.from_documents(documents,OllamaEmbeddings())

In [47]:
query = "Introduction"
result = db_Faiss.similarity_search(query)
result[0].page_content

'personal experience as part of the basis for your “meditation prescription, ” as well as actually guide your patients as you provide specifi  c instructions on how \nto meditate, it will be extremely hard for your patients not to pick up on the conviction behind your recommendations, making them far more likely to be implemented. Aft  er all, doesn’t every physician want his patients to follow his \nadvice so they can achieve better health? Learning to meditate and starting a regular meditation practice are vitally important ways to improve the lives of patients and physicians alike. \n Th e main objective of the meditative experience is to shift   the focus of the \nmind away from the frenzied thoughts of everyday living, such as worries about work, family, fi  nances or simply the “to-do list” of the day. Th  ere are \ncountless ways, or techniques, to do this.     \n   What is Meditation?'

Chaining LLM and Prompts:

In [73]:
from langchain_community.llms import Ollama
llm = Ollama(model = "llama2")

from langchain.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_template(""" You are a chatbot. Answer the following question with based on the context provided. Just keep the reply straight forward (short and crisp).
                            context: {context}.
                            question: {input}                          
                            """)

In [74]:
from langchain.chains.combine_documents import create_stuff_documents_chain
document_chain = create_stuff_documents_chain(llm,prompt)

In [75]:
# Using Retrieval Interface
retriever = db.as_retriever()
retriever

VectorStoreRetriever(tags=['Chroma', 'OllamaEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x000001F5EA9E3D00>)

In [76]:
from langchain.chains import create_retrieval_chain
retrival_chain = create_retrieval_chain(retriever,document_chain)

In [78]:
response = retrival_chain.invoke({'input':'what are the categories of meditation?'})
response['answer']

'The categories of meditation are concentrative, expressive, and movement-based.'