In [1]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

In [2]:
# Sample documents
documents = [
    "This is a list which containig sample documents.",
    "Keywords are important for keyword-based search.",
    "Document analysis involves extracting keywords.",
    "Keyword-based search relies on sparse embeddings."
]

In [3]:
query = "keyword-based search"

In [4]:
import re
def preprocess_text(text):
    # Convert text to lowercase
    text = text.lower()
    # Remove punctuation
    text = re.sub(r'[^\w\s]', '', text)
    return text

In [5]:
preprocess_documents = [preprocess_text(doc) for doc in documents]

In [6]:
preprocess_documents

['this is a list which containig sample documents',
 'keywords are important for keywordbased search',
 'document analysis involves extracting keywords',
 'keywordbased search relies on sparse embeddings']

In [7]:
preprocessed_query = preprocess_text(query)

In [8]:
preprocessed_query

'keywordbased search'

In [9]:
vector = TfidfVectorizer()

In [10]:
X = vector.fit_transform(preprocess_documents)

In [11]:
X.toarray()

array([[0.        , 0.        , 0.37796447, 0.        , 0.37796447,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.37796447, 0.        , 0.        , 0.37796447, 0.        ,
        0.        , 0.37796447, 0.        , 0.        , 0.37796447,
        0.37796447],
       [0.        , 0.4533864 , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.4533864 , 0.4533864 , 0.        ,
        0.        , 0.35745504, 0.35745504, 0.        , 0.        ,
        0.        , 0.        , 0.35745504, 0.        , 0.        ,
        0.        ],
       [0.46516193, 0.        , 0.        , 0.46516193, 0.        ,
        0.        , 0.46516193, 0.        , 0.        , 0.46516193,
        0.        , 0.        , 0.36673901, 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.        ,
        0.43671931, 0.        , 0.        , 0.       

In [12]:
X.toarray()[0]

array([0.        , 0.        , 0.37796447, 0.        , 0.37796447,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.37796447, 0.        , 0.        , 0.37796447, 0.        ,
       0.        , 0.37796447, 0.        , 0.        , 0.37796447,
       0.37796447])

In [13]:
query_embedding = vector.transform([preprocessed_query])

In [14]:
query_embedding.toarray()

array([[0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.70710678, 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.70710678, 0.        , 0.        ,
        0.        ]])


## Sparse Vector Explained

A sparse vector is a vector where most of the elements are zero. It's a way to efficiently represent data that contains a lot of zero values, saving memory and computation time.

**Why use them?**

* **Memory Efficiency:** Storing only non-zero elements significantly reduces memory usage compared to storing a dense vector with numerous zeros.
* **Computational Efficiency:**  Many operations, like vector addition or dot product, can be performed faster on sparse vectors because you only need to process the non-zero elements.


**Example in Text Analysis:**

Consider a collection of documents and a vocabulary of words. We want to represent each document as a vector where each element corresponds to a word in the vocabulary. We could have a vector where the value of each element represents the frequency or importance of a particular word in the document.

However, in practice, most documents will only contain a small subset of all possible words in the vocabulary. The majority of elements in the document vector would be zero because the document doesn't contain those words. This is where a sparse vector representation comes in.

In your provided code:

* `TfidfVectorizer` creates sparse vectors that represent the importance of each word in each document.
* The resulting `X` matrix, which is shown as a dense array `X.toarray()` for visualization purposes, is a sparse matrix in the background.
* The `query_embedding` is also a sparse vector representing the query.

**Key Concepts:**

* **Non-Zero Elements:** They hold the actual values of the vector.
* **Storage Methods:**  Sparse vectors are often stored using specialized data structures like:
    * **Coordinate List (COO):** Stores a list of (row, column, value) triplets.
    * **Compressed Sparse Row (CSR):** Stores data in rows, which is efficient for row-wise operations.
    * **Compressed Sparse Column (CSC):** Similar to CSR, but optimized for column-wise operations.
* **Applications:**
    * **Text Analysis (TF-IDF, Word Embeddings):** As seen in your example.
    * **Recommendation Systems:** Representing user-item interactions.
    * **Machine Learning:** Feature vectors for training models.


Sparse vectors are an essential concept for working with high-dimensional data efficiently and effectively. They enable us to represent and process data with many zero values in a more optimized way.


## Why use TF-IDF?

TF-IDF (Term Frequency-Inverse Document Frequency) is a powerful technique used in information retrieval and text mining to quantify the importance of a word within a document relative to a collection of documents (corpus).

Here's a breakdown of its benefits:

**1. Identifying Key Words:**

   - TF-IDF helps identify words that are **most representative** of a specific document within a larger set of documents.
   - It assigns higher weights to words that appear frequently in a particular document but relatively infrequently in the rest of the corpus. These words are likely to be more important and informative for characterizing that document.

**2. Feature Engineering for Text:**

   -  It provides a way to convert text data into numerical vectors that can be used as features for machine learning models. These vectors capture the semantic meaning of documents.

**3. Improving Search Relevance:**

   - TF-IDF is widely used in search engines to rank search results based on their relevance to a query.
   - By assigning higher weights to words that are relevant to the query and less frequent in other documents, TF-IDF helps identify documents that are most likely to be relevant to the user's search.


**4. Addressing Common Words:**

   - TF-IDF automatically downweights common words (like "the", "a", "is") that appear frequently in many documents but don't carry much meaning in distinguishing between them. This helps to focus on more relevant terms.

**In your example:**

- You're using TF-IDF to convert your documents and the query into numerical vectors.
- The cosine similarity between these vectors can then be used to measure the similarity between the query and each document. Documents with higher cosine similarity scores are considered more relevant to the query.


**In essence, TF-IDF is a valuable tool for understanding the importance of words in documents and for building effective text-based applications such as search engines, document clustering, and topic modeling.**


In [15]:
similarities = cosine_similarity(X, query_embedding)

In [16]:
similarities

array([[0.        ],
       [0.50551777],
       [0.        ],
       [0.48693426]])

In [17]:
#Ranking
ranked_indices = np.argsort(similarities,axis=0)[::-1].flatten()

In [18]:
ranked_documents = [documents[i] for i in ranked_indices]

In [19]:
ranked_documents

['Keywords are important for keyword-based search.',
 'Keyword-based search relies on sparse embeddings.',
 'Document analysis involves extracting keywords.',
 'This is a list which containig sample documents.']

In [20]:
query

'keyword-based search'

In [21]:
# Output the ranked documents
for i, doc in enumerate(ranked_documents):
    print(f"Rank {i+1}: {doc}")

Rank 1: Keywords are important for keyword-based search.
Rank 2: Keyword-based search relies on sparse embeddings.
Rank 3: Document analysis involves extracting keywords.
Rank 4: This is a list which containig sample documents.


In [22]:
document_embeddings = np.array([
    [0.634, 0.234, 0.867, 0.042, 0.249],
    [0.123, 0.456, 0.789, 0.321, 0.654],
    [0.987, 0.654, 0.321, 0.123, 0.456]
])

In [23]:
# Sample search query (represented as a dense vector)
query_embedding = np.array([[0.789, 0.321, 0.654, 0.987, 0.123]])

In [24]:
# Calculate cosine similarity between query and documents
similarities = cosine_similarity(document_embeddings, query_embedding)

In [25]:
similarities

array([[0.73558979],
       [0.67357898],
       [0.71517305]])

In [26]:
ranked_indices = np.argsort(similarities, axis=0)[::-1].flatten()

In [27]:
ranked_indices

array([0, 2, 1])

In [28]:
# Output the ranked documents
for i, idx in enumerate(ranked_indices):
    print(f"Rank {i+1}: Document {idx+1}")

Rank 1: Document 1
Rank 2: Document 3
Rank 3: Document 2


# Creating RAG

In [29]:
doc_path = "/content/Lecture 03.pdf"

In [30]:
!pip install --quiet pypdf langchain_community

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m302.3/302.3 kB[0m [31m7.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m32.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m33.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.9/50.9 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
[?25h

In [31]:
from langchain_community.document_loaders import PyPDFLoader

In [33]:
loader = PyPDFLoader(doc_path)

In [34]:
docs = loader.load()

In [35]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [36]:
splitter = RecursiveCharacterTextSplitter(chunk_size=200,chunk_overlap=30)

In [37]:
chunks = splitter.split_documents(docs)

In [38]:
chunks

[Document(metadata={'producer': 'Adobe PDF Library 9.0', 'creator': 'Acrobat PDFMaker 9.0 for PowerPoint', 'creationdate': '2014-04-07T22:55:44+06:00', 'author': 'Eva', 'company': 'cse', 'moddate': '2014-04-07T22:55:48+06:00', 'title': 'fsfsdfdsf', 'source': '/content/Lecture 03.pdf', 'total_pages': 36, 'page': 0, 'page_label': '1'}, page_content='Lecture - 3'),
 Document(metadata={'producer': 'Adobe PDF Library 9.0', 'creator': 'Acrobat PDFMaker 9.0 for PowerPoint', 'creationdate': '2014-04-07T22:55:44+06:00', 'author': 'Eva', 'company': 'cse', 'moddate': '2014-04-07T22:55:48+06:00', 'title': 'fsfsdfdsf', 'source': '/content/Lecture 03.pdf', 'total_pages': 36, 'page': 1, 'page_label': '2'}, page_content='The Basic Information Types\nInstruction\nData\nNonnumerical data\nNumbers\nFixed-point\nBinary\nDecimal\nFloating-point\nBinary\nDecimal\nInformation'),
 Document(metadata={'producer': 'Adobe PDF Library 9.0', 'creator': 'Acrobat PDFMaker 9.0 for PowerPoint', 'creationdate': '2014-04

In [39]:
from langchain.embeddings import HuggingFaceInferenceAPIEmbeddings

In [40]:
from google.colab import userdata
HF_API_KEY = userdata.get('HUGGING_FACE_TOKEN')

In [41]:
embeddings = HuggingFaceInferenceAPIEmbeddings(api_key=HF_API_KEY, model_name="BAAI/bge-base-en-v1.5")

In [42]:
!pip install --quiet chromadb

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m611.1/611.1 kB[0m [31m27.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m86.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m284.2/284.2 kB[0m [31m24.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m94.9/94.9 kB[0m [31m9.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m84.9 MB/s[0m eta [36m0:00:00

In [43]:
from langchain.vectorstores import Chroma

In [44]:
vectorstore = Chroma.from_documents(chunks,embeddings)

In [45]:
vectorstore_retreiver = vectorstore.as_retriever(search_kwargs={"k": 2})

In [46]:
vectorstore_retreiver

VectorStoreRetriever(tags=['Chroma', 'HuggingFaceInferenceAPIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x7f77f497ea10>, search_kwargs={'k': 2})

In [47]:
!pip install --quiet rank_bm25

In [48]:
from langchain.retrievers import BM25Retriever, EnsembleRetriever
keyword_retriever = BM25Retriever.from_documents(chunks)
keyword_retriever.k =  2

In [49]:
ensemble_retriever = EnsembleRetriever(retrievers=[vectorstore_retreiver, keyword_retriever], weights=[0.3, 0.7])

#Mixing vector search and keyword search for Hybrid search
### hybrid_score = (1 — alpha) * sparse_score + alpha * dense_score

In [50]:
model_name = "HuggingFaceH4/zephyr-7b-beta"

In [51]:
!pip install --quiet bitsandbytes accelerate

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.1/76.1 MB[0m [31m10.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.8/13.8 MB[0m [31m25.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m24.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m883.7/883.7 kB[0m [31m31.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m664.8/664.8 MB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m211.5/211.5 MB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.3/56.3 MB[0m [31m11.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [52]:
import torch
from transformers import ( AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline, )
from langchain import HuggingFacePipeline

In [53]:
# function for loading 4-bit quantized model
def load_quantized_model(model_name: str):
    """
    model_name: Name or path of the model to be loaded.
    return: Loaded quantized model.
    """
    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_use_double_quant=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.bfloat16,
    )

    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        torch_dtype=torch.bfloat16,
        quantization_config=bnb_config,
    )
    return model

In [54]:
# initializing tokenizer
def initialize_tokenizer(model_name: str):
    """
    model_name: Name or path of the model for tokenizer initialization.
    return: Initialized tokenizer.
    """
    tokenizer = AutoTokenizer.from_pretrained(model_name, return_token_type_ids=False)
    tokenizer.bos_token_id = 1  # Set beginning of sentence token id
    return tokenizer

In [55]:
tokenizer = initialize_tokenizer(model_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/1.43k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/168 [00:00<?, ?B/s]

In [56]:
!pip install -quiet CUDA


Usage:   
  pip3 install [options] <requirement specifier> [package-index-options] ...
  pip3 install [options] -r <requirements file> [package-index-options] ...
  pip3 install [options] [-e] <vcs project url> ...
  pip3 install [options] [-e] <local project path> ...
  pip3 install [options] <archive url/path> ...

no such option: -u


In [57]:
model = load_quantized_model(model_name)

config.json:   0%|          | 0.00/638 [00:00<?, ?B/s]

`low_cpu_mem_usage` was None, now default to True since model is quantized.


model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/8 [00:00<?, ?it/s]

model-00001-of-00008.safetensors:   0%|          | 0.00/1.89G [00:00<?, ?B/s]

model-00002-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00003-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00004-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00005-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00006-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00007-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00008-of-00008.safetensors:   0%|          | 0.00/816M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]

In [58]:
pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    use_cache=True,
    device_map="auto",
    max_length=2048,
    do_sample=True,
    top_k=5,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id,
)

Device set to use cuda:0


In [59]:
llm = HuggingFacePipeline(pipeline=pipeline)

  llm = HuggingFacePipeline(pipeline=pipeline)


In [60]:
from langchain.chains import RetrievalQA

In [61]:
normal_chain = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever = vectorstore_retreiver
)

In [62]:
hybrid_chain = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever = ensemble_retriever
)

In [63]:
response1 = normal_chain.invoke("What is Floating point numbers?")

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


In [64]:
response1

{'query': 'What is Floating point numbers?',
 'result': 'Use the following pieces of context to answer the question at the end. If you don\'t know the answer, just say that you don\'t know, don\'t try to make up an answer.\n\nFloating-Point Number\n\uf06e The floating point representation of most real numbers is \nonly approximate. For example, 1.25 is approximated by \n(011,101) representing 1.5 or by either (001, 000) or (001,\n\nConverting from Decimal to \nBinary Floating Point\n\uf06e What is the binary representation for the single-precision \nfloating point number that corresponds to X = -12.2510?\n\nQuestion: What is Floating point numbers?\nHelpful Answer: Floating point numbers are decimal or binary numbers that are used to represent real numbers in a computer. They are called floating point because the decimal point can "float" to different positions within the number, allowing for a larger range of values to be represented than with fixed point arithmetic. Floating point nu

In [69]:
print(response1.get("result"))

Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

Converting from Decimal to 
Binary Floating Point
 What is the binary representation for the single-precision 
floating point number that corresponds to X = -12.2510?

 What is the normalized binary representation for the 
number?
-12.2510 = -1100.012  = -1.100012 x 23
 What are the sign, stored exponent, and normalized 
mantissa?

Floating-Point Number
 The floating point representation of most real numbers is 
only approximate. For example, 1.25 is approximated by 
(011,101) representing 1.5 or by either (001, 000) or (001,

Question: What is Floating point numbers?
Helpful Answer: In computing, floating point number is a real number represented in floating point format for numerical calculation using computers. It is a number with a decimal point in a fixed position, and the number of digits to the right of the decima

In [68]:
response1 = hybrid_chain.invoke("What is Floating point numbers?")

In [70]:
print(response1.get("result"))

Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

Converting from Decimal to 
Binary Floating Point
 What is the binary representation for the single-precision 
floating point number that corresponds to X = -12.2510?

 What is the normalized binary representation for the 
number?
-12.2510 = -1100.012  = -1.100012 x 23
 What are the sign, stored exponent, and normalized 
mantissa?

Floating-Point Number
 The floating point representation of most real numbers is 
only approximate. For example, 1.25 is approximated by 
(011,101) representing 1.5 or by either (001, 000) or (001,

Question: What is Floating point numbers?
Helpful Answer: In computing, floating point number is a real number represented in floating point format for numerical calculation using computers. It is a number with a decimal point in a fixed position, and the number of digits to the right of the decima

In [66]:
response2 = hybrid_chain.invoke("What is BCD?")

In [67]:
print(response2.get("result"))

Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

 What is the normalized binary representation for the 
number?
-12.2510 = -1100.012  = -1.100012 x 23
 What are the sign, stored exponent, and normalized 
mantissa?

Converting from Decimal to 
Binary Floating Point
 What is the binary representation for the single-precision 
floating point number that corresponds to X = -12.2510?

Decimal Codes
BCD (Binary coded decimal)
 In BCD format each digit di of a decimal number is 
denoted by a 4-bit equivalent bi,3bi,2bi,1bi,0.
 BCD is a weighted (positional) number code where each

Decimal Codes
Excess-Three Code
 The excess-three code can be formed by adding 00112 to 
the corresponding BCD number.
 The advantage of the excess-three code is that it may be

Question: What is BCD?
Helpful Answer: BCD is a number code where each digit in a decimal number is denoted by a 4-bit 