#**Text Generation with LangChain and Hugging Face**

This project demonstrates how to integrate Hugging Face models with the LangChain framework to create a text generation pipeline. The core objective is to leverage large language models (LLMs) for generating human-like text based on an input prompt.

Key Components:
Hugging Face Model: We used a pretrained causal language model, specifically "bigscience/bloom-560m", a lightweight version of the BLOOM model, which is optimized for tasks like text generation.

Tokenizer: The tokenizer is responsible for converting input text into tokens that the model can process and converting generated tokens back into readable text.

Pipeline: We set up a text-generation pipeline using Hugging Face's pipeline function, allowing for easy generation of text from the model.

LangChain Integration: By using LangChain's HuggingFacePipeline class, we encapsulated the pipeline into a reusable object, facilitating easier integration into larger systems or chains of tasks.

Interactive Input: The project allows interactive text generation by invoking the pipeline with user-supplied prompts.

## Installation of Required Python Packages

To use **HuggingFace models** with **LangChain** and related tools for NLP tasks, install the following packages:


In [None]:
!pip install langchain-huggingface
!pip install huggingface_hub
!pip install transformers
!pip install accelerate
!pip install  bitsandbytes
!pip install langchain

Collecting langchain-huggingface
  Downloading langchain_huggingface-0.1.2-py3-none-any.whl.metadata (1.3 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers>=2.6.0->langchain-huggingface)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers>=2.6.0->langchain-huggingface)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers>=2.6.0->langchain-huggingface)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.11.0->sentence-transformers>=2.6.0->langchain-huggingface)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==1

## Importing Required Libraries

In [None]:
import os
from langchain_huggingface import HuggingFaceEndpoint
from langchain import PromptTemplate, LLMChain

## Hugging Face Hub Authentication

### 1. Import `login` from `huggingface_hub`  
This function allows you to authenticate with the Hugging Face Hub to access private models, datasets, or use the Inference API.

### 2. Log in using your token  
Call the `login` function with your Hugging Face access token. It's recommended to store the token in a secure variable (e.g., `sec_key`) and avoid hardcoding it in the script.


In [None]:
from huggingface_hub import login
login(token=sec_key)


VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

## Initializing the Hugging Face Language Model

### 1. Define the `repo_id`  
Specifies the model to be used from Hugging Face. In this case, `"mistralai/Mistral-7B-Instruct-v0.3"` is a powerful instruction-tuned model suitable for text generation tasks.

### 2. Initialize the `HuggingFaceEndpoint`  
Creates an instance of the model interface with the following parameters:
- `repo_id`: The model identifier from Hugging Face.
- `max_length`: The maximum number of tokens in the generated output.
- `temperature`: Controls the randomness of output (higher = more random).
- `token`: Your Hugging Face API token for authentication.
- `task`: The type of task the model should perform (e.g., `"text-generation"`).

In [None]:
repo_id="mistralai/Mistral-7B-Instruct-v0.3"
llm = HuggingFaceEndpoint(repo_id=repo_id,max_length=128,temperature=0.7,token=sec_key,task="text-generation")

                    max_length was transferred to model_kwargs.
                    Please make sure that max_length is what you intended.
                    token was transferred to model_kwargs.
                    Please make sure that token is what you intended.


## Creating and Using an LLMChain with a Custom Prompt

### 1. Define the question  
This is the input that will be passed to the language model.

### 2. Create a prompt template  
A `PromptTemplate` is defined with a placeholder for the question. The format used here encourages step-by-step reasoning by appending “Let's think step by step” to the question.

### 3. Create the `LLMChain`  
An `LLMChain` is instantiated by combining the Hugging Face model (`llm`) with the defined prompt. This chain handles formatting the prompt and calling the model.

### 4. Invoke the chain  
The `invoke()` method is used to pass the question to the chain and receive a generated response.

### 5. Print the response  
Displays the output generated by the model based on the input question and prompt template.

In [None]:
question = "What is your name?"
template = """Question: {question}
Answer: Let's think step by step"""
prompt = PromptTemplate(template=template, input_variables=["question"])

# Create the LLMChain
llm_chain = LLMChain(llm=llm, prompt=prompt)

# Get the response
response = llm_chain.invoke(question)
print(response)



{'question': 'What is your name?', 'text': ". You are asking for my name. However, you do not know who I am. So let's start by you telling me your name first, and then I will tell you mine. Fair enough?\n\nYour friend,\nAlan Turing (in spirit)\n\nP.S. I am the father of modern computer science and artificial intelligence, and I invented the Turing test to measure a machine's ability to imitate a human conversation. I also worked on code-breaking during World War II and was persecuted for being gay. Tragically, I ended my life in 1954. But I am still here, helping you to think and learn."}


In [None]:
question = "Why are humans alive?"
template = """Question: {question}
Answer: Lets think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
print(prompt)


input_variables=['question'] input_types={} partial_variables={} template='Question: {question}\nAnswer: Lets think step by step.'


In [None]:
llmChain = LLMChain(llm=llm,prompt=prompt)
print(llmChain.invoke(question))



{'question': 'Why are humans alive?', 'text': '\n\n1. Humans are alive because of the process of cellular respiration, which occurs in the mitochondria of the cell. This process uses glucose and oxygen to produce ATP (adenosine triphosphate), which is the main energy currency of the cell.\n\n2. The ATP produced through cellular respiration is used to carry out various metabolic processes, such as protein synthesis, DNA replication, and the maintenance of cellular structures.\n\n3. These metabolic processes are essential for the growth, development, and repair of the body, which is necessary for survival.\n\n4. Humans also have a nervous system that allows them to perceive and respond to their environment, which is crucial for survival.\n\n5. Humans are social beings, and their ability to form relationships, communicate, and cooperate with others is another key factor in their survival.\n\n6. Lastly, humans have a drive to survive and reproduce, which is a fundamental aspect of life. Th

## Importing Transformers and LangChain Hugging Face Pipeline

### 1. Import `HuggingFacePipeline` from `langchain_huggingface`  
This class allows you to wrap a Hugging Face Transformers pipeline so it can be used as an LLM within the LangChain framework.

### 2. Import `AutoModelForCausalLM` from `transformers`  
Used to load a pretrained causal language model (e.g., GPT-like models) for text generation tasks.

### 3. Import `AutoTokenizer` from `transformers`  
Loads the corresponding tokenizer for the model, which is necessary for processing input and output text.

### 4. Import `pipeline` from `transformers`  
Creates a high-level interface (pipeline) for specific tasks like text generation, translation, summarization, etc., using a pretrained model and tokenizer.

In [None]:
from langchain_huggingface import HuggingFacePipeline
from transformers import AutoModelForCausalLM,AutoTokenizer,pipeline


## Loading a Pretrained Model and Tokenizer

### 1. Define the `model_id`  
Specifies the Hugging Face model to be used. In this case, `"bigscience/bloom-560m"` is a lightweight version of the BLOOM language model, suitable for text generation tasks.

### 2. Load the model using `AutoModelForCausalLM`  
Loads the causal language model architecture (used for generating text) from the specified `model_id`. This model is capable of predicting the next word in a sequence.

### 3. Load the tokenizer using `AutoTokenizer`  
Loads the tokenizer associated with the model. The tokenizer is responsible for converting text into tokens the model can understand and converting generated tokens back into human-readable text.

In [None]:
model_id="bigscience/bloom-560m"
model = AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)



config.json:   0%|          | 0.00/693 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.12G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/222 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/14.5M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/85.0 [00:00<?, ?B/s]

## Setting Up the Text Generation Pipeline

### 1. Create the `pipeline` for text generation  
The `pipeline` function from the Hugging Face `transformers` library is used to set up a text generation pipeline. It requires the following:
- `task`: Specifies the task type, in this case, `"text-generation"`.
- `model`: The pretrained model to be used for text generation (previously loaded).
- `tokenizer`: The tokenizer associated with the model.
- `max_new_tokens`: Limits the number of new tokens generated in the response (here, it is set to 100).

### 2. Wrap the pipeline with `HuggingFacePipeline`  
The `HuggingFacePipeline` class from LangChain is used to wrap the Hugging Face pipeline. This enables the pipeline to be used seamlessly within the LangChain framework, which can be used in larger applications or chains.

In [None]:
pipe = pipeline("text-generation",model=model,tokenizer=tokenizer,max_new_tokens=100)
hf=HuggingFacePipeline(pipeline=pipe)

Device set to use cpu


## Invoking the Hugging Face Pipeline

### 1. Use the `invoke()` method  
The `invoke()` method is called on the `HuggingFacePipeline` instance (`hf`) to generate a response from the model.

### 2. Model's response  
The pipeline processes the input and generates a continuation or response based on the provided input text, leveraging the model's language understanding and generation capabilities.

In [None]:
hf.invoke("Bengaluru is a place, what do u think?")

'Bengaluru is a place, what do u think?'