In [2]:
# Install LangChain with HuggingFace integration
!pip install langchain-huggingface

# Install Hugging Face hub for accessing models
!pip install huggingface_hub

# Install Transformers library for NLP tasks
!pip install transformers

# Install Accelerate for distributed training and optimization
!pip install accelerate

# Install BitsAndBytes (likely referred to as bitsample)
!pip install bitsandbytes

# Install the core LangChain library
!pip install langchain


Collecting bitsandbytes
  Downloading bitsandbytes-0.45.0-py3-none-manylinux_2_24_x86_64.whl.metadata (2.9 kB)
Downloading bitsandbytes-0.45.0-py3-none-manylinux_2_24_x86_64.whl (69.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m69.1/69.1 MB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: bitsandbytes
Successfully installed bitsandbytes-0.45.0


In [None]:
#Environment Secret key
from google.colab import userdata
sec_key = userdata.get("HF_TOKEN")
print(sec_key)



# ACCESS HUGGINGFACE MODEL WITH API
There are also two ways to use this class. We can specify the model with the repo_id parameter. Those endpoints use the serviceless API, Which is particularly beneficial to the people using pro account to enterprise hub. Still, regular Users can already have access to a fair number of request by connecting with the HF Token in the environment where they are executing the code.

In [4]:
from langchain_huggingface import HuggingFaceEndpoint


In [5]:
from google.colab import userdata
sec_key = userdata.get("HUGGINGFACEHUB")
print(sec_key)


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


In [6]:
import os
os.environ["HUGGINGFACEHUB"]= sec_key

In [14]:
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"

# Use model_kwargs for parameters like max_length and temperature
llm = HuggingFaceEndpoint(
    repo_id=repo_id,
    temperature=0.8,  # Explicitly pass temperature
    max_length=128,   # Explicitly pass max_length
    token=sec_key     # Pass the token directly
)

                    max_length was transferred to model_kwargs.
                    Please make sure that max_length is what you intended.
                    token was transferred to model_kwargs.
                    Please make sure that token is what you intended.


In [18]:
llm.invoke("what is machine learning")

' ?\n\nMachine Learning (ML) is a subfield of artificial intelligence that focuses on developing algorithms and statistical models that enable computers to perform tasks without explicit programming. It involves training a computer system to learn patterns and make predictions or decisions based on data, rather than being explicitly programmed to follow a set of rules. Machine learning algorithms use various techniques, such as supervised learning, unsupervised learning, and reinforcement learning, to identify and make use of patterns in data. The goal of machine learning is to create systems that can automatically learn and improve from experience, allowing them to perform tasks more effectively and efficiently than traditional programming methods.\n\nExamples of machine learning applications include image and speech recognition, natural language processing, recommendation systems, and predictive modeling. Machine learning is used in a wide range of industries, including healthcare, f

In [27]:
from langchain import PromptTemplate,LLMChain
question = "How to detect radiation chemicals in Human Body?"
template = """Question : {question}
Answer: Just a sec """
prompt = PromptTemplate(template = template, input_variables = ["question"])
print(prompt)

input_variables=['question'] input_types={} partial_variables={} template='Question : {question}\nAnswer: Just a sec '


In [29]:
llm_chain = LLMChain(llm=llm,prompt = prompt)
print(llm_chain.run(question))

😜

Radiation detection in the human body is a complex process and requires specialized equipment. Here's a simplified explanation:

1. **Gamma and X-ray Radiation**: These types of radiation can be detected using a Geiger-Muller (GM) counter or a scintillation counter. However, these devices cannot provide a detailed image of the body or locate the radiation source within the body. They can only detect the presence of radiation.

2. **Radioactive Isotopes**: To detect specific radioactive isotopes (like Iodine-131 or Cesium-137), more advanced equipment like a gamma camera or a positron emission tomography (PET) scanner is used. These devices can produce detailed images of the body and can often identify the location of the radioactive material.

3. **Internal Radiation Sources**: In some cases, a procedure called a whole-body counter may be used. This involves the person lying on a bed that passes a low-energy electron beam through their body. The scanner can then detect the radiation

# PIPELINE HUGGINGFACE
Among Transformers, the Pipeline is the most varsatile tool in the huggingface toolbox.Langchain being designed primarily to address RAG and Agent use cases the scope of the pipeline here is reduced to the following text-centric tasks:"text generation","text2text-generation","summarization","translation".Models can be loaded directly with the form_model_id method


In [37]:
from langchain_huggingface import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline


In [45]:
model_id = "openai-community/gpt2"  # Specify the model
model = AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

In [49]:
pipe = pipeline("text-generation",
                model = model,
                tokenizer = tokenizer,
                max_new_tokens = 128)
hf = HuggingFacePipeline(pipeline = pipe)

Device set to use cpu


In [50]:
hf

HuggingFacePipeline(pipeline=<transformers.pipelines.text_generation.TextGenerationPipeline object at 0x78d461997fd0>, model_id='openai-community/gpt2')

In [52]:
hf.invoke("WHAT IS IODINE-131")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


'WHAT IS IODINE-131?: As always I am a little bit confused where to do so, as I\'ve only ever really read the book with the help of friends. How on earth I got to where I am, what do I want, etc. But I believe that it\'s so much more important to know what is and what isn\'t healthy than this whole "honey-soaked shit" thing. I see why this is a difficult book to read in the first place, and feel that it is the most important story to read and write, and I certainly hope you find it entertaining, not a mindless "I wish this made science fiction" type'

# My Own ChatBot

In [15]:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

# Initialize model and tokenizer for text generation
model_id = "gpt2"
model = AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

# Create text generation pipeline
text_generation_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=150,  # Adjust response length
    temperature=0.7,  # Control randomness
    do_sample=True  # Enable sampling for more creative output
)

# Q&A Chatbot function to handle questions and answers
def chatbot():
    print("Chatbot: Hello! Ask me anything.")

    while True:
        user_input = input("You: ")  # Get user input

        if user_input.lower() in ['exit', 'quit', 'bye']:
            print("Chatbot: Goodbye! Have a great day!")
            break

        # Generate response using the model
        prompt = f"Provide a detailed answer to this question: {user_input}"
        response = text_generation_pipeline(prompt)[0]['generated_text']

        # Extract the response (remove the prompt from the beginning)
        bot_response = response[len(prompt):].strip()

        # Print the answer
        print(f"Chatbot: {bot_response}")

# Start the chatbot
chatbot()


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cpu


Chatbot: Hello! Ask me anything.
You: landsize of India


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Chatbot: .

India's land use has been shifting since the late 19th century. Between 1872 and 1960 it was estimated that India had about 1.5 million hectares of land that was now mostly used for farming.

Indians now own more than 90 percent of the world's land, and that number is expected to grow to more than 90 percent by 2030, according to the World Bank's 2014 report.

The land that they own has been changing over the last 50 years, but the vast majority of it has been used for personal use. By 2050, the projected land area of India will be one third of that of China, with a projected number of more than 1.5 million hectares.

"These are huge
You: what is radiation poisons


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Chatbot: ?

If there are radioactive poisons in your body, it's important to know what level of radiation you're exposed to. Radiation poisons are chemicals that damage the body's immune system, which in turn destroys the cell membranes. These poisons are present in the body, and can be harmful to your health, your body's ability to fight off the cancer, and the organs that produce cancer cells.

How do you protect yourself against radiation poisons?

We've created a list of radiation poisons that are not only found in our body, but also in the environment as well. We've listed them in alphabetical order.

What is the most common type of radiation poisoning?

The most common type of radiation poisoning is when
You: bye
Chatbot: Goodbye! Have a great day!
