### 📘 LangChain for Beginners - Add Memory to Your AI - Part 2

**Goal**: Give your AI memory so it remembers what you say — build a chatbot that feels real.

✅ No API keys

✅ Uses Hugging Face + LangChain

✅ Runs on free Colab GPU

🧠 Let's make your AI remember — cleanly and clearly!

In [1]:
# Install required libraries
!pip install -q langchain langchain-huggingface transformers torch accelerate bitsandbytes

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.9/41.9 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m21.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.3/61.3 MB[0m [31m22.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m447.5/447.5 kB[0m [31m37.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m384.8/384.8 kB[0m [31m33.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.3/3.3 MB[0m [31m129.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m607.6/607.6 kB[0m [31m40.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m132.9/132.9 kB[0m [31m13.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from langchain_huggingface import HuggingFacePipeline

# Load model
model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    torch_dtype="auto",
    trust_remote_code=True
)

# Create pipeline with clean generation settings
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=64,          # Keep responses short
    temperature=0.7,
    top_k=50,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

# Wrap with LangChain
llm = HuggingFacePipeline(pipeline=pipe)

print("✅ Model loaded!")

  * **h_n**: tensor of shape :math:`(D * \text{num\_layers}, H_{out})` or


tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/551 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/608 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors:   0%|          | 0.00/2.20G [00:00<?, ?B/s]

### 🧩 Add Memory: Clean Conversation History

In [None]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

# Simple prompt — no special tokens in template
template = """Past conversation:
{history}

Human: {input}
AI:"""

prompt = PromptTemplate(
    input_variables=["history", "input"],
    template=template
)

# Memory stores only clean input/output pairs
memory = ConversationBufferMemory(
    input_key="input",
    memory_key="history",
    format_messages=False  # Critical: avoids storing full prompt
)

# Create chain
conversation = LLMChain(
    llm=llm,
    prompt=prompt,
    memory=memory
)

### 💬 Clean Chat Demo: Watch the AI Remember

In [None]:
def chat(message):
    response = conversation.invoke({"input": message})
    return response['text'].strip()

print("🗨️  CHAT DEMO: AI with Memory\n")
print("This AI will remember your name and hobbies.\nWatch how it answers based on what you told it before.\n")
print("─" * 50)

# Message 1
user_msg = "My name is Alex."
print(f"🧑‍💻 You: {user_msg}")
ai_response = chat(user_msg)
print(f"🤖 AI: {ai_response}\n")

# Message 2
user_msg = "What's my name?"
print(f"🧑‍💻 You: {user_msg}")
ai_response = chat(user_msg)
print(f"🤖 AI: {ai_response}\n")

# Message 3
user_msg = "I like hiking and guitar."
print(f"🧑‍💻 You: {user_msg}")
ai_response = chat(user_msg)
print(f"🤖 AI: {ai_response}\n")

# Message 4
user_msg = "What hobbies did I mention?"
print(f"🧑‍💻 You: {user_msg}")
ai_response = chat(user_msg)
print(f"🤖 AI: {ai_response}")

print("\n" + "─" * 50)
print("✅ Success! The AI remembered your name and hobbies — that's memory in action.")

### 🎉 Summary

In this notebook, you:

✅ Added **memory** to your AI

✅ Used a **clean prompt template** (no token leakage)

✅ Built a chatbot that remembers names and hobbies

💡 Responses are **short, clear, and free of repetition**

➡️ **Next: Ask PDFs questions — teach your AI to read documents!**

### 🔗 Resources

- [LangChain Memory Docs](https://python.langchain.com/docs/modules/memory/)
- [TinyLlama on Hugging Face](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0)
- [Author: Doug Ortiz](https://www.linkedin.com/in/doug-ortiz-illustris/)
- [YouTube Channel: @techbits-do](https://www.youtube.com/@techbits-do)
- [Illustris.org](https://www.illustris.org)