In [4]:
# Install required packages
!pip install google-generativeai sentence-transformers faiss-cpu langchain-community langchain

import os
import getpass
import google.generativeai as genai
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter

# Function to get Gemini API key from user
def get_gemini_api_key():
    print("🔐 Google Gemini API Key Setup")
    print("="*50)
    print("You need a Google AI API key to use Gemini.")
    print("Get your FREE key from: https://aistudio.google.com/app/apikey")
    print("="*50)

    api_key = getpass.getpass("Enter your Google AI API key (hidden input): ").strip()

    if not api_key or api_key == "":
        print("❌ No API key provided!")
        return None

    return api_key

# Create comprehensive sample data about India's freedom movement
freedom_movement_data = """
Mahatma Gandhi (1869-1948) was the preeminent leader of the Indian independence movement against British rule. He employed nonviolent civil disobedience to lead India to independence and inspired movements for civil rights across the world. Gandhi was born in Porbandar, Gujarat, and studied law in London before practicing in South Africa where he developed his philosophy of satyagraha (truth-force).

The Salt March, also known as the Dandi March, was a 24-day non-violent protest led by Mahatma Gandhi from March 12 to April 6, 1930. Gandhi walked 387 kilometers from Sabarmati Ashram to Dandi to protest the British salt monopoly. This march became a turning point in the Indian independence movement and attracted worldwide attention to India's struggle for freedom. The march defied the British Salt Act of 1882.

Jawaharlal Nehru (1889-1964) was the first Prime Minister of independent India, serving from 1947 until his death in 1964. He was a central figure in Indian politics before and after independence and is considered the architect of modern India. Nehru was educated at Harrow and Cambridge and became Gandhi's close associate in the independence movement. He was known for his vision of a secular, socialist India.

The Quit India Movement was launched by Gandhi on August 8, 1942, during World War II, demanding an end to British rule in India. The movement was also known as the August Movement and was a decisive campaign in the Indian independence movement. Gandhi gave the famous call "Do or Die" during this movement. The British responded by arresting Gandhi and other Congress leaders immediately.

Subhas Chandra Bose (1897-1945) was an Indian nationalist who defied British authority in India and became a prominent leader of the Indian independence movement. He formed the Indian National Army (Azad Hind Fauj) to fight against British rule. Bose believed in armed resistance and had ideological differences with Gandhi's non-violent approach. He sought help from Axis powers during World War II and established the Provisional Government of Free India.

Sardar Vallabhbhai Patel (1875-1950) was known as the Iron Man of India. He played a crucial role in the political integration of India after independence, helping to unite over 500 princely states into the Indian Union. Patel was also instrumental in organizing the Bardoli Satyagraha in 1928. He was India's first Deputy Prime Minister and is credited with creating the modern administrative system of India.

The Indian National Congress was founded in 1885 by Allan Octavian Hume and became the largest political party leading the Indian independence movement. It played a crucial role in the struggle for independence under leaders like Gandhi, Nehru, and Patel. The Congress initially sought reforms within the British system but later demanded complete independence or Purna Swaraj.

The partition of India in 1947 led to the creation of two independent nations: India and Pakistan. This resulted in one of the largest mass migrations in human history, with millions of people crossing borders. The partition was accompanied by widespread communal violence and displacement. Lord Mountbatten oversaw the partition process as the last Viceroy of India.

Bhagat Singh (1907-1931) was a revolutionary freedom fighter who became a folk hero of the Indian independence movement. He was executed by the British at the age of 23 for his revolutionary activities. Along with Rajguru and Sukhdev, he threw bombs in the Central Legislative Assembly to protest repressive laws. His famous quote "Inquilab Zindabad" (Long Live Revolution) became a rallying cry.

The Non-Cooperation Movement (1920-1922) was launched by Gandhi to resist British rule through non-violent means. It included boycotts of British goods, institutions, and services. The movement was suspended after the Chauri Chaura incident where protesters turned violent and killed 22 policemen. This movement marked Gandhi's emergence as a mass leader.

The Civil Disobedience Movement (1930-1934) was launched by Gandhi with the Salt March. It aimed to break British laws through non-violent resistance and civil disobedience. People were encouraged to break salt laws, boycott British goods, and refuse to pay taxes. The movement was suspended and resumed multiple times due to negotiations with the British.

Lala Lajpat Rai (1865-1928) was one of the three members of the Lal Bal Pal triumvirate. He was a prominent figure in the Indian independence movement and was known as Punjab Kesari (Lion of Punjab). He died from injuries sustained during a protest against the Simon Commission. His death sparked outrage and led to revolutionary activities by Bhagat Singh and others.

The Jallianwala Bagh massacre occurred on April 13, 1919, when British troops under General Reginald Dyer fired on a peaceful gathering in Amritsar, killing hundreds of Indians. General Dyer ordered the firing without warning on a crowd gathered for Baisakhi festival. This event intensified the independence movement and turned many moderate Indians against British rule.

Bal Gangadhar Tilak (1856-1920) was one of the first leaders of the Indian independence movement. He was known as Lokmanya Tilak and coined the slogan "Swaraj is my birthright and I shall have it." He promoted the celebration of Ganesh Chaturthi and Shivaji Jayanti to foster nationalist sentiment. He was imprisoned multiple times by the British for his nationalist activities.

The Swadeshi Movement was launched in 1905 as a response to the partition of Bengal by Lord Curzon. It promoted the use of Indian goods and boycott of British products. The movement helped in the growth of Indian industries and fostered economic nationalism. Leaders like Tilak and Lajpat Rai were prominent supporters of this movement.

Rani Lakshmibai (1828-1858) was the Queen of Jhansi and one of the leading figures of the Indian Rebellion of 1857. She fought valiantly against the British forces and became a symbol of resistance against British rule. She died fighting in the battle of Gwalior. Her resistance against the Doctrine of Lapse made her a legendary figure.

The Indian Rebellion of 1857, also known as the Sepoy Mutiny or the First War of Independence, was a major uprising against British rule. It began with the revolt of sepoys in Meerut on May 10, 1857, and spread across northern and central India. Though it failed, it marked the beginning of organized resistance against British rule and led to the end of East India Company rule.

Mangal Pandey (1827-1857) was a sepoy in the 34th Bengal Native Infantry who played a key role in the events leading up to the Indian Rebellion of 1857. He attacked British officers and was later court-martialed and executed. He is considered a hero and martyr of the Indian independence movement.

The Khilafat Movement (1919-1924) was a pan-Islamic political protest campaign launched by Muslims in British India to restore the caliph of the Ottoman Caliphate. Gandhi supported this movement to unite Hindus and Muslims against British rule. However, the movement eventually declined after the abolition of the Ottoman Caliphate by Turkey itself.
"""

# Save the data to a text file
with open('freedom_data.txt', 'w', encoding='utf-8') as f:
    f.write(freedom_movement_data)

print("📚 Created comprehensive freedom_data.txt!")
print(f"📊 Data contains {len(freedom_movement_data)} characters")

class GeminiRAGChatbot:
    def __init__(self, api_key):
        self.api_key = api_key
        self.documents = []
        self.embeddings = []
        self.index = None
        self.setup_successful = False
        self.setup_rag()

    def setup_rag(self):
        try:
            print("\n🔄 Setting up Gemini RAG system...")
            print("="*50)

            # Configure Gemini
            genai.configure(api_key=self.api_key)

            # Test API key
            print("🔑 Testing Gemini API key...")
            model = genai.GenerativeModel('gemini-2.0-flash-exp')
            test_response = model.generate_content("Hello")
            print("✅ API key working!")

            # Load documents
            loader = TextLoader("freedom_data.txt", encoding='utf-8')
            docs = loader.load()
            print(f"📖 Loaded document with {len(docs[0].page_content)} characters")

            # Split text into chunks
            text_splitter = CharacterTextSplitter(
                chunk_size=400,
                chunk_overlap=50,
                separator="\n\n"
            )
            split_docs = text_splitter.split_documents(docs)
            print(f"📝 Split into {len(split_docs)} chunks")

            # Extract text from documents
            self.documents = [doc.page_content for doc in split_docs]

            # Create embeddings using SentenceTransformer (free)
            print("🔗 Creating embeddings using SentenceTransformer...")
            embedding_model = SentenceTransformer('all-MiniLM-L6-v2')
            self.embeddings = embedding_model.encode(self.documents)

            # Create FAISS index
            print("🏗️ Building FAISS index...")
            dimension = self.embeddings.shape[1]
            self.index = faiss.IndexFlatIP(dimension)  # Inner product for similarity

            # Normalize embeddings for cosine similarity
            faiss.normalize_L2(self.embeddings)
            self.index.add(self.embeddings.astype('float32'))

            # Store embedding model for query encoding
            self.embedding_model = embedding_model

            print("✅ Gemini RAG system setup completed successfully!")
            self.setup_successful = True

        except Exception as e:
            print(f"❌ Error setting up RAG: {str(e)}")
            if "API_KEY" in str(e).upper() or "401" in str(e):
                print("\n💡 This suggests your API key might be invalid")
                print("   • Get a free key at: https://aistudio.google.com/app/apikey")
                print("   • Make sure you copied the key correctly")
            self.setup_successful = False

    def retrieve_relevant_docs(self, query, k=3):
        """Retrieve most relevant documents for the query"""
        try:
            # Encode query
            query_embedding = self.embedding_model.encode([query])
            faiss.normalize_L2(query_embedding)

            # Search for similar documents
            scores, indices = self.index.search(query_embedding.astype('float32'), k)

            # Get relevant documents
            relevant_docs = []
            for i, idx in enumerate(indices[0]):
                if idx < len(self.documents):
                    relevant_docs.append({
                        'content': self.documents[idx],
                        'score': scores[0][i]
                    })

            return relevant_docs
        except Exception as e:
            print(f"Error in retrieval: {e}")
            return []

    def ask_question(self, question):
        if not self.setup_successful:
            return "❌ RAG system not initialized properly. Please check your API key and try again."

        try:
            print(f"\n🔍 Processing: {question}")
            print("-" * 60)

            # Retrieve relevant documents
            relevant_docs = self.retrieve_relevant_docs(question, k=3)

            if not relevant_docs:
                print("❌ No relevant documents found")
                return "Sorry, I couldn't find relevant information to answer your question."

            # Create context from relevant documents
            context = "\n\n".join([doc['content'] for doc in relevant_docs])

            # Create prompt for Gemini
            prompt = f"""You are an expert on India's freedom movement. Based on the following context, please answer the question accurately and comprehensively.

Context:
{context}

Question: {question}

Please provide a detailed answer based on the context above. If the context doesn't contain enough information to fully answer the question, mention that and provide what information is available."""

            # Generate response using Gemini
            model = genai.GenerativeModel('gemini-2.0-flash-exp')
            response = model.generate_content(prompt)
            answer = response.text

            print(f"📜 Answer:")
            print(f"{answer}")

            # Show sources
            print(f"\n📚 Sources used ({len(relevant_docs)} found):")
            for i, doc in enumerate(relevant_docs[:2]):
                print(f"\n--- Source {i+1} (Relevance: {doc['score']:.3f}) ---")
                content = doc['content'].strip()
                print(f"{content[:300]}{'...' if len(content) > 300 else ''}")

            return answer

        except Exception as e:
            error_msg = f"❌ Error processing question: {str(e)}"
            print(error_msg)
            return error_msg

# Main setup function
def setup_gemini_chatbot():
    print("🇮🇳 India's Freedom Movement RAG Chatbot with Gemini 2.0 Flash")
    print("="*70)

    # Get API key from user
    api_key = get_gemini_api_key()

    if not api_key:
        print("❌ Cannot proceed without API key")
        return None

    # Initialize RAG system
    rag_bot = GeminiRAGChatbot(api_key)

    if not rag_bot.setup_successful:
        print("\n❌ Setup failed. Please check your API key and try again.")
        return None

    return rag_bot

# Interactive question function
def ask_questions_interactively(rag_bot):
    if not rag_bot:
        print("❌ Chatbot not initialized")
        return

    print("\n🎯 INTERACTIVE MODE")
    print("="*60)
    print("Ask questions about India's freedom movement!")
    print("Topics: Gandhi, Nehru, Salt March, Quit India, Partition, etc.")
    print("Type 'quit', 'exit', or 'q' to stop")
    print("="*60)

    question_count = 0

    while True:
        try:
            question = input(f"\n❓ Question #{question_count + 1}: ").strip()

            if question.lower() in ['quit', 'exit', 'q']:
                print(f"\n👋 Thank you! You asked {question_count} questions.")
                break

            if not question:
                print("Please enter a question or type 'quit' to exit.")
                continue

            rag_bot.ask_question(question)
            question_count += 1
            print("\n" + "="*60)

        except KeyboardInterrupt:
            print(f"\n\n👋 Goodbye! You asked {question_count} questions.")
            break

# Demo questions function
def run_demo_questions(rag_bot):
    if not rag_bot:
        return

    demo_questions = [
        "Who was Mahatma Gandhi and what was his philosophy of satyagraha?",
        "Tell me about the Salt March and its significance in the freedom movement",
        "What was the Quit India Movement and when was it launched?",
        "How was Subhas Chandra Bose different from Gandhi in his approach?",
        "What happened during the Jallianwala Bagh massacre?"
    ]

    print("\n🎬 DEMO MODE - Sample Questions")
    print("="*60)

    for i, question in enumerate(demo_questions, 1):
        print(f"\n🎯 DEMO QUESTION {i}/{len(demo_questions)}:")
        rag_bot.ask_question(question)
        print("\n" + "="*60)

        if i < len(demo_questions):
            input("Press Enter to continue to next demo question...")

# Main execution
print("🚀 Starting Gemini-powered RAG Chatbot Setup...")
print("🆓 Using FREE Google Gemini 2.0 Flash model!")
print("="*70)

# Setup the chatbot
chatbot = setup_gemini_chatbot()

if chatbot:
    print("\n🎉 Setup successful! Choose what you'd like to do:")
    print("="*60)
    print("1. Run demo questions (recommended for first time)")
    print("2. Start interactive mode")
    print("3. Ask a specific question")

    while True:
        try:
            choice = input("\nEnter your choice (1/2/3) or 'quit': ").strip()

            if choice.lower() in ['quit', 'q', 'exit']:
                print("👋 Goodbye!")
                break
            elif choice == '1':
                run_demo_questions(chatbot)
            elif choice == '2':
                ask_questions_interactively(chatbot)
            elif choice == '3':
                question = input("Enter your question: ").strip()
                if question:
                    chatbot.ask_question(question)
            else:
                print("Please enter 1, 2, 3, or 'quit'")

        except KeyboardInterrupt:
            print("\n👋 Goodbye!")
            break
else:
    print("\n❌ Chatbot setup failed. Please try again with a valid API key.")
    print("💡 Get your FREE Google AI API key at: https://aistudio.google.com/app/apikey")

📚 Created comprehensive freedom_data.txt!
📊 Data contains 7190 characters
🚀 Starting Gemini-powered RAG Chatbot Setup...
🆓 Using FREE Google Gemini 2.0 Flash model!
🇮🇳 India's Freedom Movement RAG Chatbot with Gemini 2.0 Flash
🔐 Google Gemini API Key Setup
You need a Google AI API key to use Gemini.
Get your FREE key from: https://aistudio.google.com/app/apikey
Enter your Google AI API key (hidden input): ··········

🔄 Setting up Gemini RAG system...
🔑 Testing Gemini API key...




✅ API key working!
📖 Loaded document with 7190 characters
📝 Split into 19 chunks
🔗 Creating embeddings using SentenceTransformer...


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

🏗️ Building FAISS index...
✅ Gemini RAG system setup completed successfully!

🎉 Setup successful! Choose what you'd like to do:
1. Run demo questions (recommended for first time)
2. Start interactive mode
3. Ask a specific question

Enter your choice (1/2/3) or 'quit': 1

🎬 DEMO MODE - Sample Questions

🎯 DEMO QUESTION 1/5:

🔍 Processing: Who was Mahatma Gandhi and what was his philosophy of satyagraha?
------------------------------------------------------------
📜 Answer:
Based on the provided context:

Mahatma Gandhi (1869-1948) was the preeminent leader of the Indian independence movement against British rule. He was born in Porbandar, Gujarat, and studied law in London before practicing in South Africa.

His philosophy of satyagraha (truth-force) was developed during his time in South Africa. The context does not elaborate on the specifics of satyagraha, but it mentions that Gandhi employed "nonviolent civil disobedience" to lead India to independence, and that this was inspired by