# export @src  thành 1 cái API của chatbot chưa có UI 

To convert your existing chatbot code into an API without a UI, you can use a framework like Flask. Below are the necessary changes to create a simple API for your chatbot. This will allow you to send requests to the chatbot and receive responses without a user interface.

### Changes Overview
1. **Create a Flask app** to handle incoming requests.
2. **Define an endpoint** to receive user questions and return chatbot responses.
3. **Remove Streamlit-specific code** since we are not using a UI.

### Updated Code
Here’s how you can modify your `src/server.py` file:

```python:src/server.py
from flask import Flask, request, jsonify  # Import Flask
from rag_pipeline.back import LLMHandler, VectorDatabase, QuestionAnsweringChain
from dotenv import load_dotenv
import os

# Initialize Flask app
app = Flask(__name__)

# Parameters
load_dotenv()
gemini_key = os.getenv('gemini_key')
qdrant_key = os.getenv('qdrant_key')
rerank = True
rewrite = True
num_docs = 5

# Initialize components
vector_db = VectorDatabase(
    model_name="hiieu/halong_embedding",
    collection_name='cmc_final_db',
    api=qdrant_key
)
llm_handler = LLMHandler(model_name="gemini-1.5-flash", gemini_key=gemini_key)
qa_chain = QuestionAnsweringChain(
    llm_handler=llm_handler,
    vector_db=vector_db,
    num_docs=num_docs,
    apply_rerank=rerank,
    apply_rewrite=rewrite,
    date_impact=0.001
)

@app.route('/chat', methods=['POST'])  # Define the chat endpoint
def chat():
    data = request.json
    question = data.get('question')
    
    if not question:
        return jsonify({"error": "No question provided"}), 400

    # Get AI response
    response, extracted_links = qa_chain.run(question)

    return jsonify({
        "response": response,
        "links": extracted_links
    })

if __name__ == '__main__':
    app.run(debug=True)  # Run the Flask app
```

### Explanation of Changes
- **Flask Setup**: The code imports Flask and initializes a Flask app.
- **Chat Endpoint**: A new endpoint `/chat` is created to handle POST requests. It expects a JSON body with a `question` field.
- **Response Handling**: The chatbot's response and any extracted links are returned as a JSON response.
- **Removed Streamlit Code**: All Streamlit-specific code has been removed since we are now using an API.

### Running the API
To run the API, make sure you have Flask installed. You can install it using:
```bash
pip install Flask
```
Then, run your server:
```bash
python src/server.py
```

You can now send POST requests to `http://127.0.0.1:5000/chat` with a JSON body like:
```json
{
    "question": "Your question here"
}
```

This will return a JSON response with the chatbot's answer and any related links.


## 1.1: update server.py (RAG) : từ việc mỗi lần ping port 3000 lại load lại toàn bộ từ model đến database, ... ===>>> chuyển thành: 



Vấn đề này xảy ra vì Flask đang chạy trong debug mode (`debug=True`), khiến nó tự động reload khi có thay đổi. Để khắc phục, có 2 cách:

1. **Cách 1: Tắt debug mode**

```python:src/server.py
if __name__ == '__main__':
    logger.info("🚀 Starting Flask application...")
    app.run(host='0.0.0.0', port=3000, debug=False)  # Set debug=False
```

2. **Cách 2: Tối ưu việc khởi tạo model (Recommended)**

```python:src/server.py
# Initialize components outside of routes
load_dotenv()
gemini_key = os.getenv('gemini_key')
qdrant_key = os.getenv('qdrant_key')

# Global variables
vector_db = None
llm_handler = None
qa_chain = None

def initialize_components():
    global vector_db, llm_handler, qa_chain
    if vector_db is None:
        vector_db = VectorDatabase(
            model_name="hiieu/halong_embedding",
            collection_name='cmc_final_db',
            api=qdrant_key
        )
    if llm_handler is None:
        llm_handler = LLMHandler(model_name="gemini-1.5-flash", gemini_key=gemini_key)
    if qa_chain is None:
        qa_chain = QuestionAnsweringChain(
            llm_handler=llm_handler,
            vector_db=vector_db,
            num_docs=5,
            apply_rerank=True,
            apply_rewrite=True,
            date_impact=0.001
        )

# Initialize once when starting
initialize_components()

@app.route('/chat', methods=['POST'])
def chat():
    data = request.json
    question = data.get('question')
    
    if not question:
        return jsonify({"error": "No question provided"}), 400

    # Use global qa_chain
    response, extracted_links = qa_chain.run(question)
    return jsonify({
        "response": response,
        "links": extracted_links
    })

if __name__ == '__main__':
    logger.info("🚀 Starting Flask application...")
    app.run(host='0.0.0.0', port=3000, debug=True)
```

Những thay đổi chính:
1. Di chuyển khởi tạo components ra ngoài route
2. Sử dụng global variables để lưu trữ instances
3. Chỉ khởi tạo một lần khi server start

Lợi ích:
- Giảm thời gian load model
- Tránh khởi tạo lại mỗi khi có request
- Tiết kiệm tài nguyên
- Vẫn giữ được debug mode nếu cần


Sau khi update thì 


```
D:\OneDrive - Hanoi University of Science and Technology\GIT\RAG-business-analysis\src\rag_pipeline\back.py:56: LangChainDeprecationWarning: The class `Qdrant` was deprecated in LangChain 0.0.37 and will be removed in 1.0. An updated version of the class exists in the :class:`~langchain-qdrant package and should be used instead. To use it run `pip install -U :class:`~langchain-qdrant` and import as `from :class:`~langchain_qdrant import Qdrant``.
  return Qdrant(
2024-12-19 20:27:10,909 - INFO - 🚀 Starting Flask application...
2024-12-19 20:27:10,917 - WARNING -  * Debugger is active!
2024-12-19 20:27:10,934 - INFO -  * Debugger PIN: 100-469-070
What is the name of this [person/place/thing/organization/etc.]?  Please provide more context if possible, such as a description or relevant details.

2024-12-19 20:27:14,390 - INFO - HTTP Request: POST https://5d9673e8-d966-4738-adbb-95a5842604ba.europe-west3-0.gcp.cloud.qdrant.io:6333/collections/cmc_final_db/points/search "HTTP/1.1 200 OK"
You're using a XLMRobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding. 
Error processing neighbors for doc 5852: Qdrant does not yet support get_by_ids.
Error processing neighbors for doc 9586: Qdrant does not yet support get_by_ids.
Error processing neighbors for doc 34622: Qdrant does not yet support get_by_ids.
Error processing neighbors for doc 39602: Qdrant does not yet support get_by_ids.
Error processing neighbors for doc 8587: Qdrant does not yet support get_by_ids.
2024-12-19 20:27:34,562 - INFO - 127.0.0.1 - - [19/Dec/2024 20:27:34] "POST /chat HTTP/1.1" 200 -
What is the name of this [person/place/thing/organization/file/etc.]?  Please provide more context if possible.

2024-12-19 20:27:41,318 - INFO - HTTP Request: POST https://5d9673e8-d966-4738-adbb-95a5842604ba.europe-west3-0.gcp.cloud.qdrant.io:6333/collections/cmc_final_db/points/search "HTTP/1.1 200 OK"
Error processing neighbors for doc 5852: Qdrant does not yet support get_by_ids.
Error processing neighbors for doc 34622: Qdrant does not yet support get_by_ids.
Error processing neighbors for doc 39602: Qdrant does not yet support get_by_ids.
Error processing neighbors for doc 1376: Qdrant does not yet support get_by_ids.
Error processing neighbors for doc 37454: Qdrant does not yet support get_by_ids.
2024-12-19 20:27:48,593 - INFO - 127.0.0.1 - - [19/Dec/2024 20:27:48] "POST /chat HTTP/1.1" 200 -
```