here we should do all pip install's in terminal.

error 1: now streamlit app is running locally, if ingested pfd and its showing like this 
                                 
Both `max_new_tokens` (=256) and `max_length`(=220) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)

This is not an error, but a warning from hugging face transformer library,
Why it happens?
When generating text, the model looks at parameters like:
max_length → total length of input + output tokens
max_new_tokens → maximum number of tokens the model can generate in the output

If you set both, the library says:
“I’ll ignore max_length and use max_new_tokens instead.”
That’s why you see the message.

How to fix (optional)
You can update your synthesize_answer function (or wherever you call .generate()) to only set one of them. For example:

output = generator(
    query,
    max_new_tokens=256,   # keep only this
    # remove max_length to avoid the warning
    do_sample=True,
    temperature=0.7
)


If you’re fine with the current behavior, you can safely ignore the warning — it won’t break your chatbot.
But if you want a clean console, just remove max_length and use only max_new_tokens.

You should use only one:
If you want the model to generate at most N tokens beyond the prompt, use:
generator(prompt, max_new_tokens=256)

If you want the model to limit total tokens (prompt + answer), use:
generator(prompt, max_length=220)

In practice, max_new_tokens is recommended for QA/chat apps, because your prompt length can vary.

Error 2: about chunks & MemoryError

File "C:\Users\Admin\Documents\CusorAIDoc\chatbot_st.py", line 47, in <module> chunks = chunk_text(paste_text,chunk_size=chunk_size,overlap=overlap) File "C:\Users\Admin\Documents\CusorAIDoc\csb.py", line 44, in chunk_text chunks.append(chunk) ~~~~~~~~~~~~~^^^^^^^

First thought it about appending chunks without initial like chunks = [], but the problem was that my prgm was ruuning out of RAM,while spiltting and storing the chunks.

Why is this happening?
Looking at your code:

while start < n:
    end = min(start + chunk_size, n)
    chunk = text[start:end]
    chunks.append(chunk)
    start = end - overlap
    if start < 0:
        start = 0


If overlap is too big (e.g. overlap=120 with chunk_size=800), the loop may get stuck creating too many overlapping chunks.
If the input text (paste_text) is very large (say an entire big PDF), storing every chunk in a list eats up RAM.
If start is not increasing properly, it can loop infinitely → filling memory.

How to Fix:
Fix the loop increment.Right now, you are doing:
start = end - overlap

This can cause overlaps to pile up incorrectly. Instead, you should move forward like this:
start += chunk_size - overlap

Add a safeguard for memory
If the text is too big, limit chunks:
if len(chunks) > 10000:   # arbitrary cutoff
    break

Optimized function:

def chunk_text(text: str, chunk_size: int = 800, overlap: int = 120) -> List[str]:
    """
    Splits text into overlapping chunks.
    """
    text = clean_text(text)
    chunks = []
    start = 0
    n = len(text)

    while start < n:
        end = min(start + chunk_size, n)
        chunk = text[start:end]
        chunks.append(chunk)
        
        # move start forward correctly
        start += chunk_size - overlap  

        if start < 0:  
            start = 0  

    return chunks

error 3:after running streamlit run locally, i stored the name in data/faqs.txt and asked chat, it saying no source found.

The reason it says “No sources found yet. Insert your documents.” is because your app didn’t actually load the data/faqs.txt file into the vector database / memory for retrieval.

But saving text manually into data/faqs.txt does not automatically load it into the app. The app only works with documents you upload or paste + press Insert.

How to Fix
You have two options:

Option 1: Upload your file via UI
Go to the left panel in your app.
Click Browse files → select your faqs.txt.
Press Insert.
→ This will chunk the file and add it to your knowledge base.
Now when you ask “what is my name?”, it should search inside faqs.txt.

Option 2: Auto-load data/faqs.txt on startup
If you want your app to always load data/faqs.txt without uploading each time, add this code in your chatbot_st.py (before the UI part):

import os

faqs_path = "data/faqs.txt"
if os.path.exists(faqs_path):
    with open(faqs_path, "r", encoding="utf-8") as f:
        faq_text = f.read()
    chunks = chunk_text(faq_text, chunk_size=800, overlap=120)
    # insert into your vector store
    insert_chunks(chunks)


(Here insert_chunks should be whatever function you are already using when pressing Insert in your app.)