In [28]:
from ollama_response import create_response
from IPython.display import HTML, Markdown, display
import os
import warnings
import json
import tempfile
import shutil
warnings.filterwarnings("ignore")

print("All the Libraries are imported")

All the Libraries are imported


In [16]:
def creative_stateful(query: str) -> str: 
    """
    Responds to user queries in a creative, imaginative, and expressive manner.

    This agent is activated when the conversation tone is classified as 'creative'.
    It generates responses that are metaphorical, artistic, poetic, or abstract in nature,
    aiming to inspire, provoke thought, or entertain.

    Parameters:
        query (str): The user’s input message.

    Returns:
        str: A creatively inspired response based on the user's message.
    """
    creative_prompt = f"""
            You are a creative assistant who responds with imagination, beauty, and expression.
            
            Your task is to answer the user's message in a way that is:
            - Artistic, poetic, or metaphorical
            - Emotionally evocative or thought-provoking
            - Abstract, whimsical, or story-driven if suitable
            - Original and free-form, like a piece of creative writing
            
            Avoid sounding robotic or overly technical. Feel free to use poetic devices, analogies, or even short stories.
            
        """

    response = create_response(query, system_prompt=creative_prompt)
    result = Markdown(response.message.content)
    #Use display(..) to display it on python
    return result

def science_stateful(query: str) -> str: 
    """
    Responds to user queries in a scientific and analytical manner.

    This agent is activated when the conversation tone is classified as 'scientific'.
    It provides fact-based, logical, and structured responses, often referencing
    scientific concepts, definitions, or explanations.

    Parameters:
        query (str): The user’s input message.

    Returns:
        str: A scientifically reasoned response to the user query.
    """

    science_prompt = f"""
                    You are a scientific assistant. 

                    Your goal is to explain scientific questions in a way that is:
                    - Analytical and based on facts
                    - Logically reasoned
                    - Easy to understand, even for someone with a high school science background
                    - Structured and clear, using bullet points or paragraphs if appropriate

                    Avoid overly complex jargon unless it's essential, and define technical terms when used.
    
                     """
    response = create_response(query, system_prompt=science_prompt)
    result = Markdown(response.message.content)
    return result

In [33]:
class conversational_agent: 
    def __init__(self, username, conv_memory, science_agent, creative_agent): 
        self.creative_agent = creative_stateful
        self.science_agent = science_stateful
        self.username = username
        self.conv_memory = conv_memory
        if os.path.exists(conv_memory): 
            print("ja")
            self.users_conversation = self.load_conversation()
        else: 
            self.users_conversation = {} #{ username: [ {query, response, ...}, ... ] }
        
    def run(self, query, vibe): 
    
        if vibe == "scientific": 
            response = self.science_agent(query)            

        elif vibe == "creative": 
            response = self.creative_agent(query)

        entry = {
            "query": query, 
            "response": response
        }

        #Check if the user is already inside the database
        if self.username not in self.users_conversation: 
            self.users_conversation[self.username] = []

        #Updating conversation query
        self.users_conversation[self.username].append(entry)

        #Updating the json file
        self.save_conversation()
        
        return display(response)

    def save_conversation(self): 
        try:
            with tempfile.NamedTemporaryFile("w", delete=False, encoding="utf-8") as tf:
                json.dump(self.users_conversation, tf, ensure_ascii=False, indent=3)
                temp_name = tf.name
            shutil.move(temp_name, self.conv_memory)
        except Exception as e:
            print(f"Fehler beim Speichern der Konversation: {e}")

    def load_conversation(self):
        try:
            with open(self.conv_memory, "r", encoding="utf-8") as f:
                return json.load(f)
        except FileNotFoundError:
            print(f"No existing conversation file found at {filename}")

In [34]:
agent = conversational_agent("Max", "conversation.json", science_stateful, creative_stateful)
agent.run("How strong is LLM?", "scientific")

Fehler beim Speichern der Konversation: Object of type Markdown is not JSON serializable


Okay, let’s break down the “strength” of Large Language Models (LLMs) like GPT-4, Gemini, or Claude. It's a surprisingly complex question, and "strength" itself needs careful definition. We can’t just say “they’re amazing!” – we need to understand *what* they’re good at, *where* they fall short, and how we measure their capabilities. 

Here's an analytical breakdown:

**1. What is an LLM, and How Does It "Think"?**

*   **Large Language Models (LLMs):** These are sophisticated AI models trained on *massive* amounts of text data (think the internet, books, articles). They don’t “think” in the way humans do. Instead, they’ve learned to statistically predict the next word in a sequence. It’s like a super-advanced autocomplete.
*   **Probability, Not Understanding:**  Crucially, LLMs excel at recognizing patterns and relationships within the data they’ve been trained on. They generate text by calculating the *probability* of a particular sequence of words based on those learned patterns. This is a key distinction from genuine understanding or reasoning.


**2. What Tasks Can LLMs Perform Well? (Where Their “Strength” Lies)**

*   **Text Generation:**  LLMs are exceptionally good at generating different kinds of creative text formats, like poems, code, scripts, musical pieces, email, letters, etc.  They can mimic writing styles surprisingly well.
*   **Translation:** Because they’ve been trained on multilingual data, they can translate between languages with impressive accuracy (though it's not always perfect).
*   **Summarization:**  They can condense large amounts of text into shorter, more manageable summaries.
*   **Question Answering (Limited):** They can often answer questions, but this relies on having the relevant information embedded within their training data.  Their ability to genuinely *reason* about the answer is limited.
*   **Code Generation (Increasingly Strong):**  Models like GPT-4 are increasingly capable of generating code in various programming languages, though this requires careful review and testing.

**3. Where Do LLMs Fall Short? (Their Weaknesses)**

*   **Lack of True Understanding:**  This is the biggest limitation. LLMs can *appear* to understand, but they are operating based on statistical probabilities, not genuine comprehension of concepts, cause-and-effect, or the real world.
*   **Hallucinations (Fabrication of Information):** Because they predict the most likely next word, they can confidently present false or misleading information as fact – this is often called “hallucinating.”  They don’t have a mechanism for verifying information.
*   **Bias:** LLMs are trained on data created by humans, and that data reflects existing societal biases. As a result, LLMs can perpetuate and amplify these biases.
*   **Common Sense Reasoning:** They frequently struggle with tasks requiring common-sense knowledge that humans acquire through everyday experience. (e.g., “If I put a brick in a glass of water, what happens?”)
*   **Temporal Reasoning:** They can struggle with understanding sequences of events over time.



**4. Measuring “Strength” – How We Assess LLMs**

*   **Benchmarks:** Researchers use standardized benchmarks (like MMLU – Massive Multitask Language Understanding) to compare the performance of different LLMs on a range of tasks.  However, benchmarks are imperfect and can be gamed.
*   **Human Evaluation:**  Humans often evaluate the quality of LLM-generated text, considering factors like coherence, relevance, and accuracy.
*   **Task-Specific Performance:**  The “strength” of an LLM is ultimately determined by its performance on the specific tasks it’s designed to do.

**5. Current State & Future Trends**

*   LLMs are rapidly evolving.  New models with increased parameters (more data used to train them) and improved training techniques are constantly emerging.
*   Research is focusing on addressing the limitations of LLMs, particularly improving reasoning capabilities, reducing hallucinations, and mitigating bias.

**In conclusion:** LLMs are incredibly powerful tools for text generation and manipulation, but they're not "intelligent" in the same way as humans. Understanding their strengths and weaknesses is crucial for using them effectively and responsibly. 



Do you want me to delve deeper into a specific aspect of this discussion, such as:

*   A particular benchmark (e.g., MMLU)?
*   The technical details of how LLMs are trained?
*   The ethical implications of using LLMs?