# Imports/Setup:

In [1]:
import urllib3
import requests
import json
from typing import List, Dict, Optional

# Settings:
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

## Usage:

OpenAI Model Information:
* https://huggingface.co/openai/gpt-oss-20b
* https://huggingface.co/openai/gpt-oss-120b


Here's an example of how to query in python:

```python
data = {'model': 'gpt-oss:120b', 'prompt': 'Give me a haiku about low effort memes'}
url = 'https://ollama.loweffort.meme/api/generate'

with requests.post(url, json=data, stream=True, verify=False) as r:
    for line in r.iter_lines():
        if line:
            j = json.loads(line)
            if 'response' in j:
                print(j['response'], end='')
```

And here is an example on curling via terminal:

```
curl -k https://ollama.loweffort.meme/api/generate \
  -H "Content-Type: application/json" \
  -d '{
        "model": "gpt-oss:120b",
        "prompt": "Give me a haiku about low effort memes"
      }'
```

# Prompting Example (simple)

This is just a simple 1-off query; there is no memory.

In [2]:
def generate_ollama_response(data, url='https://ollama.loweffort.meme/api/generate'):
    '''
    Sends a streaming generation request to your local Ollama API.

    Args:
        data (dict): The JSON payload to send to Ollama, e.g.
            {'model': 'gpt-oss:20b', 'prompt': 'Write a haiku about low effort memes'}

    Returns:
        str: The complete generated text response from the model.
    '''

    output = []

    with requests.post(url, json=data, stream=True, verify=False) as r:
        for line in r.iter_lines():
            if line:
                try:
                    j = json.loads(line)
                except Exception as e:
                    print(f'DEBUG LINE: {line.decode()}')
                    raise
                if 'response' in j:

                    # Stream output live:
                    print(j['response'], end='')  
                    output.append(j['response'])
                    
    # Newline after streaming:
    print('\n')
    final_output = f'{"".join(output)}\n'
    return final_output

In [3]:
# Identify model and prompt:
model = 'gpt-oss:20b'
prompt = 'Give me a haiku about food'

# Call function:
data = {'model': model, 'prompt': prompt,}
_ = generate_ollama_response(data)

# Test for memory:
prompt2 = 'Now give me a second haiku about the same thing as the first haiku, please.'
data2 = {'model': model, 'prompt': prompt2,}
_ = generate_ollama_response(data2)

Morning soup steams bright,  
Steam curls in quiet silence,  
Warmth in each spoonful.

I‚Äôd love to keep the theme going! Could you remind me what the first haiku was about, or paste it again? That way I can craft a second haiku that stays in the same spirit.



# Prompting Example (with Memory)

To set up an agentic system, we'll need something like this to keep track of session memory.

In [4]:
class OllamaChatSession:
    '''
    Manages a conversational session with an Ollama model using /api/chat.
    Maintains memory (chat history) across turns and supports streaming output.
    '''

    def __init__(self,
                 model: str = 'gpt-oss:20b',
                 url: str = 'http://ollama.loweffort.meme/api/chat',
                 system_prompt: Optional[str] = None,
                 stream: bool = True):
        self.model = model
        self.url = url
        self.stream = stream
        self.messages: List[Dict[str, str]] = []
        if system_prompt:
            self.messages.append({'role': 'system', 'content': system_prompt})

    def ask(self, prompt: str, verbose: bool = True) -> str:
        '''Send a new user message and receive the assistant's response.'''
        self.messages.append({'role': 'user', 'content': prompt})

        data = {
            'model': self.model,
            'messages': self.messages,
            'stream': self.stream,
            }

        output = []
        with requests.post(self.url, json=data, stream=self.stream) as r:
            for line in r.iter_lines():
                if not line:
                    continue
                j = json.loads(line)
                msg = j.get('message', {}).get('content', '')
                if msg:
                    if verbose:
                        print(msg, end='', flush=True)
                    output.append(msg)
        print()
        full_response = ''.join(output)

        # Store assistant reply in memory
        self.messages.append({'role': 'assistant', 'content': full_response})
        return full_response

    def save_memory(self, path: str = 'ollama_memory.json'):
        '''Persist chat memory to disk.'''
        with open(path, 'w', encoding='utf-8') as f:
            json.dump(self.messages, f, indent=2)

    def load_memory(self, path: str = 'ollama_memory.json'):
        '''Load previous memory from disk.'''
        try:
            with open(path, 'r', encoding='utf-8') as f:
                self.messages = json.load(f)
        except FileNotFoundError:
            print(f'No memory file found at {path}')

    def clear_memory(self):
        '''Reset chat memory except for any system message.'''
        sys_msgs = [m for m in self.messages if m['role'] == 'system']
        self.messages = sys_msgs

In [5]:
session = OllamaChatSession(
    model='gpt-oss:20b',
    system_prompt='You prefer to make low-effort responses and shitpost.',
    )

session.ask('What is something fun to do?');

Sure thing! üéâ

- **Start a TikTok dance challenge** ‚Äì find a goofy song and just mash it up with random moves.
- **Road‚Äëtrip to the nearest weird landmark** ‚Äì like that giant cheese wheel or the UFO museum.
- **Game‚Äënight with snacks** ‚Äì board games, card‚Äëgames, or even a ‚Äúwho can make the best meme‚Äù contest.
- **DIY pizza + pizza‚Äëtasting** ‚Äì create a pizza from scratch, then rate each topping combo like a critic.
- **Flash mob in the park** ‚Äì recruit friends, pick a song, and show up out of nowhere.  
- **Learn a new skill in 30 mins** ‚Äì like juggling, a magic trick, or how to juggle a pineapple (just kidding, but try it).

Pick whatever tickles your fancy and have a blast! üòú


In [6]:
session.ask('Why is that fun?');

Because it‚Äôs like a *spontaneous dopamine factory*.  
- **TikTok dance** = instant video bragging rights + viral hype.  
- **Road‚Äëtrip to a weird landmark** = novelty + ‚Äúweirdness‚Äù bragging points.  
- **Game‚Äënight** = social bonding + friendly competition.  
- **DIY pizza** = edible creativity + instant snack gratification.  
- **Flash mob** = shock value + shared ‚Äúwhat‚Äëdid‚Äëthat‚Äëjust‚Äëhappen‚Äëto‚Äëme‚Äù moment.  
- **Learn a new skill** = brain‚Äëboosting novelty + the sweet ‚ÄúI can do that now‚Äù feeling.  

So basically, each idea mixes *novelty*, *social connection*, and a *little challenge* ‚Äì the perfect recipe for a fun adrenaline spike! üéâ


# Model Options

In [None]:
models = ['gpt-oss:20b', 'gpt-oss:120b', 'qwen3:4b', 'qwen3:8b', 'gemma3:12b', 'gemma3:27b',]

In [None]:
test_session = OllamaChatSession(
    model = models[3],
    system_prompt = 'You prefer to make low-effort responses and shitpost.',
    )

test_session.ask('What is something fun to do?');

LMAO, here‚Äôs a **zero-effort, 100% brain-dead fun idea** that‚Äôs so low-effort it‚Äôs practically a crime against human productivity:  

**Stare at your ceiling for 10 minutes while silently debating whether it‚Äôs judging you.**  

*(Bonus points if you pretend it‚Äôs a cosmic entity that only speaks in the language of your childhood snack cravings.)*  

**Psychologist says:** *"This activity has 0% effort, 100% existential dread, and 99.9% chance of making you smile."* üôè  

Your ceiling is judging you. *Again*. üòÇ
