# MiniMax-M2.1 Capabilities Test

This notebook explores the various capabilities of the MiniMax-M2.1 model:
- Reasoning and logic
- Code generation
- Creative writing
- Math and calculations
- Multi-turn conversation coherence


In [1]:
# Setup
import sys
sys.path.insert(0, '..')

from dotenv import load_dotenv
load_dotenv('../.env')

from src.minimax_client import MiniMaxClient

client = MiniMaxClient()
print(f"Testing: {client.model}")


Testing: MiniMax-M2.1


## 1. Reasoning and Logic

Test the model's ability to reason through problems step by step.


In [2]:
# Logic puzzle
response = client.simple_chat("""
A farmer needs to cross a river with a wolf, a goat, and a cabbage. 
The boat can only carry the farmer and one item at a time.
If left alone:
- The wolf will eat the goat
- The goat will eat the cabbage

How can the farmer get everything across safely?
""")
print("Logic Puzzle Solution:")
print(response)


Logic Puzzle Solution:
<think>
Got it, let's try to figure this out step by step. First, let's list the initial state: Left bank (L): Farmer (F), Wolf (W), Goat (G), Cabbage (C). Right bank (R): Empty. The boat can carry F plus one other thing.

First, we need to avoid leaving W and G alone together, or G and C alone together. Let's start:

1. First trip: F takes G to R (since if F takes W first, then L has W and C, but wait no—wait if F takes W first, L has W and C? No, initial L is F, W, G, C. If F takes W to R, then L has G and C—oh, that's bad (G would eat C). So can't take W first. If F takes C first, L has W and G—bad (W eats G). So first trip must be F takes G to R. So after trip 1: L: W, C; R: F, G; boat at R.

2. Trip 2: F leaves G at R and goes back alone to L. So now L: F, W, C; R: G; boat at L.

3. Trip 3: Now L has F, W, C—wait no, but if F takes W to R, then L has C alone, which is fine, and R has F, G, W—wait no, but wait when F takes W to R, what's the state on R? R has

In [3]:
# Deductive reasoning
response = client.simple_chat("""
All roses are flowers.
Some flowers fade quickly.
Can we conclude that some roses fade quickly? 
Explain your reasoning.
""")
print("Deductive Reasoning:")
print(response)


Deductive Reasoning:
<think>
The user asks: "All roses are flowers. Some flowers fade quickly. Can we conclude that some roses fade quickly? Explain your reasoning."

We need to answer in a thorough manner. It's a logic puzzle: "All roses are flowers" => universal affirmative: ∀x (R(x) → F(x)). "Some flowers fade quickly." => existential affirmative: ∃x (F(x) ∧ Q(x)), where Q(x) = fade quickly. The question is whether we can conclude ∃x (R(x) ∧ Q(x)). In classical categorical logic, you cannot infer "some roses fade quickly" from the premises because you don't know that the flowers that fade quickly are among the roses. The existential premise only says there exists at least one flower that fades quickly, but it might not be a rose. So the answer is no, we cannot conclude that.

But we need to explain reasoning: using Venn diagrams, set theory, categorical syllogism, and perhaps note that the syllogism would be invalid: universal premise (All R are F), particular premise (Some F are Q)

## 2. Code Generation

Test the model's ability to write and explain code.


In [None]:
# Python code generation
response = client.simple_chat("""
Write a Python function that implements binary search on a sorted list.
Include docstring and type hints.
""")
print("Binary Search Implementation:")
print(response)


In [None]:
# Code explanation
response = client.simple_chat("""
Explain what this code does and identify any potential issues:

```python
def mystery(n):
    if n <= 1:
        return n
    return mystery(n-1) + mystery(n-2)
```
""")
print("Code Explanation:")
print(response)


## 3. Creative Writing

Test the model's creative and narrative abilities.


In [None]:
# Short story
response = client.simple_chat(
    "Write a short story (100-150 words) about a robot learning to paint.",
    temperature=0.9  # Higher temperature for creativity
)
print("Creative Story:")
print(response)


In [None]:
# Poetry
response = client.simple_chat(
    "Write a haiku about artificial intelligence.",
    temperature=0.8
)
print("AI Haiku:")
print(response)


## 4. Math and Calculations

Test mathematical reasoning and problem-solving.


In [None]:
# Word problem
response = client.simple_chat("""
A train leaves Station A at 9:00 AM traveling at 60 mph.
Another train leaves Station B (300 miles away) at 10:00 AM traveling toward Station A at 80 mph.
At what time will they meet? Show your work.
""")
print("Math Word Problem:")
print(response)


In [None]:
# Algebra
response = client.simple_chat("""
Solve for x: 3x² - 12x + 9 = 0
Show your solution step by step.
""")
print("Algebra Problem:")
print(response)


## 5. Multi-turn Conversation Coherence

Test the model's ability to maintain context across multiple turns.


In [None]:
# Multi-turn context test
conversation = [
    {"role": "system", "content": "You are a helpful assistant with perfect memory."}
]

# Define questions that require remembering context
questions = [
    "My name is Alex and I have 3 cats named Luna, Milo, and Shadow.",
    "What's my oldest cat's name? (Luna is the oldest)",
    "How many pets do I have in total?",
    "What's my name?"
]

print("Multi-turn Conversation Coherence Test:")
print("=" * 50)

for q in questions:
    conversation.append({"role": "user", "content": q})
    response = client.chat(conversation)
    answer = response.choices[0].message.content
    conversation.append({"role": "assistant", "content": answer})
    
    print(f"\nUser: {q}")
    print(f"Assistant: {answer}")


## Summary

Use this notebook to systematically evaluate MiniMax-M2.1 capabilities. Add more test cases as needed to explore specific areas of interest.
