# MCP Tools Demo - Manual Testing

This notebook tests the 3 MCP tools directly:
1. **single_mean** - Single mean hypothesis test
2. **compare_means** - Compare means between groups
3. **regress_fit** - Linear regression (expanded with vif, ivar, plots)

These tools will be called by the LLM in the next checkpoint.

## Setup: Load Data into MCP DATA_STORE

In [1]:
import sys

sys.path.insert(0, "/home/vnijs/gh/pyrsm/mcp-server")

from server_regression import DATA_STORE, call_tool
import pyrsm
import asyncio

# Load test datasets
print("Loading datasets...")
salary, _ = pyrsm.load_data(name="salary", pkg="basics")
DATA_STORE["salary"] = salary
print(f"âœ“ Loaded salary: {salary.shape}")
print(f"  Columns: {list(salary.columns)}")

diamonds, _ = pyrsm.load_data(name="diamonds", pkg="model")
DATA_STORE["diamonds"] = diamonds
print(f"âœ“ Loaded diamonds: {diamonds.shape}")
print(f"  Columns: {list(diamonds.columns)}")

Loading datasets...
âœ“ Loaded salary: (397, 6)
  Columns: ['salary', 'rank', 'discipline', 'yrs_since_phd', 'yrs_service', 'sex']
âœ“ Loaded diamonds: (3000, 11)
  Columns: ['price', 'carat', 'clarity', 'cut', 'color', 'depth', 'table', 'x', 'y', 'z', 'date']


## Tool 1: single_mean

Test if the mean salary equals $100,000

In [2]:
async def test_single_mean():
    result = await call_tool(
        name="single_mean",
        arguments={
            "data_name": "salary",
            "var": "salary",
            "comp_value": 100000,
            "alt_hyp": "two-sided",
            "conf": 0.95,
            "dec": 2,
        },
    )
    return result[0].text


result = await test_single_mean()
print(result)

**Single Mean Hypothesis Test**

Generated code:
```python
import pyrsm
# Single mean test: salary
sm = pyrsm.basics.single_mean({'salary': salary}, var='salary', alt_hyp='two-sided', conf=0.95, comp_value=100000)
sm.summary(dec=2)
```

Output:
```
Single mean test
Data      : salary
Variables : salary
Confidence: 0.95
Comparison: 100000

Null hyp. : the mean of salary is equal to 100000
Alt. hyp. : the mean of salary is not equal to 100000

     mean   n  n_missing       sd      se     me
113706.46 397          0 30289.04 1520.16 2988.6
    diff      se  t.value p.value  df      2.5%     97.5%    
13706.46 1520.16     9.02  < .001 396 110717.86 116695.06 ***

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

```

**Suggested next steps:**
- Interpret the p-value and confidence interval
- Try different alternative hypotheses if needed
- Plot the results with plots=['hist']



## Tool 2: compare_means

Compare salary between different academic ranks

In [3]:
async def test_compare_means():
    result = await call_tool(
        name="compare_means",
        arguments={
            "data_name": "salary",
            "var1": "rank",  # Categorical: Prof, AssocProf, AsstProf
            "var2": "salary",  # Numeric: salary to compare
            "alt_hyp": "two-sided",
            "conf": 0.95,
            "test_type": "t-test",
            "dec": 2,
        },
    )
    return result[0].text


result = await test_compare_means()
print(result)

**Compare Means Test**

Generated code:
```python
import pyrsm
# Compare means: rank vs salary
cm = pyrsm.basics.compare_means({'salary': salary}, var1='rank', var2='salary', alt_hyp='two-sided', conf=0.95, sample_type='independent', test_type='t-test')
cm.summary(dec=2)
```

Output:
```
Pairwise mean comparisons (t-test)
Data      : salary
Variables : rank, salary
Samples   : independent
Confidence: 0.95
Adjustment: None
     rank      mean   n  n_missing       sd      se      me
 AsstProf  80775.99  67          0  8174.11  998.63 1993.82
AssocProf  93876.44  64          0 13831.70 1728.96 3455.06
     Prof 126772.11 266          0 27718.67 1699.54 3346.32
           Null hyp.                       Alt. hyp.      diff p.value    
AsstProf = AssocProf AsstProf not equal to AssocProf -13100.45  < .001 ***
     AsstProf = Prof      AsstProf not equal to Prof -45996.12  < .001 ***
    AssocProf = Prof     AssocProf not equal to Prof -32895.67  < .001 ***

Signif. codes:  0 '***' 0.001 '**

## Tool 3: regress_fit (Expanded)

Fit regression with new parameters: vif, dec, ivar, plots

In [4]:
async def test_regress_fit():
    result = await call_tool(
        name="regress_fit",
        arguments={
            "data_name": "diamonds",
            "rvar": "price",
            "evar": ["carat", "depth", "table"],
            "vif": True,  # Check multicollinearity
            "dec": 2,  # 2 decimal places
            "show_summary": True,
        },
    )
    return result[0].text


result = await test_regress_fit()
print(result)

âœ“ Model fitted and stored as: reg_f96aa384_1761786138

Dataset: diamonds (3000 rows)
Response: price
Predictors: carat, depth, table

Generated code:
```python
data, description = pyrsm.load_data(name='diamonds')
reg = pyrsm.model.regress(data, rvar='price', evar=['carat', 'depth', 'table'])
```

Summary:
```
Linear regression (OLS)
Data                 : Not provided
Response variable    : price
Explanatory variables: carat, depth, table
Null hyp.: the effect of x on price is zero
Alt. hyp.: the effect of x on price is not zero

           coefficient  std.error  t.value p.value     
Intercept    12594.059   1599.337    7.875  < .001  ***
carat         7844.699     57.572  136.258  < .001  ***
depth         -150.722     19.525   -7.719  < .001  ***
table          -97.629     12.826   -7.612  < .001  ***

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-squared: 0.864, Adjusted R-squared: 0.864
F-statistic: 6342.525 df(3, 2996), p.value < 0.001
Nr obs: 3,000

Varianc

## Test with Interaction Terms (ivar)

In [5]:
async def test_regress_with_interaction():
    result = await call_tool(
        name="regress_fit",
        arguments={
            "data_name": "diamonds",
            "rvar": "price",
            "evar": ["carat", "depth"],
            "ivar": ["carat:depth"],  # Interaction term
            "vif": True,
            "dec": 3,
        },
    )
    return result[0].text


result = await test_regress_with_interaction()
print(result)

âœ“ Model fitted and stored as: reg_1f726d7b_1761786143

Dataset: diamonds (3000 rows)
Response: price
Predictors: carat, depth

Generated code:
```python
data, description = pyrsm.load_data(name='diamonds')
reg = pyrsm.model.regress(data, rvar='price', evar=['carat', 'depth'])
```

Summary:
```
Linear regression (OLS)
Data                 : Not provided
Response variable    : price
Explanatory variables: carat, depth
Null hyp.: the effect of x on price is zero
Alt. hyp.: the effect of x on price is not zero

           coefficient  std.error  t.value p.value     
Intercept     4048.461   1149.806    3.521  < .001  ***
carat         7752.874     56.827  136.430  < .001  ***
depth         -102.008     18.621   -5.478  < .001  ***

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-squared: 0.861, Adjusted R-squared: 0.861
F-statistic: 9307.988 df(2, 2997), p.value < 0.001
Nr obs: 3,000

Variance inflation factors:

         vif    Rsq
carat  1.001  0.001
depth  1.001  0.0

## Summary

All 3 tools working:
- âœ… **single_mean**: Hypothesis test for single mean
- âœ… **compare_means**: Compare means between groups
- âœ… **regress_fit**: Linear regression with vif, ivar, dec, plots

**Next Steps**:
1. These tools are now defined in the MCP server
2. Checkpoint 3: LLM will select tools based on natural language prompts
3. Checkpoint 4: `%%mcp` magic will call LLM â†’ tools automatically