# SkillMesh Getting Started

This notebook walks through the full SkillMesh workflow:

1. Install SkillMesh dependencies  
2. Load the registry  
3. Retrieve top-K expert cards  
4. Emit provider-ready context  
5. Demonstrate a simple mock agent loop

## 1. Install SkillMesh dependencies

This cell installs the SkillMesh package from the local repository in editable mode.

In [1]:
# Install SkillMesh dependencies from root folder
!pip install -e ../..

Obtaining file:///E:/ForOpensource/SkillMesh
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Checking if build backend supports build_editable: started
  Checking if build backend supports build_editable: finished with status 'done'
  Getting requirements to build editable: started
  Getting requirements to build editable: finished with status 'done'
  Preparing editable metadata (pyproject.toml): started
  Preparing editable metadata (pyproject.toml): finished with status 'done'
Building wheels for collected packages: skillmesh
  Building editable for skillmesh (pyproject.toml): started
  Building editable for skillmesh (pyproject.toml): finished with status 'done'
  Created wheel for skillmesh: filename=skillmesh-0.1.0-0.editable-py3-none-any.whl size=5534 sha256=6dfabd1ecad61c06188c5aa448167818c7e5ee7490ef689be1230fcc4dfc2684
  Stored in directory: C:\Users\DELL\AppData\Local\Temp\pip-ephem-wheel-cache-w74lwmao\wheels\26\fc\af\


[notice] A new release of pip is available: 23.2.1 -> 26.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


## 2. Load SkillMesh Registry

This cell loads the SkillMesh registry, which contains all expert cards.

- `load_registry(path)` loads the registry JSON file.
- `len(registry)` shows how many expert cards were loaded.

In [2]:
from skill_registry_rag.registry import load_registry

registry = load_registry("../registry/tools.json")

len(registry)

92

## 3. Retrieve Top-K Expert Cards

- `SkillRetriever(registry)` creates a retriever object from the loaded registry.
- `query` is the user request.
- `top_k=3` means we want the top 3 expert cards matching the query.

In [3]:
from skill_registry_rag.retriever import SkillRetriever

retriever = SkillRetriever(registry)

user_query = "clean messy sales data and generate charts"
hits = retriever.retrieve(user_query, top_k=3)

hits  # shows top 3 relevant expert cards

[RetrievalHit(card=ToolCard(id='role.data-analyst', title='Data Analyst Role Orchestrator', domain='role_orchestrator', instruction_file='roles/data-analyst.md', description='Role expert that orchestrates profiling, cleaning, EDA, baseline modeling, and visualization for tabular analytics tasks.', tags=['role', 'data-analyst', 'eda', 'pandas', 'visualization', 'baseline-modeling'], tool_hints=['data.pandas-advanced', 'data.sql-queries', 'viz.matplotlib-seaborn', 'ml.sklearn-modeling', 'stats.scipy-statsmodels'], examples=['Profile messy dataset, clean it, and deliver insight dashboard', 'Run EDA plus baseline prediction with leakage-safe validation'], aliases=['role-data-analyst', 'analytics-orchestrator'], dependencies=['data.pandas-advanced', 'data.sql-queries', 'viz.matplotlib-seaborn', 'ml.sklearn-modeling', 'stats.scipy-statsmodels'], output_artifacts=['data_profile_summary', 'eda_findings', 'visual_report', 'baseline_model_report'], quality_checks=['missingness_and_dtype_audit_co

## 4. Emit provider-ready context 

This step formats the instructions from the top-K retrieval results into a single context string
that could be passed to an AI model (like Claude or Codex) for answering the query.

In [4]:
context = "\n\n".join(hit.card.instruction_text for hit in hits)

print("USER QUERY:")
print(user_query)

print("\nRETRIEVED CONTEXT:")
print(context[:1000])  # prints first 1000 characters of combined instructions

USER QUERY:
clean messy sales data and generate charts

RETRIEVED CONTEXT:
# Data Analyst Role Expert

Use this role when the request needs end-to-end tabular analysis: data profiling, cleaning, exploratory analysis, baseline modeling, and clear visual communication.

## Allowed expert dependencies

- `data.pandas-advanced`
- `data.sql-queries`
- `viz.matplotlib-seaborn`
- `ml.sklearn-modeling`
- `stats.scipy-statsmodels`

## Execution behavior

1. Start with a data quality audit:
   nulls, dtypes, duplicates, outliers, key integrity, and temporal coverage.
2. Normalize and clean data using reproducible transformations.
3. Produce concise EDA:
   distributions, trends, segmentation, and relationship charts.
4. If prediction is requested, build a leakage-safe baseline model with validation metrics.
5. Explain findings in business terms:
   what changed, how much, and what action is implied.
6. End with caveats and next steps.

## Output contract

- `profile_summary`: row/column counts, 

## 5. Mock Agent Loop

- Simulates multiple queries in a loop.  
- Retrieves top-K expert cards for each query.  
- Shows how an LLM could consume this context iteratively.

In [5]:
queries = [
    "clean messy sales data",
    "generate charts for sales",
    "summarize findings"
]

for query in queries:
    hits = retriever.retrieve(query, top_k=3)
    context = "\n\n".join(hit.card.instruction_text for hit in hits)
    print(f"Query: {query}")
    print(context[:500])
    print("-"*50)

Query: clean messy sales data
# Data Analyst Role Expert

Use this role when the request needs end-to-end tabular analysis: data profiling, cleaning, exploratory analysis, baseline modeling, and clear visual communication.

## Allowed expert dependencies

- `data.pandas-advanced`
- `data.sql-queries`
- `viz.matplotlib-seaborn`
- `ml.sklearn-modeling`
- `stats.scipy-statsmodels`

## Execution behavior

1. Start with a data quality audit:
   nulls, dtypes, duplicates, outliers, key integrity, and temporal coverage.
2. Normaliz
--------------------------------------------------
Query: generate charts for sales
# Slide Creation Expert (PPTX)

Use this expert for executive summaries, project updates, and narrative decks.

## Execution behavior

1. Derive a slide storyline first (problem, analysis, findings, actions).
2. Allocate one key message per slide and keep text concise.
3. Generate PPTX using `python-pptx` with consistent templates.
4. Embed charts/tables as visuals instead of dense 