# Linear Issues Analysis with LangSmith Insights

This example demonstrates how to use LangSmith Insights to analyze Linear issues from your workspace. We'll:
1. Fetch issues from Linear using the GraphQL API
2. Store the issues in a CSV file for reference
3. Use LangSmith Insights to automatically cluster issues by theme and priority

## Prerequisites

You will need:
1. A Linear API key

<details>
<summary><b>How to get your Linear API key</b> (click to expand)</summary>

1. Go to [Linear Settings â†’ API](https://linear.app/settings/api)
2. Scroll down to "Personal API keys"
3. Click "Create key"
4. Give your key a label (e.g., "LangSmith Insights")
5. Copy the generated key immediately (you won't be able to see it again)

**Permissions**: The API key inherits your user permissions, so you'll have access to all teams and issues you can see in Linear.

</details>

2. A LangSmith API key

<details>
<summary><b>How to get your LangSmith API key</b> (click to expand)</summary>

1. Go to [LangSmith Settings](https://smith.langchain.com/settings)
2. Click "Create API Key"
3. Copy your new API key

</details>

## Setup

Before running the notebook, set your API keys as environment variables in your terminal:
```bash
export LINEAR_API_KEY=your-linear-api-key-here
export LANGSMITH_API_KEY=your-langsmith-api-key
```

You can also configure optional filters:
```bash
# Optional: Filter by specific teams (comma-separated team keys)
export LINEAR_TEAM_KEYS="ENG,PRODUCT"

# Optional: Filter by status (e.g., "Backlog", "In Progress", "Done")
export LINEAR_STATUS_FILTER="Backlog"

# Optional: Only fetch issues created after this date (ISO 8601 format)
export LINEAR_CREATED_AFTER="2024-01-01T00:00:00Z"

# Optional: Limit the number of issues to fetch (default: 500)
export LINEAR_LIMIT="500"
```

In [None]:
import csv
import os
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional

import requests
from langsmith import Client

In [None]:
DATA_DIR = Path("data")
DATA_DIR.mkdir(parents=True, exist_ok=True)
CSV_PATH = DATA_DIR / "linear_issues.csv"

# TODO: Configure your Linear filters (or use environment variables)
# You can set these here or via environment variables (see Setup section above)
LINEAR_TEAM_KEYS = [
    key.strip() 
    for key in os.getenv("LINEAR_TEAM_KEYS", "").split(",") 
    if key.strip()
]  # e.g., ["ENG", "PRODUCT"]
LINEAR_STATUS_FILTER = os.getenv("LINEAR_STATUS_FILTER", "")  # e.g., "Backlog"
LINEAR_CREATED_AFTER = os.getenv("LINEAR_CREATED_AFTER")  # e.g., "2024-01-01T00:00:00Z"
LINEAR_LIMIT = int(os.getenv("LINEAR_LIMIT", "500"))

LINEAR_ENDPOINT = "https://api.linear.app/graphql"

print(f"Configuration:")
print(f"  Team keys: {LINEAR_TEAM_KEYS or 'All teams'}")
print(f"  Status filter: {LINEAR_STATUS_FILTER or 'All statuses'}")
print(f"  Created after: {LINEAR_CREATED_AFTER or 'No date filter'}")
print(f"  Limit: {LINEAR_LIMIT} issues")
print(f"  CSV path: {CSV_PATH}")

## Define Linear GraphQL Query

Linear uses a GraphQL API. We'll define a query to fetch issues with all relevant fields.

In [None]:
LINEAR_QUERY = """
query Issues($after: String, $filter: IssueFilter, $limit: Int!) {
  issues(
    first: $limit,
    after: $after,
    filter: $filter,
    includeArchived: false
  ) {
    pageInfo {
      hasNextPage
      endCursor
    }
    nodes {
      id
      identifier
      title
      description
      priority
      estimate
      url
      createdAt
      updatedAt
      startedAt
      completedAt
      assignee { name email }
      team { name key }
      state { name type }
      labels { nodes { name } }
    }
  }
}
"""

print("GraphQL query defined")

## Fetch Linear Issues

We'll fetch issues from Linear, handling pagination automatically. This may take a moment depending on how many issues match your filters.

In [None]:
LINEAR_API_KEY = os.getenv("LINEAR_API_KEY")
if not LINEAR_API_KEY:
    raise RuntimeError(
        "LINEAR_API_KEY environment variable is required. "
        "Get an API key at https://linear.app/settings/api"
    )

# Build filter based on configuration
issue_filter = {}
if LINEAR_TEAM_KEYS:
    issue_filter["team"] = {"key": {"in": LINEAR_TEAM_KEYS}}
if LINEAR_STATUS_FILTER:
    issue_filter["state"] = {"name": {"eq": LINEAR_STATUS_FILTER}}
if LINEAR_CREATED_AFTER:
    issue_filter["createdAt"] = {"gte": LINEAR_CREATED_AFTER}

print(f"Active filters: {issue_filter or 'None'}\n")

# Fetch issues with pagination
issues = []
after = None

headers = {
    "Authorization": LINEAR_API_KEY,
    "Content-Type": "application/json",
}

while len(issues) < LINEAR_LIMIT:
    variables = {
        "after": after,
        "filter": issue_filter or None,
        "limit": min(100, LINEAR_LIMIT),  # Linear API max is 250, we use 100 for safety
    }
    
    payload = {"query": LINEAR_QUERY, "variables": variables}
    response = requests.post(LINEAR_ENDPOINT, json=payload, headers=headers, timeout=30)
    
    try:
        response.raise_for_status()
    except requests.HTTPError:
        print(f"Linear API error: {response.text}")
        raise
    
    data = response.json()
    if "errors" in data:
        raise RuntimeError(f"Linear API error: {data['errors']}")
    
    page_data = data["data"]["issues"]
    nodes = page_data.get("nodes", [])
    issues.extend(nodes)
    print(f"Fetched {len(nodes)} issues (total: {len(issues)})")
    
    if not page_data["pageInfo"]["hasNextPage"]:
        break
    
    after = page_data["pageInfo"]["endCursor"]

# Limit to requested amount
issues = issues[:LINEAR_LIMIT]

print(f"\nCollected {len(issues)} issues from Linear")

## Preview Issue Data

Let's look at an example issue to see what data we have:

In [None]:
if issues:
    example = issues[0]
    labels = ", ".join(
        label["name"] for label in example.get("labels", {}).get("nodes", [])
    ) or "None"
    
    print(f"Issue {example['identifier']}: {example['title']}")
    print(f"Team: {example.get('team', {}).get('name')}")
    print(f"State: {example.get('state', {}).get('name')} ({example.get('state', {}).get('type')})")
    print(f"Priority: {example.get('priority')} | Estimate: {example.get('estimate') or 'Not set'}")
    print(f"Assignee: {(example.get('assignee') or {}).get('name') or 'Unassigned'}")
    print(f"Labels: {labels}")
    print(f"Created: {example.get('createdAt')}")
    print(f"URL: {example['url']}")
    print(f"\nDescription (first 300 chars):\n{(example.get('description') or 'No description')[:300]}...")

## Save Issues to CSV

Save the issues locally for reference and future analysis.

In [None]:
if issues:
    fieldnames = [
        "id", "identifier", "title", "description", "team", "state", "state_type",
        "priority", "estimate", "assignee", "labels", "created_at", "updated_at",
        "started_at", "completed_at", "url"
    ]
    
    with CSV_PATH.open("w", newline="", encoding="utf-8") as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()
        for issue in issues:
            labels = ", ".join(
                label["name"] for label in issue.get("labels", {}).get("nodes", [])
            )
            writer.writerow({
                "id": issue.get("id"),
                "identifier": issue.get("identifier"),
                "title": issue.get("title"),
                "description": issue.get("description"),
                "team": issue.get("team", {}).get("name"),
                "state": issue.get("state", {}).get("name"),
                "state_type": issue.get("state", {}).get("type"),
                "priority": issue.get("priority"),
                "estimate": issue.get("estimate"),
                "assignee": (issue.get("assignee") or {}).get("name"),
                "labels": labels,
                "created_at": issue.get("createdAt"),
                "updated_at": issue.get("updatedAt"),
                "started_at": issue.get("startedAt"),
                "completed_at": issue.get("completedAt"),
                "url": issue.get("url"),
            })
    print(f"Saved {len(issues)} issues to {CSV_PATH}")
else:
    print("No issues to save")

## Analyze with LangSmith Insights

Now we'll send the issues to LangSmith Insights for automatic clustering and analysis. The Insights API will:
- Identify common themes across issues (bugs, feature requests, technical debt, etc.)
- Group similar issues together
- Analyze patterns by priority and team
- Generate summaries for each cluster
- Provide a visual interface to explore the results

This is particularly useful for:
- Understanding what your team is working on
- Identifying bottlenecks and recurring issues
- Prioritizing roadmap items across teams
- Getting insights into technical debt and infrastructure needs

In [None]:
# Format issues for the Insights API
# Each issue is converted to a chat history format with structured information
chat_histories = []

for issue in issues:
    labels = (
        ", ".join(label["name"] for label in issue.get("labels", {}).get("nodes", []))
        or "(none)"
    )
    
    content = (
        f"Linear Issue {issue.get('identifier')} - {issue.get('title')}\n"
        f"Team: {issue.get('team', {}).get('name')} | State: {issue.get('state', {}).get('name')}\n"
        f"Priority: {issue.get('priority')} | Estimate: {issue.get('estimate')}\n"
        f"Assignee: {(issue.get('assignee') or {}).get('name')} | Labels: {labels}\n"
        f"URL: {issue.get('url')} | Created: {issue.get('createdAt')}\n\n"
        f"{issue.get('description') or '(no description)'}"
    )
    
    chat_histories.append([{"role": "user", "content": content}])

print(f"Prepared {len(chat_histories)} issues for analysis")
if chat_histories:
    print(f"\nExample formatted issue:\n{chat_histories[0][0]['content'][:500]}...")

In [None]:
LANGSMITH_API_KEY = os.getenv("LANGSMITH_API_KEY")
if not LANGSMITH_API_KEY:
    raise RuntimeError(
        "LANGSMITH_API_KEY environment variable is required. "
        "Get an API key at https://smith.langchain.com/settings"
    )

client = Client(api_key=LANGSMITH_API_KEY)

report = client.generate_insights(
    chat_histories=chat_histories,
    instructions=(
        "These are Linear issues from our workspace. "
        "Cluster them into actionable themes (bugs, roadmap gaps, infrastructure debt, etc.). "
        "Help identify patterns in what teams are working on and where priorities lie."
    ),
    name=f"linear-issues-{datetime.now(timezone.utc):%Y%m%d-%H%M}",
)

print("\nInsights report generated successfully!")
print(f"View your results at: {report.url if hasattr(report, 'url') else 'Check LangSmith UI'}")