Description:

Classifies according to business and technical concerns a sample project (a microservice) PlantBasedPizza.LoyaltyPoints.
Classfication results stored as output.csv file in classification/results folder under the name of llm that performed classification.

Key Features:

- LLM Provider Abstraction: Strategy pattern with OpenAI, Anthropic, and Ollama providers
- File Scanner: Scans all 4 LoyaltyPoints projects for .cs and appsettings.json files
- Semantic Classification: Extracts business context, rules, workflows, and integration points
- CSV Output: Structured results for the vector database
- Intermediate Results: JSON files for debugging and comparison
- Provider Comparison: Test all 3 LLM providers on the same code

In [1]:
from classification.code_scanner import CodeScanner
from pathlib import Path
from llm_providers.ollama import OllamaProvider
from classification.classification_pipeline import ClassificationPipeline
from dotenv import load_dotenv
import os

load_dotenv()

PROJECT_ROOT = os.getenv("LOYALTY_POINTS_PROJECT_ROOT")
print(PROJECT_ROOT)# Update this path

projects = [
    "PlantBasedPizza.LoyaltyPoints.Api.csproj",
    "PlantBasedPizza.LoyaltyPoints.Internal.csproj",
    "PlantBasedPizza.LoyaltyPoints.Shared.csproj",
    "PlantBasedPizza.LoyaltyPoints.Worker.csproj"
]

def create_model_folder(model):
    classification_folder = "../classification/results"
    root_dir = Path(f"{classification_folder}/{model}")
    root_dir.mkdir(exist_ok=True)

    output_csv = f"{root_dir}/output.csv"
    intermediate_results_dir = f"{root_dir}/intermediate_results"
    return output_csv, intermediate_results_dir

code_scanner = CodeScanner(PROJECT_ROOT, projects)



D:/src/learning/dotnet/event-driven-course/module5/src/PlantBasedPizza.LoyaltyPoints/application


In [2]:
from llm_providers.anthropic import AnthropicProvider
anthropic_key = os.environ.get("ANTHROPIC_API_KEY")
anthropic_key = "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"
output_csv, intermediate_results_dir = create_model_folder("claude3.5")
anth_provider = AnthropicProvider(api_key=anthropic_key, model="claude-3-5-sonnet-latest")
pipeline = ClassificationPipeline(anth_provider,
                                  scanner=code_scanner,
                                  output_csv=output_csv,
                                  intermediate_dir=intermediate_results_dir)

print("\n=== Running Classification with Anthropic ===")
anthropic_results = pipeline.run_classification()

print("\n=== Classification Complete ===")
print(f"Results saved to: {output_csv}")
print(f"Intermediate results in: {intermediate_results_dir}")

Initialized Anthropic provider with model: claude-3-5-sonnet-latest

=== Running Classification with Anthropic ===
Starting classification with provider: Anthropic-claude-3-5-sonnet-latest
Scanning code files...
Scanned 6 files from PlantBasedPizza.LoyaltyPoints.Api.csproj
Scanned 7 files from PlantBasedPizza.LoyaltyPoints.Internal.csproj
Scanned 13 files from PlantBasedPizza.LoyaltyPoints.Shared.csproj
Scanned 7 files from PlantBasedPizza.LoyaltyPoints.Worker.csproj
Found 33 files to classify
Classifying files...
Classifying 1/33: PlantBasedPizza.LoyaltyPoints.Api\ObservabilityExtensions.cs
Classifying 2/33: PlantBasedPizza.LoyaltyPoints.Api\Program.cs
Classifying 3/33: PlantBasedPizza.LoyaltyPoints.Api\appsettings.Development.json
Classifying 4/33: PlantBasedPizza.LoyaltyPoints.Api\appsettings.json
Classifying 5/33: PlantBasedPizza.LoyaltyPoints.Api\bin\Debug\net9.0\appsettings.Development.json
Classifying 6/33: PlantBasedPizza.LoyaltyPoints.Api\bin\Debug\net9.0\appsettings.json
Clas

In [3]:
output_csv, intermediate_results_dir = create_model_folder("codellama")
ollama_provider = OllamaProvider(model="codellama:7b", base_url="http://localhost:11434")
pipeline = ClassificationPipeline(ollama_provider,
                                  scanner=code_scanner,
                                  output_csv=output_csv,
                                  intermediate_dir=intermediate_results_dir)

print("\n=== Running Classification with Ollama ===")
ollama_results = pipeline.run_classification()

print("\n=== Classification Complete ===")
print(f"Results saved to: {output_csv}")
print(f"Intermediate results in: {intermediate_results_dir}")

Initialized Ollama provider with model: codellama:7b

=== Running Classification with Ollama ===
Starting classification with provider: Ollama-codellama:7b
Scanning code files...
Scanned 6 files from PlantBasedPizza.LoyaltyPoints.Api.csproj
Scanned 7 files from PlantBasedPizza.LoyaltyPoints.Internal.csproj
Scanned 13 files from PlantBasedPizza.LoyaltyPoints.Shared.csproj
Scanned 7 files from PlantBasedPizza.LoyaltyPoints.Worker.csproj
Found 33 files to classify
Classifying files...
Classifying 1/33: PlantBasedPizza.LoyaltyPoints.Api\ObservabilityExtensions.cs
Classifying 2/33: PlantBasedPizza.LoyaltyPoints.Api\Program.cs
Classifying 3/33: PlantBasedPizza.LoyaltyPoints.Api\appsettings.Development.json
Classifying 4/33: PlantBasedPizza.LoyaltyPoints.Api\appsettings.json
Classifying 5/33: PlantBasedPizza.LoyaltyPoints.Api\bin\Debug\net9.0\appsettings.Development.json
Classifying 6/33: PlantBasedPizza.LoyaltyPoints.Api\bin\Debug\net9.0\appsettings.json
Classifying 7/33: PlantBasedPizza.Loy

In [4]:
from llm_providers.openai import OpenAIProvider

openai_key = os.getenv("OPENAI_API_KEY")
output_csv, intermediate_results_dir = create_model_folder("gpt4.1")
openai_provider = OpenAIProvider(api_key=openai_key, model="gpt-4.1-2025-04-14")
pipeline = ClassificationPipeline(openai_provider,
                                  scanner=code_scanner,
                                  output_csv=output_csv,
                                  intermediate_dir=intermediate_results_dir)

print("\n=== Running Classification with OpenAI ===")
openai_results = pipeline.run_classification()

print("\n=== Classification Complete ===")
print(f"Results saved to: {output_csv}")
print(f"Intermediate results in: {intermediate_results_dir}")

Initialized OpenAI provider with model: gpt-4.1-2025-04-14

=== Running Classification with OpenAI ===
Starting classification with provider: OpenAI-gpt-4.1-2025-04-14
Scanning code files...
Scanned 6 files from PlantBasedPizza.LoyaltyPoints.Api.csproj
Scanned 7 files from PlantBasedPizza.LoyaltyPoints.Internal.csproj
Scanned 13 files from PlantBasedPizza.LoyaltyPoints.Shared.csproj
Scanned 7 files from PlantBasedPizza.LoyaltyPoints.Worker.csproj
Found 33 files to classify
Classifying files...
Classifying 1/33: PlantBasedPizza.LoyaltyPoints.Api\ObservabilityExtensions.cs
Classifying 2/33: PlantBasedPizza.LoyaltyPoints.Api\Program.cs
Classifying 3/33: PlantBasedPizza.LoyaltyPoints.Api\appsettings.Development.json
Classifying 4/33: PlantBasedPizza.LoyaltyPoints.Api\appsettings.json
Classifying 5/33: PlantBasedPizza.LoyaltyPoints.Api\bin\Debug\net9.0\appsettings.Development.json
Classifying 6/33: PlantBasedPizza.LoyaltyPoints.Api\bin\Debug\net9.0\appsettings.json
Classifying 7/33: PlantBa