# Release Train Metro Plan Analysis

This notebook analyzes project relationships and generates visualization data for the release train metro plan.

## Data Sources

The analysis uses three main data sources from OpenRewrite recipe runs:
- **ProjectCoordinates.csv**: Maven/Gradle project identifiers (groupId, artifactId) 
- **DependenciesInUse.csv**: Dependencies between projects
- **ParentRelationships.csv**: Parent POM and Gradle parent project relationships
- **UnusedDependencies.csv**: Import patterns to identify potentially unused dependencies

## Visualization Link Types

The generated metro plan visualization supports multiple connection types:
- **Dependency links** (blue solid): Normal dependencies between projects
- **Parent links** (red solid): Parent POM or Gradle parent relationships  
- **Unused links** (orange dashed): Potentially unused dependencies that should be reviewed

Run the enhanced `ReleaseMetroPlan` recipe to generate all data files, then execute this notebook to create the visualization data.

# Release Train Metro Plan Analysis

This notebook analyzes project relationships and generates visualization data for the release train metro plan.

## Data Sources

The analysis uses three main data sources from OpenRewrite recipe runs:
- **ProjectCoordinates.csv**: Maven/Gradle project identifiers (groupId, artifactId) 
- **DependenciesInUse.csv**: Dependencies between projects
- **ParentRelationships.csv**: Parent POM and Gradle parent project relationships

In [1]:
%use dataframe
val projectIds = DataFrame.read("/Users/merlin/IdeaProjects/private/Release-Train-Metro-Plan/src/main/kotlin/data/ProjectCoordinates.csv")
val dependencies = DataFrame.read("/Users/merlin/IdeaProjects/private/Release-Train-Metro-Plan/src/main/kotlin/data/DependenciesInUse.csv")
val parentRelationships = DataFrame.read("/Users/merlin/IdeaProjects/private/Release-Train-Metro-Plan/src/main/kotlin/data/ParentRelationships.csv")

print("Loaded ${projectIds.rowsCount()} projects, ${dependencies.rowsCount()} dependencies, and ${parentRelationships.rowsCount()} parent relationships.")

Loaded 78 projects, 842454 dependencies, and 0 parent relationships.

In [2]:
import java.nio.file.Files
import java.nio.file.StandardOpenOption
import kotlin.io.path.Path
import kotlin.io.path.createFile

data class Artifact(val group: String?,
                   val artifact: String,
                   var parent: Artifact? = null) {
    override fun equals(other: Any?): Boolean = other is Artifact && other.group == group && other.artifact == artifact
    override fun hashCode(): Int = group.hashCode() * 31 + artifact.hashCode()
}

data class Repository(val path: String, val artifacts: Set<Artifact>,  val dependencies: Set<Artifact>) {
    override fun equals(other: Any?): Boolean = other is Repository && other.path == path
    override fun hashCode(): Int = path.hashCode()
}

val repos = mutableListOf<Repository>()

// Create repositories from project coordinates
projectIds
    .select { it["repositoryPath", "repositoryBranch", "groupId", "artifactId"] }
    .filter { it["repositoryBranch"] == "master" || it["repositoryBranch"] == "main" }
    .groupBy { it["repositoryPath"] }
    .forEach { groupEntry ->
        val repoPath = groupEntry.key["repositoryPath"] as String
        val repoGroup = groupEntry.group
        
        val repoArtifacts = repoGroup
            .map { row -> Artifact(row["groupId"] as String?, row["artifactId"] as String) }
            .toSet()
        
        val repoDependencies = dependencies
            .select { it["repositoryPath", "repositoryBranch", "groupId", "artifactId"] }
            .filter { it["repositoryBranch"] == "master" || it["repositoryBranch"] == "main" }
            .filter { it["repositoryPath"] == repoPath }
            .map { row -> Artifact(row["groupId"] as String?, row["artifactId"] as String) }
            .toSet()
        
        repos.add(Repository(repoPath, repoArtifacts, repoDependencies))
    }

// Process parent relationships from the new ParentRelationships DataTable
if(parentRelationships.rowsCount() < 0) {
    parentRelationships
        .select { it["repositoryPath", "repositoryBranch", "childArtifactId", "parentGroupId", "parentArtifactId"] }
        .filter { it["repositoryBranch"] == "master" || it["repositoryBranch"] == "main" }
        .forEach { row ->
            repos.firstOrNull { r -> r.path == row["repositoryPath"] as String }
                ?.artifacts?.firstOrNull { a -> a.artifact == row["childArtifactId"] as String }
                ?.let { a -> a.parent = Artifact(row["parentGroupId"] as String?, row["parentArtifactId"] as String) }
        }
} else {
    println("No parent relationships found - skipping parent relationship processing")
}

println("derived ${repos.size} repositories from the data, containing ${repos.sumOf { it.artifacts.size }} artifacts, ${repos.sumOf { it.dependencies.size }} dependencies, and ${repos.sumOf { r -> r.artifacts.count { it.parent != null } }} parent relationships.")

No parent relationships found - skipping parent relationship processing
derived 38 repositories from the data, containing 71 artifacts, 3820 dependencies, and 0 parent relationships.


In [4]:
// Generate connections between repositories including parent relationships and unused dependencies
enum class LinkType { parent, dependency, unused }
data class Link(val src: String, val dist: String, val type: LinkType) {
    fun asD3() : String = "{ source: \"${src}\", target: \"${dist}\", type: \"${type}\" }"
}
data class Node(val id: String) {
    fun asD3() : String = "{ id: \"${id}\" }"
}

val edges = mutableSetOf<Link>()

for (repo in repos) {
    // Add parent relationships: if artifact A has parent B, create link from A's repo to B's repo
    for (artifact in repo.artifacts) {
        if (artifact.parent != null) {
            val parentRepo = repos.find { it.artifacts.contains(artifact.parent) }
            if (parentRepo != null && parentRepo.path != repo.path) {
                edges.add(Link(repo.path, parentRepo.path, LinkType.parent))
            }
        }
    }
    
    // Add dependency relationships: if repo uses dependency D, create link from repo to D's repo
    repo.dependencies
        .map { dep -> repos.find { it.artifacts.contains(dep) }?.path }
        .filterNotNull()
        .filter { it != repo.path }
        .map { Link(repo.path, it, LinkType.dependency) }
        .forEach { edges.add(it) }
}

// Add unused dependency relationships if UnusedDependencies data is available
try {
    val unusedDeps = DataFrame.read("/Users/merlin/IdeaProjects/private/Release-Train-Metro-Plan/src/main/kotlin/data/UnusedDependencies.csv")
    
    // Group unused dependencies by repository and find potential unused links
    val unusedByRepo = unusedDeps
        .groupBy { it["repositoryPath"] }
        .map { groupEntry ->
            val repoPath = groupEntry.key["repositoryPath"] as String
            val group = groupEntry.group
            val dependencies = group
                .filter { it["reasonSuspected"]?.toString()?.contains("Import found") == true }
                .groupBy { it["dependencyGroupId"] }
                .filter { it.group().size().ncol < 2 } // Dependencies with very few imports
                .keys
            
            repoPath to dependencies
        }
        .toMap()
    
    // Create unused dependency links for dependencies with minimal usage
    for ((repoPath, suspiciousDeps) in unusedByRepo) {
        for (depGroupId in suspiciousDeps) {
            val targetRepo = repos.find { repo -> 
                repo.artifacts.any { it.group == depGroupId.get(0) }
            }?.path
            
            if (targetRepo != null && targetRepo != repoPath) {
                // Only add if there's already a dependency link (to avoid false positives)
                val existingDep = edges.find { 
                    it.src == repoPath && it.dist == targetRepo && it.type == LinkType.dependency 
                }
                if (existingDep != null) {
                    edges.add(Link(repoPath, targetRepo, LinkType.unused))
                    println("Added unused dependency link: $repoPath -> $targetRepo ($depGroupId)")
                }
            }
        }
    }
    
    println("Processed ${unusedByRepo.size} repositories for unused dependency analysis")
    
} catch (e: Exception) {
    println("UnusedDependencies.csv not available - skipping unused dependency link generation")
    println("Run FindPotentiallyUnusedDependencies recipe to enable unused dependency analysis")
}

val nodes = edges.map { listOf(it.src, it.dist) }.flatMap { it }.distinct().map { Node(it) }

println("\nGenerated ${edges.size} total connections:")
println("- ${edges.count { it.type == LinkType.dependency }} dependency links")
println("- ${edges.count { it.type == LinkType.parent }} parent links") 
println("- ${edges.count { it.type == LinkType.unused }} unused dependency links")

println(nodes.joinToString(",\n\t", prefix = "export const nodes = [\n", postfix = "\n];") { it.asD3() })
println(edges.joinToString(",\n\t", prefix = "export const links = [\n", postfix = "\n];") { it.asD3() })

Processed 38 repositories for unused dependency analysis

Generated 151 total connections:
- 151 dependency links
- 0 parent links
- 0 unused dependency links
export const nodes = [
{ id: "openrewrite/rewrite-all" },
	{ id: "openrewrite/rewrite-csharp" },
	{ id: "openrewrite/rewrite" },
	{ id: "openrewrite/rewrite-python" },
	{ id: "openrewrite/rewrite-java-dependencies" },
	{ id: "openrewrite/rewrite-analysis" },
	{ id: "openrewrite/rewrite-apache" },
	{ id: "openrewrite/rewrite-templating" },
	{ id: "openrewrite/rewrite-static-analysis" },
	{ id: "openrewrite/rewrite-logging-frameworks" },
	{ id: "openrewrite/rewrite-build-gradle-plugin" },
	{ id: "openrewrite/rewrite-gradle-tooling-model" },
	{ id: "openrewrite/rewrite-cucumber-jvm" },
	{ id: "openrewrite/rewrite-docker" },
	{ id: "openrewrite/rewrite-dropwizard" },
	{ id: "openrewrite/rewrite-testing-frameworks" },
	{ id: "openrewrite/rewrite-feature-flags" },
	{ id: "openrewrite/rewrite-generative-ai" },
	{ id: "openrewrite/rewrit

## Using the Enhanced Visualization

After running this notebook, copy the generated JavaScript output to `connections.js` and open `metro-plan.html` in a browser.

### Visual Legend:
- **Blue solid lines + arrows**: Regular dependency relationships  
- **Red solid lines + arrows**: Parent POM/Gradle relationships
- **Orange dashed lines + arrows**: Potentially unused dependencies (review candidates)

### Interpreting Unused Dependencies:
Orange dashed lines indicate dependencies that are declared in build files but have minimal import usage in the source code. These represent potential cleanup opportunities:

1. **Review the dependency**: Check if it's actually needed
2. **Consider removal**: If unused, removing it can simplify the release train
3. **Update build files**: Remove unnecessary dependencies to reduce coupling

The dashed visualization makes it easy to spot problematic dependencies that may be complicating your release coordination.

In [None]:
// Analyze unused dependencies (if UnusedDependencies.csv is available)
// This would be generated by running the FindPotentiallyUnusedDependencies recipe

try {
    val unusedDeps = DataFrame.read("/Users/merlin/IdeaProjects/private/Release-Train-Metro-Plan/src/main/kotlin/data/UnusedDependencies.csv")
    
    println("=== Unused Dependencies Analysis ===")
    println("Found ${unusedDeps.rowsCount()} import usage records")
    
    // Group by dependency to see usage patterns
    val dependencyUsage = unusedDeps
        .groupBy { it["dependencyGroupId"] }
        .aggregate {
            count() into "usageCount"
            first { it["dependencyArtifactId"] } into "artifactId"
        }
        .sortByDesc("usageCount")
    
    println("\nMost imported dependency groups:")
    dependencyUsage.take(10).forEach { row ->
        println("${row["dependencyGroupId"]}: ${row["usageCount"]} imports")
    }
    
    // Identify potentially problematic dependencies
    val suspiciousDeps = unusedDeps
        .filter { it["reasonSuspected"]?.toString()?.contains("Import found") == true }
        .groupBy { it["dependencyGroupId"] }
        .aggregate { count() into "importCount" }
        .filter { (it["importCount"] as? Int ?: 0) < 3 } // Dependencies with very few imports
    
    println("\nDependencies with minimal usage (< 3 imports):")
    suspiciousDeps.forEach { row ->
        println("${row["dependencyGroupId"]}: ${row["importCount"]} imports")
    }
    
} catch (e: Exception) {
    println("UnusedDependencies.csv not found - run FindPotentiallyUnusedDependencies recipe first")
    println("This analysis shows import patterns that can help identify unused dependencies")
}