<a href="https://colab.research.google.com/github/componavt/prompt-corpus/blob/main/src/utils/py_git_metrics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🔍 py_git_metrics - Python Code Metrics Analyzer for GitHub

**📌 Description:**
Clones a GitHub repository, analyzes **only the `src` folder**, and outputs key code metrics as a comma-separated string.
Optimized for Python projects.

**📊 Metrics:**
`cyclomatic_complexity, avg_lines, duplicate_blocks, dependencies`

**⚠️ Note:**
Only the `src` folder is analyzed. Ensure it exists in the repository.

In [1]:
# Set the GitHub repository URL here
repo_url = "https://github.com/componavt/LLLE-R1900s"

In [None]:
import os
import subprocess

def analyze_repo(repo_url):
    """
    Clones a GitHub repository, analyzes the codebase in the 'src' folder,
    and outputs key metrics as a comma-separated string.
    """
    # Clone the repository (without ".git" suffix)
    repo_name = repo_url.split("/")[-1]
    subprocess.run(["git", "clone", f"{repo_url}.git"], check=True)
    os.chdir(repo_name)

    # Check if 'src' folder exists
    if not os.path.exists("src"):
        raise FileNotFoundError("The 'src' folder was not found in the repository.")

    # Install required tools
    subprocess.run(["pip", "install", "radon", "pipdeptree", "--quiet"], check=True)
    subprocess.run(["npm", "install", "-g", "jscpd", "--silent"], check=True)

    # Initialize metrics dictionary
    metrics = {}

    # Analyze only the 'src' folder
    src_path = os.path.join(os.getcwd(), "src")

    # Cyclomatic complexity (radon)
    result = subprocess.run(
        ["radon", "cc", src_path, "--average"],
        capture_output=True,
        text=True
    )
    metrics["cyclomatic_complexity"] = float(result.stdout.strip().split()[-1])

    # Average lines of code per file (radon)
    result = subprocess.run(
        ["radon", "raw", src_path],
        capture_output=True,
        text=True
    )
    loc_line = [line for line in result.stdout.split("\n") if "LOC:" in line]
    if loc_line:
        metrics["avg_lines"] = int(loc_line[0].split()[1])
    else:
        metrics["avg_lines"] = 0

    # Duplicate code blocks (jscpd)
    result = subprocess.run(
        ["jscpd", "--format", "text", "--output", "console", src_path],
        capture_output=True,
        text=True
    )
    duplicate_count = len([line for line in result.stdout.split("\n") if "duplicates" in line])
    metrics["duplicate_blocks"] = duplicate_count

    # External dependencies (pipdeptree)
    result = subprocess.run(
        ["pipdeptree", "--warn", "silence"],
        capture_output=True,
        text=True
    )
    metrics["dependencies"] = len(result.stdout.strip().split("\n")) - 1  # Exclude header

    # Return to the root directory
    os.chdir("..")

    # Format output as a comma-separated string
    output = ",".join([str(v) for v in metrics.values()])
    print(output)

# Run the analysis
analyze_repo(repo_url)

# 🔍 GitHub Repository Code Analyzer

**📌 Описание:**
Скрипт клонирует репозиторий, анализирует код **только в папке `src`** и выводит ключевые метрики в виде строки.

**📊 Метрики:**
`cyclomatic_complexity, avg_lines, duplicate_blocks, dependencies`

**⚠️ Примечание:**
Анализируется только содержимое папки `src` в репозитории, скрипт оптимизирован для Python-проектов.