# 🔍 NPM Dependency Auditor

**Analyze npm packages for security risks and get AI-powered explanations in plain English.**

## Features
- 📦 Fetch real-time package metadata from NPM registry
- 🔒 Basic security risk assessment based on maintenance status
- 🤖 AI-powered analysis using Groq (free!)
- 📊 Clear, actionable recommendations

## Setup Required
Add to your `.env` file:
```
GROQ_API_KEY=your_key_here
```
Get your free key at: https://console.groq.com/keys

In [None]:
import requests
import json
from datetime import datetime
from openai import OpenAI
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv(override=True)

print("✅ Setup complete!")

✅ Setup complete!


## 1️⃣ Fetch Package Information from NPM

In [11]:
def fetch_npm_package_info(package_name):
    """
    Fetch package information from NPM registry
    
    Args:
        package_name: Name of package (e.g., 'express' or 'axios@0.21.1')
    
    Returns:
        Dictionary with package metadata
    """
    # Parse package name and version
    if '@' in package_name and not package_name.startswith('@'):
        name, version = package_name.split('@')
    else:
        name = package_name
        version = 'latest'
    
    try:
        # Fetch from NPM registry API
        url = f"https://registry.npmjs.org/{name}"
        response = requests.get(url, timeout=10)
        response.raise_for_status()
        
        data = response.json()
        
        # Get latest version info
        latest_version = data['dist-tags']['latest']
        version_data = data['versions'][latest_version]
        
        # Calculate days since last update
        last_modified = data['time'][latest_version]
        last_modified_date = datetime.fromisoformat(last_modified.replace('Z', '+00:00'))
        days_ago = (datetime.now(last_modified_date.tzinfo) - last_modified_date).days
        
        # Extract useful info
        package_info = {
            'name': name,
            'version': latest_version,
            'description': data.get('description', 'No description'),
            'last_updated': last_modified,
            'days_since_update': days_ago,
            'homepage': data.get('homepage', 'N/A'),
            'repository': data.get('repository', {}).get('url', 'N/A'),
            'license': version_data.get('license', 'N/A'),
            'maintainers': len(data.get('maintainers', [])),
            'keywords': data.get('keywords', [])
        }
        
        return package_info
    
    except requests.exceptions.RequestException as e:
        return {'error': f"Failed to fetch package: {str(e)}"}
    except KeyError as e:
        return {'error': f"Package not found or invalid data: {str(e)}"}

print("✅ fetch_npm_package_info() defined")

✅ fetch_npm_package_info() defined


## 2️⃣ Check for Security Risks

In [None]:
def check_vulnerabilities_basic(package_name):
    """
    Basic vulnerability check using NPM audit data
    For MVP, we'll simulate this - later we can use Snyk/OSV API
    """
    # For now, we'll mark packages as potentially risky based on age
    # In production, you'd call APIs like:
    # - https://api.osv.dev/v1/query
    # - https://snyk.io/api
    
    info = fetch_npm_package_info(package_name)
    
    if 'error' in info:
        return {'status': 'unknown', 'reason': info['error']}
    
    days = info['days_since_update']
    
    # Simple heuristic for MVP
    if days > 730:  # 2 years
        return {
            'status': 'high_risk',
            'reason': 'Package not updated in over 2 years (abandoned?)',
            'days_since_update': days
        }
    elif days > 365:  # 1 year
        return {
            'status': 'medium_risk',
            'reason': 'Package not updated in over 1 year',
            'days_since_update': days
        }
    else:
        return {
            'status': 'low_risk',
            'reason': 'Recently maintained',
            'days_since_update': days
        }

print("✅ check_vulnerabilities_basic() defined")

✅ Function defined! Testing...
{'status': 'low_risk', 'reason': 'Recently maintained', 'days_since_update': 214}


## 3️⃣ AI-Powered Analysis (Using Groq)

In [12]:
def analyze_with_ai(package_info, vulnerability_info):
    """
    Use Groq (free API) instead of GPT-4
    """
    # Groq uses OpenAI-compatible API
    groq_client = OpenAI(
        api_key=os.getenv("GROQ_API_KEY"),  # Add to .env file
        base_url="https://api.groq.com/openai/v1"
    )
    
    prompt = f"""You are a senior software security analyst. Analyze this npm package and provide a clear, concise risk assessment.

Package Information:
- Name: {package_info.get('name')}
- Version: {package_info.get('version')}
- Description: {package_info.get('description')}
- Last Updated: {package_info.get('days_since_update')} days ago
- Maintainers: {package_info.get('maintainers')}
- License: {package_info.get('license')}

Preliminary Risk Assessment:
- Status: {vulnerability_info.get('status')}
- Reason: {vulnerability_info.get('reason')}

Provide your analysis in this format:

**RISK LEVEL**: [LOW/MEDIUM/HIGH]

**PLAIN ENGLISH SUMMARY**:
[2-3 sentences explaining what this package does and its overall safety]

**GOOD SIGNS** (if any):
- [List positive indicators]

**CONCERNS** (if any):
- [List potential issues]

**RECOMMENDATION**:
[Clear action: "Safe to use" / "Update immediately" / "Consider alternatives" etc.]

Keep it concise, non-technical, and actionable."""

    try:
        response = groq_client.chat.completions.create(
            model="llama-3.3-70b-versatile",  # Updated model (Dec 2024)
            messages=[
                {"role": "system", "content": "You are a helpful security analyst who explains technical concepts in simple terms."},
                {"role": "user", "content": prompt}
            ],
            temperature=0.7,
            max_tokens=800
        )
        
        return response.choices[0].message.content
    
    except Exception as e:
        return f"Error with Groq: {str(e)}"

print("✅ analyze_with_ai() defined - using Groq API")

✅ analyze_with_ai() defined - using Groq API


## 4️⃣ Main Analysis Pipeline

In [13]:
def analyze_package(package_name):
    """
    Complete package analysis pipeline
    """
    print(f"🔍 Analyzing package: {package_name}")
    print("=" * 70)
    
    # Step 1: Fetch package info
    print("\n📦 Fetching package data from NPM...")
    package_info = fetch_npm_package_info(package_name)
    
    if 'error' in package_info:
        print(f"❌ Error: {package_info['error']}")
        return
    
    print(f"✅ Found: {package_info['name']}@{package_info['version']}")
    
    # Step 2: Check for vulnerabilities
    print("\n🔒 Checking security status...")
    vuln_info = check_vulnerabilities_basic(package_name)
    print(f"✅ Status: {vuln_info['status']}")
    
    # Step 3: Analyze with AI
    print("\n🤖 Analyzing with AI...")
    ai_analysis = analyze_with_ai(package_info, vuln_info)
    
    # Display results
    print("\n" + "=" * 70)
    print("📊 PACKAGE DETAILS")
    print("=" * 70)
    print(f"Name:         {package_info['name']}")
    print(f"Version:      {package_info['version']}")
    print(f"Description:  {package_info['description']}")
    print(f"Last Updated: {package_info['days_since_update']} days ago")
    print(f"Maintainers:  {package_info['maintainers']}")
    print(f"License:      {package_info['license']}")
    print(f"Repository:   {package_info['repository']}")
    
    print("\n" + "=" * 70)
    print("🤖 AI SECURITY ANALYSIS")
    print("=" * 70)
    print(ai_analysis)
    print("\n" + "=" * 70)
    print(f"Report generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("⚠️  Use at your own risk. This is for informational purposes only.")
    print("=" * 70)

print("✅ Main analysis function ready!")

✅ Main analysis function ready!


---

## 🧪 Examples & Testing

### Example 1: Well-Maintained Package (Express)

In [14]:
# Test with a safe, well-maintained package
analyze_package('express')

🔍 Analyzing package: express

📦 Fetching package data from NPM...
✅ Found: express@5.1.0

🔒 Checking security status...
✅ Status: low_risk

🤖 Analyzing with AI...

📊 PACKAGE DETAILS
Name:         express
Version:      5.1.0
Description:  Fast, unopinionated, minimalist web framework
Last Updated: 214 days ago
Maintainers:  5
License:      MIT
Repository:   git+https://github.com/expressjs/express.git

🤖 AI SECURITY ANALYSIS
**RISK LEVEL**: LOW

**PLAIN ENGLISH SUMMARY**: The "express" package is a popular web framework that helps build fast and flexible websites. It's regularly maintained by a team of 5 developers, which suggests it's stable and secure. Overall, it appears to be a safe choice for web development.

**GOOD SIGNS**:
- Recently maintained by multiple developers
- Widely used and popular
- Permissive MIT license

**CONCERNS**:
- The package hasn't been updated in 214 days, which might indicate a lack of recent security patches

**RECOMMENDATION**: Safe to use, but consider 

### Example 2: Abandoned Package (Left-Pad)

In [15]:
# Test with an abandoned package
analyze_package('left-pad')

🔍 Analyzing package: left-pad

📦 Fetching package data from NPM...
✅ Found: left-pad@1.3.0

🔒 Checking security status...
✅ Status: high_risk

🤖 Analyzing with AI...

📊 PACKAGE DETAILS
Name:         left-pad
Version:      1.3.0
Description:  String left pad
Last Updated: 2763 days ago
Maintainers:  2
License:      WTFPL
Repository:   git+ssh://git@github.com/stevemao/left-pad.git

🤖 AI SECURITY ANALYSIS
**RISK LEVEL**: HIGH

**PLAIN ENGLISH SUMMARY**: The "left-pad" package is a simple tool that helps format strings by adding spaces to the left. However, it hasn't been updated in over 7 years, which raises concerns about its safety and reliability. This outdated package may pose a risk to your project's security.

**GOOD SIGNS**: 
- It has a simple and specific function, which reduces the potential for complex security vulnerabilities.
- The license (WTFPL) is permissive, allowing for flexible use.

**CONCERNS**: 
- The package is severely outdated, with no updates in over 7 years, ind