Skip to content

solskogen/arch-pkgcompare

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Arch Linux Package Reporting System

A comprehensive multi-architecture package analysis and reporting system for Arch Linux. Compare any two architectures (aarch64, x86_64, riscv, s390x, sparc, powerpc, etc.) with 28+ specialized reports, intelligent caching, and deployment flexibility.

System Requirements

Data Loader (Python)

  • Python 3.6+
  • urllib (included in Python)
  • mysql-connector-python or PyMySQL library
  • Network access to Arch repository mirrors

Web Application (PHP)

  • PHP 7.4 or newer (8.0+ recommended)
  • MySQL 5.7+ or MariaDB 10.2+
  • Required PHP modules:
    • php-mysql or php-mysqli - database access
    • php-curl - HTTP requests (data loader)
    • php-gd - optional, image handling

Server

  • Apache 2.4+ with mod_php or
  • Nginx 1.14+ with PHP-FPM or
  • Any web server supporting PHP

Quick Start (5 minutes)

1. Configure Architecture Comparison

Copy the example configuration and edit it to specify which architectures to compare:

cp config.ini-example config.ini
# Edit config.ini with your preferred editor

You can compare ANY two architectures using elegant URL templates. See config.ini-example for template format details.

Elegant Template Format (Recommended)

[database]
host = localhost
user = aarch64linux
password = aarch64linux
database = aarch64linux

# Compare aarch64 vs x86_64 with URL templates
[arch-aarch64]
mirror = https://arch-linux-repo.drzee.net
url_template = {mirror}/arch/{repo}/os/{arch}/{repo}.db
repos = core, extra, forge

[arch-x86_64]
mirror = https://geo.mirror.pkgbuild.com
url_template = {mirror}/{repo}/os/{arch}/{repo}.db
repos = core, extra

Template Variables:

  • {mirror} - Mirror base URL
  • {repo} - Repository name (automatically substituted from repos list)
  • {arch} - Architecture name (automatically substituted)

This elegant format reduces repetition and is easier to maintain!

Alternative: Direct URL Format

If you prefer explicit URLs instead of templates:

[arch-aarch64]
core = https://arch-linux-repo.drzee.net/arch/core/os/aarch64/core.db
extra = https://arch-linux-repo.drzee.net/arch/extra/os/aarch64/extra.db
forge = https://arch-linux-repo.drzee.net/arch/forge/os/aarch64/forge.db

[arch-x86_64]
core = https://geo.mirror.pkgbuild.com/core/os/x86_64/core.db
extra = https://geo.mirror.pkgbuild.com/extra/os/x86_64/extra.db

To compare different architectures (e.g., riscv vs x86_64), use template format:

[arch-riscv64]
mirror = https://riscv-mirror.example.com
url_template = {mirror}/{repo}/os/{arch}/{repo}.db
repos = core, extra, testing

[arch-x86_64]
mirror = https://geo.mirror.pkgbuild.com
url_template = {mirror}/{repo}/os/{arch}/{repo}.db
repos = core, extra

Key features of config.ini:

  • Each [arch-*] section defines one architecture
  • Use templates OR direct URLs (both supported)
  • Template format: define mirror URL once, list repos, reduce repetition
  • Each architecture can have unlimited repositories
  • Different architectures can have DIFFERENT repositories
  • Repository names are free-form (used as-is in reports)
  • Exactly 2 architectures required for comparison (binary comparison)

2. Create Database

mysql -u root -p -e "
CREATE DATABASE aarch64linux;
CREATE USER 'aarch64linux'@'localhost' IDENTIFIED BY 'aarch64linux';
GRANT ALL PRIVILEGES ON aarch64linux.* TO 'aarch64linux'@'localhost';
FLUSH PRIVILEGES;
"

3. Load Package Data

python3 load_arch_packages.py

This downloads package metadata from your configured repositories and imports packages for all architectures.

4. Verify PHP Requirements

Before deploying, ensure you have the required PHP modules:

# On Arch Linux
sudo pacman -S php php-mysql

# Verify required modules are installed
php -m | grep -E "mysql|mysqli|curl"

You should see: curl, mysqli (or mysql), and optionally gd.

Edit reporting/config.ini (set database connection to match step 2):

[database]
host = localhost
user = aarch64linux
password = aarch64linux
database = aarch64linux

5. Deploy Application

Prerequisites: PHP 7.4+ with required modules:

  • php-mysql or php-mysqli - for database access
  • php-gd - for image handling (optional, used in future enhancements)
  • php-curl - for HTTP requests (used by data loader)

Install on Arch Linux:

sudo pacman -S php php-mysql

Option A: Using PHP's built-in server (testing only)

cd reporting
php -S localhost:8000
# Open http://localhost:8000

Option B: Apache Deployment

  1. Copy files to Apache document root:
sudo cp -r reporting /var/www/html/
sudo chown -R http:http /var/www/html/reporting
sudo chmod 755 /var/www/html/reporting
sudo chmod 755 /var/www/html/reporting/cache
sudo chmod 644 /var/www/html/reporting/config.ini
  1. Create Apache configuration (/etc/httpd/conf.d/reporting.conf):
<Directory /var/www/html/reporting>
    Options -Indexes +FollowSymLinks
    AllowOverride All
    Require all granted
</Directory>
  1. Enable required Apache modules:
sudo a2enmod php
sudo systemctl restart apache2
  1. Access at: http://localhost/reporting

Option C: Nginx Deployment

  1. Copy files:
sudo cp -r reporting /srv/http/
sudo chown -r http:http /srv/http/reporting
sudo chmod 755 /srv/http/reporting
sudo chmod 755 /srv/http/reporting/cache
  1. Configure Nginx (/etc/nginx/sites-available/reporting):
server {
    listen 80;
    server_name _;
    root /srv/http/reporting;
    index index.php;

    location ~ \.php$ {
        fastcgi_pass unix:/run/php-fpm/php-fpm.sock;
        fastcgi_index index.php;
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        include fastcgi_params;
    }

    location ~ /\. {
        deny all;
    }
}
  1. Enable and restart:
sudo ln -s /etc/nginx/sites-available/reporting /etc/nginx/sites-enabled/
sudo systemctl enable --now nginx php-fpm
  1. Access at: http://localhost/

What This Project Does

  1. Collects package metadata from aarch64 and x86_64 repositories
  2. Compares packages, dependencies, versions, and metadata between architectures
  3. Reports 28+ different types of differences and issues
  4. Caches results for fast web access (35x performance improvement)
  5. Provides a web interface for browsing and analyzing data

Project Structure

aarch64linux-reporting/
├── config.ini                    # Configuration file (EDIT THIS)
├── README.md                     # This file
├── load_arch_packages.py         # Database loader + data pipeline
└── reporting/                    # Web application
    ├── config.ini                # Web app configuration (EDIT THIS)
    ├── boot.php                  # Application initialization (do not edit)
    ├── 28 report pages
    ├── app/ (4 core classes)
    ├── css/ (styling)
    ├── cache/ (auto-managed)
    └── Original documentation

Configuration

All settings are in config.ini. No code changes needed!

Database Connection

Data Loader (Python)

Edit config.ini in the root directory:

[database]
host = localhost
user = aarch64linux
password = aarch64linux
database = aarch64linux

Web Application (PHP)

Edit reporting/config.ini:

[database]
host = localhost
user = aarch64linux
password = aarch64linux
database = aarch64linux

Repository Mirrors

The architecture repositories are configured in the main config.ini:

[arch-aarch64]
core = https://arch-linux-repo.drzee.net/arch/core/os/aarch64/core.db
extra = https://arch-linux-repo.drzee.net/arch/extra/os/aarch64/extra.db
forge = https://arch-linux-repo.drzee.net/arch/forge/os/aarch64/forge.db

[arch-x86_64]
core = https://geo.mirror.pkgbuild.com/core/os/x86_64/core.db
extra = https://geo.mirror.pkgbuild.com/extra/os/x86_64/extra.db

To use different mirrors, edit the URLs in config.ini and run load_arch_packages.py again.

For different architecture pairs, simply update the section names and repositories. For example, to compare riscv64 with x86_64:

[arch-riscv64]
core = https://riscv-mirror.example.com/core/os/riscv64/core.db
extra = https://riscv-mirror.example.com/extra/os/riscv64/extra.db

[arch-x86_64]
core = https://geo.mirror.pkgbuild.com/core/os/x86_64/core.db
extra = https://geo.mirror.pkgbuild.com/extra/os/x86_64/extra.db

Loader Performance

[loader]
batch_size = 5000              # Records per batch (1000-10000)
parallel_downloads = 5         # Concurrent downloads (3-10)
timeout = 30                   # Download timeout in seconds

Examples:

  • Slow network: batch_size=1000, parallel_downloads=2, timeout=60
  • Fast network: batch_size=10000, parallel_downloads=10, timeout=20

Caching

[cache]
enabled = true
ttl = 3600                     # 1 hour cache (0 to disable)
directory = cache

Available Reports (28+)

Package Availability

  • First Architecture Only Packages
  • Second Architecture Only Packages
  • Architecture-Specific Packages (Excluding Provides)
  • Missing -any Packages

Version Management

  • First Architecture Newer Versions
  • Second Architecture Newer Versions
  • Outdated -any Packages

Dependencies & Relationships

  • Circular Dependencies (consolidated by architecture)
  • Dependency Differences
  • Conflict Differences
  • Provides/Virtual Package Differences
  • Replace Differences
  • Optional Dependency Differences
  • Makedepend Differences

Metadata Comparison

  • License Discrepancies
  • Group Membership Differences
  • Package Base Mismatches

Repository Analysis

  • Repository Differences (wrong repo on aarch64)
  • Repository Mismatches
  • Per-Repository Comparison
  • Repository-Specific Lists (core, extra, forge)

Special Analysis

  • Orphaned Split Packages
  • Architecture-Independent (-any) Package Differences
  • Package Size Differences
  • Advanced Statistics Dashboard

Database Schema

The system automatically creates these tables:

  • packages - Package metadata
  • package_licenses - License assignments
  • package_provides - Virtual packages
  • package_depends - Runtime dependencies
  • package_makedepends - Build-time dependencies
  • package_optdepends - Optional dependencies
  • package_groups - Group memberships
  • package_conflicts - Conflicts
  • package_replaces - Replacements
  • repositories - Repository info
  • import_metadata - Refresh timestamps

Performance

  • First load: 7.4 seconds (full computation)
  • Cached load: 17 milliseconds (typical)
  • Cache duration: 1 hour (configurable)
  • Data refresh: 60-90 seconds
  • Total packages: ~28,000 (14,000 per architecture)

Cache automatically clears when load_arch_packages.py runs.

Updating Package Data

# Reload all data
python3 load_arch_packages.py

# Cache auto-clears automatically
# Reports refresh on next access

Schedule regular updates (weekly/monthly recommended):

# Cron job
0 2 * * * cd /path/to/aarch64linux-reporting && python3 load_arch_packages.py

Monitoring & Cache Management

Check cache status:

php reporting/clear_cache.php info

Clear cache manually:

php reporting/clear_cache.php clear

Troubleshooting

Database Connection Issues

# Test connection
mysql -u aarch64linux -p aarch64linux -e "SELECT COUNT(*) FROM packages;"

# Check config.ini credentials
grep -A4 "\[database\]" config.ini

# Check environment variables
echo $DB_HOST $DB_USER $DB_NAME

File Permission Errors

# Fix permissions
chmod 755 reporting/
chmod 644 reporting/*.php
chmod 777 reporting/cache

Download Timeout

# Increase timeout in config.ini
[loader]
timeout = 60

Performance Issues

# Check cache status
php reporting/clear_cache.php info

# Verify database indexes exist
mysql -u aarch64linux -p aarch64linux aarch64linux -e "SHOW INDEXES FROM packages;"

# Increase batch_size in config.ini
[loader]
batch_size = 10000

Blank Pages

# Check PHP error log
tail -50 /var/log/php-errors.log
tail -50 /var/log/apache2/error.log

# Enable error display temporarily
# Edit reporting/index.php:
ini_set('display_errors', 1);
error_reporting(E_ALL);

Security

  • SQL Injection: All queries use prepared statements
  • XSS Protection: All output is HTML-escaped
  • No Authentication Required: Read-only database access
  • Environment Variables: Use for production secrets
  • HTTPS Recommended: For production deployments
  • Cache: Not web-accessible
  • File Permissions: Restrict config.ini (chmod 600)

Production recommendations:

  1. Use HTTPS/SSL encryption
  2. Set database credentials via environment variables
  3. Restrict file permissions: chmod 600 config.ini
  4. Keep PHP and database updated
  5. Monitor error logs regularly

Features

✓ 28+ specialized reports ✓ Multi-architecture comparison (compare any 2+ architectures) ✓ Supports any architectures: aarch64, x86_64, riscv, s390x, powerpc, sparc, etc. ✓ 35x performance improvement with caching ✓ No external dependencies (pure PHP) ✓ Flexible deployment (Apache, Nginx, shared hosting) ✓ Production-ready with security hardening ✓ Configurable via config.ini (no code changes needed) ✓ Environment variable overrides for production ✓ Comprehensive error handling ✓ Smart dependency analysis (runtime + build-time)

Next Steps

  1. Configure: Edit config.ini if needed (default settings work for local testing)
  2. Load Data: Run python3 load_arch_packages.py
  3. Deploy: Choose Apache or Nginx deployment option above
  4. Access: Open the application in your browser
  5. Explore: Review 28+ reports and statistics
  6. Schedule: Set up regular data refreshes

Example Configuration Profiles

Development (local testing)

[database]
host = localhost
user = aarch64linux
password = aarch64linux
database = aarch64linux

[loader]
batch_size = 5000
parallel_downloads = 5
timeout = 30

[cache]
enabled = false

Production (secure)

[database]
host = db.internal.example.com
# Use environment variables for password!

[arch-aarch64]
core = https://internal-mirror.example.com/arch/core/os/aarch64/core.db
extra = https://internal-mirror.example.com/arch/extra/os/aarch64/extra.db

[arch-x86_64]
core = https://internal-mirror.example.com/arch/core/os/x86_64/core.db
extra = https://internal-mirror.example.com/arch/extra/os/x86_64/extra.db

[loader]
batch_size = 10000
parallel_downloads = 8

[cache]
enabled = true
ttl = 3600

Slow Network

[loader]
batch_size = 1000
parallel_downloads = 2
timeout = 60

Support

  • Check error logs in /var/log/apache2/ or /var/log/nginx/
  • Review config.ini for correct settings
  • Verify database is running and accessible
  • Check file permissions (755 dirs, 644 files, 777 cache)
  • Review application logs: tail -f load_arch_packages.log

License

Multi-architecture package reporting system for Arch Linux. Originally developed to support the Arch Linux aarch64 porting initiative, now architecture-agnostic and deployable with any architecture combination.


Ready to start? Follow the "Quick Start" section above. You'll have it running in 5 minutes!

About

Multi-architecture Arch Linux package comparison system

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors