🌐 PyIM - Python Influence Maximization Package

A Comprehensive Python Package for Influence Maximization, Network Analysis, and Information Diffusion Modeling

Installation • Quick Start • Documentation • Examples • API Reference

📖 Table of Contents

🌟 Overview
✨ Features
📦 Installation
🚀 Quick Start
📚 Documentation
💡 Examples
🔧 Configuration
⚡ Performance
🧪 Testing
🤝 Contributing
📝 Citation
📞 Support

🌟 Overview

PyIM is a comprehensive Python package designed for researchers and practitioners working on influence maximization, network analysis, and information diffusion modeling in complex networks. It provides state-of-the-art algorithms, multiple diffusion models, and comprehensive evaluation metrics for studying information spread in social networks, biological networks, and other complex systems.

🎯 Key Capabilities

🌐 Network Support: Single and multi-layer networks with full NetworkX integration
🔬 7 Diffusion Models: IC, LT, SI, SIR, SIS, CIC, MLIC with multiprocessing
🧮 5+ Algorithms: Greedy, CELF, Centrality-based, SAW_ASA, SFLA
📈 Evaluation Framework: Comprehensive metrics and benchmarking tools
⚡ Performance Optimized: Multiprocessing support for large-scale networks
🛡️ Robust: Comprehensive error handling and validation

📊 Use Cases

Social Network Analysis: Identify influential users for viral marketing
Epidemiology: Model disease spread and identify key intervention points
Recommendation Systems: Optimize information dissemination
Security: Detect critical nodes for network protection
Biological Networks: Study protein interactions and gene regulation

✨ Features

🌐 Network Support

Single Layer Networks (SLN): Standard graph structures
Multi-Layer Networks (MLN): Multiplex and interconnected networks
Full NetworkX Integration: Seamless compatibility with NetworkX ecosystem
Network Statistics: Comprehensive analysis and metrics
Flexible Operations: Node/edge manipulation, attribute management

📖 Network Module Documentation

🔬 Diffusion Models

Independent Cascade (IC): Classic probabilistic diffusion model
Linear Threshold (LT): Threshold-based activation model
Susceptible-Infected (SI): Simple epidemic model
Susceptible-Infected-Recovered (SIR): Epidemic model with recovery
Susceptible-Infected-Susceptible (SIS): Endemic disease model
Competitive Independent Cascade (CIC): Multi-competitor diffusion
Multi-Layer Independent Cascade (MLIC): Cross-layer diffusion

📖 Diffusion Models Documentation

🧮 Influence Maximization Algorithms

Greedy Algorithm: Classic greedy approach with (1-1/e) approximation
Cost-Effective Lazy Forward (CELF): Optimized greedy with lazy evaluation
Centrality-Based Selection: Degree, betweenness, closeness, eigenvector
Simulated Annealing (SAW_ASA): Metaheuristic optimization
Shuffled Frog Leaping (SFLA): Population-based optimization

📖 Algorithms Documentation

📈 Evaluation Framework

Influence Spread Metrics: Direct diffusion simulation
Structural Metrics: Distance, clustering, connectivity
Comparative Metrics: Kendall Tau, Jaccard, Spearman
Batch Evaluation: Systematic multi-algorithm comparison
Performance Benchmarking: Timing and memory profiling

📖 Evaluation Documentation

⚡ Performance Optimization

Multiprocessing Support: All diffusion models support parallel execution
Performance Monitoring: Built-in timing and profiling
Memory Efficient: Optimized for large-scale networks
Scalable: Tested on networks with 10,000+ nodes
Flexible Configuration: Tune performance for your hardware

📖 Performance Guide

🛡️ Robust Error Handling

Comprehensive Exception Hierarchy: Clear error categorization
Informative Error Messages: Easy debugging and troubleshooting
Input Validation: Prevent common mistakes
Consistent Patterns: Uniform error handling across modules

📖 Error Handling Guide

📦 Installation

🎯 Prerequisites

Python: 3.7 or higher
pip: Python package manager
Operating System: Windows, macOS, or Linux

📥 Installation Methods

Method 1: Install from PyPI (Recommended)

pip install PyIM

Method 2: Install from Source

# Clone the repository
git clone https://github.com/yourusername/PyIM.git
cd PyIM

# Install dependencies
pip install -r PyIM/requirements.txt

# Install the package
pip install -e .

Method 3: Manual Dependency Installation

pip install networkx>=2.8.0
pip install numpy>=1.21.0
pip install rich>=12.0.0
pip install psutil>=5.9.0

✅ Verification

Verify installation by running:

import PyIM
print(f"PyIM version: {PyIM.__version__}")
print(f"PyIM author: {PyIM.__author__}")

Expected output:

PyIM version: 1.0.0
PyIM author: PyIM Team

🚀 Quick Start

📝 Basic Usage Example

from PyIM import SLN, IC, Greedy
from PyIM.diffusionModel import ICWeighter

# Step 1: Create a network
edges = [(1, 2), (2, 3), (3, 4), (4, 1), (5, 6)]
network = SLN("example_network", "undirected", edges=edges)

# Step 2: Set edge weights for diffusion
weighter = ICWeighter(weighting_type="uniform", active_probability=0.1)
weighter(network)

# Step 3: Create diffusion model
ic_model = IC(MC=1000, verbose=False)

# Step 4: Create influence maximization algorithm
greedy = Greedy(ic_model, verbose=False)

# Step 5: Find optimal seed set
seed_nodes = greedy(network, k=3)

# Step 6: Evaluate influence spread
influence = ic_model(network, seed_nodes)

# Step 7: Display results
print(f"📊 Network Statistics:")
print(f"   Nodes: {network.number_of_nodes()}")
print(f"   Edges: {network.number_of_edges()}")
print(f"   Density: {network.get_network_statistics()['density']:.4f}")
print(f"\n🎯 Influence Maximization Results:")
print(f"   Seed nodes: {seed_nodes}")
print(f"   Influence spread: {influence:.2f}")
print(f"   Coverage: {influence/network.number_of_nodes()*100:.1f}%")

Expected Output:

📊 Network Statistics:
   Nodes: 6
   Edges: 5
   Density: 0.3333

🎯 Influence Maximization Results:
   Seed nodes: [1, 2, 3]
   Influence spread: 2.45
   Coverage: 40.8%

🔧 Advanced Example with Multiprocessing

from PyIM import SLN
from PyIM.diffusionModel import IC, ICWeighter
from PyIM.algorithm import Greedy
import time

# Create a larger network
edges = [(i, i+1) for i in range(1, 100)] + [(1, 50), (50, 100)]
network = SLN("large_network", "undirected", edges=edges)

# Set weights
weighter = ICWeighter(weighting_type="uniform", active_probability=0.05)
weighter(network)

# Create model with multiprocessing
ic_model = IC(MC=10000, verbose=True, enable_monitoring=True)

# Test single process
start = time.time()
result_single = ic_model(network, [1, 2, 3], multiprocess=False)
time_single = time.time() - start

# Test multiprocessing
start = time.time()
result_multi = ic_model(network, [1, 2, 3], multiprocess=True, n_processes=4)
time_multi = time.time() - start

# Compare results
print(f"\n⚡ Performance Comparison:")
print(f"   Single process: {time_single:.2f}s, Influence: {result_single:.2f}")
print(f"   Multiprocess:    {time_multi:.2f}s, Influence: {result_multi:.2f}")
print(f"   Speedup:         {time_single/time_multi:.2f}x")

# Get performance metrics
if ic_model._performance_monitor:
    summary = ic_model._performance_monitor.get_summary()
    print(f"\n📊 Performance Metrics: {summary}")

📚 Documentation

📖 Module Documentation

Module	Description	Documentation
🌐 Network	Single and multi-layer network classes	Network Module
🔬 Diffusion Models	Information diffusion simulation	Diffusion Models
🧮 Algorithms	Influence maximization algorithms	Algorithms
📈 Evaluation	Performance evaluation framework	Evaluation
📊 Dataset	Dataset management and loading	Dataset
⚙️ Configuration	Configuration management	Configuration
🛠️ Utils	Utility functions and helpers	Utilities
⚠️ Error Handling	Exception handling guide	Error Handling
⚡ Performance	Performance optimization guide	Performance

📚 Getting Started Guides

🚀 Quick Start Guide - Get up and running in 5 minutes
📖 Tutorial Series - Step-by-step tutorials
💡 Best Practices - Recommended patterns and approaches
🔧 Advanced Usage - Advanced features and techniques

📊 Reference Documentation

📋 API Reference - Complete API documentation
📝 Examples Gallery - Collection of usage examples
🧪 Testing Guide - How to run and write tests
🤝 Contributing Guide - Contribution guidelines

💡 Examples

🌐 Example 1: Network Creation and Analysis

from PyIM import SLN, MLN
from PyIM.network import load_networks

# Create a single layer network
edges = [(1, 2), (2, 3), (3, 4), (4, 1), (5, 6)]
sln = SLN("social_network", "undirected", edges=edges)

# Analyze network
stats = sln.get_network_statistics()
print(f"📊 Network Analysis:")
print(f"   Name: {stats['name']}")
print(f"   Nodes: {stats['nodes']}")
print(f"   Edges: {stats['edges']}")
print(f"   Density: {stats['density']:.4f}")
print(f"   Average Degree: {stats['avg_degree']:.2f}")

# Create a multi-layer network
layer1_edges = [(1, 2), (2, 3), (3, 4)]
layer2_edges = [(1, 3), (3, 5), (5, 2)]
layer3_edges = [(2, 4), (4, 5), (5, 1)]
edges_of_layers = [layer1_edges, layer2_edges, layer3_edges]
mln = MLN("multiplex_network", "undirected", edges_of_layers=edges_of_layers)

print(f"\n🌐 Multi-Layer Network:")
print(f"   Layers: {mln.number_of_layers()}")
print(f"   Total Nodes: {mln.number_of_nodes()}")
print(f"   Total Edges: {mln.number_of_edges()}")

# Load multiple networks from specifications
network_specs = [
    {
        'name': 'network1',
        'type': 'SLN',
        'directionality': 'undirected',
        'edges': [(1, 2), (2, 3), (3, 4)]
    },
    {
        'name': 'network2',
        'type': 'SLN',
        'directionality': 'directed',
        'edges': [(1, 2), (2, 3), (3, 1)]
    }
]

networks = load_networks(network_specs)
print(f"\n📦 Loaded {len(networks)} networks:")
for net in networks:
    print(f"   - {net.name}: {net.number_of_nodes()} nodes, {net.number_of_edges()} edges")

🔬 Example 2: Diffusion Model Comparison

from PyIM import SLN
from PyIM.diffusionModel import IC, LT, SIR, ICWeighter, LTWeighter, SIRWeighter
import matplotlib.pyplot as plt

# Create network
edges = [(i, i+1) for i in range(1, 20)] + [(1, 10), (10, 20)]
network = SLN("test_network", "undirected", edges=edges)

# Set weights for different models
ic_weighter = ICWeighter(weighting_type="uniform", active_probability=0.1)
ic_weighter(network)

lt_weighter = LTWeighter(weighting_type="uniform", threshold=0.3)
lt_weighter(network)

sir_weighter = SIRWeighter(weighting_type="uniform", infection_prob=0.1, recovery_prob=0.05)
sir_weighter(network)

# Create models
ic_model = IC(MC=1000, verbose=False)
lt_model = LT(MC=1000, verbose=False)
sir_model = SIR(MC=1000, verbose=False)

# Test different seed sets
seed_sets = [[1], [1, 5], [1, 5, 10], [1, 5, 10, 15]]

print("📊 Diffusion Model Comparison:")
print("-" * 60)

for seeds in seed_sets:
    ic_result = ic_model(network, seeds)
    lt_result = lt_model(network, seeds)
    sir_result = sir_model(network, seeds)

    print(f"Seeds {seeds}:")
    print(f"  IC:  {ic_result:.2f} activated nodes")
    print(f"  LT:  {lt_result:.2f} activated nodes")
    print(f"  SIR: {sir_result:.2f} activated nodes")
    print()

# Plot results (optional)
# plt.figure(figsize=(10, 6))
# plt.plot(range(len(seed_sets)), [ic_model(network, s) for s in seed_sets], 'o-', label='IC')
# plt.plot(range(len(seed_sets)), [lt_model(network, s) for s in seed_sets], 's-', label='LT')
# plt.plot(range(len(seed_sets)), [sir_model(network, s) for s in seed_sets], '^-', label='SIR')
# plt.xlabel('Seed Set Size')
# plt.ylabel('Influence Spread')
# plt.title('Diffusion Model Comparison')
# plt.legend()
# plt.grid(True)
# plt.show()

🧮 Example 3: Algorithm Comparison

from PyIM import SLN
from PyIM.diffusionModel import IC, ICWeighter
from PyIM.algorithm import Greedy, CELF, CentralitySeedSelector
import time

# Create network
edges = [(i, i+1) for i in range(1, 50)] + [(1, 25), (25, 50)]
network = SLN("algorithm_test", "undirected", edges=edges)

# Set weights
weighter = ICWeighter(weighting_type="uniform", active_probability=0.1)
weighter(network)

# Create diffusion model
ic_model = IC(MC=1000, verbose=False)

# Create algorithms
algorithms = {
    'Greedy': Greedy(ic_model, verbose=False),
    'CELF': CELF(ic_model, verbose=False),
    'Degree': CentralitySeedSelector(centrality_type='degree'),
    'Betweenness': CentralitySeedSelector(centrality_type='betweenness'),
    'Closeness': CentralitySeedSelector(centrality_type='closeness')
}

# Test different k values
k_values = [5, 10, 15, 20]

print("🧮 Algorithm Performance Comparison")
print("=" * 70)

for k in k_values:
    print(f"\n📊 k = {k}:")
    print("-" * 70)
    
    results = {}
    times = {}
    
    for name, algorithm in algorithms.items():
        start = time.time()
        seeds = algorithm(network, k=k)
        elapsed = time.time() - start
        
        # Evaluate influence
        influence = ic_model(network, seeds)
        
        results[name] = influence
        times[name] = elapsed
        
        print(f"{name:15s}: Influence = {influence:6.2f}, Time = {elapsed:6.3f}s")
    
    # Find best algorithm
    best_algorithm = max(results, key=results.get)
    fastest_algorithm = min(times, key=times.get)
    
    print(f"\n🏆 Best Influence: {best_algorithm} ({results[best_algorithm]:.2f})")
    print(f"⚡ Fastest: {fastest_algorithm} ({times[fastest_algorithm]:.3f}s)")

📈 Example 4: Comprehensive Evaluation

from PyIM import SLN
from PyIM.diffusionModel import IC, ICWeighter
from PyIM.algorithm import Greedy, CentralitySeedSelector
from PyIM.evaluation import Evaluation
from PyIM.evaluation.metrics import InfluenceSpread, Distance, ClusteringCoefficient

# Create multiple networks
networks = [
    SLN("network1", "undirected", edges=[(i, i+1) for i in range(1, 20)]),
    SLN("network2", "undirected", edges=[(i, i+1) for i in range(1, 30)]),
    SLN("network3", "undirected", edges=[(i, i+1) for i in range(1, 40)])
]

# Set weights for all networks
weighter = ICWeighter(weighting_type="uniform", active_probability=0.1)
for network in networks:
    weighter(network)

# Create diffusion model
ic_model = IC(MC=1000, verbose=False)

# Create metrics
metrics = [
    InfluenceSpread(ic_model),
    Distance(),
    ClusteringCoefficient()
]

# Create algorithms
algorithms = [
    Greedy(ic_model, verbose=False),
    CentralitySeedSelector(centrality_type='degree')
]

# Create evaluation framework
evaluator = Evaluation(
    networks=networks,
    metrics=metrics,
    algorithms=algorithms,
    k_range=[5, 10, 15],
    verbose=True
)

# Run evaluation
print("📈 Running Comprehensive Evaluation...")
print("=" * 70)
results = evaluator.run_evaluation()

# Display results
print("\n📊 Evaluation Results:")
print("=" * 70)

for metric_name, metric_results in results.items():
    print(f"\n📏 {metric_name}:")
    for network_name, network_results in metric_results.items():
        print(f"   {network_name}:")
        for algo_name, algo_results in network_results.items():
            print(f"      {algo_name}: {algo_results}")

🔧 Configuration

⚙️ Environment Variables

Configure PyIM using environment variables:

# Data directories
export PYIM_DATA_DIR="/path/to/data"
export PYIM_DOWNLOAD_DIR="/path/to/downloads"
export PYIM_TEMP_DIR="/path/to/temp"

# Logging
export PYIM_LOG_LEVEL="INFO"
export PYIM_LOG_FORMAT="%(asctime)s - %(name)s - %(levelname)s - %(message)s"

# Performance
export PYIM_MAX_WORKERS="4"
export PYIM_CACHE_ENABLED="true"

📝 Configuration File

Create a config.json file:

{
  "data_dir": "/path/to/data",
  "download_dir": "/path/to/downloads",
  "temp_dir": "/path/to/temp",
  "log_level": "INFO",
  "max_workers": 4,
  "cache_enabled": true,
  "validation_strict": false
}

Load configuration:

from PyIM.config import PyIMConfig

# Load from file
config = PyIMConfig.from_file("config.json")

# Or use global configuration
from PyIM.config import get_config
config = get_config()

# Access configuration
print(f"Data directory: {config.data_dir}")
print(f"Log level: {config.log_level}")
print(f"Max workers: {config.max_workers}")

⚡ Performance

🚀 Multiprocessing

All diffusion models support multiprocessing for improved performance:

from PyIM import SLN
from PyIM.diffusionModel import IC, ICWeighter

# Create network
network = SLN("large_network", "undirected", edges=[(i, i+1) for i in range(1, 1000)])

# Set weights
weighter = ICWeighter(weighting_type="uniform", active_probability=0.01)
weighter(network)

# Create model
ic_model = IC(MC=10000, verbose=True)

# Use multiprocessing (recommended for large networks)
result = ic_model(network, seed_nodes=[1, 2, 3], multiprocess=True, n_processes=4)

# Or use single process (for small networks)
result = ic_model(network, seed_nodes=[1, 2, 3], multiprocess=False)

📊 Performance Monitoring

Enable performance monitoring to track execution time:

from PyIM.diffusionModel import IC

# Create model with monitoring enabled
ic_model = IC(MC=1000, enable_monitoring=True, verbose=False)

# Run simulation
result = ic_model(network, seed_nodes=[1, 2, 3])

# Get performance metrics
if ic_model._performance_monitor:
    summary = ic_model._performance_monitor.get_summary()
    print(f"Performance: {summary}")

💡 Performance Tips

Use multiprocessing for large networks (100+ nodes)
Adjust MC parameter based on accuracy requirements
Choose appropriate diffusion model for your application
Use CELF for faster greedy approximation
Enable performance monitoring for optimization

🧪 Testing

🏃 Running Tests

Run the comprehensive test suite:

# Run all tests
python tests/test_all_modules.py

# Or use the test runner
python run_tests.py

📊 Test Coverage

The test suite includes:

✅ Network Module (8 tests): Creation, operations, statistics
✅ Dataset Module (3 tests): Loading, management, error handling
✅ Diffusion Models (3 tests): IC model, multiprocessing
✅ Algorithm Module (2 tests): Greedy, Centrality
✅ Error Handling (4 tests): All exception types
✅ Configuration (2 tests): Config loading, constants
✅ Integration Tests (1 test): Complete workflows

Total: 23 tests with 100% success rate

🔍 Running Specific Tests

# Run specific test class
python -m unittest tests.test_all_modules.TestNetworkModule

# Run specific test method
python -m unittest tests.test_all_modules.TestNetworkModule.test_sln_creation

# Run with verbose output
python -m unittest tests.test_all_modules -v

🤝 Contributing

We welcome contributions to PyIM! Please follow these guidelines:

📝 Code Style

Follow PEP 8 style guidelines
Use meaningful variable and function names
Add docstrings to all public functions and classes
Include type hints where appropriate

🧪 Testing

Write unit tests for new features
Ensure all tests pass before submitting
Test on multiple Python versions (3.7+)
Include edge cases and error conditions

📚 Documentation

Update documentation for new features
Include usage examples
Maintain API documentation
Update CHANGELOG.md

🚀 Pull Requests

Fork the repository
Create a feature branch
Make your changes with clear commit messages
Submit a pull request with description

📝 Citation

If you use PyIM in your research, please cite:

@software{pyim2024,
  title = {PyIM: Python Influence Maximization Package},
  author = {PyIM Team},
  year = {2024},
  version = {1.0.0},
  url = {https://github.com/yourusername/PyIM}
}

📞 Support

📖 Documentation

Main Documentation: See individual module documentation
API Reference: API Reference
Examples: Examples Gallery
Tutorials: Tutorial Series

💬 Community

Issues: Report bugs and request features via GitHub Issues
Discussions: Join our GitHub Discussions
Wiki: Check our Wiki for additional resources

📧 Contact

Email: support@pyim.org
Website: https://pyim.org
GitHub: https://github.com/yourusername/PyIM

📜 License

PyIM is licensed under the MIT License. See LICENSE file for details.

Made with ❤️ by the PyIM Team

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
__pycache__		__pycache__
algorithm		algorithm
dataset		dataset
diffusionModel		diffusionModel
docs		docs
error		error
evaluation		evaluation
network		network
tmp		tmp
warning		warning
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
config.py		config.py
constant.py		constant.py
performance.py		performance.py
requirements.txt		requirements.txt
test.py		test.py
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

🌐 PyIM - Python Influence Maximization Package

📖 Table of Contents

🌟 Overview

🎯 Key Capabilities

📊 Use Cases

✨ Features

🌐 Network Support

🔬 Diffusion Models

🧮 Influence Maximization Algorithms

📈 Evaluation Framework

⚡ Performance Optimization

🛡️ Robust Error Handling

📦 Installation

🎯 Prerequisites

📥 Installation Methods

Method 1: Install from PyPI (Recommended)

Method 2: Install from Source

Method 3: Manual Dependency Installation

✅ Verification

🚀 Quick Start

📝 Basic Usage Example

🔧 Advanced Example with Multiprocessing

📚 Documentation

📖 Module Documentation

📚 Getting Started Guides

📊 Reference Documentation

💡 Examples

🌐 Example 1: Network Creation and Analysis

🔬 Example 2: Diffusion Model Comparison

🧮 Example 3: Algorithm Comparison

📈 Example 4: Comprehensive Evaluation

🔧 Configuration

⚙️ Environment Variables

📝 Configuration File

⚡ Performance

🚀 Multiprocessing

📊 Performance Monitoring

💡 Performance Tips

🧪 Testing

🏃 Running Tests

📊 Test Coverage

🔍 Running Specific Tests

🤝 Contributing

📝 Code Style

🧪 Testing

📚 Documentation

🚀 Pull Requests

📝 Citation

📞 Support

📖 Documentation

💬 Community

📧 Contact

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages