Skip to content

lam-dan/concurrent-programming

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Concurrent Programming Examples

A collection of Python examples demonstrating various concurrent programming patterns, including thread pools, producer-consumer patterns, and web scraping with concurrent requests.

πŸš€ Features

  • S&P 500 Stock Price Scraper: Concurrent web scraping of stock prices from Yahoo Finance
  • Thread Pool Architecture: Queue-based work distribution across multiple worker threads
  • Error Handling: Robust error handling for network issues, rate limiting, and data parsing
  • Multiple Worker Types: Examples of different threading patterns and use cases

πŸ“ Project Structure

concurrent-programs/
β”œβ”€β”€ main.py                          # Main application with queue-based architecture
β”œβ”€β”€ workers/
β”‚   β”œβ”€β”€ YahooFinanceWorkers.py       # Stock price fetching with threading
β”‚   β”œβ”€β”€ WikiWorker.py                # S&P 500 symbol scraping from Wikipedia
β”‚   β”œβ”€β”€ SleepyWorkers.py             # Sleep-based threading example
β”‚   └── SquareSumWorkers.py          # Mathematical computation threading
β”œβ”€β”€ venv/                            # Virtual environment
└── README.md                        # This file

πŸ—οΈ Architecture

Main Application (main.py)

The main application demonstrates a producer-consumer pattern with multiple worker threads:

  1. Producer: Fetches S&P 500 company symbols from Wikipedia
  2. Queue: Distributes symbols to multiple consumer threads
  3. Consumers: 4 concurrent scheduler threads that process symbols
  4. Workers: Each symbol gets its own YahooFinanceWorker for HTTP requests
# 4 concurrent scheduler threads
num_yahoo_finance_price_workers = 4
for i in range(num_yahoo_finance_price_workers):
    yahooFinancePriceScheduler = YahooFinancePriceScheduler(input_queue=symbol_queue)
    yahoo_finance_price_scheduler_threads.append(yahooFinancePriceScheduler)

Worker Classes

YahooFinanceWorker

  • Purpose: Fetches stock prices from Yahoo Finance
  • Features:
    • Automatic thread management
    • Robust error handling for network issues
    • Gzip compression handling
    • Rate limiting protection
    • Price parsing with comma handling

WikiWorker

  • Purpose: Scrapes S&P 500 company symbols from Wikipedia
  • Features:
    • BeautifulSoup HTML parsing
    • User-Agent header for proper access
    • Generator-based symbol extraction

SleepyWorker

  • Purpose: Demonstrates basic threading with sleep operations
  • Use Case: Testing thread behavior and timing

SquareSumWorker

  • Purpose: CPU-intensive mathematical computations
  • Use Case: Demonstrating threading for computational tasks

πŸ› οΈ Installation

  1. Clone the repository:

    git clone https://github.com/lam-dan/concurrent-programming.git
    cd concurrent-programming
  2. Create and activate virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies:

    pip install requests beautifulsoup4 lxml

πŸš€ Usage

Run the Main Application

python main.py

This will:

  1. Fetch all S&P 500 company symbols from Wikipedia
  2. Create 4 concurrent scheduler threads
  3. Process symbols in parallel, fetching stock prices
  4. Display results and timing information

Expected Output

154.0
72.54
131.28
208.72
...
Sleeping took: 15.8

πŸ”§ Configuration

Adjusting Concurrency

Modify the number of worker threads in main.py:

num_yahoo_finance_price_workers = 4  # Change this number

Rate Limiting

The application includes built-in rate limiting to avoid overwhelming Yahoo Finance:

  • Random delays between requests
  • Proper User-Agent headers
  • Error handling for 429 (Too Many Requests) responses

πŸ› Error Handling

The application handles various error scenarios:

  • 404 Errors: Symbols not found on Yahoo Finance
  • Rate Limiting: 429 responses from Yahoo Finance
  • Network Issues: Connection timeouts and failures
  • Data Parsing: Malformed price data and gzip compression issues
  • Invalid Symbols: Companies that have been delisted or renamed

πŸ§ͺ Testing Individual Components

Test WikiWorker

from workers.WikiWorker import WikiWorker

wikiWorker = WikiWorker()
symbols = list(wikiWorker.get_sp_500_copmanies())
print(f"Found {len(symbols)} symbols")
print("First 5 symbols:", symbols[:5])

Test YahooFinanceWorker

from workers.YahooFinanceWorkers import YahooFinanceWorker

worker = YahooFinanceWorker("AAPL")
price = worker.get_price()
print(f"AAPL: ${price}")

πŸ“Š Performance

  • Concurrent Processing: 4 threads processing symbols simultaneously
  • Queue-based Distribution: Automatic load balancing across workers
  • Efficient I/O: Non-blocking HTTP requests with proper timeouts
  • Memory Efficient: Generator-based symbol processing

πŸ” Key Concepts Demonstrated

  1. Thread Pools: Multiple worker threads sharing a common task queue
  2. Producer-Consumer Pattern: Main thread produces work, workers consume it
  3. Queue-based Communication: Thread-safe work distribution
  4. Error Handling in Concurrent Code: Robust error management across threads
  5. Web Scraping Best Practices: Rate limiting, headers, and error handling
  6. Thread Synchronization: Using join() to wait for thread completion

🀝 Contributing

Feel free to submit issues, feature requests, or pull requests to improve this project!

πŸ“ License

This project is open source and available under the MIT License.

πŸ™ Acknowledgments

  • Yahoo Finance for providing stock price data
  • Wikipedia for S&P 500 company listings
  • Python threading library for concurrent programming capabilities

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published