A collection of Python examples demonstrating various concurrent programming patterns, including thread pools, producer-consumer patterns, and web scraping with concurrent requests.
- S&P 500 Stock Price Scraper: Concurrent web scraping of stock prices from Yahoo Finance
- Thread Pool Architecture: Queue-based work distribution across multiple worker threads
- Error Handling: Robust error handling for network issues, rate limiting, and data parsing
- Multiple Worker Types: Examples of different threading patterns and use cases
concurrent-programs/
βββ main.py # Main application with queue-based architecture
βββ workers/
β βββ YahooFinanceWorkers.py # Stock price fetching with threading
β βββ WikiWorker.py # S&P 500 symbol scraping from Wikipedia
β βββ SleepyWorkers.py # Sleep-based threading example
β βββ SquareSumWorkers.py # Mathematical computation threading
βββ venv/ # Virtual environment
βββ README.md # This file
The main application demonstrates a producer-consumer pattern with multiple worker threads:
- Producer: Fetches S&P 500 company symbols from Wikipedia
- Queue: Distributes symbols to multiple consumer threads
- Consumers: 4 concurrent scheduler threads that process symbols
- Workers: Each symbol gets its own
YahooFinanceWorkerfor HTTP requests
# 4 concurrent scheduler threads
num_yahoo_finance_price_workers = 4
for i in range(num_yahoo_finance_price_workers):
yahooFinancePriceScheduler = YahooFinancePriceScheduler(input_queue=symbol_queue)
yahoo_finance_price_scheduler_threads.append(yahooFinancePriceScheduler)- Purpose: Fetches stock prices from Yahoo Finance
- Features:
- Automatic thread management
- Robust error handling for network issues
- Gzip compression handling
- Rate limiting protection
- Price parsing with comma handling
- Purpose: Scrapes S&P 500 company symbols from Wikipedia
- Features:
- BeautifulSoup HTML parsing
- User-Agent header for proper access
- Generator-based symbol extraction
- Purpose: Demonstrates basic threading with sleep operations
- Use Case: Testing thread behavior and timing
- Purpose: CPU-intensive mathematical computations
- Use Case: Demonstrating threading for computational tasks
-
Clone the repository:
git clone https://github.com/lam-dan/concurrent-programming.git cd concurrent-programming -
Create and activate virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install requests beautifulsoup4 lxml
python main.pyThis will:
- Fetch all S&P 500 company symbols from Wikipedia
- Create 4 concurrent scheduler threads
- Process symbols in parallel, fetching stock prices
- Display results and timing information
154.0
72.54
131.28
208.72
...
Sleeping took: 15.8
Modify the number of worker threads in main.py:
num_yahoo_finance_price_workers = 4 # Change this numberThe application includes built-in rate limiting to avoid overwhelming Yahoo Finance:
- Random delays between requests
- Proper User-Agent headers
- Error handling for 429 (Too Many Requests) responses
The application handles various error scenarios:
- 404 Errors: Symbols not found on Yahoo Finance
- Rate Limiting: 429 responses from Yahoo Finance
- Network Issues: Connection timeouts and failures
- Data Parsing: Malformed price data and gzip compression issues
- Invalid Symbols: Companies that have been delisted or renamed
from workers.WikiWorker import WikiWorker
wikiWorker = WikiWorker()
symbols = list(wikiWorker.get_sp_500_copmanies())
print(f"Found {len(symbols)} symbols")
print("First 5 symbols:", symbols[:5])from workers.YahooFinanceWorkers import YahooFinanceWorker
worker = YahooFinanceWorker("AAPL")
price = worker.get_price()
print(f"AAPL: ${price}")- Concurrent Processing: 4 threads processing symbols simultaneously
- Queue-based Distribution: Automatic load balancing across workers
- Efficient I/O: Non-blocking HTTP requests with proper timeouts
- Memory Efficient: Generator-based symbol processing
- Thread Pools: Multiple worker threads sharing a common task queue
- Producer-Consumer Pattern: Main thread produces work, workers consume it
- Queue-based Communication: Thread-safe work distribution
- Error Handling in Concurrent Code: Robust error management across threads
- Web Scraping Best Practices: Rate limiting, headers, and error handling
- Thread Synchronization: Using
join()to wait for thread completion
Feel free to submit issues, feature requests, or pull requests to improve this project!
This project is open source and available under the MIT License.
- Yahoo Finance for providing stock price data
- Wikipedia for S&P 500 company listings
- Python threading library for concurrent programming capabilities