A snapshot of retail algorithmic trading before it became mainstream
Overview • Architecture • Features • Documentation • Building • Historical Context
This is a historical archive from 2006-2007, preserved for educational purposes.
This codebase contains outdated libraries with known security vulnerabilities and practices that were acceptable in 2006 but are now considered insecure. It is NOT suitable for production use without significant modernization.
Why archive this? To document the evolution of retail quantitative trading and demonstrate algorithmic approaches that predated their mainstream adoption.
Cherry Picker is a Java-based automated stock trading analysis system built independently in 2006-2007. It implements ensemble forecasting using multiple statistical models, processes intraday market data at 15-second intervals, and generates professional PDF reports with time-series visualizations.
In 2006-2007, this represented cutting-edge retail trading technology:
| Concept | Status in 2006 | Industry Adoption |
|---|---|---|
| ✅ Ensemble forecasting | Novel for retail traders | Mainstream by 2010s |
| ✅ Adaptive rolling windows | Rare in personal systems | Standard by 2012+ |
| ✅ Automated trading systems | Primarily institutional | Retail adoption 2015+ |
| ✅ High-frequency data analysis | Innovative at 15-sec granularity | Common by 2018+ |
| ✅ Systematic approach | Growing among quants | Ubiquitous by 2020+ |
The achievement: Building a complete end-to-end quantitative pipeline (data → analysis → forecasting → visualization) as a solo developer, before algorithmic trading platforms became accessible to retail traders.
┌─────────────────────────────────────────────────────────────────┐
│ Market Data Source │
│ (SQL Server Database) │
│ 15-second interval tick data with metadata │
└────────────────┬────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ CpForecasting.java - Ensemble Engine │
├─────────────────────────────────────────────────────────────────┤
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────┐ │
│ │ Simple Exp. │ │ Double Exp. │ │ Moving Avg │ │
│ │ Smoothing (SES) │ │ Smoothing (DES) │ │ Models (MAM) │ │
│ └──────────────────┘ └──────────────────┘ └──────────────┘ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Weighted Moving │ │ Polynomial │ │
│ │ Averages (WMA) │ │ Regression (BRR) │ │
│ └──────────────────┘ └──────────────────┘ │
│ │
│ • Rolling window adaptation (10-200 observations) │
│ • Multi-horizon forecasting (5 periods ahead) │
│ • Custom weighting schemes │
└────────────────┬────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Predictions Database Table │
│ (cp_processed_results_st) │
│ Stores all model outputs for evaluation │
└────────────────┬────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ CpCharting.java - Report Generator │
├─────────────────────────────────────────────────────────────────┤
│ • JFreeChart: Time-series visualization │
│ • iText PDF: Multi-page document generation │
│ • 4 charts per page, 100 pages per file │
│ • Min/Max markers for entry/exit signals │
│ • Bookmark navigation by symbol │
└────────────────┬────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Professional Trading Reports │
│ (Multi-file PDFs) │
│ Ready for analysis and trading decisions │
└─────────────────────────────────────────────────────────────────┘
- Purpose: Generate predictions using multiple statistical models
- Input: Historical price data from
cp_processed_resultstable - Processing:
- Maintains rolling windows of observations (configurable per model)
- Removes oldest observations as new data arrives (concept drift handling)
- Runs all models in parallel on same dataset
- Forecasts H=5 periods ahead (~75 seconds)
- Output: Predictions stored in
cp_processed_results_st - Lines of Code: ~545
- Purpose: Generate PDF reports with time-series charts
- Input: Chart data from
cp_chartvaluestable - Processing:
- Queries all symbols and dates in database
- Generates 4 charts per page
- Adds markers for min/max price points
- Creates bookmarks for navigation
- Splits into multiple files at 100 pages
- Output: Professional PDF reports (
cpcharts1.pdf,cpcharts2.pdf, ...) - Lines of Code: ~605
- Location:
cpconfig/directory - Format: XML files using SIXX serialization
- Contents: 150+ configuration files containing:
- Symbol-specific parameters
- Trading time windows
- Price thresholds
- Skip flags for problematic symbols
- Purpose: Fine-tune system behavior per stock
Rather than relying on a single model, Cherry Picker runs 5 different algorithms simultaneously:
// Model implementations from CpForecasting.java
1. Simple Exponential Smoothing (SES)
→ Best for: Mean-reverting price action
→ α parameter auto-optimized via getBestFitModel()
2. Double Exponential Smoothing (DES)
→ Best for: Trending markets with momentum
→ Captures both level and rate of change
3. Moving Average Models (MAM)
→ Best for: Noise reduction in volatile data
→ Configurable periods: 10, 25, 50, 100, 150
4. Weighted Moving Averages (WMA)
→ Best for: Emphasizing recent observations
→ Custom weight schemes: [0.1754, 0.1754, 0.1754, 0.0877, ...]
5. Polynomial Regression (BRR)
→ Best for: Nonlinear pattern detection
→ 5th degree polynomial fittingWhy ensemble? Different models excel in different market conditions. Running all models provides:
- Robustness against model-specific failures
- Diversity of perspectives on price movements
- Ability to select best performer ex-post
// From CpForecasting.java, lines 274-305
if (ctr > (periodstouse + 5)) {
DataPoint rdp = (DataPoint) observedData.toArray()[0];
observedData.remove(rdp); // Remove oldest observation
// Re-index all points to maintain temporal consistency
ctr = 0;
Iterator it = observedData.iterator();
while (it.hasNext()) {
ctr = ctr + 1;
Observation dpf = (Observation) it.next();
dpf.setIndependentValue("t", ctr);
}
}This approach:
- Automatically discards stale data
- Maintains fixed-size observation window
- Adapts to changing market conditions (concept drift)
- Predates modern online learning frameworks by years
// Millisecond-precision timestamps
observations.add(new Millisecond(rs.getTimestamp("entrydate")),
rs.getFloat("value_value"));- Granularity: 15-second intervals
- Precision: Millisecond timestamps
- Volume: Thousands of observations per symbol per day
- Context: Appropriate for 2006 retail trading (vs microsecond HFT today)
// CpCharting.java generates multi-page PDFs with:
- 4 charts per page (optimized layout)
- 100 pages per file (manageable size)
- Custom pagination with headers/footers
- Bookmark navigation by symbol
- Min/Max price markers with color coding
- Automated filename generationThis repository includes comprehensive documentation:
| Document | Purpose | Audience |
|---|---|---|
| README.md (this file) | Overview and getting started | Everyone |
| PROJECT_OVERVIEW.md | Deep technical analysis (10,000+ words) | Technical deep-dive |
| DATABASE_SCHEMA.md | Complete database design | Database/backend devs |
| CONFIGURATION.md | XML configuration format guide | Configuration management |
| SECURITY.md | Security considerations | Security-conscious users |
| CONTRIBUTING.md | Contribution guidelines | Contributors |
| LICENSE | MIT License + disclaimers | Legal compliance |
# Required software
- Java 1.4+ JDK (tested with 1.4/1.5)
- Microsoft SQL Server 2000/2005
- Windows OS (hardcoded paths, though fixable)
# Required libraries (all in javajars/)
- OpenForecast 0.4.0
- JFreeChart 1.0.1
- jcommon 1.0.5
- iText 1.4.3
- Microsoft JDBC drivers (sqljdbc.jar, mssqlserver.jar, msbase.jar, msutil.jar)-- Create database
CREATE DATABASE cherrypicker;
-- Create tables (see DATABASE_SCHEMA.md for complete DDL)
-- Key tables: cp_processed_results, cp_processed_results_st, cp_chartvalues
-- Import sample data
-- (Historical data not included in repository)cd javajars
# Compile forecasting engine
javac -classpath "OpenForecast-0.4.0.jar:sqljdbc.jar:mssqlserver.jar:msbase.jar:msutil.jar:." \
CpForecasting.java
# Compile charting engine
javac -classpath "itext-1.4.3.jar:jfreechart-1.0.1.jar:jcommon-1.0.5.jar:sqljdbc.jar:servlet.jar:." \
CpCharting.javaNote: Use semicolons (;) instead of colons (:) as classpath separator on Windows.
# Set database credentials via environment variables (recommended)
export DB_URL="jdbc:microsoft:sqlserver://localhost:1433;databasename=cherrypicker"
export DB_USER="your_username"
export DB_PASSWORD="your_password"
export OUTPUT_DIR="/path/to/output/"
# Run forecasting for date range
java -classpath "OpenForecast-0.4.0.jar:sqljdbc.jar:mssqlserver.jar:msbase.jar:msutil.jar:shiftone-jrat.jar:." \
CpForecasting "10/25/05" "10/25/05"
# Generate PDF charts
java -classpath "itext-1.4.3.jar:jfreechart-1.0.1.jar:jcommon-1.0.5.jar:sqljdbc.jar:servlet.jar:." \
CpCharting cpchartsOutput:
- Forecasts written to
cp_processed_results_sttable - PDFs generated:
cpcharts1.pdf,cpcharts2.pdf, etc.
Language: Java 1.4/1.5 (2002-2004 era)
Database: Microsoft SQL Server 2000/2005
Platform: Windows (hardcoded C:\ paths)
Build: Manual javac (pre-Maven/Gradle)
VCS: File-based backups (pre-Git ubiquity)
| Library | Version | Purpose | License |
|---|---|---|---|
| OpenForecast | 0.4.0 | Statistical forecasting models | LGPL |
| JFreeChart | 1.0.1 | Chart generation | LGPL |
| jcommon | 1.0.5 | JFreeChart dependency | LGPL |
| iText | 1.4.3 | PDF generation | MPL/LGPL |
| MS JDBC Driver | 1.0 | Database connectivity | Proprietary |
Note: These libraries are from 2005-2006 and have known security vulnerabilities. Dependencies are not included in this repository due to licensing. Users must obtain them independently.
Understanding the era this was built in:
Trading Landscape:
- 📈 Pre-financial crisis bull market
- 💰 Commission costs: $7-10/trade (vs $0 today)
- 🤖 Algorithmic trading: Primarily institutional
- 📊 Real-time data: Just becoming accessible to retail
- 🎯 Market efficiency: More exploitable patterns than today
Technology Landscape:
- ☕ Java 5 was cutting-edge (released 2004)
- 🗄️ SQL Server 2005 was latest version
- 💻 Most developers used Eclipse or NetBeans
- 📦 Maven was new (2004), Gradle didn't exist
- 🌐 Stack Overflow didn't exist until 2008
- 🔧 Git was brand new (2005), SVN was standard
Development Challenges:
- Finding documentation for statistical forecasting
- Managing classpath dependencies manually
- Debugging JDBC connection issues without good tooling
- No online communities for troubleshooting
- Limited examples of retail algo trading systems
✅ Ensemble methods in retail trading - Most retail traders used single indicators
✅ Adaptive learning - Rolling windows weren't standard practice
✅ Automated end-to-end pipeline - Most analysis was manual
✅ High-frequency data processing - 15-second granularity was ambitious
✅ Systematic approach - Data-driven decisions vs discretionary trading
Software Engineering:
- ✅ Object-oriented design in Java
- ✅ Complex library integration (12 dependencies)
- ✅ Database design for time-series data
- ✅ Multi-threaded data processing concepts
- ✅ File I/O and resource management
Quantitative Analysis:
- ✅ Statistical forecasting methodology
- ✅ Time-series analysis techniques
- ✅ Model parameter optimization
- ✅ Signal generation from predictions
- ✅ Understanding of financial market dynamics
Data Engineering:
- ✅ Schema design for high-frequency data
- ✅ ETL pipeline for market data
- ✅ Query optimization for large datasets
- ✅ Batch processing architecture
Domain Expertise:
- ✅ Market microstructure understanding
- ✅ Intraday price dynamics
- ✅ Statistical forecasting in finance
- ✅ Trading signal generation
Self-Directed Learning:
- Taught myself statistical forecasting while building system
- Navigated complex library integration without Stack Overflow
- Bridged multiple domains independently
End-to-End Ownership:
- Requirements gathering (implicit)
- System architecture and design
- Implementation and testing
- Deployment and operation
Problem-Solving:
- Selected appropriate models for stock price data
- Handled timestamp precision across layers
- Optimized database queries for performance
- Debugged complex issues in production
Being transparent about what could be improved:
❌ Hardcoded database credentials (now removed for archive)
❌ SQL injection vulnerabilities in dynamic queries
❌ No input validation
❌ Outdated libraries with known CVEs
❌ No encryption for sensitive data
❌ No formal backtesting framework
❌ Missing risk management (stop-loss, position sizing)
❌ Transaction costs not modeled
❌ Single feature (price only, no volume/volatility)
❌ No systematic model selection logic
❌ Hardcoded parameters throughout
❌ Massive methods (490+ lines)
❌ Limited error handling
❌ Commented-out code left in place
❌ No unit tests
❌ Magic numbers
❌ Windows-specific paths
If rebuilding in 2025:
# Modern Python implementation
Technology Stack:
- Language: Python 3.12+
- Data: pandas, numpy
- ML: scikit-learn, TensorFlow/PyTorch
- Backtesting: Backtrader, Zipline
- Database: PostgreSQL or TimescaleDB
- Execution: Alpaca/IB API
- Infrastructure: Docker, Kubernetes
Architecture:
- Microservices for each component
- Event-driven with message queues
- Cloud-native (AWS Lambda, etc.)
- Real-time streaming (Kafka)
- Proper CI/CD pipeline
Improvements:
✅ Feature engineering (volume, volatility, sentiment)
✅ Deep learning models (LSTM, Transformers)
✅ Comprehensive backtesting with walk-forward
✅ Risk management (Kelly criterion, stop-losses)
✅ Transaction cost modeling
✅ Model selection via cross-validation
✅ A/B testing framework
✅ Monitoring and alerting
✅ Unit tests with >80% coverage// 1. Prepared Statements (prevents SQL injection, improves performance)
insertForecast = connection.prepareStatement(
"insert into cp_processed_results_st (...) values (?, ?, ?, ?, ?, ?, ?, 50, ?)"
);
// 2. Batch Processing (single query retrieves all data)
String searchUserQuery = "select ... order by symbol, trandate, entrydate";
ResultSet rs = stmt.executeQuery(searchUserQuery);
// 3. In-Memory Processing (no disk I/O during forecasting)
DataSet observedData = new DataSet();
model.init(observedData);
// 4. File Splitting (manageable PDF sizes)
if (page == 100) {
document.close();
// Create new file
}| Dimension | Capability | Bottleneck |
|---|---|---|
| Symbols | Sequential processing | Single-threaded |
| Date Range | Configurable | Memory for large ranges |
| Chart Generation | Thousands per run | Disk I/O for PDFs |
| Forecasting | Real-time capable | Model training time |
Modern Improvement: Parallelize symbol processing, use distributed computing for backtesting.
1. Classical Time-Series Forecasting
- Exponential smoothing theory and practice
- Moving average techniques
- Regression-based forecasting
- Rolling window adaptation
2. Financial Data Processing
- Handling tick-by-tick data
- Time-series database design
- Signal generation from predictions
- Backtesting concepts (implicit)
3. Java Software Engineering (2006 era)
- JDBC database connectivity
- Third-party library integration
- PDF generation techniques
- Chart rendering pipelines
4. Evolution of Trading Technology
- Compare 2006 approaches to 2025 methods
- See what concepts remain relevant
- Understand why certain practices were retired
- Appreciate modern tooling improvements
This codebase could support:
- Finance courses: Example of quantitative trading system
- Statistics courses: Applied time-series forecasting
- Software engineering courses: Legacy code analysis
- History of computing: Evolution of retail FinTech
Status: This is a historical archive and is not actively maintained.
- ❌ No bug fixes planned
- ❌ No feature additions
- ❌ No support provided
- ✅ Documentation improvements welcome
- ✅ Historical context additions welcome
If you find this interesting:
- ⭐ Star the repository
- 🔍 Study the code
- 💬 Open discussions for historical questions
- 📝 Cite in academic work if useful
See CONTRIBUTING.md for more details.
This project is licensed under the MIT License - see the LICENSE file for details.
Important: This license includes additional disclaimers specific to this historical archive:
⚠️ Not suitable for production use⚠️ Contains known security vulnerabilities⚠️ No warranties or support⚠️ Educational purposes only
Third-party libraries (OpenForecast, JFreeChart, iText, JDBC drivers) have their own licenses and are not included.
Built independently using:
- OpenForecast library for statistical forecasting models
- JFreeChart for professional chart generation
- iText for PDF document creation
- Microsoft SQL Server for data storage
- Microsoft JDBC Driver for database connectivity
Special thanks to the open-source community of 2006 for making these libraries available.
Repository: https://github.com/kbadinger/cherry-picker
Author: Kevin Badinger
Created: 2006-2007 Archived: 2025 Purpose: Historical documentation and educational reference
Related Documentation:
- Detailed Technical Overview
- Database Schema
- Configuration Guide
- Security Considerations
- Contributing Guidelines
📊 Codebase Metrics:
Lines of Code: ~1,150
Java Files: 2 primary classes
Configuration Files: 150+ XML files
Dependencies: 12 JAR files (~13MB)
Database Tables: 3 primary tables
🎯 Functionality:
Forecasting Models: 5 distinct algorithms
Data Granularity: 15-second intervals
Chart Layouts: 4 per page
PDF Batch Size: 100 pages per file
Date Range: Configurable (tested 2005-2007)
⏱️ Development:
Timeline: 2006-2007 (exact dates unknown)
Team Size: 1 (solo developer)
Architecture: Designed from scratch
Testing: Manual validation via PDF reports
If you find this historical archive interesting or educational, please consider starring the repository!