Skip to content

Release 4.0.0: split finance/trading to alphapy-finance#8

Merged
mrconway merged 4 commits into
mainfrom
develop
Apr 26, 2026
Merged

Release 4.0.0: split finance/trading to alphapy-finance#8
mrconway merged 4 commits into
mainfrom
develop

Conversation

@mrconway
Copy link
Copy Markdown
Collaborator

Summary

Promotes 4.0.0 develop work to main:

After merge, tag `v4.0.0` on main.

Test plan

  • `pytest tests/` → 18 passed, 1 skipped
  • alphapy CLI works, version is 4.0.0
  • alphapy-finance receiving end already validated (PR Document proper branching workflow #1 there → 396 passed)

mrconway and others added 3 commits April 26, 2026 15:37
Pre-split prep: removes the only module-level finance imports in the
ML core. yfinance, AlpacaDataSource, and PolygonDataSource are now
imported inside the functions that use them (get_yahoo_data,
get_alpaca_data, get_polygon_data).

This lets alphapy-pro import cleanly without yfinance/alpaca-py/
polygon-api-client installed, which is required for the upcoming
alphapy-finance split where those packages live downstream.

Verified: tests/test_imports.py, tests/test_utilities.py,
tests/test_version.py all pass (18 passed, 1 skipped).
Pre-split prep: lazy-import finance deps in alphapy/data.py
BREAKING: All trading, markets, and Alfi code has moved to the private
alphapy-finance repo. See tag v3.1.1-monolith for the pre-split state.

Removed from alphapy-pro:
- alphapy/{mflow_main,system,backtest,metalabel}.py
- alphapy/{portfolio,indicators,data_sources}/
- Market-data functions in alphapy/data.py (get_market_data,
  get_yahoo_data, get_polygon_data, get_alpaca_data, get_eodhd_data,
  convert_data, convert_offset, assign_global_data, standardize_data,
  resample_ohlcv).
- Top-level app/ (Alfi backend+frontend+agent+research) and agent/
  (legacy trading agent).
- 13 trading project examples.
- scripts/{run_agent,run_research,sync_tradingagents}.py and
  scripts/markets/.
- tests/{agent,app_agent,backend,integration}/ and finance tests
  in tests/alphapy/.
- docs/PLAN-markets.md, docs/markets/.
- config/sources.yml.template, systems.yml, groups.yml,
  indicators.yml.
- Entry points: mflow, scott, research.
- ~30 dependencies (alpaca-py, vectorbt, fastapi, langchain*,
  yfinance, polygon-api-client, etc.).

Kept in alphapy-pro:
- alphapy/{alphapy_main,model,data,frame,features,transforms,
  variables,globals,estimators,optimize,plots,utilities,space,
  group,alias,calendrical}.py.
- 3 example projects: kaggle, pizza, time-series.
- tests/{test_imports,test_utilities,test_version}.py and
  tests/conftest.py.
- 26 ML-core dependencies.
- Single entry point: alphapy.

Updated:
- alphapy/__init__.py: bumped to 4.0.0, added __all__.
- pyproject.toml: trimmed deps, dropped finance entry points and
  packages, kept hatch-managed dynamic version.
- README.md, CLAUDE.md, AGENTS.md: rewritten for ML-only scope.
- CHANGELOG.md: 4.0.0 release notes.
- config/alphapy.yml: stripped mflow section.

Verification:
- pytest tests/ -> 18 passed, 1 skipped.
- alphapy --help works.
- import alphapy succeeds without finance deps installed.
- rg "yfinance|alpaca|polygon|vectorbt" alphapy/ -> zero hits.

Migration for downstream consumers:
- alphapy-finance now owns all trading code.
- alphapy-sports unaffected (already isolated).
@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Claude finished @mrconway's task —— View job


✅ PR Review Complete

Reviewed the Release 4.0.0 PR that implements lazy imports for finance/trading dependencies.

Progress

  • Read and understand the PR context and changes
  • Analyze the modified file alphapy/data.py
  • Review code quality and best practices
  • Check for potential bugs or issues
  • Assess performance considerations
  • Review security concerns
  • Evaluate test coverage implications
  • Provide comprehensive review feedback

📋 Review Summary

This PR implements a clean "lazy import" pattern to decouple AlphaPy from finance-specific dependencies, supporting the v4.0.0 split to alphapy-finance.

APPROVE - This is excellent, production-ready code

🎯 Key Changes Analysis

The changes in alphapy/data.py (+3/-2 lines) implement lazy imports:

Removed global imports:

- from alphapy.data_sources import AlpacaDataSource, PolygonDataSource  # Line 39
- import yfinance as yf  # Line 50

Added function-level imports:

+ from alphapy.data_sources import PolygonDataSource  # Line 355 in get_polygon_data()
+ import yfinance as yf  # Line 429 in get_yahoo_data()  
+ from alphapy.data_sources import AlpacaDataSource  # Line 587 in get_alpaca_data()

🏆 Code Quality & Best Practices

Excellent Implementation:

  • Perfect lazy import pattern - imports moved to point of use within functions
  • Zero functional changes - maintains exact same API and behavior
  • Clean separation of concerns - ML core decoupled from finance dependencies
  • Consistent style - follows existing codebase patterns

🔍 Technical Assessment

No Issues Found:

  • No bugs - transformations are mechanical and preserve all functionality
  • No security concerns - no changes to API keys, authentication, or data handling
  • No performance impact - imports happen once per function call, negligible overhead
  • Test coverage maintained - existing tests validate the refactored import pattern

Performance Consideration:

  • Import overhead: Minimal - yfinance and data source imports occur only when those specific data sources are used
  • Runtime impact: Near zero - imports are cached after first use per Python module system

📝 Architecture Benefits

  1. Dependency Isolation: Core AlphaPy can run without finance dependencies
  2. Modular Design: Supports the v4.0.0 package split strategy
  3. Backward Compatibility: Existing user code continues to work unchanged
  4. Error Handling: Import failures will be localized to specific data source functions

🧪 Test Coverage

The PR mentions comprehensive testing:

  • pytest tests/ → 18 passed, 1 skipped
  • alphapy CLI works, version is 4.0.0
  • alphapy-finance validation (396 tests passed)

This covers both the ML pipeline and the separated finance functionality.

🎯 Recommendation: APPROVE

This is a textbook example of clean dependency management. The lazy import pattern is implemented perfectly, with zero breaking changes and clear architectural benefits for the v4.0.0 split.

Ready for merge

4.0.0: Split off finance/trading code to alphapy-finance
@mrconway mrconway merged commit b910c12 into main Apr 26, 2026
3 of 5 checks passed
@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Claude finished @mrconway's task —— View job


PR Review Analysis: AlphaPy 4.0.0 Release

This is a comprehensive review of the major v4.0.0 release that splits finance/trading functionality into the private alphapy-finance repository.

Progress

  • Read and understand the PR context and changes
  • Analyze the modified file alphapy/data.py
  • Review code quality and best practices
  • Check for potential bugs or issues
  • Assess performance considerations
  • Review security concerns
  • Evaluate test coverage implications
  • Provide comprehensive review feedback

Review Summary

Overall Assessment: ✅ EXCELLENT - This is a well-executed major architectural refactor that successfully splits the codebase while maintaining code quality and functionality.

Code Quality & Best Practices

✅ Strengths:

  • Clean Split Architecture: The removal of 3.4M+ lines of finance-specific code creates a focused, domain-agnostic ML framework
  • Proper Dependency Management: Removed ~30 finance-related dependencies from pyproject.toml, significantly reducing the package footprint
  • Clear Migration Path: Excellent documentation in CHANGELOG.md with specific migration instructions for downstream consumers
  • Consistent Code Style: The remaining alphapy/data.py maintains consistent coding patterns and documentation standards
  • Proper Version Bump: Clean version increment to 4.0.0 reflecting the breaking changes

🟨 Areas for Improvement:

  • Import Warning Suppression: alphapy/__init__.py:12 globally suppresses sklearn feature name warnings - consider using more targeted warning filters
  • Error Handling: alphapy/data.py:104-106 has fallback logic but could benefit from more explicit error logging when primary data directory fails

Potential Bugs & Issues

✅ No Critical Issues Found

🟨 Minor Considerations:

  • Data Loading Fallback: In alphapy/data.py:104-106, the fallback from data_dir to input_dir happens silently - consider adding a warning log to help users understand why their data directory wasn't found
  • LabelEncoder State: In alphapy/data.py:125, LabelEncoder() is created and used immediately without storing the fitted encoder - this is fine for training but may cause issues if you need to transform new data with the same encoding

Performance Considerations

✅ Positive Impact:

  • Reduced Package Size: Removing 3.4M+ lines significantly reduces memory footprint and import times
  • Dependency Reduction: Fewer dependencies means faster installation and fewer potential conflicts
  • Polars Integration: The use of Polars DataFrames in data.py provides excellent performance for large datasets

✅ No Performance Regressions Identified

Security Concerns

✅ Good Security Posture:

  • Dependency Cleanup: Removing unused dependencies reduces attack surface
  • No Credential Exposure: All finance-related API clients and potential credential handling moved to separate repo
  • Standard Library Usage: Core functionality relies on well-established libraries (sklearn, polars, numpy)

✅ No Security Issues Found

Test Coverage Implications

✅ Well-Managed Test Migration:

  • Passing Tests: All remaining tests (18 passed, 1 skipped) indicate core functionality is intact
  • Import Protection: tests/test_imports.py has good error handling for optional dependencies and compatibility issues
  • Appropriate Test Scope: Tests now focus on the core ML pipeline functionality rather than finance-specific features

🟨 Test Coverage Considerations:

  • Data Module Coverage: The heavily modified alphapy/data.py could benefit from more specific unit tests for edge cases in the data loading logic
  • Integration Tests: Consider adding tests that verify the ML pipeline works end-to-end without finance dependencies

Specific Code Review Notes

alphapy/data.py Changes

Lines 104-106: Fallback directory logic

if df.is_empty():
    input_dir = SSEP.join([run_dir, 'input'])
    df = read_frame(input_dir, filename, extension, separator)

Suggestion: Add logging to indicate when fallback occurs:

if df.is_empty():
    logger.info("Data not found in %s, trying input directory", data_dir)
    input_dir = SSEP.join([run_dir, 'input'])
    df = read_frame(input_dir, filename, extension, separator)

alphapy/__init__.py Changes

Line 12: Global warning suppression

warnings.filterwarnings("ignore", message="X does not have valid feature names")

Suggestion: Consider more targeted filtering or document why this specific warning needs suppression.

Migration Assessment

✅ Excellent Migration Strategy:

  • Clear documentation of what moved where
  • Preservation of version history via tags (v3.1.1-monolith)
  • Specific guidance for consuming repositories
  • Breaking changes clearly marked in changelog

Conclusion

This is an exemplary major version release that successfully achieves its architectural goals. The code quality remains high, no significant issues were introduced, and the migration path is well-documented. The split creates a cleaner, more focused ML framework while properly preserving the finance functionality in a separate repository.

Recommendation: ✅ APPROVE - This PR successfully delivers on its objectives with minimal risk and excellent execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant