# Stage 02: Tooling Setup & Environment Configuration
**Project:** Turtle Trading Strategy Research  
**Author:** Panwei Hu  
**Date:** 2025-01-27

## Objectives
- Set up development environment for quantitative research
- Configure data sources and API access
- Establish project structure and dependencies
- Test core libraries and functionality
- Create reproducible research environment

## Environment Requirements
- **Python 3.8+** with scientific computing stack
- **Financial Data APIs** (Alpha Vantage, Yahoo Finance)
- **Time Series Analysis** libraries (pandas, numpy)
- **Backtesting Framework** components
- **Visualization** tools (matplotlib, seaborn)
- **Version Control** and project organization


In [1]:
import sys
import os
from pathlib import Path
import pkg_resources
from importlib import import_module
import warnings
warnings.filterwarnings('ignore')

# Set up project paths
PROJECT_ROOT = Path('..').resolve()
sys.path.append(str(PROJECT_ROOT / 'src'))

print("üê¢ Turtle Trading - Tooling Setup & Environment Check")
print("="*60)
print(f"Python Version: {sys.version}")
print(f"Project Root: {PROJECT_ROOT}")
print(f"Working Directory: {Path.cwd()}")

# Check Python version
major, minor = sys.version_info[:2]
if major >= 3 and minor >= 8:
    print("‚úÖ Python version compatible (3.8+)")
else:
    print("‚ùå Python version too old. Requires 3.8+")
    
print("\n" + "="*60)


  import pkg_resources


üê¢ Turtle Trading - Tooling Setup & Environment Check
Python Version: 3.11.13 (main, Jun  5 2025, 08:21:08) [Clang 14.0.6 ]
Project Root: /Users/panweihu/Desktop/Desktop_m1/NYU_mfe/bootcamp/camp4/bootcamp_bill_panwei_hu/turtle_project
Working Directory: /Users/panweihu/Desktop/Desktop_m1/NYU_mfe/bootcamp/camp4/bootcamp_bill_panwei_hu/turtle_project/notebooks
‚úÖ Python version compatible (3.8+)



In [4]:
# Core Dependencies Check
required_packages = {
    'numpy': '1.19.0',
    'pandas': '1.3.0', 
    'matplotlib': '3.3.0',
    'seaborn': '0.11.0',
    'scipy': '1.7.0',
    'requests': '2.25.0',
    'dotenv': '0.9.0',
    'yfinance': '0.1.70'
}

optional_packages = {
    'pyarrow': '5.0.0',
    'fastparquet': '0.7.0',
    'beautifulsoup4': '4.9.0',
    'lxml': '4.6.0',
    'openpyxl': '3.0.0'
}

def check_package(package_name, min_version=None, optional=False):
    """Check if package is installed and meets version requirements"""
    try:
        pkg = import_module(package_name.replace('-', '_'))
        
        # Get version
        if hasattr(pkg, '__version__'):
            version = pkg.__version__
        else:
            try:
                version = pkg_resources.get_distribution(package_name).version
            except:
                version = "unknown"
        
        # Version check
        status = "‚úÖ"
        if min_version and version != "unknown":
            try:
                from packaging import version as pkg_version
                if pkg_version.parse(version) < pkg_version.parse(min_version):
                    status = "‚ö†Ô∏è " if optional else "‚ùå"
            except:
                pass
                
        print(f"{status} {package_name}: {version}")
        return True
        
    except ImportError:
        status = "‚ö†Ô∏è " if optional else "‚ùå"
        print(f"{status} {package_name}: NOT INSTALLED")
        return False

print("üì¶ Core Dependencies:")
core_success = 0
for package, min_ver in required_packages.items():
    if check_package(package, min_ver, optional=False):
        core_success += 1

print(f"\nüì¶ Optional Dependencies:")
optional_success = 0
for package, min_ver in optional_packages.items():
    if check_package(package, min_ver, optional=True):
        optional_success += 1

print(f"\nüìä Dependency Summary:")
print(f"Core packages: {core_success}/{len(required_packages)} ({'‚úÖ' if core_success == len(required_packages) else '‚ùå'})")
print(f"Optional packages: {optional_success}/{len(optional_packages)}")

if core_success == len(required_packages):
    print("\nüéâ All core dependencies satisfied!")
else:
    print(f"\n‚ö†Ô∏è  Missing {len(required_packages) - core_success} core dependencies. Install with:")
    print("pip install -r requirements.txt")


üì¶ Core Dependencies:
‚úÖ numpy: 2.0.1
‚úÖ pandas: 2.3.1
‚úÖ matplotlib: 3.10.0
‚úÖ seaborn: 0.13.2
‚úÖ scipy: 1.16.0
‚úÖ requests: 2.32.4
‚úÖ dotenv: 0.9.9
‚úÖ yfinance: 0.2.65

üì¶ Optional Dependencies:
‚úÖ pyarrow: 21.0.0
‚úÖ fastparquet: 2024.11.0
‚ö†Ô∏è  beautifulsoup4: NOT INSTALLED
‚úÖ lxml: 6.0.0
‚ö†Ô∏è  openpyxl: NOT INSTALLED

üìä Dependency Summary:
Core packages: 8/8 (‚úÖ)
Optional packages: 3/5

üéâ All core dependencies satisfied!


In [5]:
# Environment Configuration & API Setup
from dotenv import load_dotenv

print("üîß Environment Configuration:")
print("="*40)

# Load environment variables
env_file = PROJECT_ROOT / '.env'
load_dotenv(env_file)

# Check for .env file
if env_file.exists():
    print(f"‚úÖ .env file found: {env_file}")
else:
    print(f"‚ö†Ô∏è  .env file not found: {env_file}")
    print("   Create .env file in project root with API keys")

# Check API keys
api_keys = {
    'ALPHAVANTAGE_API_KEY': 'Alpha Vantage (primary data source)',
    'QUANDL_API_KEY': 'Quandl (alternative data)',
    'FRED_API_KEY': 'FRED (economic data)',
}

print(f"\nüîë API Key Status:")
for key, description in api_keys.items():
    value = os.getenv(key)
    if value:
        masked = value[:4] + '*' * (len(value) - 8) + value[-4:] if len(value) > 8 else '*' * len(value)
        print(f"‚úÖ {key}: {masked} ({description})")
    else:
        print(f"‚ùå {key}: NOT SET ({description})")

# Environment variables for project
project_env = {
    'DATA_DIR': os.getenv('DATA_DIR', str(PROJECT_ROOT / 'data')),
    'CACHE_DIR': os.getenv('CACHE_DIR', str(PROJECT_ROOT / 'cache')),
    'LOG_LEVEL': os.getenv('LOG_LEVEL', 'INFO'),
}

print(f"\n‚öôÔ∏è  Project Configuration:")
for key, value in project_env.items():
    print(f"  {key}: {value}")

# Ensure directories exist
data_dir = Path(project_env['DATA_DIR'])
cache_dir = Path(project_env['CACHE_DIR'])

for directory in [data_dir, cache_dir, data_dir / 'raw', data_dir / 'processed']:
    directory.mkdir(parents=True, exist_ok=True)
    print(f"üìÅ Created/verified: {directory}")

print(f"\n‚úÖ Environment configuration complete!")


üîß Environment Configuration:
‚úÖ .env file found: /Users/panweihu/Desktop/Desktop_m1/NYU_mfe/bootcamp/camp4/bootcamp_bill_panwei_hu/turtle_project/.env

üîë API Key Status:
‚úÖ ALPHAVANTAGE_API_KEY: M230********HLFK (Alpha Vantage (primary data source))
‚ùå QUANDL_API_KEY: NOT SET (Quandl (alternative data))
‚ùå FRED_API_KEY: NOT SET (FRED (economic data))

‚öôÔ∏è  Project Configuration:
  DATA_DIR: ./data
  CACHE_DIR: /Users/panweihu/Desktop/Desktop_m1/NYU_mfe/bootcamp/camp4/bootcamp_bill_panwei_hu/turtle_project/cache
  LOG_LEVEL: INFO
üìÅ Created/verified: data
üìÅ Created/verified: /Users/panweihu/Desktop/Desktop_m1/NYU_mfe/bootcamp/camp4/bootcamp_bill_panwei_hu/turtle_project/cache
üìÅ Created/verified: data/raw
üìÅ Created/verified: data/processed

‚úÖ Environment configuration complete!


In [7]:

# Core Functionality Tests
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import requests

print("üß™ Core Functionality Tests:")
print("="*40)

# Test 1: NumPy operations
print("1Ô∏è‚É£ NumPy Test:")
arr = np.random.randn(1000)
print(f"   Array shape: {arr.shape}")
print(f"   Mean: {arr.mean():.4f}")
print(f"   Std: {arr.std():.4f}")
print("   ‚úÖ NumPy working")

# Test 2: Pandas operations
print("\n2Ô∏è‚É£ Pandas Test:")
dates = pd.date_range('2024-01-01', periods=100, freq='D')
df = pd.DataFrame({
    'date': dates,
    'price': 100 + np.cumsum(np.random.randn(100) * 0.02),
    'volume': np.random.randint(1000000, 10000000, 100)
})
print(f"   DataFrame shape: {df.shape}")
print(f"   Date range: {df['date'].min()} to {df['date'].max()}")
print(f"   Price range: ${df['price'].min():.2f} - ${df['price'].max():.2f}")
print("   ‚úÖ Pandas working")

# Test 3: Plotting
print("\n3Ô∏è‚É£ Matplotlib Test:")
try:
    fig, ax = plt.subplots(figsize=(10, 6))
    ax.plot(df['date'], df['price'], label='Price', linewidth=2)
    ax.set_title('Sample Price Series')
    ax.set_xlabel('Date')
    ax.set_ylabel('Price ($)')
    ax.legend()
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.close()  # Close to avoid display in notebook
    print("   ‚úÖ Matplotlib working")
except Exception as e:
    print(f"   ‚ùå Matplotlib error: {e}")

# Test 4: Internet connectivity
print("\n4Ô∏è‚É£ Internet Connectivity Test:")
try:
    response = requests.get('https://httpbin.org/status/200', timeout=5)
    if response.status_code == 200:
        print("   ‚úÖ Internet connection working")
    else:
        print(f"   ‚ö†Ô∏è  Unexpected status code: {response.status_code}")
except Exception as e:
    print(f"   ‚ùå Internet connection failed: {e}")

# Test 5: Data source accessibility
print("\n5Ô∏è‚É£ Data Source Test:")
try:
    # Test Yahoo Finance
    import yfinance as yf
    ticker = yf.Ticker("SPY")
    hist = ticker.history(period="5d")
    if not hist.empty:
        print("   ‚úÖ Yahoo Finance accessible")
    else:
        print("   ‚ö†Ô∏è  Yahoo Finance returned empty data")
except Exception as e:
    print(f"   ‚ùå Yahoo Finance error: {e}")

# Test Alpha Vantage if API key available
alpha_key = os.getenv('ALPHAVANTAGE_API_KEY')
if alpha_key:
    try:
        url = f"https://www.alphavantage.co/query?function=GLOBAL_QUOTE&symbol=SPY&apikey={alpha_key}"
        response = requests.get(url, timeout=10)
        data = response.json()
        if 'Global Quote' in data:
            print("   ‚úÖ Alpha Vantage API accessible")
            print(data)
        else:
            print(f"   ‚ö†Ô∏è  Alpha Vantage API response: {list(data.keys())}")
    except Exception as e:
        print(f"   ‚ùå Alpha Vantage API error: {e}")
else:
    print("   ‚ö†Ô∏è  Alpha Vantage API key not configured")

print("\nüéØ Functionality test complete!")


üß™ Core Functionality Tests:
1Ô∏è‚É£ NumPy Test:
   Array shape: (1000,)
   Mean: 0.0492
   Std: 0.9951
   ‚úÖ NumPy working

2Ô∏è‚É£ Pandas Test:
   DataFrame shape: (100, 3)
   Date range: 2024-01-01 00:00:00 to 2024-04-09 00:00:00
   Price range: $99.75 - $100.01
   ‚úÖ Pandas working

3Ô∏è‚É£ Matplotlib Test:
   ‚úÖ Matplotlib working

4Ô∏è‚É£ Internet Connectivity Test:
   ‚úÖ Internet connection working

5Ô∏è‚É£ Data Source Test:
   ‚úÖ Yahoo Finance accessible
   ‚úÖ Alpha Vantage API accessible
{'Global Quote': {'01. symbol': 'SPY', '02. open': '643.1200', '03. high': '644.1050', '04. low': '638.4800', '05. price': '639.8100', '06. volume': '69750731', '07. latest trading day': '2025-08-19', '08. previous close': '643.3000', '09. change': '-3.4900', '10. change percent': '-0.5425%'}}

üéØ Functionality test complete!
