### Short-Term Portfolio Selection Strategy

This notebook executes a quantitative stock selection strategy for short-term mean-reversion opportunities.

**Workflow:**
1.  **Prerequisites:** The final merged data file for the target date must exist. The core selection logic (`select_short_term_stocks_debug`) and save/load helpers are assumed to be in `src/utils.py`.
2.  **Load Data:** Loads the universe of stocks and their features.
3.  **Execute Strategy:** Applies a set of predefined filters and a weighted scoring model to the data universe to select a small portfolio of stocks.
4.  **Analyze & Save Results:** Enriches the selected portfolio with descriptive data (Company, Industry) and saves the results (portfolio DataFrame and parameters used) to disk.
5.  **Verify Calculation:** Performs a manual, step-by-step recalculation of the scores for a single ticker to validate the core logic.

### Setup and Configuration

This cell defines all parameters for the strategy run, including filters, scoring weights, and file paths. **This is the main cell to modify for tuning the strategy.**

In [3]:
# py8_portf_picks_short_term_v6.ipynb

import sys
from pathlib import Path
import pandas as pd
import json
import numpy as np # Import numpy for the assertion

# --- Project Path Setup ---
NOTEBOOK_DIR = Path.cwd()
ROOT_DIR = NOTEBOOK_DIR.parent if NOTEBOOK_DIR.name == 'notebooks' else NOTEBOOK_DIR
SRC_DIR = ROOT_DIR / 'src'
if str(SRC_DIR) not in sys.path:
    sys.path.append(str(SRC_DIR))

# --- Dynamic Configuration (from config.py) ---
from config import DATE_STR, DEST_DIR
import utils # Import your custom utility library

# --- Strategy Parameters for THIS RUN ---
# These parameters will be passed to the function, overriding its defaults.
N_SELECT = 10

STRATEGY_FILTERS = {
    'min_price': 10.0,
    'min_avg_volume_m': 2.0,
    'min_roe_pct': 5.0,
    'max_debt_eq': 1.5
}

STRATEGY_SCORING_WEIGHTS = {
    'rsi': 0.35,
    'change': 0.35,
    'rel_volume': 0.20,
    'volatility': 0.10
}

STRATEGY_INV_VOL_COL = 'ATR/Price %'

# --- File Path Construction ---
DATA_DIR = Path(DEST_DIR)
SOURCE_PATH = DATA_DIR / f'{DATE_STR}_df_finviz_merged_stocks_etfs.parquet'
OUTPUT_BASE_PATH = ROOT_DIR / 'output' / 'selection_results' / f'{DATE_STR}_short_term_mean_reversion'

# --- Notebook Setup ---
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 2000)
pd.set_option('display.max_rows', 200)
pd.set_option('display.float_format', '{:.4f}'.format)
%load_ext autoreload
%autoreload 2

# --- Verification ---
print(f"Executing strategy for Date: {DATE_STR}")
print(f"Source file: {SOURCE_PATH}")
print(f"Output will be saved with base path: {OUTPUT_BASE_PATH}")
assert abs(sum(STRATEGY_SCORING_WEIGHTS.values()) - 1.0) < 1e-9, "Scoring weights must sum to 1.0"

Executing strategy for Date: 2025-06-13
Source file: ..\data\2025-06-13_df_finviz_merged_stocks_etfs.parquet
Output will be saved with base path: c:\Users\ping\Files_win10\python\py311\stocks\output\selection_results\2025-06-13_short_term_mean_reversion


### Step 1: Load Data Universe

Load the complete dataset from which the selection will be made.

In [31]:
print(f"--- Step 1: Loading data from {SOURCE_PATH.name} ---")

try:
    df_finviz = pd.read_parquet(SOURCE_PATH)
    print(f"Successfully loaded data for {len(df_finviz)} tickers.")
except FileNotFoundError:
    print(f"ERROR: Source file not found at {SOURCE_PATH}. Halting execution.")
    df_finviz = None

--- Step 1: Loading data from 2025-06-13_df_finviz_merged_stocks_etfs.parquet ---
Successfully loaded data for 1530 tickers.


### Step 2: Execute Selection Strategy

Run the core selection logic using the parameters defined in the setup cell. This function is assumed to be in `utils.py`.

In [32]:
if df_finviz is not None:
    print("\n--- Step 2: Executing stock selection strategy ---")
    
    # Pass the strategy-specific parameters defined in the setup cell.
    df_selected, df_filtered, params_used = utils.select_short_term_stocks_debug(
        df_finviz=df_finviz,
        n_select=N_SELECT,
        filters=STRATEGY_FILTERS,
        scoring_weights=STRATEGY_SCORING_WEIGHTS,
        inv_vol_col_name=STRATEGY_INV_VOL_COL
    )

    if df_selected.empty:
        print("\nNo stocks were selected based on the current criteria.")
    else:
        print(f"\nStrategy executed successfully. Selected {len(df_selected)} stocks.")
else:
    print("\nSkipping strategy execution because data failed to load.")
    df_selected, df_filtered, params_used = pd.DataFrame(), pd.DataFrame(), {}


--- Step 2: Executing stock selection strategy ---

Strategy executed successfully. Selected 10 stocks.


### Step 3: Analyze and Enrich Results

Display the selected stocks and add key descriptive columns from the original Finviz data for better context.

In [33]:
if not df_selected.empty:
    print("\n--- Step 3: Analyzing and enriching selected portfolio ---")

    # Add key descriptive columns for context
    cols_to_add = ['Company', 'Industry', 'Market Cap, M', 'Rank']    
    df_display = utils.add_columns_from_source(
        base_df=df_selected,
        source_df=df_finviz,
        cols_to_add=cols_to_add,
        match_on_base_index=True
    )
    
    # # --- FIX: Use the variable defined in the setup cell ---
    # # The variable `STRATEGY_INV_VOL_COL` was defined in our setup cell.
    # display_cols = cols_to_add + [
    #     'final_score', 'Weight_EW', 'Weight_IV', 'Weight_SW', 
    #     'RSI', 'Change %', 'Rel Volume', STRATEGY_INV_VOL_COL
    # ]
    
    print("Top selected stocks with scores and weights:")
    # display(df_display[display_cols])
    # display(df_display)   
    print(f'df_display:\n{df_display}')  
else:
    print("\nNo results to analyze.")


--- Step 3: Analyzing and enriching selected portfolio ---
Top selected stocks with scores and weights:
df_display:
                  Company                             Industry  Market Cap, M  Rank  Avg Volume, M  Debt/Eq   ROE %    Price  Rel Volume     RSI  Change %  ATR/Price %   z_RSI  z_Change%  z_RelVolume  z_ATR/Price%  final_score  Weight_EW  Weight_IV  Weight_SW
BEKE  KE Holdings Inc ADR                 Real Estate Services     21170.0000   623         9.3300   0.3200  6.5300  18.4700      5.0800 42.5200   -2.7400       3.1402 -0.9659    -0.6954       8.2710        0.3512       2.2005     0.1000     0.0932     0.1425
ADBE            Adobe Inc               Software - Application    166930.0000    77         3.8000   0.5700 52.2500 391.6800      2.7900 39.3100   -5.3200       2.3489 -1.2526    -1.9088       3.6803       -0.4221       1.8848     0.1000     0.1246     0.1221
BF-B    Brown-Forman Corp  Beverages - Wineries & Distilleries     12550.0000   901         3.5700   0.

### Step 4: Save Selection Results

Save the portfolio DataFrame (Parquet and CSV) and the parameters used (JSON) for record-keeping and backtesting.

In [34]:
# --- Explicitly name the index before saving or displaying. --
df_selected.index.name = 'Ticker'

if not df_selected.empty:
    print("\n--- Step 4: Saving selection results and parameters ---")

    save_successful = utils.save_selection_results(
        df_selected=df_selected,       
        parameters_used=params_used,
        base_filepath=str(OUTPUT_BASE_PATH), # Convert Path to string for the function
        save_csv=True
    )

    if save_successful:
        print(f"Results saved successfully with base path: {OUTPUT_BASE_PATH}")
else:
    print("\nNo results to save.")


--- Step 4: Saving selection results and parameters ---
Results saved successfully with base path: c:\Users\ping\Files_win10\python\py311\stocks\output\selection_results\2025-06-13_short_term_mean_reversion


### Step 5: Verify Calculation Logic

This section performs a manual, step-by-step recalculation for a single ticker to validate the primary function's logic. This is excellent for debugging and ensuring correctness.

In [35]:
if not df_selected.empty and not df_filtered.empty:
    print("\n--- Step 5: Verifying calculation for a sample ticker ---")
    
    # Pick a sample ticker from the results
    sample_ticker = df_selected.index[0]
    print(f"Verifying scores for ticker: {sample_ticker}")

    # Manually calculate the Z-score for each component
    z_score_calcs = {}
    for factor, col_name in {'rsi': 'RSI', 'change': 'Change %', 'rel_volume': 'Rel Volume', 'volatility': 'ATR/Price %'}.items():
        value = df_filtered.loc[sample_ticker, col_name]
        mean = df_filtered[col_name].mean()
        std = df_filtered[col_name].std(ddof=0) # Use population std dev to match zscore
        z_score = (value - mean) / std
        z_score_calcs[factor] = z_score
        
        # --- FIX: Construct the z-score column name outside the f-string ---
        z_col_name = f'z_{col_name.replace(" ", "")}'
        
        print(f"\nFactor: {factor} ({col_name})")
        print(f"  - Value: {value:.4f}, Mean: {mean:.4f}, Std: {std:.4f}")
        print(f"  - Manual Z-Score: {z_score:.4f}")
        # Now use the clean variable in the f-string
        print(f"  - Function Z-Score: {df_selected.loc[sample_ticker, z_col_name]:.4f}")

    # Manually calculate the final weighted score
    manual_final_score = (
        z_score_calcs['rsi'] * STRATEGY_SCORING_WEIGHTS['rsi'] * (-1) +
        z_score_calcs['change'] * STRATEGY_SCORING_WEIGHTS['change'] * (-1) +
        z_score_calcs['rel_volume'] * STRATEGY_SCORING_WEIGHTS['rel_volume'] * (1) +
        z_score_calcs['volatility'] * STRATEGY_SCORING_WEIGHTS['volatility'] * (-1)
    )

    print("\n--- Final Score Comparison ---")
    print(f"Manual Final Score Calculation: {manual_final_score:.4f}")
    print(f"Function Final Score from DataFrame: {df_selected.loc[sample_ticker, 'final_score']:.4f}")

    # Assert that the manual calculation is close to the function's result
    assert np.isclose(manual_final_score, df_selected.loc[sample_ticker, 'final_score']), "Verification failed: Manual score does not match function score!"
    print("\nVerification successful!")

else:
    print("\nSkipping verification step.")


--- Step 5: Verifying calculation for a sample ticker ---
Verifying scores for ticker: BEKE

Factor: rsi (RSI)
  - Value: 42.5200, Mean: 53.3353, Std: 11.1973
  - Manual Z-Score: -0.9659
  - Function Z-Score: -0.9659

Factor: change (Change %)
  - Value: -2.7400, Mean: -1.2615, Std: 2.1262
  - Manual Z-Score: -0.6954
  - Function Z-Score: -0.6954

Factor: rel_volume (Rel Volume)
  - Value: 5.0800, Mean: 0.9541, Std: 0.4988
  - Manual Z-Score: 8.2710
  - Function Z-Score: 8.2710

Factor: volatility (ATR/Price %)
  - Value: 3.1402, Mean: 2.7808, Std: 1.0234
  - Manual Z-Score: 0.3512
  - Function Z-Score: 0.3512

--- Final Score Comparison ---
Manual Final Score Calculation: 2.2005
Function Final Score from DataFrame: 2.2005

Verification successful!
