# PricePredict Prediction Data Structures

The outputs of the PricePredict class include an analysis of predictions vs seasonality and correlations between a given stock vs other stocks.

This document explains how these output can be leveraged.

### Understanding the `pred_rank` Array in the `pred_rankings` Structure

I'll assume "red_rankings" is a typo for "pred_rankings" (based on the class's internal structure), and "now attributes" is meant to be "how attributes." If that's not the case, feel free to clarify! Drawing from the PricePredict class's mechanics, here's a clear explanation of how the `pred_rank` array is generated within the `pred_rankings` structure, and how its attributes can enhance price prediction strategies. This is based on the system's prediction adjustment and analysis processes.

#### How the `pred_rank` Array is Created
The `pred_rank` array is computed during the prediction adjustment phase (in the `adjust_prediction` method) to quantify the strength of predicted price changes. It's not a raw output from the LSTM model but a post-processing step that ranks deltas for interpretability. Here's the step-by-step process:

1. **Delta Extraction**: After generating raw predictions (e.g., for close, high, and low prices), the system calculates the differences (deltas) between consecutive predicted closing prices. These deltas capture the predicted movement from one period to the next (e.g., daily or weekly).

2. **Focus on Magnitude**: Absolute values of these deltas are used to emphasize the size of changes, regardless of direction (positive or negative). This creates a series of values representing the "intensity" of each predicted shift.

3. **Binning and Ranking**: The absolute deltas are binned into a histogram with 10 bins (deciles) using NumPy's `digitize` and `histogram` functions. Each delta is assigned a rank from 0 to 10 based on its bin:
   - 0: Smallest changes (low intensity, potentially noisy signals).
   - 10: Largest changes (high intensity, stronger predicted movements).
   This results in the `pred_rank` array, which ranks all predicted periods sequentially.

4. **Integration into `pred_rankings`**: The array is stored in the `pred_rankings` dictionary (generated in the `prediction_analysis` method), alongside related metrics like the most recent delta (`pred_last_delta`) and seasonal equivalents. The last non-placeholder rank (e.g., second-to-last) is often emphasized for the immediate prediction.

This ranking helps normalize predictions, making them easier to compare across periods or assets.

#### How Attributes in `pred_rankings` Can Be Leveraged for Price Prediction
The `pred_rankings` structure (a dictionary in the `analysis` output) provides a compact summary of prediction strength and trends, combining machine learning outputs with seasonal insights. Here's how key attributes can be applied to improve price prediction models or trading decisions:

- **`pred_rank` (the array itself)**:
  - **Leverage**: This array ranks the magnitude of predicted changes across all periods, acting as a confidence filter. For instance, in a prediction pipeline, prioritize high-rank periods (e.g., >7) for trades, as they indicate stronger signals. It's useful for risk management—low ranks might signal unreliable predictions, reducing false positives in automated systems. In backtesting, aggregate ranks to score overall model performance.

- **`pred_last_delta`**:
  - **Leverage**: This is the most recent predicted price change (delta). Combine it with `pred_rank` for short-term forecasts—e.g., a high rank with a positive delta could trigger a buy signal in momentum-based strategies. It's ideal for delta-hedging or adjusting position sizes based on expected volatility.

- **`season_last_delta`** and **`season_rank`**:
  - **Leverage**: These derive from seasonal decomposition (e.g., trends over cycles like 30 days), ranking seasonal deltas similarly to `pred_rank`. Use them for ensemble predictions—e.g., if `season_rank` aligns with `pred_rank`, it validates the forecast and increases confidence. This is powerful for seasonal trading (e.g., end-of-quarter effects), where you blend ML predictions with historical patterns to refine long-term price targets.

Overall, `pred_rankings` turns predictions into ranked, actionable insights. For example, in a custom model, you could compute a weighted score (e.g., `pred_rank * pred_last_delta`) to adjust base predictions, or use it in pairs trading by comparing rankings across correlated assets. This structure enhances robustness, especially when integrated with other class features like sentiment analysis.


### How `pred_strength` is Calculated in the PricePredict Class

The `pred_strength` attribute in the PricePredict class represents a composite score that quantifies the overall reliability and intensity of a price prediction. It's not a standalone data structure but a key value within the broader `analysis` dictionary (often under `pred_strength`), generated during the prediction analysis phase. This score integrates machine learning predictions with seasonal trends, providing a normalized metric for decision-making. Below, I'll explain its calculation step-by-step, based on the class's internal logic.

#### Step-by-Step Calculation of `pred_strength`
1. **Gather Core Components**:
   - **Prediction Rank (`pred_rank`)**: This is the rank (typically 0-10) of the most recent predicted price change's magnitude, derived from binning absolute deltas in the predictions. Higher values indicate stronger predicted movements.
   - **Seasonal Rank (`season_rank`)**: Similar to `pred_rank`, but based on deltas from seasonal decomposition (e.g., trends over cycles like 30 days). It's adjusted for direction (positive for upward trends, negative for downward).
   - **Seasonal Correlation (`season_corr`)**: A percentage (0-1) measuring how well seasonal trends align with actual price movements. It's computed by comparing up/down days in seasonal data versus historical closes.

2. **Combine Ranks with Correlation**:
   - First, weight the seasonal rank by the correlation: `season_rank * season_corr`. This emphasizes seasonal signals only if they historically match real trends (e.g., a high correlation boosts the seasonal contribution).

3. **Aggregate and Normalize**:
   - Add the weighted seasonal component to the prediction rank: `pred_rank + (season_rank * season_corr)`.
   - Divide by 20 to normalize into a range roughly from -1 to 1 (assuming ranks are 0-10): `(pred_rank + (season_rank * season_corr)) / 20`.
   - Round the result to 4 decimal places for precision.

   **Formula Summary**:
   `pred_strength = round((pred_rank + (season_rank * season_corr)) / 20, 4)`

   - Positive values suggest bullish strength (e.g., 0.5 indicates moderate upward confidence).
   - Negative values suggest bearish strength (e.g., -0.3 indicates mild downward reliability).
   - Values near 0 imply neutral or weak signals.

This calculation occurs after predictions are adjusted and analyzed (e.g., in the `prediction_analysis` method), ensuring it reflects both ML outputs and seasonal validation.

#### Array Structure and Ordering
- **When It's an Array**: `pred_strength` is derived from operations on arrays like `pred_rank` (which ranks predicted deltas across periods). If not reduced, it becomes an array with one value per predicted period (e.g., daily changes). The length matches the number of deltas computed (usually one less than the total prediction periods, as deltas compare consecutive values).

- **Order of Elements**:
  - The array is ordered **chronologically from past to present**.
  - **Index [0]**: Represents the **earliest (past) period** in the prediction series. This is the oldest delta/rank in the computed sequence.
  - **Last Index (e.g., [-1])**: Represents the **most recent (present or near-future) period**. This is the newest delta/rank, closest to the current time or the next predicted value.

- **Why This Order?**: The underlying deltas (e.g., price changes) are calculated sequentially from the start of the data series (historical periods) to the end (latest predictions). For example:
  - If predicting over 10 periods, the array might have 9 strength values (for 9 deltas).
  - [0] could be the strength for the change from period 1 to 2 (farthest in the past).
  - [8] would be the strength for the change from period 9 to 10 (most recent).

- **Reduction to Scalar**: In practice, if `pred_strength` is an array, the class logs it and often reduces it to the first element ([0]) for the final stored value. This [0] would thus reflect the earliest strength in the series (past-oriented). However, for prediction tasks, you might want to access the full array or the last element for the most current insight.

#### Practical Applications and Insights
- **Enhancing Predictions**: Use `pred_strength` as a filter or multiplier in models—e.g., scale predicted prices by this score to create risk-adjusted forecasts. High absolute values (>0.4) could trigger trades, while low ones signal caution.
- **Risk Management**: In trading strategies, combine it with thresholds (e.g., only act if `|pred_strength| > 0.3`) to avoid low-confidence predictions.
- **Integration with Other Attributes**: Pair it with `pred_rankings` (e.g., check if `pred_strength` aligns with high `pred_rank`) for robust validation, or use it in ensembles with sentiment scores from the class's Groq integration.
