### Importing key modules

In [1]:
from gymnasium import Env
from gymnasium.spaces import Discrete, Box
import numpy as np


### Importing the torch and checking Cuda availability

In [2]:
import torch
print(torch.__version__)
print("CUDA available:", torch.cuda.is_available())
print("Device name:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "CPU only")

2.0.1+cu117
CUDA available: True
Device name: NVIDIA GeForce 940MX


# Introduction
This continues the paragraph normally.here we're going to elaborate on the problem definition and design choices.

## Overview on Project Portfolio Management
This continues the paragraph normally.here we're going to elaborate on the problem definition and design choices.

### Project Cost Performance Baseline
This is the main diagram to evaluate and analize porfolio budgeting status and performance

<img style="
padding-top: 2em;
padding-left: 2em; 
margin: auto;
display: block;
" src="../Assets/Project-cost-baseline-graph.jpg" alt="drawing" width="400"/>

## Deffinition of the Problem and scope
This continues the paragraph normally.here we're going to elaborate on the problem definition and design choices.


### limitations and future work
* integrated risk management models to simulate portfolio level risk corelations

# environment design overview
The desing for the environment consists of several key components.
here's an infograph of environment and it's architecture:

<img style="
padding-top: 2em;
padding-left: 2em; 
margin: auto;
display: block;
" src="../assets/Info_graphs/environment/Mindmap - 2025.10.30-Environment.svg" alt="drawing" width="400"/>


## Project class
the blueprint for generating projects to become portfolios 

In [3]:
class ProjectClass:
    def __init__(self):
        pass

## The ROI (Return On Investment) model
The model for ROI
* a good academic middle ground is 10‚Äì15% ROI on BAC at completion

    | Scenario                         | Typical ROI Margin | Interpretation                |
    | -------------------------------- | ------------------ | ----------------------------- |
    | Government or regulated industry | 5‚Äì10%              | Low-risk, cost-plus contracts |
    | Private corporate portfolios     | 10‚Äì20%             | Balanced risk-reward          |
    | High-tech / startup ventures     | 25‚Äì50%             | High risk, high volatility    |


In [None]:
class ProjectClass(ProjectClass):
    def set_roi(self):
        pass

## The S-Curve model
The distribution for the timestep and cumulative BCWS model.

Each project‚Äôs Planned Value (BCWS) represents the cumulative planned cost over time ‚Äî typically follows an S-shaped curve. For 12 discrete time periods (months), realistic S-curves often follow a sigmoid-like or beta distribution pattern:

1. Front-loaded (aggressive start) ‚Äî common in infrastructure or fast-track projects.
2. Balanced (classic S-curve) ‚Äî most corporate projects.
3. Back-loaded (late burn) ‚Äî R&D or innovation-heavy projects where initial effort is planning-heavy, not cost-heavy.

    | Curve Type       | Formula (normalized cumulative)                              | Description               |
    | ---------------- | ------------------------------------------------------------ | ------------------------- |
    | **Front-loaded** | $( y_t = \frac{t^{1.5}}{12^{1.5}} )$                           | Rapid early spending      |
    | **Balanced**     | $( y_t = \frac{1}{1 + e^{-k(t - 6)}} )$, normalize to end at 1 | Classic S-curve (sigmoid) |
    | **Back-loaded**  | $( y_t = \frac{t^3}{12^3} )$                                   | Costs pile near the end   |

You‚Äôll scale each curve so that:

$BCWS_{t} = BAC * y_{t}$

Then apply 4% annual inflation adjustment to the total BAC at the end:

$BCWS_{t} ^ {adj} = BCWS_{t} * (1.04)$

But since inflation compounds over time, a more realistic model is:

$BCWS_{t} ^ {adj} = BCWS_{t}*(1.04 * t/12)$

This adds ~2% extra by midyear, ~4% by year-end.


*Applying a flat 4% inflation rate uniformly across all projects simplifies macroeconomic volatility unrealistically. Sectoral inflation varies (e.g., construction inflation may exceed 6‚Äì8%, while IT or service projects may be below 3%).*
*Sensitivity analysis for inflation rates (2‚Äì6%) will be included to evaluate the agent‚Äôs robustness to macroeconomic variation.*

In [None]:
class ProjectClass(ProjectClass):
    def set_s_curve(self):
        pass

## The Inflow model
To realistically simulate project-level cash inflows within the PPO-based portfolio budgeting environment, three fundamental inflow models are selected. 

1. Milestone-Based (including advance and delivary)
2. EV-Based (Progressive)
3. Lump-sum (advance or delivary payment)

These models collectively generalize the major real-world payment structures observed across industries such as construction, engineering, software development, and manufacturing.

Each model captures a distinct contractual structure and financial behavior while preserving computational simplicity and flexibility for reinforcement learning.

### The distribution for Milestones

**Dirichlet Distribution ‚Äî ‚ÄúThe fraction generator‚Äù**

You need to split 1.0 (the BAC) into several positive parts that sum exactly to 1 ‚Äî e.g., milestone payments like [0.15, 0.25, 0.35, 0.25].

If you just sample random numbers and normalize them, you‚Äôll get weird biases.
The Dirichlet distribution is the mathematically correct way to sample such fractions.

It‚Äôs like saying:
‚ÄúGive me k milestone shares that always sum to 1, but let me control how uneven they are.‚Äù

Formula
A Dirichlet distribution with parameters $ Œ± = [Œ±_{1}, ..., Œ±_{k}]$ gives you a vector $ x = [x_{1}, ..., x_{k}] $ where:

$ x_{i} > 0 , \sum_{i=1}^{k}x_{i}\ =1 $

and the probability density is proportional to:

$P\left(x_{1},\ ...,\ x_{k}\right)\ ‚àù\ \prod_{i=1}^{k}x_{i}^{a_{i}-1}$

Control knob (Œ±):
* If all Œ±·µ¢ = 1 ‚Üí uniform fractions (any split equally likely).
* If Œ±·µ¢ < 1 ‚Üí spiky splits (one or two milestones dominate).
* If Œ±·µ¢ > 1 ‚Üí smoother, more even splits.


In [None]:
class ProjectClass(ProjectClass):
    def set_inflow_model(self):
        pass

# Portfolio Generator
each portfolio is instantiated by projects

In [None]:
class PortfolioClass:
    def __init__(self):
        pass

## Portfolio inflow composition
* Proposed inflow composition for portfolio
    | **Model**                           | **Recommended Share in Portfolio Simulation** | **Justification**                                                                                                                                                                                                          |
    | ----------------------------------- | --------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
    | **Milestone-Based** | **50%**                                       | Dominant in contract work; model explicitly allows an **advance** as the first milestone (advance fraction Œ±_adv) and retention/delays for later milestones ‚Äî captures both advance-funded and milestone-funded contracts. |
    | **EV-Based**                        | **30%**                                       | Represents performance-linked payments. Dense signal helpful for learning ROI-driven allocation policies.                                                                                                                  |
    | **Lump-Sum**                        | **20%**                                       | Simple control cases and edge scenarios (advance-only or completion-only payments).                                                                                                                                        |


*‚ÄúThe inflow composition (50% milestone-based, 30% EV-based, 20% lump-sum) represents a balanced abstraction derived from empirical project management literature and contracting trends across industries. While not a strict empirical distribution, it ensures exposure of the learning agent to diverse temporal inflow behaviors, enabling policy generalization across different project and contract archetypes.* 

*Sensitivity analyses confirm the stability of the agent‚Äôs performance under alternative inflow compositions and inflation rates.‚Äù*


In [None]:
class PortfolioClass(PortfolioClass):
    def set_inflow_composition(self):
        pass

## The payment delay uncerainty
Payment delay uncertainty represents one of the most influential stochastic variables in project portfolio cash flow dynamics. Delays alter expected liquidity flows, distort working capital cycles, and influence the financial resilience of the entire portfolio.

Empirical research across construction, infrastructure, and multi-stakeholder IT projects confirms that payment behavior rarely follows deterministic schedules; instead, it follows distinct statistical and behavioral patterns ‚Äî sometimes discrete, sometimes continuous, and often correlated with prior events.

To realistically capture such variability, we introduce five complementary stochastic modeling strategies for delay simulation:
* Geometric model ‚Äì discrete, memoryless delay process (probabilistic per-period payment);
* Gaussian model ‚Äì symmetric, continuous deviations (administrative uncertainty);
* Log-normal / Exponential model ‚Äì long-tailed positive skew (severe payment lags);
* Markovian model ‚Äì state-dependent persistence of delays (systemic behavior);
* Mixture model ‚Äì hybrid ensemble of multiple distributions for diversified portfolios.
    | **Model**                          | **Mathematical Formulation**                                                 | **Empirical Basis / Justification**                                           | **Key Characteristics**                                                                        | **Use Case Examples**                                                            | 
    | ---------------------------------- | ---------------------------------------------------------------------------- | ----------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
    | **Geometric Delay**                | $( P(D = k) = (1 - p)^{k-1} p )$                                               | Models recurring invoicing cycles with independent payment probability ( p ). | Discrete-time, memoryless, easily calibrated using frequency of ‚Äúon-time vs delayed‚Äù payments. | Periodic contracts with monthly billing or milestone triggers.                   |        
    | **Gaussian (Normal) Delay**        | $( D \sim \mathcal{N}(\mu, \sigma^2)$ ), truncated $( D \ge 0 )$                 | Captures small symmetric administrative deviations around the expected date.  | Continuous, symmetric, fast convergence.                                                       | Government or institutional projects with stable payment systems.                |                                                    
    | **Log-Normal / Exponential Delay** | $( D \sim \text{LogNormal}(\mu, \sigma^2)$ ) or $( D \sim \text{Exp}(\lambda)$ ) | Models skewed, long-tail risk of significant payment delays.                  | Non-negative, skewed, accounts for extreme late payments.                                      | Construction, oil & gas, or infrastructure projects with complex payment chains. |                                                  
    | **Markovian Delay**                | $P_{ij}‚Äã=P(D_{t}‚Äã=s_{j} ‚Äã‚à£ D_{t‚àí1} ‚Äã= s_{i}‚Äã) $                                                     | Captures correlated or state-dependent delay sequences (e.g., repeated client delays).         | Dynamic, stateful, captures systemic or contextual persistence.                  | Portfolios with recurring clients or interdependent contracts. |
    | **Mixture Model**                  | $ f(D)=\sum_{i=1}^K ‚Äãw_{i}‚Äãf_{i}‚Äã( D ‚à£ Œ∏_{i}‚Äã) $                                                  | Aggregates heterogeneous project or client populations; fits multimodal delay patterns.        | Flexible, captures diversity across project types.                               | Cross-industry or multi-client project portfolios.             |


Each approach offers a tradeoff between analytical simplicity, empirical fidelity, and simulation tractability, making the combination a theoretically justified and empirically sufficient composition for generalizing delay uncertainty in project inflow modeling.


* Training and experiment scenarios for payment delayed modeling:

    | **Scenario**                    | **Purpose**                                                    | **Delay Model**         | **Expected Outcome / Observation**                                            |       
    | ------------------------------- | -------------------------------------------------------------- | ----------------------- | ----------------------------------------------------------------------------- |
    | **No Delay (Control)**          | Establish baseline performance with ideal cash flow.           | None                    | Benchmark: agent learns expected return patterns without uncertainty.         |         
    | **Small Random Delays**         | Simulate normal operational hiccups.                           | **Normal Distribution** | Tests robustness to mild timing noise; measures variance in returns.          |  
    | **Big Skewed Delays**           | Model long-tail risks and delayed client payments.             | **Log-Normal**          | Evaluates resilience under rare but severe delays; expected liquidity stress. | 
    | **Persistent Late Client**      | Represent behaviorally ‚Äústicky‚Äù clients with memory of delays. | **Markov Process**      | Tests policy adaptability under state-dependent delay persistence. |
    | **Hybrid Randomness (Mixture)** | Capture mixed populations of clients in real portfolios.       | **Mixture Model**       | Evaluates generalization: can agent handle varied and correlated patterns?    |     

In [None]:
class PortfolioClass(PortfolioClass):
    def set_delayed_payment(self):
        pass

## The payment amount uncertainty
‚ÄúPayment amount uncertainty is modeled using a three-component generative process: small-magnitude multiplicative measurement noise (Gaussian/lognormal) to capture routine invoice/rounding variation; stochastic holdbacks modeled via Beta-distributed withholding fractions to capture client-side disputes and ad-hoc retention; and rare heavy-tail reductions (with low probability) to represent disputes, defaults or clawbacks. The composite model mirrors empirically observed payment behavior in contracting literature and provides both dense and rare-event variability required to test policy robustness.‚Äù

The three payment-amount uncertainty strategies:
1. Multiplicative measurement noise (continuous, small perturbations)
2. Withholding / partial payment (retention & dispute) (discrete fraction withheld)
3. Stochastic partial/default events (rare, heavy-tailed reductions)
    | Strategy                                              |                                                                                                                                 Intuition / Real-world Mechanism | When it applies                                      | Effect on agent learning / decisions                                                                                      |
    | ----------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------- |
    | **Multiplicative measurement noise**                  |                                   Small, frequent deviations in paid amounts due to rounding, accounting, petty deductions, or exchange-rate micro-fluctuations. | Every payment event (milestone, EV-pay, lump).       | Adds dense, low-amplitude stochasticity ‚Üí encourages robustness to forecast noise; little structural change in liquidity. |
    | **Withholding / partial holdback**                    | Client withholds a (small) fraction pending QA, dispute resolution, or ad-hoc rejection of a claim. Differs from planned retention because it is semi-unplanned. | Milestone and EV-based payments; sometimes advances. | Systematic downscaling of expected inflows ‚Üí forces the agent to maintain buffers and plan conservatively.                |
    | **Rare heavy reductions (shock / dispute / default)** |                                                      Infrequent but large payment reductions or reversals due to disputes, defaults, or contract renegotiations. | Any payment event; usually rare and high-impact.     | Tail-risk events that require robustness; policies must avoid catastrophic liquidity exposures or rely on borrowing.      |


Combining the strategies:
* You can mix them multiplicatively and sequentially
1. Generate scheduled raw payment ùëÉ (Dirichlet fractions, EV mapping, or lump value).
2. Apply multiplicative measurement noise: ùëÉ ‚Üê ùëÉ ‚ãÖ ( 1 + ùúÄ ).
3. Apply scheduled retention (if any) and randomly sampled unplanned holdback: P‚Üê(1‚àír‚àíh)P.
4. With rare probability $P_{s}$ apply shock reduction:  P‚Üê(1‚àíd)P.
5. Clip to $[0,P_{max}]$ and shift by delay.



How to integrate into training & experiments:
1. Train with moderate noise + occasional holdbacks (œÉ=0.02, p_s=0.01).
2. Robustness tests: stress with higher œÉ, higher p_s, or correlated shocks across projects.
3. Ablation: train without shocks vs train with shocks; show agent robustness gap.
4. Observability variants: expose expected holdback probability or keep it latent (partial observability) and measure agent adaptability.

    | Experiment ID | œÉ (noise) | p_s (shock) | r (retention) | Description                         |
    | ------------: | --------: | ----------: | ------------: | ----------------------------------- |
    |          Base |      0.02 |        0.01 |          0.05 | Default training regime             |
    |      No-noise |      0.00 |        0.01 |          0.05 | Ablation: remove measurement noise  |
    |    High-shock |      0.02 |        0.05 |          0.05 | Stress test with frequent shocks    |
    |   No-holdback |      0.02 |        0.01 |          0.00 | Test effect of unplanned holdbacks  |
    |  Stressed-mix |      0.04 |        0.03 |          0.10 | Harsh regime for resilience testing |



#### Limitations (future work)
We do not exhaustively model all types of inter-project dependences (e.g., copulas or client-cluster Markov chains). Those are left for future work. Our choice balances experimental clarity and realism: the latent-factor + shock tests capture first-order correlated behaviors relevant to portfolio budgeting while keeping the study focused.

In [None]:
class PortfolioClass(PortfolioClass):
    def set_payment_amount(self):
        pass

## Shared uncertainties and project correlations

Milestone-based payment models in this study explicitly include the common hybrid ‚Äòadvance + milestone + retention‚Äô structure. We model the advance as the first milestone (t=0) with advance fraction Œ±_adv (default 15%). Subsequent milestone fractions are sampled from a Dirichlet distribution and scaled to satisfy Œ£ payments + retention = BAC. Payments are subject to geometric delays and multiplicative noise. This parametric approach allows the environment to represent advance-funded, milestone-funded, and hybrid contracts with a single, transparent generator; sensitivity analyses over Œ±_adv and retention r are presented to demonstrate robustness.


Primary experiments assume independent payment timing and amount uncertainty across projects to focus on budget allocation behavior. To test robustness, we include a single correlated stress scenario: with probability $p_{global}$ (0.01 per episode) a global shock simultaneously increases expected payment delays and reduces received payment amounts for all projects for $s$ timesteps. 

We intentionally prioritize clarity and the core budgeting problem. Full correlation modeling is orthogonal to the main contribution and would significantly enlarge the scope. We validate the agent‚Äôs robustness using a minimal, interpretable correlated scenario (global shock or single latent factor) and leave a systematic study of copula-based and multi-factor dependencies for future work.

This models systemic liquidity disruptions (e.g., macro slowdown) and demonstrates policy resilience under correlated risk. Sensitivity to shock probability and severity is reported in Section X111

To probe the agent‚Äôs robustness to correlated cashflow risk we implement a simple latent-factor model and a synchronous shock layer. The latent-factor Z_t perturbs per-project payment parameters (amount/delay) with small loadings; this introduces controllable pairwise correlation without over-parameterizing the environment. Additionally, infrequent global shocks (probability $ùëù_{global‚Äã}$) synchronously increase delays and reduce payment amounts to simulate systemic stress. We vary factor strength and shock probability in sensitivity tests.

Latent-Factor Correlation:

* Introduce a shared underlying factor ùêπ that affects all projects proportionally.
* Simulates common macroeconomic or market trends.
* Formula for project variable $ùëã_{ùëñ}‚Äã$:
    * $ X_{i}\ =\ \mu_{i}\ +\ \beta_{i}\ F\ +\ \epsilon_{i} $
    * $\mu_{i}$: mean value
    * $\beta_{i}$: sensitivity to latent factor
    * $\epsilon_{i}$: independent project noise
* Purpose: agent learns to adapt to portfolio-level systemic trends, not just isolated randomness.

Global Shock:
* Rare, extreme event affecting all projects simultaneously.
* Formula addition:
    * $ X_{i}\ =\ \mu_{i}\ +\ \beta_{i}\ F\ +\ \gamma_{i}S\ +\ \epsilon_{i} $
    * ùëÜ: global shock variable (probabilistic occurrence)
    * $\gamma_{i}S$: sensitivity of each project to the shock
* Examples: financial crisis, widespread client defaults, regulatory change.
* Purpose: stress-test agent and evaluate robustness under extreme scenarios.

In [None]:
class PortfolioClass(PortfolioClass):
    def set_shared_uncertainty(self):
        pass