# Temporary Impact Analysis and Optimal Execution Algorithm


This notebook aims to:

- Analyze the temporary impact model $g_t(x)$ using real data for three tickers (CRWV, FROG, SOUN), comparing linear and nonlinear models.
- Build a mathematical framework and an approximate algorithm for optimal execution such that $\sum_i x_i = S$.


You will find in this notebook:

1. Data import and preparation.
2. Data exploration and visual analysis.
3. Fitting mathematical models and comparing results.
4. Mathematical explanation of the execution algorithm.
5. Practical code for the optimal execution algorithm.


> **Note:** The original data files must be available in the specified paths.

In [None]:
# تثبيت matplotlib إذا لم تكن مثبتة
%pip install matplotlib

# استيراد المكتبات الأساسية للتحليل والرسم البياني
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from zipfile import ZipFile
import os

# إعداد الرسوم البيانية لتكون أكثر وضوحًا وجمالًا
sns.set(style='whitegrid')
plt.rcParams['figure.figsize'] = (10, 5)

In [None]:
# Load real data from zip files for each ticker with error handling


ticker_paths = {

    'CRWV': 'data/CRWV/CRWV.zip',

    'FROG': 'data/FROG/FROG.zip',

    'SOUN': 'data/SOUN/SOUN.zip'

}



ticker_dfs = {}



for ticker, zip_path in ticker_paths.items():

    print(f'\n--- Processing data for {ticker} ---')

    if not os.path.exists(zip_path):

        print(f'❌ File not found: {zip_path}')

        continue

    try:

        with ZipFile(zip_path, 'r') as zip_ref:

            zip_ref.extractall(f'data/{ticker}/extracted')

            csv_files = [f for f in os.listdir(f'data/{ticker}/extracted') if f.endswith(".csv")]

            if not csv_files:

                print(f'❌ No CSV file inside {zip_path}')

                continue

            try:

                df = pd.read_csv(f'data/{ticker}/extracted/' + csv_files[0])

            except Exception as e:

                print(f'❌ Error reading CSV: {e}')

                continue

            required_cols = set(['price', 'volume'])

            if not required_cols.issubset(df.columns):

                print(f'❌ File {csv_files[0]} does not contain required columns: {required_cols - set(df.columns)}')

                continue

            ticker_dfs[ticker] = df

            print(f'✅ Data loaded ({len(df)}) rows')

            display(df.head(5))

    except Exception as e:

        print(f'❌ Error processing {ticker}: {e}')



if not ticker_dfs:

    print('⚠️ No data loaded. Make sure files exist at the correct paths and in the required format.')

In [None]:
# Data exploration and visual analysis


for ticker, df in ticker_dfs.items():

    print(f'\n--- Analysis for {ticker} ---')

    display(df.describe().T)

    if 'price' in df.columns and 'volume' in df.columns:

        fig, axes = plt.subplots(1, 2, figsize=(14, 4))

        sns.lineplot(ax=axes[0], data=df['price'], color='royalblue')

        axes[0].set_title(f'{ticker} Price')

        axes[0].set_xlabel('Time')

        axes[0].set_ylabel('Price')

        sns.lineplot(ax=axes[1], data=df['volume'], color='darkorange')

        axes[1].set_title(f'{ticker} Trading Volume')

        axes[1].set_xlabel('Time')

        axes[1].set_ylabel('Volume')

        plt.suptitle(f'Visual Analysis for {ticker}', fontsize=14)

        plt.tight_layout()

        plt.show()

NameError: name 'ticker_dfs' is not defined

## Modeling the Temporary Impact $g_t(x)$

The temporary market impact function $g_t(x)$ models the immediate price change caused by trading $x$ shares at time $t$. A common approach is to assume a linear relationship: $g_t(x) \approx \beta_t x$. However, this can be a gross oversimplification, as real market impact often exhibits nonlinearities due to liquidity, order book depth, and market microstructure effects.

### Linear Model
- **Form:** $g_t(x) = \beta_t x$
- **Pros:** Simple, interpretable, easy to estimate.
- **Cons:** May not capture diminishing/increasing returns to scale, ignores nonlinear liquidity effects.

### Nonlinear Model
- **Form:** $g_t(x) = \alpha_t x^p$ (with $p \neq 1$)
- **Pros:** Can capture concave/convex impact, more flexible.
- **Cons:** More complex to estimate, risk of overfitting with limited data.

We will fit both models to the data and compare their fit.

In [None]:
%pip install scipy


# Fit linear and nonlinear models for temporary impact g_t(x)

from scipy.optimize import curve_fit



def linear_model(x, beta):

    return beta * x



def nonlinear_model(x, alpha, p):

    return alpha * np.power(x, p)



fit_results = {}



for ticker, df in ticker_dfs.items():

    if 'price' in df.columns and 'volume' in df.columns:

        print(f'\n--- Model fitting for {ticker} ---')

        df = df.copy()

        df['impact'] = df['price'].diff().fillna(0)

        x = df['volume'].values

        y = df['impact'].values

        # Fit linear model

        try:

            popt_lin, _ = curve_fit(linear_model, x, y)

        except Exception as e:

            print(f'Error in linear model: {e}')

            continue

        # Fit nonlinear model

        try:

            popt_nonlin, _ = curve_fit(nonlinear_model, x, y, bounds=([0,0],[np.inf,2]))

        except Exception as e:

            print(f'Error in nonlinear model: {e}')

            continue

        fit_results[ticker] = {'linear': popt_lin, 'nonlinear': popt_nonlin}

        # Plot results

        plt.figure(figsize=(7,4))

        plt.scatter(x, y, alpha=0.3, label='Data')

        x_fit = np.linspace(x.min(), x.max(), 100)

        plt.plot(x_fit, linear_model(x_fit, *popt_lin), label='Linear Model', color='royalblue')

        plt.plot(x_fit, nonlinear_model(x_fit, *popt_nonlin), label='Nonlinear Model', color='darkorange')

        plt.title(f'Temporary Impact Fitting for {ticker}')

        plt.xlabel('Volume')

        plt.ylabel('Price Change')

        plt.legend()

        plt.show()

        print(f'Linear model coefficient (beta): {popt_lin[0]:.4g}')

        print(f'Nonlinear model coefficients (alpha, p): {popt_nonlin[0]:.4g}, {popt_nonlin[1]:.4g}')

### Model Discussion and Results



The linear model $g_t(x) = \beta_t x$ is simple and easy to interpret, but it often does not accurately reflect reality in illiquid markets or when executing large orders. The nonlinear model $g_t(x) = \alpha_t x^p$ provides greater flexibility and better captures market behavior when there is nonlinearity in the impact.


**Conclusion:**

- Use the nonlinear model if the data shows that $p$ is clearly different from 1.
- If the results are similar, the linear model can be used for its simplicity.


> **Recommendation:** Always monitor the quality of fit and interpret results based on market characteristics and available data.

### Mathematical Framework and Optimal Execution Algorithm



**Objective:** Execute a total quantity $S$ of shares over $N$ periods such that $\sum x_i = S$ while minimizing the total cost including temporary impact.


**Objective function:**


$$

\min_{x_1,...,x_N} \sum_{i=1}^N [P_{t_i} x_i + g_i(x_i)]

$$


**Constraints:**

- $\sum_{i=1}^N x_i = S$

- $x_i \geq 0$


**Solution steps:**

1. If $g_i(x)$ is linear, the solution is analytical (Almgren-Chriss).

2. If nonlinear, use numerical algorithms such as `scipy.optimize.minimize`.

3. Check that constraints are satisfied after solving.


> **Note:** The objective function or constraints can be modified according to market requirements or risk preferences.

In [None]:
# Example: Optimal Execution with Nonlinear Impact
from scipy.optimize import minimize

S = 10000  # total shares
N = 10     # intervals
alpha = 0.01
p = 0.8
P0 = 100

def cost(x):
    x = np.array(x)
    return np.sum(P0 * x + alpha * np.power(x, p))

# Constraint: sum(x) = S
cons = ({'type': 'eq', 'fun': lambda x: np.sum(x) - S})
# Bounds: x_i >= 0
bounds = [(0, None)] * N
x0 = np.ones(N) * (S / N)

res = minimize(cost, x0, bounds=bounds, constraints=cons)
print('Optimal x:', res.x)
print('Total cost:', res.fun)