# Introduction to Plotly

In [1]:

import plotly.express as px
import numpy as np

# Scatter Plot Example
x = np.random.randn(100)
y = np.random.randn(100)
fig = px.scatter(x=x, y=y, title="Scatter Plot: Relationship between X and Y", labels={'x': 'X values', 'y': 'Y values'})
fig.show()

# Bar Chart Example
categories = ['Category A', 'Category B', 'Category C', 'Category D']
values = [23, 17, 35, 29]
fig = px.bar(x=categories, y=values, title="Bar Chart: Values by Category", labels={'x': 'Categories', 'y': 'Values'})
fig.show()

# Histogram Example
data = np.random.randn(500)
fig = px.histogram(data, title="Histogram: Data Distribution", labels={'value': 'Values'})
fig.show()

# 3D Scatter Plot Example
x = np.random.randn(100)
y = np.random.randn(100)
z = np.random.randn(100)
fig = px.scatter_3d(x=x, y=y, z=z, title="3D Scatter Plot: X, Y, Z values")
fig.show()

# Surface Plot Example
x = np.linspace(-5, 5, 100)
y = np.linspace(-5, 5, 100)
x, y = np.meshgrid(x, y)
z = np.sin(np.sqrt(x**2 + y**2))
fig = px.imshow(z, title="Surface Plot: Sin Function")
fig.show()



# Statistical Arbitrage and Cointegration Testing

Statistical arbitrage is a market-neutral strategy that seeks to take advantage of price divergences between related assets. The key idea is that certain pairs of assets may have a stable, long-term relationship, even though they may deviate from this relationship over short periods.

The concept of **cointegration** helps identify such pairs of assets. Cointegration refers to a statistical property where two or more time series move together in the long run despite short-term fluctuations. When two assets are cointegrated, their prices have a long-term equilibrium, meaning the spread between their prices reverts to a mean over time.

In this section, we'll fetch price data for two stocks, test for cointegration using the **Engle-Granger cointegration test**, and generate trading signals based on deviations from the equilibrium (the spread).

### Cointegration Test Explanation:
The **Engle-Granger test** is used to test if two or more time series are cointegrated. It outputs:
- **Test Statistic**: The result of the test.
- **Critical Values**: Threshold values at different confidence levels (1%, 5%, 10%).
- **P-Value**: The probability that the time series are not cointegrated.

The test is interpreted as:
- If the **P-value** is less than the significance level (alpha = 0.05), we reject the null hypothesis and conclude that the assets are cointegrated.
- If the **P-value** is greater than 0.05, the assets are not cointegrated, meaning they don't share a long-term equilibrium.

This strategy typically generates long/short trading signals based on deviations from the mean spread between two cointegrated assets.


In [2]:

# Statistical Arbitrage with Performance Metrics and Market Neutral Portfolio Positions

import yfinance as yf
import pandas as pd
import numpy as np
import plotly.graph_objs as go
import statsmodels.api as sm

# Fetch stock data
tickers = ['AAPL', 'MSFT']
data = yf.download(tickers, start='2020-01-01', end='2023-01-01')['Adj Close']

# Calculate the log prices
log_prices = np.log(data)

# Perform Engle-Granger cointegration test
coint_test = sm.tsa.coint(log_prices[tickers[0]], log_prices[tickers[1]])

# Cointegration test result interpretation
test_statistic, p_value, critical_values = coint_test
print(f"Test Statistic: {test_statistic}")
print(f"P-value: {p_value}")
print(f"Critical Values: {critical_values}")

# Check if the assets are cointegrated at the 0.05 level (alpha)
if p_value < 0.05:
    print(f"The assets are cointegrated with a p-value of {p_value:.5f}.")
else:
    print(f"The assets are not cointegrated with a p-value of {p_value:.5f}.")

# Calculate the spread between the two assets
spread = log_prices[tickers[0]] - log_prices[tickers[1]]

# Generate trading signals based on spread deviations from the mean
mean_spread = spread.mean()
std_spread = spread.std()
long_signal = spread < (mean_spread - 1.5 * std_spread)
short_signal = spread > (mean_spread + 1.5 * std_spread)

# Create positions based on the signals
positions = pd.DataFrame(index=spread.index)
positions[tickers[0]] = 0
positions[tickers[1]] = 0
positions[tickers[0]][long_signal] = 1  # Long on stock 1
positions[tickers[1]][long_signal] = -1  # Short on stock 2
positions[tickers[0]][short_signal] = -1  # Short on stock 1
positions[tickers[1]][short_signal] = 1  # Long on stock 2

# Plot market-neutral portfolio positions using Plotly
fig = go.Figure()

# Add positions for stock 1 and stock 2
fig.add_trace(go.Scatter(x=positions.index, y=positions[tickers[0]], mode='lines', name=f'{tickers[0]} Position'))
fig.add_trace(go.Scatter(x=positions.index, y=positions[tickers[1]], mode='lines', name=f'{tickers[1]} Position'))

# Customize layout
fig.update_layout(title="Market Neutral Portfolio Positions", xaxis_title="Date", yaxis_title="Position")
fig.show()

# Calculate portfolio returns based on the spread strategy
log_returns = log_prices.diff()
portfolio_returns = (positions.shift(1) * log_returns).sum(axis=1)

# Plot the spread and trading signals using Plotly
fig = go.Figure()

# Plot the spread
fig.add_trace(go.Scatter(x=spread.index, y=spread, mode='lines', name='Spread'))

# Plot long and short signals
fig.add_trace(go.Scatter(x=spread[long_signal].index, y=spread[long_signal], mode='markers', marker=dict(color='green', size=8), name='Long Signal'))
fig.add_trace(go.Scatter(x=spread[short_signal].index, y=spread[short_signal], mode='markers', marker=dict(color='red', size=8), name='Short Signal'))

# Add mean line
fig.add_trace(go.Scatter(x=spread.index, y=[mean_spread] * len(spread), mode='lines', line=dict(dash='dash', color='red'), name='Mean Spread'))

# Customize layout
fig.update_layout(title=f'Spread Between {tickers[0]} and {tickers[1]} with Trading Signals', xaxis_title='Date', yaxis_title='Spread')
fig.show()

# 1. Calculate and plot cumulative returns
cumulative_returns = portfolio_returns.cumsum()
fig = go.Figure()
fig.add_trace(go.Scatter(x=cumulative_returns.index, y=cumulative_returns, mode='lines', name='Cumulative Returns'))
fig.update_layout(title="Cumulative Returns of Statistical Arbitrage Strategy", xaxis_title="Date", yaxis_title="Cumulative Return")
fig.show()

# 2. Calculate key performance metrics: Total Return, Annualized Return, Volatility, Sharpe Ratio, Omega Ratio

# Total Return
total_return = cumulative_returns[-1]
print(f"Total Return: {total_return:.2%}")

# Annualized Return
annualized_return = portfolio_returns.mean() * 252
print(f"Annualized Return: {annualized_return:.2%}")

# Annualized Volatility
annualized_volatility = portfolio_returns.std() * np.sqrt(252)
print(f"Annualized Volatility: {annualized_volatility:.2%}")

# Sharpe Ratio
sharpe_ratio = annualized_return / annualized_volatility
print(f"Sharpe Ratio: {sharpe_ratio:.2f}")

# Omega Ratio (with threshold set to 0 for break-even)
threshold = 0
excess_returns = portfolio_returns - threshold
positive_returns = excess_returns[excess_returns > 0].sum()
negative_returns = -excess_returns[excess_returns < 0].sum()
omega_ratio_value = positive_returns / negative_returns
print(f"Omega Ratio: {omega_ratio_value:.2f}")


[*********************100%%**********************]  2 of 2 completed

Test Statistic: -1.9306557888392772
P-value: 0.5643817341825061
Critical Values: [-3.91100464 -3.34423482 -3.05007226]
The assets are not cointegrated with a p-value of 0.56438.




ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the defau

Total Return: 25.18%
Annualized Return: 8.39%
Annualized Volatility: 8.33%
Sharpe Ratio: 1.01
Omega Ratio: 1.59



Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`




# Portfolio Optimization Using the Sharpe and Omega Ratios

In portfolio optimization, we aim to balance the trade-off between return and risk by selecting optimal asset weights. Two common performance metrics used are:

1. **Sharpe Ratio**: This measures the performance of a portfolio relative to its risk. It is the ratio of excess return (over the risk-free rate) to the portfolio's standard deviation (volatility). The higher the Sharpe ratio, the better the risk-adjusted return.

2. **Omega Ratio**: This is a more flexible performance measure that compares the probability of gains versus the probability of losses. The Omega ratio is particularly useful because it considers all moments of the return distribution (not just mean and variance) and provides insights into both tail risks and higher returns.

### Optimization Process:
- **Mean-Variance Optimization**: We'll optimize the portfolio using the Sharpe ratio first, which aims to maximize the risk-adjusted returns.
- **Omega Ratio Optimization**: Next, we'll optimize using the Omega ratio, which emphasizes the overall shape of the return distribution and provides a more holistic measure of performance.


In [3]:

# Portfolio Optimization with Sharpe Ratio and Manual Omega Ratio Calculation

import yfinance as yf
import pandas as pd
import numpy as np
from pypfopt.efficient_frontier import EfficientFrontier
from pypfopt import risk_models, expected_returns
import plotly.graph_objs as go

# Fetch stock data for a couple of selected stocks
tickers = ['AAPL', 'MSFT']
data = yf.download(tickers, start='2020-01-01', end='2023-01-01')['Adj Close']

# Calculate daily returns
returns = data.pct_change().dropna()

# Apply PyPortfolioOpt to calculate expected returns and sample covariance matrix
mu = expected_returns.mean_historical_return(data)
S = risk_models.sample_cov(data)

# Perform mean-variance optimization to maximize the Sharpe ratio
ef_sharpe = EfficientFrontier(mu, S)
weights_sharpe = ef_sharpe.max_sharpe()
cleaned_weights_sharpe = ef_sharpe.clean_weights()

# Display the optimized portfolio weights for Sharpe ratio
fig = go.Figure(data=[go.Bar(x=list(cleaned_weights_sharpe.keys()), y=list(cleaned_weights_sharpe.values()))])
fig.update_layout(title="Optimized Portfolio Weights (Max Sharpe Ratio)", xaxis_title="Assets", yaxis_title="Weight")
fig.show()

# Calculate portfolio returns using the optimized weights
portfolio_returns = returns.dot(pd.Series(cleaned_weights_sharpe))

# Define a threshold for the Omega ratio (for example, zero for break-even)
threshold = 0

# Calculate the Omega ratio
excess_returns = portfolio_returns - threshold
positive_returns = excess_returns[excess_returns > 0].sum()
negative_returns = -excess_returns[excess_returns < 0].sum()

omega_ratio_value = positive_returns / negative_returns
print(f"Omega Ratio: {omega_ratio_value:.2f}")

# Plot cumulative returns of the optimized portfolio
cumulative_returns = np.cumsum(portfolio_returns)
fig = go.Figure()
fig.add_trace(go.Scatter(x=portfolio_returns.index, y=cumulative_returns, mode='lines', name='Cumulative Returns'))
fig.update_layout(title="Cumulative Returns of Optimized Portfolio", xaxis_title="Date", yaxis_title="Cumulative Return")
fig.show()


[*********************100%%**********************]  2 of 2 completed


Omega Ratio: 1.13



# Plotly Exercises

Now that you have been introduced to basic visualizations using Plotly, try the following exercises to practice:

1. **Line Plot**: Create a line plot using Plotly to show the trend of a stock's adjusted close prices over time. Use any stock from the list in the previous sections.
2. **Pie Chart**: Create a pie chart that represents the percentage of stock portfolio allocations for 5 stocks.
3. **Bubble Chart**: Create a bubble chart using randomly generated data where the size of each bubble represents the value of a third variable.
4. **Heatmap**: Create a heatmap showing the correlation between the adjusted close prices of all stocks in the dataset.

### Example - Line Plot:
Here is an example of how to create a simple line plot to visualize stock price trends.

```python
import plotly.express as px
import yfinance as yf

# Fetch data for Apple stock
data = yf.download('AAPL', start='2020-01-01', end='2023-01-01')['Adj Close']

# Create line plot
fig = px.line(data, title='AAPL Stock Price Over Time')
fig.show()
```

Use the example above to help you with Exercise 1.



# Portfolio Optimization Exercises

1. **Different Objective**: In the current portfolio optimization, we used the Sharpe ratio as the objective. Try modifying the objective to minimize the volatility of the portfolio instead.
2. **Vary the Assets**: Modify the list of stocks used for portfolio optimization. Try adding more stocks to the portfolio, and compare the optimized portfolio's weights for different sets of stocks.
3. **Simulate Different Time Periods**: Change the start and end dates for the stock data to optimize portfolios for different market conditions (e.g., during a bull market or bear market). Analyze how the portfolio weights differ across time periods.

### Example - Minimizing Volatility:
Here is an example of how you can modify the portfolio optimization to minimize the volatility of the portfolio:

```python
from pypfopt.efficient_frontier import EfficientFrontier

# Calculate expected returns and sample covariance matrix
mu = expected_returns.mean_historical_return(data)
S = risk_models.sample_cov(data)

# Optimize for minimum volatility
ef_volatility = EfficientFrontier(mu, S)
weights_min_vol = ef_volatility.min_volatility()
cleaned_weights_min_vol = ef_volatility.clean_weights()

# Display the optimized portfolio weights for minimum volatility
fig = go.Figure(data=[go.Bar(x=list(cleaned_weights_min_vol.keys()), y=list(cleaned_weights_min_vol.values()))])
fig.update_layout(title="Optimized Portfolio Weights (Minimum Volatility)", xaxis_title="Assets", yaxis_title="Weight")
fig.show()
```

Try this example to solve Exercise 1.



# Statistical Arbitrage Exercises

1. **Test Different Stock Pairs**: In the example, we tested the cointegration of two specific stocks. Try selecting a different pair of stocks and re-run the cointegration test. Are the new stocks cointegrated? How does the spread behave?
2. **Adjust the Mean and Standard Deviation Thresholds**: In the example, we used 1.5 times the standard deviation to generate trading signals. Try modifying this threshold (e.g., 1 standard deviation or 2 standard deviations) and observe how the signals and portfolio performance change.
3. **Different Time Periods**: Try running the statistical arbitrage strategy on a different time period (e.g., during the COVID-19 market crash). How does the strategy perform during different periods?

### Example - Testing a New Stock Pair:
Here is an example of how you can test a different pair of stocks for cointegration:

```python
# Fetch data for two different stocks (e.g., GOOGL and AMZN)
tickers = ['GOOGL', 'AMZN']
data = yf.download(tickers, start='2020-01-01', end='2023-01-01')['Adj Close']

# Calculate the log prices
log_prices = np.log(data)

# Perform Engle-Granger cointegration test
coint_test = sm.tsa.coint(log_prices[tickers[0]], log_prices[tickers[1]])

# Cointegration test result interpretation
test_statistic, p_value, critical_values = coint_test
print(f"Test Statistic: {test_statistic}")
print(f"P-value: {p_value}")
print(f"Critical Values: {critical_values}")
```

Try using this code to complete Exercise 1.



# Plotly Exercises

Now that you have been introduced to basic visualizations using Plotly, try the following exercises to practice:

1. **Line Plot**: Create a line plot using Plotly to show the trend of a stock's adjusted close prices over time. Use any stock from the list in the previous sections.
2. **Pie Chart**: Create a pie chart that represents the percentage of stock portfolio allocations for 5 stocks.
3. **Bubble Chart**: Create a bubble chart using randomly generated data where the size of each bubble represents the value of a third variable.
4. **Heatmap**: Create a heatmap showing the correlation between the adjusted close prices of all stocks in the dataset.


In [4]:

# Example - Line Plot:
import plotly.express as px
import yfinance as yf

# Fetch data for Apple stock
data = yf.download('AAPL', start='2020-01-01', end='2023-01-01')['Adj Close']

# Create line plot
fig = px.line(data, title='AAPL Stock Price Over Time')
fig.show()


[*********************100%%**********************]  1 of 1 completed



# Portfolio Optimization Exercises

1. **Different Objective**: In the current portfolio optimization, we used the Sharpe ratio as the objective. Try modifying the objective to minimize the volatility of the portfolio instead.
2. **Vary the Assets**: Modify the list of stocks used for portfolio optimization. Try adding more stocks to the portfolio, and compare the optimized portfolio's weights for different sets of stocks.
3. **Simulate Different Time Periods**: Change the start and end dates for the stock data to optimize portfolios for different market conditions (e.g., during a bull market or bear market). Analyze how the portfolio weights differ across time periods.


In [5]:

# Example - Minimizing Volatility:
from pypfopt.efficient_frontier import EfficientFrontier
from pypfopt import expected_returns, risk_models
import yfinance as yf
import plotly.graph_objs as go

# Fetch stock data
tickers = ['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'TSLA']
data = yf.download(tickers, start='2020-01-01', end='2023-01-01')['Adj Close']

# Calculate expected returns and sample covariance matrix
mu = expected_returns.mean_historical_return(data)
S = risk_models.sample_cov(data)

# Optimize for minimum volatility
ef_volatility = EfficientFrontier(mu, S)
weights_min_vol = ef_volatility.min_volatility()
cleaned_weights_min_vol = ef_volatility.clean_weights()

# Display the optimized portfolio weights for minimum volatility
fig = go.Figure(data=[go.Bar(x=list(cleaned_weights_min_vol.keys()), y=list(cleaned_weights_min_vol.values()))])
fig.update_layout(title="Optimized Portfolio Weights (Minimum Volatility)", xaxis_title="Assets", yaxis_title="Weight")
fig.show()


[*********************100%%**********************]  5 of 5 completed



# Statistical Arbitrage Exercises

1. **Test Different Stock Pairs**: In the example, we tested the cointegration of two specific stocks. Try selecting a different pair of stocks and re-run the cointegration test. Are the new stocks cointegrated? How does the spread behave?
2. **Adjust the Mean and Standard Deviation Thresholds**: In the example, we used 1.5 times the standard deviation to generate trading signals. Try modifying this threshold (e.g., 1 standard deviation or 2 standard deviations) and observe how the signals and portfolio performance change.
3. **Different Time Periods**: Try running the statistical arbitrage strategy on a different time period (e.g., during the COVID-19 market crash). How does the strategy perform during different periods?


In [6]:

# Example - Testing a New Stock Pair:
import yfinance as yf
import statsmodels.api as sm
import numpy as np
import pandas as pd

# Fetch data for two different stocks (e.g., GOOGL and AMZN)
tickers = ['GOOGL', 'AMZN']
data = yf.download(tickers, start='2020-01-01', end='2023-01-01')['Adj Close']

# Calculate the log prices
log_prices = np.log(data)

# Perform Engle-Granger cointegration test
coint_test = sm.tsa.coint(log_prices[tickers[0]], log_prices[tickers[1]])

# Cointegration test result interpretation
test_statistic, p_value, critical_values = coint_test
print(f"Test Statistic: {test_statistic}")
print(f"P-value: {p_value}")
print(f"Critical Values: {critical_values}")


[*********************100%%**********************]  2 of 2 completed

Test Statistic: -0.7449006152702802
P-value: 0.94156984045575
Critical Values: [-3.91100464 -3.34423482 -3.05007226]



