# Extended Analysis: Stock Selection for Short Position Strategy Based on Correlation with NASDAQ-100 Futures (NQ=F)

## Project Overview

In this extended study, the goal is to refine the previously developed short position strategy by targeting stocks from the **NASDAQ-100** index that exhibit a significant correlation with **NASDAQ-100 futures (NQ=F)**. The primary objective is to enhance the shorting strategy by narrowing the selection to stocks whose price movements are highly correlated with the broader market trend represented by the NQ=F futures. This approach aims to improve both the precision and profitability of the shorting strategy, increasing its reliability as a trading tool.

## Data Acquisition

In this step, we download the historical adjusted closing price data for **NASDAQ-100** stocks and **NQ=F futures** from Yahoo Finance. We collect data spanning from **January 1, 2015**, to the present day, for a set of NASDAQ-100 tickers. 

The **adjusted closing price** is used to account for stock splits and dividends, ensuring that the data reflects the true price movement. We use **yfinance** to pull the data, which is critical for the analysis of price movements and subsequent correlation with the NQ=F futures.


In [442]:
# Import Libraries
import yfinance as yf
import pandas as pd
import numpy as np
from datetime import datetime

# List of top 20 NASDAQ-100 tickers
tickers = [
    "NQ=F", "AAPL", "MSFT", "GOOGL", "AMZN", "META", "TSLA", "NVDA", "PYPL", "INTC", "CSCO",
    "AMD", "NFLX", "INTU", "PEP", "ADBE", "QCOM", "NVDA", "V", "MRNA", "BIDU", "ISRG"
]


# Download historical data for NASDAQ-100 stocks and NQ=F futures
data = yf.download(tickers, start='2015-01-01', end=datetime.now())['Adj Close']
# Remove timezone information from data
data.index = data.index.tz_localize(None)

# Preview the data
data.head()

[*********************100%***********************]  21 of 21 completed


Ticker,AAPL,ADBE,AMD,AMZN,BIDU,CSCO,GOOGL,INTC,INTU,ISRG,...,MRNA,MSFT,NFLX,NQ=F,NVDA,PEP,PYPL,QCOM,TSLA,V
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2015-01-02,24.347172,72.339996,2.67,15.426,223.080002,20.326616,26.4121,27.896456,83.943237,58.396667,...,,40.232861,49.848572,4214.25,0.483177,71.198448,,55.841816,14.620667,61.776661
2015-01-05,23.661274,71.980003,2.66,15.1095,219.789993,19.921703,25.908844,27.5819,83.611931,57.152222,...,,39.862877,47.311428,4161.75,0.475016,70.663147,,55.578709,14.006,60.413025
2015-01-06,23.6635,70.529999,2.63,14.7645,220.179993,19.914345,25.26943,27.067844,81.044189,57.754444,...,,39.277786,46.501431,4102.25,0.460614,70.127922,,54.789345,14.085333,60.02372
2015-01-07,23.995319,71.110001,2.58,14.921,224.350006,20.098387,25.195114,27.635603,81.837898,58.07111,...,,39.776833,46.742859,4151.5,0.459414,72.178535,,55.428349,14.063333,60.827942
2015-01-08,24.917271,72.919998,2.61,15.023,229.210007,20.252996,25.282896,28.149639,82.936234,59.222221,...,,40.947002,47.779999,4232.25,0.476696,73.490334,,56.014732,14.041333,61.643784


## Calculating Daily Returns

In this step, we calculate the **daily returns** for each stock in the NASDAQ-100 index and the NQ=F futures. The daily return for each asset is computed as the percentage change between the adjusted closing price of consecutive trading days. 

In [444]:
# Calculate daily returns for each stock and NQ=F futures
returns = data.pct_change()

# Preview the data's returns
returns.head()

  returns = data.pct_change()


Ticker,AAPL,ADBE,AMD,AMZN,BIDU,CSCO,GOOGL,INTC,INTU,ISRG,...,MRNA,MSFT,NFLX,NQ=F,NVDA,PEP,PYPL,QCOM,TSLA,V
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2015-01-02,,,,,,,,,,,...,,,,,,,,,,
2015-01-05,-0.028172,-0.004976,-0.003745,-0.020517,-0.014748,-0.01992,-0.019054,-0.011276,-0.003947,-0.02131,...,,-0.009196,-0.050897,-0.012458,-0.01689,-0.007518,,-0.004712,-0.042041,-0.022074
2015-01-06,9.4e-05,-0.020145,-0.011278,-0.022833,0.001774,-0.000369,-0.024679,-0.018637,-0.03071,0.010537,...,,-0.014678,-0.017121,-0.014297,-0.030318,-0.007574,,-0.014203,0.005664,-0.006444
2015-01-07,0.014022,0.008223,-0.019011,0.0106,0.018939,0.009242,-0.002941,0.020975,0.009794,0.005483,...,,0.012706,0.005192,0.012006,-0.002606,0.029241,,0.011663,-0.001562,0.013398
2015-01-08,0.038422,0.025453,0.011628,0.006836,0.021663,0.007693,0.003484,0.018601,0.013421,0.019822,...,,0.029418,0.022188,0.019451,0.037618,0.018174,,0.010579,-0.001564,0.013412


## Correlation Analysis

Here, we calculate the **correlation** between the returns of each NASDAQ-100 stock and the returns of the NQ=F futures. The correlation coefficient measures the strength and direction of the linear relationship between the two variables, i.e., how similarly the returns of the individual stocks move relative to the NQ=F futures.

A high correlation indicates that the stock tends to move in a similar direction to the NQ=F futures, while a low or negative correlation suggests less alignment. This step is crucial for identifying stocks that behave in a manner closely aligned with the broader NASDAQ-100 index.

In [446]:
# Extract NQ=F returns (for NASDAQ-100 futures)
nq_returns = returns["NQ=F"]

# Calculate correlation between each stock and NQ=F futures
correlations = returns.corrwith(nq_returns)

# Drop NQ=F from the correlations
correlations = correlations.drop("NQ=F")

# Preview the correlation results
print(correlations)

Ticker
AAPL     0.794358
ADBE     0.754279
AMD      0.554763
AMZN     0.723927
BIDU     0.472535
CSCO     0.637932
GOOGL    0.774630
INTC     0.631495
INTU     0.763073
ISRG     0.669006
META     0.682573
MRNA     0.236595
MSFT     0.838018
NFLX     0.557435
NVDA     0.717759
PEP      0.471933
PYPL     0.660208
QCOM     0.650229
TSLA     0.540430
V        0.693191
dtype: float64


## Sorting and Identifying Strongly Correlated Stocks

In this step, we **sort** the stocks based on their correlation with the NQ=F futures, from the highest to the lowest. Sorting allows us to identify which stocks have the strongest positive correlation with the broader NASDAQ-100 movement.

This sorted list will serve as the foundation for selecting stocks that exhibit strong alignment with the NQ=F futures and are more likely to mirror market trends, which is important for a short-selling strategy.

In [448]:
# Define a threshold for strong correlation (absolute correlation > 0.7)
strong_correlation_threshold = 0.7

# Filter stocks that meet the strong correlation criteria and convert to DataFrame
strongly_correlated_stocks_df = pd.DataFrame(list((correlations[abs(correlations) >= strong_correlation_threshold]).items()), 
                                             columns=['Ticker', 'Correlation'])

# Set the index to be the Ticker column
strongly_correlated_stocks_df.set_index('Ticker', inplace=True)

# Display the DataFrame
print(strongly_correlated_stocks_df)

        Correlation
Ticker             
AAPL       0.794358
ADBE       0.754279
AMZN       0.723927
GOOGL      0.774630
INTU       0.763073
MSFT       0.838018
NVDA       0.717759


While **correlation** shows the relationship between a stock and the NQ=F futures, it doesn't provide information on the stock's potential magnitude of movement relative to the market. To better understand this, we calculate **Beta**, which measures the stock's volatility compared to the broader market.

## Beta Value Calculation: Measuring Volatility

The **Beta** value quantifies how much a stock's price moves relative to the market. The formula for Beta is:

$$\beta = \frac{\text{Cov}(R_{\text{stock}}, R_{\text{market}})}{\text{Var}(R_{\text{market}})}$$

Where:
- $\text{Cov}(R_{\text{stock}}, R_{\text{market}})$  is the covariance between the stock's returns and the market returns.
- $\text{Var}(R_{\text{market}})$  is the variance of the market's returns.

- **Beta > 1**: Stock is more volatile than the market.
- **Beta < 1**: Stock is less volatile than the market.
- **Beta = 1**: Stock moves in line with the market.

#### Why Beta Matters for Short-Selling Strategy ?

For a short-selling strategy, stocks with **higher Beta values** are more desirable, as they are likely to experience larger declines than the broader market during downtrends.


In [455]:
# Extract tickers from the index and convert them to a list
strongly_correlated_stocks_list = strongly_correlated_stocks_df.index.tolist()

# Display the tickers list
print(strongly_correlated_stocks_list)

['AAPL', 'ADBE', 'AMZN', 'GOOGL', 'INTU', 'MSFT', 'NVDA']


In [457]:
# Calculate the variance of NQ=F returns
nq_f_variance = returns['NQ=F'].var()

# Initialize an empty list to store beta values for each stock
beta_values = []

# Loop through each stock in the strongly correlated stocks list
for stock in strongly_correlated_stocks_list:
    # Extract returns for the stock and the market (NQ=F)
    stock_returns = returns[stock]
    nasdaq_returns = returns['NQ=F']
    
    # Calculate the covariance between stock returns and NASDAQ returns
    cov_stock_nqf = stock_returns.cov(nasdaq_returns)
    
    # Calculate Beta (Covariance / Variance of the market returns)
    beta = cov_stock_nqf / nq_f_variance
    
    # Append the beta value to the list
    beta_values.append(beta)

# Assign the beta values to the 'Beta' column of the DataFrame
strongly_correlated_stocks_df['Beta'] = beta_values

# Display the updated DataFrame with Beta values
print(strongly_correlated_stocks_df)

        Correlation      Beta
Ticker                       
AAPL       0.794358  1.037636
ADBE       0.754279  1.137224
AMZN       0.723927  1.086576
GOOGL      0.774630  1.005027
INTU       0.763073  1.094349
MSFT       0.838018  1.042646
NVDA       0.717759  1.599278


## VaR (Value at Risk) Calculation: Measuring Risk for Each Stock

In this step, we calculate **Value at Risk (VaR)** for each stock in the strongly correlated group. VaR quantifies the potential loss in the value of an asset over a specified time period, given a certain confidence level. In this case, we calculate VaR at a **95% confidence level**, meaning there's a 95% probability that the actual loss will not exceed the calculated VaR value.

#### VaR Formula
The formula for VaR is:

$$\text{VaR} = \mu + (Z \times \sigma)$$

Where:
- $\mu $ is the **mean return** of the stock,
- $Z $ is the **Z-score** for the desired confidence level (for 95%, the Z-score is -1.645 for a one-tailed distribution),
- $ \sigma $ is the **standard deviation** of the stock's returns.

#### Why VaR Is Important

VaR helps in understanding the potential downside risk of a stock in the short term. For short-selling strategies, **a higher VaR** indicates **greater potential for significant losses**, while a lower VaR may indicate a more stable asset. By calculating VaR, we can assess which stocks may present greater risk and require more careful consideration in terms of capital allocation and stop-loss thresholds.

In [464]:
# Calculate the standard deviation of NQ=F returns
nq_f_std_dev = returns['NQ=F'].std()

# Z-score for 95% confidence level (one-tailed distribution)
z_score_95 = -1.645

# Initialize an empty list to store VaR values for each stock
var_values = []

# Loop through each stock in the strongly correlated stocks list
for stock in strongly_correlated_stocks_list:
    # Extract returns for the stock
    stock_returns = returns[stock]
    
    # Calculate VaR at 95% confidence level (mean + Z * standard deviation)
    var_95 = stock_returns.mean() + (z_score_95 * stock_returns.std())
    
    # Append the VaR value to the list
    var_values.append(var_95)

# Assign the VaR values to the 'VaR' column of the DataFrame
strongly_correlated_stocks_df['VaR'] = var_values

# Display the updated DataFrame with VaR values
print(strongly_correlated_stocks_df)

        Correlation      Beta       VaR
Ticker                                 
AAPL       0.794358  1.037636 -0.028533
ADBE       0.754279  1.137224 -0.033167
AMZN       0.723927  1.086576 -0.032759
GOOGL      0.774630  1.005027 -0.028475
INTU       0.763073  1.094349 -0.031475
MSFT       0.838018  1.042646 -0.027105
NVDA       0.717759  1.599278 -0.047722


### Testing Strategy for Selected Stocks

In this step, we perform a backtest using data from selected stocks that have a strong correlation with the **NQ=F futures**. The goal is to evaluate how these stocks perform over a specified period, both before and after a short-selling strategy is implemented.

We download adjusted close prices for the selected stocks over two periods:
- **Period 1:** From **2024-08-20 to 2024-09-05** (used for historical performance analysis).
- **Period 2:** From **2024-11-11 to 2024-11-16** (used to test the strategy on these stocks).

The strategy is to calculate the percentage change in the stock price over these periods to assess how the stocks performed in relation to the broader market (NQ=F futures).

#### Steps:
1. **Download Historical Data**: We use the `yfinance` library to fetch adjusted close prices for the strongly correlated stocks.
2. **Calculate Percentage Change**: For each stock, the percentage change in the adjusted close price is calculated from the beginning to the end of each period. This is done using the formula:
   $$
   \text{Return\%} = \frac{{\text{Last Close} - \text{First Close}}}{{\text{First Close}}} \times 100
   $$
3. **Create Dataframe**: The results are stored in a DataFrame for better readability.

In [466]:
# Test Strategy for each stock to gain the data for analysation
test1 = yf.download(strongly_correlated_stocks_list, start='2024-08-20', end='2024-09-05')['Adj Close']
test2 = yf.download(strongly_correlated_stocks_list, start='2024-11-11', end='2024-11-16')['Adj Close']

[*********************100%***********************]  7 of 7 completed
[*********************100%***********************]  7 of 7 completed


In [473]:
# Preview the test DataFrame
test1.head()

Ticker,AAPL,ADBE,AMZN,GOOGL,INTU,MSFT,NVDA
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2024-08-20 00:00:00+00:00,226.261063,562.25,178.880005,166.95845,666.163574,424.799988,127.239113
2024-08-21 00:00:00+00:00,226.151184,565.789978,180.110001,165.630219,669.368164,424.140015,128.489014
2024-08-22 00:00:00+00:00,224.283249,557.440002,176.130005,163.582932,664.17688,415.549988,123.729416
2024-08-23 00:00:00+00:00,226.590698,558.299988,177.039993,165.400513,618.812927,416.790009,129.358932
2024-08-26 00:00:00+00:00,226.930328,559.440002,175.5,165.939804,616.087463,413.48999,126.449181


In [475]:
# Function to calculate return based on the Close prices
def calculate_return(stock_data):
    returns = {}
    
    for ticker in stock_data:  # Iterate through the tickers
        # Get the first and last Adj Close price
        first_close = stock_data[ticker].iloc[0]
        last_close = stock_data[ticker].iloc[-1]
        
        # Calculate the return
        return_pct = (last_close - first_close) / first_close * 100
        returns[ticker] = return_pct
        
    return returns

In [508]:
# Calculate returns for both test1 and test2
returns_test1 = calculate_return(test1)
returns_test2 = calculate_return(test2)

# Convert the results into a pandas DataFrame for better readability
returns_df = pd.DataFrame({
    'Ticker': list(returns_test1.keys()),
    'Return 08/20-09/05': list(returns_test1.values()),
    'Return 11/11-11/16': list(returns_test2.values())
})

# Display the DataFrame
print(returns_df)

  Ticker  Return 08/20-09/05  Return 11/11-11/16
0   AAPL           -2.498779            0.343399
1   ADBE            2.312139           -0.220032
2   AMZN           -3.102640           -2.045057
3  GOOGL           -6.418228           -4.358193
4   INTU           -6.620908           -1.359429
5   MSFT           -3.742937           -0.720081
6   NVDA          -16.534382           -2.258019


In [510]:
# Merge strongly_correlated_stocks_df with returns_df on 'Ticker'
for_inference = pd.merge(strongly_correlated_stocks_df, returns_df, on='Ticker')

# Display the merged DataFrame
print(for_inference)

  Ticker  Correlation      Beta       VaR  Return 08/20-09/05  \
0   AAPL     0.794358  1.037636 -0.028533           -2.498779   
1   ADBE     0.754279  1.137224 -0.033167            2.312139   
2   AMZN     0.723927  1.086576 -0.032759           -3.102640   
3  GOOGL     0.774630  1.005027 -0.028475           -6.418228   
4   INTU     0.763073  1.094349 -0.031475           -6.620908   
5   MSFT     0.838018  1.042646 -0.027105           -3.742937   
6   NVDA     0.717759  1.599278 -0.047722          -16.534382   

   Return 11/11-11/16  
0            0.343399  
1           -0.220032  
2           -2.045057  
3           -4.358193  
4           -1.359429  
5           -0.720081  
6           -2.258019  


## Inference and Conclusion from Extended Research

### Key Insights:
- **Correlation and Price Movement:**
  - The correlation values between the selected stocks and the NQ=F index provide insight into the likely direction of the stocks' price movement. A strong positive correlation suggests that stocks will likely follow the movement of the NQ=F index. As seen in this research, when the NQ=F index declines, most of the stocks with high correlation also experience a drop in price. For example, during the period of August 20 to September 5, 2024, all stocks except ADBE showed negative returns, indicating that high correlation stocks tend to follow the broader market trend.

- **Beta as an Indicator for Short Strategy:**
  - The **beta value** is a crucial factor for identifying stocks with higher volatility, which can be advantageous for a **short strategy**. Higher beta stocks, such as NVDA (Beta = 1.60), are more sensitive to the overall market movements and could offer higher returns in a short position when the market declines. In particular, NVDA’s significant beta value suggests that it may be a prime candidate for shorting, as it shows much larger fluctuations compared to the index itself.


- **VaR and Stock Performance:**
  - The **Value at Risk (VaR)** analysis provides a measure of potential loss in a given timeframe. NVDA, with its higher beta value, also exhibits a larger VaR (-0.05) compared to the other stocks, indicating that it can experience more substantial losses in adverse market conditions. For instance, during the short position period from August 20 to September 5, NVDA experienced a 16.53% loss, which is far greater than the losses seen in other high-correlation stocks. This reinforces the potential effectiveness of NVDA in a short strategy, as its higher volatility can result in larger gains when the stock price declines.


- **Market Conditions and Anomalies:**
  - It's important to note that while historical data and metrics like beta and VaR provide valuable insights, individual stock performance can be affected by external factors, such as news, earnings calls, or other company-specific events. For example, during the November 11 to November 16, 2024 period, NVDA did not experience as significant a decline as expected, potentially due to market anticipation of its earnings call. This shows that while historical data can guide decision-making, traders should also consider external events that could affect stock performance.

### Conclusion:
This research demonstrates the importance of **correlation**, **beta**, and **VaR** in identifying potential short candidates and managing risk. Strongly correlated stocks are likely to follow the broader market trend, but those with higher beta values, like NVDA, are more volatile and could perform better in a short position. Furthermore, while VaR indicates the potential for large losses, it also highlights stocks that could provide greater returns when shorted in a declining market. However, traders should also stay informed about company-specific news that may impact stock performance, as seen with NVDA's unexpected stability during the November 2024 period. Overall, combining these factors can provide valuable insights for shorting strategies but should be supplemented with real-time information for optimal decision-making.