# Module 2 Homework

In this homework, we're going to combine data from various sources to process it in Pandas and generate additional fields.

If not stated otherwise, please use the [Colab](https://github.com/DataTalksClub/stock-markets-analytics-zoomcamp/blob/main/02-dataframe-analysis/Module2_Colab_Working_with_the_data.ipynb) covered at the livestream to re-use the code snippets.

In [1]:
import numpy as np
import pandas as pd
import requests
import time, io
import talib
import yfinance as yf
from datetime import date, datetime, timedelta

## Question 1: IPO Filings Web Scraping and Data Processing

**What's the total sum ($m) of 2023 filings that happened on Fridays?**

Re-use the [Code Snippet 1] example to get the data from web for this endpoint: https://stockanalysis.com/ipos/filings/
Convert the 'Filing Date' to datetime(), 'Shares Offered' to float64 (if '-' is encountered, populate with NaNs).
Define a new field 'Avg_price' based on the "Price Range", which equals to NaN if no price is specified, to the price (if only one number is provided), or to the average of 2 prices (if a range is given).
You may be inspired by the function `extract_numbers()` in [Code Snippet 4], or you can write your own function to "parse" a string.
Define a column "Shares_offered_value", which equals to "Shares Offered" * "Avg_price" (when both columns are defined; otherwise, it's NaN)

Find the total sum in $m (millions of USD, closest INTEGER number) for all filings during 2023, which happened on Fridays (`Date.dt.dayofweek()==4`). You should see 32 records in total, 25 of it is not null.

(additional: you can read about [S-1 IPO filing](https://www.dfinsolutions.com/knowledge-hub/thought-leadership/knowledge-resources/what-s-1-ipo-filing) to understand the context)

In [2]:
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3',
}

url = "https://stockanalysis.com/ipos/filings/"
response = requests.get(url, headers=headers)
response = io.StringIO(response.text)

ipo_dfs = pd.read_html(response)

In [3]:
ipos_df= ipo_dfs[0]
ipos_df["Filing Date"] = pd.to_datetime(ipos_df["Filing Date"], format="mixed")
ipos_df.loc[ipos_df["Shares Offered"] == "-", "Shares Offered"] = np.nan
ipos_df["Shares Offered"] = ipos_df["Shares Offered"].astype("float")
ipos_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 325 entries, 0 to 324
Data columns (total 5 columns):
 #   Column          Non-Null Count  Dtype         
---  ------          --------------  -----         
 0   Filing Date     325 non-null    datetime64[ns]
 1   Symbol          325 non-null    object        
 2   Company Name    325 non-null    object        
 3   Price Range     325 non-null    object        
 4   Shares Offered  252 non-null    float64       
dtypes: datetime64[ns](1), float64(1), object(3)
memory usage: 12.8+ KB


In [4]:
ipos_2023 = ipos_df[ipos_df["Filing Date"].dt.year == 2023].copy()
ipos_2023["weekday"] = ipos_2023["Filing Date"].dt.weekday
ipos_2023

Unnamed: 0,Filing Date,Symbol,Company Name,Price Range,Shares Offered,weekday
50,2023-12-29,LEC,Lafayette Energy Corp,$3.50 - $4.50,1200000.0,4
51,2023-12-29,EPSM,Epsium Enterprise Limited,-,,4
52,2023-12-28,ONDR,"Sushi Ginza Onodera, Inc.",$7.00 - $8.00,1066667.0,3
53,2023-12-27,JDZG,Jiade Limited,$4.00 - $5.00,2200000.0,2
54,2023-12-22,LZMH,LZ Technology Holdings Limited,-,,4
...,...,...,...,...,...,...
162,2023-01-31,FBGL,FBS Global Limited,$4.00 - $5.00,1875000.0,1
163,2023-01-24,THNK,"T1V, Inc.",$4.00 - $6.00,3300000.0,1
164,2023-01-23,RPET,New Ruipeng Pet Group Inc.,-,,0
165,2023-01-13,RVGO,"RVeloCITY, Inc.",$4.00 - $5.00,3750000.0,4


In [5]:
ipos_2023_friday = ipos_2023[ipos_2023["weekday"] == 4].copy()
ipos_2023_friday

Unnamed: 0,Filing Date,Symbol,Company Name,Price Range,Shares Offered,weekday
50,2023-12-29,LEC,Lafayette Energy Corp,$3.50 - $4.50,1200000.0,4
51,2023-12-29,EPSM,Epsium Enterprise Limited,-,,4
54,2023-12-22,LZMH,LZ Technology Holdings Limited,-,,4
55,2023-12-22,CHLW,Chun Hui Le Wan International Holding Group Ltd,-,,4
60,2023-12-15,GIT,Going International Holding Company Limited,-,,4
62,2023-12-08,ENGS,Energys Group Limited,$4.00 - $6.00,2000000.0,4
63,2023-12-08,LNKS,Linkers Industries Limited,$4.00 - $6.00,2200000.0,4
82,2023-10-27,RAY,Raytech Holding Limited,$4.00 - $5.00,1500000.0,4
89,2023-10-13,ORIS,Oriental Rise Holdings Limited,$4.00,2000000.0,4
92,2023-10-06,QMMM,QMMM Holdings Limited,$4.00,2125000.0,4


In [6]:
def calulate_average(row):
    if row == "-":
        return np.nan
    
    if "-" in row:
        price = row.replace("$", "").replace(" ", "").split("-")
        price = [float(p) for p in price]
        return np.mean(price)
    
    else:
        return float(row.replace("$", ""))
    
ipos_2023_friday["Avg_price"] = ipos_2023_friday["Price Range"].apply(calulate_average)
ipos_2023_friday["Shares_offered_value"] = ipos_2023_friday["Avg_price"] * ipos_2023_friday["Shares Offered"]
ipos_2023_friday

Unnamed: 0,Filing Date,Symbol,Company Name,Price Range,Shares Offered,weekday,Avg_price,Shares_offered_value
50,2023-12-29,LEC,Lafayette Energy Corp,$3.50 - $4.50,1200000.0,4,4.0,4800000.0
51,2023-12-29,EPSM,Epsium Enterprise Limited,-,,4,,
54,2023-12-22,LZMH,LZ Technology Holdings Limited,-,,4,,
55,2023-12-22,CHLW,Chun Hui Le Wan International Holding Group Ltd,-,,4,,
60,2023-12-15,GIT,Going International Holding Company Limited,-,,4,,
62,2023-12-08,ENGS,Energys Group Limited,$4.00 - $6.00,2000000.0,4,5.0,10000000.0
63,2023-12-08,LNKS,Linkers Industries Limited,$4.00 - $6.00,2200000.0,4,5.0,11000000.0
82,2023-10-27,RAY,Raytech Holding Limited,$4.00 - $5.00,1500000.0,4,4.5,6750000.0
89,2023-10-13,ORIS,Oriental Rise Holdings Limited,$4.00,2000000.0,4,4.0,8000000.0
92,2023-10-06,QMMM,QMMM Holdings Limited,$4.00,2125000.0,4,4.0,8500000.0


In [7]:
print(f"The total sum ($m) of 2023 filings that happenned of Fridays is\
 {round(ipos_2023_friday['Shares_offered_value'].sum() / 1e6)}")

The total sum ($m) of 2023 filings that happenned of Fridays is 286


## Question 2:  IPOs "Fixed days hold" strategy


**Find the optimal number of days X (between 1 and 30), where 75% quantile growth is the highest?**


Reuse [Code Snippet 1] to retrieve the list of IPOs from 2023 and 2024 (from URLs: https://stockanalysis.com/ipos/2023/ and https://stockanalysis.com/ipos/2024/). 
Get all OHLCV daily prices for all stocks with an "IPO date" before March 1, 2024 ("< 2024-03-01") - 184 tickers (without 'RYZB'). Please remove 'RYZB', as it is no longer available on Yahoo Finance. 

Sometimes you may need to adjust the symbol name (e.g., 'IBAC' on stockanalysis.com -> 'IBACU' on Yahoo Finance) to locate OHLCV prices for all stocks. Also, you can see the ticker changes using this [link](https://stockanalysis.com/actions/changes/).
Some of the tickers (like 'DYCQ' and 'LEGT') were on the market less than 30 days (11 and 21 days, respectively). Let's leave them in the dataset; it just means that you couldn't hold them for more days than they were listed.

Let's assume you managed to buy a new stock (listed on IPO) on the first day at the [Adj Close] price]. Your strategy is to hold for exactly X full days (where X is between 1 and 30) and sell at the "Adj. Close" price in X days (e.g., if X=1, you sell on the next day).
Find X, when the 75% quantile growth (among 185 investments) is the highest. 

HINTs:
* You can generate 30 additional columns: growth_future_1d ... growth_future_30d, join that with the table of min_dates (first day when each stock has data on Yahoo Finance), and perform vector operations on the resulting dataset.
* You can use the `DataFrame.describe()` function to get mean, min, max, 25-50-75% quantiles.


Additional: 
* You can also ensure that the mean and 50th percentile (median) investment returns are negative for most X values, implying a wager for a "lucky" investor who might be in the top 25%.
* What's your recommendation: Do you suggest pursuing this strategy for an optimal X?

In [8]:
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3',
}

url = "https://stockanalysis.com/ipos/2023/"
response = requests.get(url, headers=headers)
response = io.StringIO(response.text)

ipo_dfs = pd.read_html(response)
ipos_2023 = ipo_dfs[0]

url = "https://stockanalysis.com/ipos/2024/"
response = requests.get(url, headers=headers)
response = io.StringIO(response.text)

ipo_dfs = pd.read_html(response)
ipos_2024 = ipo_dfs[0]

In [9]:
stacked_ipos_df = pd.concat([ipos_2024, ipos_2023], ignore_index=True)
stacked_ipos_df['IPO Date'] = pd.to_datetime(stacked_ipos_df['IPO Date'], format="mixed")
# it has some missing values --> use defensive errors='coerce' (if don't have time to crack into the data errors)
#     : pd.to_numeric() function call, which will convert problematic values to NaN.
#     otherwise you'll get a ValueError: Unable to parse string "-" at position 9
stacked_ipos_df['IPO Price'] = pd.to_numeric(stacked_ipos_df['IPO Price'].str.replace('$', ''), errors='coerce')
# not sure why, but need to call it again to transform 'object' to 'float64'
stacked_ipos_df['IPO Price'] = pd.to_numeric(stacked_ipos_df['IPO Price'])
# Convert "Current" column
stacked_ipos_df['Current'] = pd.to_numeric(stacked_ipos_df['Current'].str.replace('$', ''), errors='coerce')
# Convert 'Return' to numeric format (percentage)
stacked_ipos_df['Return'] = pd.to_numeric(stacked_ipos_df['Return'].str.replace('%', ''), errors='coerce') / 100
stacked_ipos_df = stacked_ipos_df.loc[stacked_ipos_df["IPO Date"] < "2024-03-01"]
stacked_ipos_df

Unnamed: 0,IPO Date,Symbol,Company Name,IPO Price,Current,Return
33,2024-02-27,SMXT,"SolarMax Technology, Inc.",4.00,10.34,1.5850
34,2024-02-22,VHAI,Vocodia Holdings Corp,4.25,0.13,-0.9686
35,2024-02-21,DYCQ,DT Cloud Acquisition Corporation,10.00,10.16,0.0160
36,2024-02-16,CHRO,Chromocell Therapeutics Corp,6.00,1.84,-0.6933
37,2024-02-14,UMAC,"Unusual Machines, Inc.",4.00,1.07,-0.7325
...,...,...,...,...,...,...
213,2023-01-25,QSG,QuantaSing Group Ltd,12.50,3.19,-0.7448
214,2023-01-20,CVKD,"Cadrenal Therapeutics, Inc.",5.00,0.48,-0.9040
215,2023-01-13,SKWD,"Skyward Specialty Insurance Group, Inc.",15.00,37.61,1.5073
216,2023-01-13,ISRL,Israel Acquisitions Corp,10.00,10.93,0.0930


In [10]:
tickers = list(stacked_ipos_df["Symbol"].unique())
tickers.remove("RYZB")
tickers.remove("PTHR")
tickers.append("HOVR")
start_date = date(2023, 1, 1)
end_date = date(2024, 2, 29)
len(tickers)

184

In [11]:
def download_dataset(tickers, start_date, end_date, max_days=30):
    stocks_df = pd.DataFrame()

    for i, ticker in enumerate(tickers):
        print(i,ticker)

        # Work with stock prices
        temp_df = pd.DataFrame(index=range(1))
        historyPrices = yf.download(tickers = ticker,
                                    period="max",
                                    interval = "1d")

        # generate features for historical prices, and what we want to predict
        historyPrices = historyPrices[(historyPrices.index.date >= start_date) & (historyPrices.index.date <= end_date)]
        for day in range(1, max_days+1):
            if day < len(historyPrices):
                temp_df[f"growth_{day}d"] = historyPrices.iloc[day]["Adj Close"] / historyPrices.iloc[0]["Adj Close"] - 1
            else:
                temp_df[f"growth_{day}d"] = np.nan


        temp_df["ticker"] = ticker
        
        time.sleep(1)

        if stocks_df.empty:
            stocks_df = temp_df
        else:
            stocks_df = pd.concat([stocks_df, temp_df], ignore_index=True)

    return stocks_df

growth_df = download_dataset(tickers, start_date, end_date, max_days=30)

0 SMXT


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

1 VHAI



[*********************100%%**********************]  1 of 1 completed

2 DYCQ



[*********************100%%**********************]  1 of 1 completed

3 CHRO





4 UMAC


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

5 TBBB



[*********************100%%**********************]  1 of 1 completed

6 MGX



[*********************100%%**********************]  1 of 1 completed

7 HLXB



[*********************100%%**********************]  1 of 1 completed

8 TELO



[*********************100%%**********************]  1 of 1 completed

9 KYTX



[*********************100%%**********************]  1 of 1 completed

10 PMNT



[*********************100%%**********************]  1 of 1 completed

11 AHR



[*********************100%%**********************]  1 of 1 completed

12 LEGT



[*********************100%%**********************]  1 of 1 completed

13 ANRO



[*********************100%%**********************]  1 of 1 completed

14 GUTS



[*********************100%%**********************]  1 of 1 completed

15 AS



[*********************100%%**********************]  1 of 1 completed

16 FBLG



[*********************100%%**********************]  1 of 1 completed

17 BTSG



[*********************100%%**********************]  1 of 1 completed

18 AVBP



[*********************100%%**********************]  1 of 1 completed

19 HAO



[*********************100%%**********************]  1 of 1 completed

20 CGON



[*********************100%%**********************]  1 of 1 completed

21 YIBO





22 SUGP


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

23 JL



[*********************100%%**********************]  1 of 1 completed

24 KSPI



[*********************100%%**********************]  1 of 1 completed

25 JVSA



[*********************100%%**********************]  1 of 1 completed

26 PSBD



[*********************100%%**********************]  1 of 1 completed

27 CCTG



[*********************100%%**********************]  1 of 1 completed

28 SYNX



[*********************100%%**********************]  1 of 1 completed

29 SDHC



[*********************100%%**********************]  1 of 1 completed

30 ROMA



[*********************100%%**********************]  1 of 1 completed

31 IROH



[*********************100%%**********************]  1 of 1 completed

32 LGCB



[*********************100%%**********************]  1 of 1 completed

33 ZKH



[*********************100%%**********************]  1 of 1 completed

34 BAYA



[*********************100%%**********************]  1 of 1 completed

35 INHD



[*********************100%%**********************]  1 of 1 completed

36 AFJK



[*********************100%%**********************]  1 of 1 completed

37 GSIW



[*********************100%%**********************]  1 of 1 completed

38 FEBO



[*********************100%%**********************]  1 of 1 completed

39 CLBR



[*********************100%%**********************]  1 of 1 completed

40 ELAB



[*********************100%%**********************]  1 of 1 completed

41 RR



[*********************100%%**********************]  1 of 1 completed

42 DDC



[*********************100%%**********************]  1 of 1 completed

43 SHIM



[*********************100%%**********************]  1 of 1 completed

44 GLAC



[*********************100%%**********************]  1 of 1 completed

45 SGN



[*********************100%%**********************]  1 of 1 completed

46 HG





47 CRGX


[*********************100%%**********************]  1 of 1 completed


48 ANSC


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

49 AITR



[*********************100%%**********************]  1 of 1 completed

50 GVH





51 LXEO


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

52 PAPL



[*********************100%%**********************]  1 of 1 completed

53 ATGL



[*********************100%%**********************]  1 of 1 completed

54 MNR



[*********************100%%**********************]  1 of 1 completed

55 WBUY



[*********************100%%**********************]  1 of 1 completed

56 NCL



[*********************100%%**********************]  1 of 1 completed

57 BIRK



[*********************100%%**********************]  1 of 1 completed

58 GMM



[*********************100%%**********************]  1 of 1 completed

59 PMEC





60 LRHC


[*********************100%%**********************]  1 of 1 completed


61 GPAK


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

62 SPKL



[*********************100%%**********************]  1 of 1 completed

63 QETA



[*********************100%%**********************]  1 of 1 completed

64 MSS





65 ANL


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

66 SYRA



[*********************100%%**********************]  1 of 1 completed

67 VSME



[*********************100%%**********************]  1 of 1 completed

68 LRE



[*********************100%%**********************]  1 of 1 completed

69 TURB



[*********************100%%**********************]  1 of 1 completed

70 MDBH



[*********************100%%**********************]  1 of 1 completed

71 KVYO



[*********************100%%**********************]  1 of 1 completed

72 CART



[*********************100%%**********************]  1 of 1 completed

73 DTCK



[*********************100%%**********************]  1 of 1 completed

74 NMRA



[*********************100%%**********************]  1 of 1 completed

75 ARM



[*********************100%%**********************]  1 of 1 completed

76 SPPL



[*********************100%%**********************]  1 of 1 completed

77 NWGL



[*********************100%%**********************]  1 of 1 completed

78 SWIN



[*********************100%%**********************]  1 of 1 completed

79 IVP





80 NNAG


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

81 SRM



[*********************100%%**********************]  1 of 1 completed

82 SPGC





83 LQR


[*********************100%%**********************]  1 of 1 completed


84 NRXS


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

85 FTEL



[*********************100%%**********************]  1 of 1 completed

86 MIRA



[*********************100%%**********************]  1 of 1 completed

87 PXDT



[*********************100%%**********************]  1 of 1 completed

88 HRYU



[*********************100%%**********************]  1 of 1 completed

89 CTNT



[*********************100%%**********************]  1 of 1 completed

90 SRFM



[*********************100%%**********************]  1 of 1 completed

91 PRZO



[*********************100%%**********************]  1 of 1 completed

92 HYAC



[*********************100%%**********************]  1 of 1 completed

93 KVAC



[*********************100%%**********************]  1 of 1 completed

94 JNVR



[*********************100%%**********************]  1 of 1 completed

95 ELWS



[*********************100%%**********************]  1 of 1 completed

96 WRNT



[*********************100%%**********************]  1 of 1 completed

97 TSBX



[*********************100%%**********************]  1 of 1 completed

98 ODD



[*********************100%%**********************]  1 of 1 completed

99 APGE



[*********************100%%**********************]  1 of 1 completed

100 NETD



[*********************100%%**********************]  1 of 1 completed

101 SGMT



[*********************100%%**********************]  1 of 1 completed

102 BOWN



[*********************100%%**********************]  1 of 1 completed

103 SXTP



[*********************100%%**********************]  1 of 1 completed

104 PWM



[*********************100%%**********************]  1 of 1 completed

105 VTMX



[*********************100%%**********************]  1 of 1 completed

106 INTS



[*********************100%%**********************]  1 of 1 completed

107 SVV





108 KGS


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

109 FIHL



[*********************100%%**********************]  1 of 1 completed

110 GENK



[*********************100%%**********************]  1 of 1 completed

111 BUJA



[*********************100%%**********************]  1 of 1 completed

112 BOF



[*********************100%%**********************]  1 of 1 completed

113 AZTR



[*********************100%%**********************]  1 of 1 completed

114 CAVA



[*********************100%%**********************]  1 of 1 completed

115 ESHA





116 ATMU


[*********************100%%**********************]  1 of 1 completed


117 ATS


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

118 IPXX



[*********************100%%**********************]  1 of 1 completed

119 CWD



[*********************100%%**********************]  1 of 1 completed

120 SGE



[*********************100%%**********************]  1 of 1 completed

121 SLRN



[*********************100%%**********************]  1 of 1 completed

122 ALCY



[*********************100%%**********************]  1 of 1 completed

123 KVUE



[*********************100%%**********************]  1 of 1 completed

124 GODN



[*********************100%%**********************]  1 of 1 completed

125 TRNR



[*********************100%%**********************]  1 of 1 completed

126 AACT



[*********************100%%**********************]  1 of 1 completed

127 JYD



[*********************100%%**********************]  1 of 1 completed

128 USGO



[*********************100%%**********************]  1 of 1 completed

129 UCAR



[*********************100%%**********************]  1 of 1 completed

130 WLGS



[*********************100%%**********************]  1 of 1 completed

131 TPET



[*********************100%%**********************]  1 of 1 completed

132 TCJH



[*********************100%%**********************]  1 of 1 completed

133 GDTC



[*********************100%%**********************]  1 of 1 completed

134 VCIG



[*********************100%%**********************]  1 of 1 completed

135 GDHG



[*********************100%%**********************]  1 of 1 completed

136 ARBB



[*********************100%%**********************]  1 of 1 completed

137 ISPR





138 MGIH


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

139 MWG



[*********************100%%**********************]  1 of 1 completed

140 HSHP



[*********************100%%**********************]  1 of 1 completed

141 SFWL



[*********************100%%**********************]  1 of 1 completed

142 SYT



[*********************100%%**********************]  1 of 1 completed

143 HKIT



[*********************100%%**********************]  1 of 1 completed

144 CHSN



[*********************100%%**********************]  1 of 1 completed

145 TBMC



[*********************100%%**********************]  1 of 1 completed

146 HLP



[*********************100%%**********************]  1 of 1 completed

147 ZJYL



[*********************100%%**********************]  1 of 1 completed

148 TMTC



[*********************100%%**********************]  1 of 1 completed

149 YGFGF





150 OAKU


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

151 BANL



[*********************100%%**********************]  1 of 1 completed

152 OMH



[*********************100%%**********************]  1 of 1 completed

153 MGRX



[*********************100%%**********************]  1 of 1 completed

154 FORL



[*********************100%%**********************]  1 of 1 completed

155 ICG



[*********************100%%**********************]  1 of 1 completed

156 IZM



[*********************100%%**********************]  1 of 1 completed

157 AESI



[*********************100%%**********************]  1 of 1 completed

158 AIXI



[*********************100%%**********************]  1 of 1 completed

159 SBXC



[*********************100%%**********************]  1 of 1 completed

160 BMR



[*********************100%%**********************]  1 of 1 completed

161 DIST



[*********************100%%**********************]  1 of 1 completed

162 GXAI



[*********************100%%**********************]  1 of 1 completed

163 MARX



[*********************100%%**********************]  1 of 1 completed

164 BFRG



[*********************100%%**********************]  1 of 1 completed

165 ENLT



[*********************100%%**********************]  1 of 1 completed

166 MLYS



[*********************100%%**********************]  1 of 1 completed

167 BLAC



[*********************100%%**********************]  1 of 1 completed

168 NXT



[*********************100%%**********************]  1 of 1 completed

169 HSAI



[*********************100%%**********************]  1 of 1 completed

170 LSDI



[*********************100%%**********************]  1 of 1 completed

171 LICN



[*********************100%%**********************]  1 of 1 completed

172 GPCR





173 ASST


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

174 CETU



[*********************100%%**********************]  1 of 1 completed

175 TXO



[*********************100%%**********************]  1 of 1 completed

176 BREA



[*********************100%%**********************]  1 of 1 completed

177 GNLX



[*********************100%%**********************]  1 of 1 completed

178 QSG



[*********************100%%**********************]  1 of 1 completed

179 CVKD



[*********************100%%**********************]  1 of 1 completed

180 SKWD



[*********************100%%**********************]  1 of 1 completed

181 ISRL



[*********************100%%**********************]  1 of 1 completed

182 MGOL



[*********************100%%**********************]  1 of 1 completed

183 HOVR





In [12]:
growth_df.describe()

Unnamed: 0,growth_1d,growth_2d,growth_3d,growth_4d,growth_5d,growth_6d,growth_7d,growth_8d,growth_9d,growth_10d,...,growth_21d,growth_22d,growth_23d,growth_24d,growth_25d,growth_26d,growth_27d,growth_28d,growth_29d,growth_30d
count,181.0,181.0,180.0,179.0,179.0,179.0,179.0,179.0,177.0,177.0,...,165.0,165.0,165.0,162.0,160.0,158.0,157.0,157.0,156.0,154.0
mean,-0.053573,-0.061657,-0.067399,-0.077484,-0.085402,-0.09266,-0.103517,-0.10426,-0.102986,-0.10596,...,-0.081706,-0.090219,-0.090567,-0.093193,-0.105073,-0.098904,-0.064416,-0.058906,-0.059838,-0.058124
std,0.172888,0.20694,0.248641,0.26538,0.296595,0.311695,0.301164,0.323844,0.357175,0.3702,...,0.532256,0.489866,0.496217,0.50458,0.500826,0.547035,0.835076,0.836287,0.835842,0.829488
min,-0.846431,-0.891267,-0.913359,-0.905743,-0.918882,-0.912323,-0.914739,-0.909877,-0.904384,-0.903003,...,-0.951674,-0.951674,-0.950639,-0.951674,-0.951674,-0.954781,-0.955471,-0.953055,-0.957197,-0.959613
25%,-0.09589,-0.122449,-0.144643,-0.161301,-0.214694,-0.236429,-0.214208,-0.251714,-0.271429,-0.290155,...,-0.383817,-0.387967,-0.383817,-0.366905,-0.39767,-0.396992,-0.39802,-0.376238,-0.389279,-0.376471
50%,0.0,0.0,-0.002177,-0.004801,-0.003974,-0.02926,-0.023861,-0.019061,-0.017448,-0.022609,...,-0.022222,-0.01108,-0.015491,-0.020072,-0.030481,-0.017977,-0.025862,-0.025559,-0.025562,-0.008263
75%,0.016667,0.021407,0.011263,0.011435,0.0105,0.008474,0.006903,0.008991,0.011896,0.016327,...,0.013712,0.024089,0.026189,0.026564,0.02141,0.027515,0.034483,0.038591,0.025717,0.023718
max,0.362069,0.464015,1.38,1.08371,1.262443,1.52987,1.173913,1.35974,1.751948,2.176087,...,3.5,2.871041,2.846154,2.803394,2.427273,3.817886,8.056122,8.081632,8.265306,8.372449


In [13]:
optimal = growth_df.describe().T["75%"].argmax() + 1
print(f"The optimal number of days for the highest 75% quantile growth is {optimal}")

The optimal number of days for the highest 75% quantile growth is 28


## Question 3: Is Growth Concentrated in the Largest Stocks?

**Get the share of days (percentage as int) when Large Stocks outperform (growth_7d - growth over 7 periods back) the Largest stocks?**


Reuse [Code Snippet 5] to obtain OHLCV stats for 33 stocks 
for 10 full years of data (2014-01-01 to 2023-12-31). You'll need to download slightly more data (7 periods before 2014-01-01 to calculate the growth_7d for the first 6 days correctly):

`US_STOCKS = ['MSFT', 'AAPL', 'GOOG', 'NVDA', 'AMZN', 'META', 'BRK-B', 'LLY', 'AVGO','V', 'JPM']`

`EU_STOCKS = ['NVO','MC.PA', 'ASML', 'RMS.PA', 'OR.PA', 'SAP', 'ACN', 'TTE', 'SIE.DE','IDEXY','CDI.PA']`

`INDIA_STOCKS = ['RELIANCE.NS','TCS.NS','HDB','BHARTIARTL.NS','IBN','SBIN.NS','LICI.NS','INFY','ITC.NS','HINDUNILVR.NS','LT.NS']`

`LARGEST_STOCKS = US_STOCKS + EU_STOCKS + INDIA_STOCKS`
<br/>

Now let's add the top 12-22 stocks (as of end-April 2024):
<br/>

`NEW_US = ['TSLA','WMT','XOM','UNH','MA','PG','JNJ','MRK','HD','COST','ORCL']`

`NEW_EU = ['PRX.AS','CDI.PA','AIR.PA','SU.PA','ETN','SNY','BUD','DTE.DE','ALV.DE','MDT','AI.PA','EL.PA']`

`NEW_INDIA = ['BAJFINANCE.NS','MARUTI.NS','HCLTECH.NS','TATAMOTORS.NS','SUNPHARMA.NS','ONGC.NS','ADANIENT.NS','ADANIENT.NS','NTPC.NS','KOTAKBANK.NS','TITAN.NS']`

`LARGE_STOCKS = NEW_EU + NEW_US + NEW_INDIA`

You should be able to obtain stats for 33 LARGEST STOCKS and 32 LARGE STOCKS (from the actual stats on Yahoo Finance)

Calculate  `growth_7d` for every stock and every day.
Get the average daily `growth_7d` for the LARGEST_STOCKS group vs. the LARGE_STOCKS group.

For example, for the first of data you should have:
| Date   |      ticker_category      |  growth_7d |
|----------|:-------------:|------:|
| 2014-01-01 |  LARGE | 1.011684 |
| 2014-01-01 |   LARGEST   |   1.011797 |

On that day, the LARGEST group was growing faster than LARGE one (new stocks).

Calculate the number of days when the LARGE GROUP (new smaller stocks) outperforms the LARGEST GROUP, divide it by the total number of trading days (which should be 2595 days), and convert it to a percentage (closest INTEGER value). For example, if you find that 1700 out of 2595 days meet this condition, it means that 1700/2595 = 0.655, or approximately 66% of days, the LARGE stocks were growing faster than the LARGEST ones. This suggests that you should consider extending your dataset with more stocks to seek higher growth.

HINT: you can use pandas.pivot_table() to "flatten" the table (LARGE and LARGEST growth_7d as columns)

In [14]:
US_STOCKS = ['MSFT', 'AAPL', 'GOOG', 'NVDA', 'AMZN', 'META', 'BRK-B', 'LLY', 'AVGO','V', 'JPM']

EU_STOCKS = ['NVO','MC.PA', 'ASML', 'RMS.PA', 'OR.PA', 'SAP', 'ACN', 'TTE', 'SIE.DE','IDEXY','CDI.PA']

INDIA_STOCKS = ['RELIANCE.NS','TCS.NS','HDB','BHARTIARTL.NS','IBN','SBIN.NS','LICI.NS','INFY','ITC.NS','HINDUNILVR.NS','LT.NS']

LARGEST_STOCKS = US_STOCKS + EU_STOCKS + INDIA_STOCKS

NEW_US = ['TSLA','WMT','XOM','UNH','MA','PG','JNJ','MRK','HD','COST','ORCL']

NEW_EU = ['PRX.AS','CDI.PA','AIR.PA','SU.PA','ETN','SNY','BUD','DTE.DE','ALV.DE','MDT','AI.PA','EL.PA']

NEW_INDIA = ['BAJFINANCE.NS','MARUTI.NS','HCLTECH.NS','TATAMOTORS.NS','SUNPHARMA.NS','ONGC.NS','ADANIENT.NS','ADANIENT.NS','NTPC.NS','KOTAKBANK.NS','TITAN.NS']

LARGE_STOCKS = NEW_EU + NEW_US + NEW_INDIA

start_date = date(2014, 1, 1) 
end_date = date(2023, 12, 31)

In [15]:
def download_dataset(tickers, start_date, end_date, group_name):
    stocks_df = pd.DataFrame()

    for i, ticker in enumerate(tickers):
        print(i,ticker)

        # Work with stock prices
        historyPrices = yf.download(tickers = ticker,
                                    start = start_date - timedelta(days=14),
                                    end = end_date)
    
        # generate features for historical prices, and what we want to predict
        historyPrices['Ticker'] = ticker
        historyPrices['Year']= historyPrices.index.year
        historyPrices['Month'] = historyPrices.index.month
        historyPrices['Weekday'] = historyPrices.index.weekday
        historyPrices['Date'] = pd.to_datetime(historyPrices.index.date)
        historyPrices["ticker_category"] = group_name
    
        historyPrices['growth_7d'] = historyPrices['Adj Close'].pct_change(7) + 1
        historyPrices = historyPrices[historyPrices["Date"] >= pd.to_datetime(start_date)]
    
        # sleep 1 sec between downloads - not to overload the API server
        time.sleep(1)
    
        if stocks_df.empty:
            stocks_df = historyPrices
        else:
            stocks_df = pd.concat([stocks_df, historyPrices], ignore_index=True)
            
    return stocks_df

largest_df = download_dataset(LARGEST_STOCKS, start_date, end_date, "LARGEST")
large_df = download_dataset(LARGE_STOCKS, start_date, end_date, "LARGE")

0 MSFT


[*********************100%%**********************]  1 of 1 completed


1 AAPL


[*********************100%%**********************]  1 of 1 completed


2 GOOG


[*********************100%%**********************]  1 of 1 completed


3 NVDA


[*********************100%%**********************]  1 of 1 completed


4 AMZN


[*********************100%%**********************]  1 of 1 completed


5 META


[*********************100%%**********************]  1 of 1 completed


6 BRK-B


[*********************100%%**********************]  1 of 1 completed


7 LLY


[*********************100%%**********************]  1 of 1 completed


8 AVGO


[*********************100%%**********************]  1 of 1 completed


9 V


[*********************100%%**********************]  1 of 1 completed


10 JPM


[*********************100%%**********************]  1 of 1 completed


11 NVO


[*********************100%%**********************]  1 of 1 completed


12 MC.PA


[*********************100%%**********************]  1 of 1 completed


13 ASML


[*********************100%%**********************]  1 of 1 completed


14 RMS.PA


[*********************100%%**********************]  1 of 1 completed


15 OR.PA


[*********************100%%**********************]  1 of 1 completed


16 SAP


[*********************100%%**********************]  1 of 1 completed


17 ACN


[*********************100%%**********************]  1 of 1 completed


18 TTE


[*********************100%%**********************]  1 of 1 completed


19 SIE.DE


[*********************100%%**********************]  1 of 1 completed


20 IDEXY


[*********************100%%**********************]  1 of 1 completed


21 CDI.PA


[*********************100%%**********************]  1 of 1 completed


22 RELIANCE.NS


[*********************100%%**********************]  1 of 1 completed


23 TCS.NS


[*********************100%%**********************]  1 of 1 completed


24 HDB


[*********************100%%**********************]  1 of 1 completed


25 BHARTIARTL.NS


[*********************100%%**********************]  1 of 1 completed


26 IBN


[*********************100%%**********************]  1 of 1 completed


27 SBIN.NS


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

28 LICI.NS





29 INFY


[*********************100%%**********************]  1 of 1 completed


30 ITC.NS


[*********************100%%**********************]  1 of 1 completed


31 HINDUNILVR.NS


[*********************100%%**********************]  1 of 1 completed


32 LT.NS


[*********************100%%**********************]  1 of 1 completed


0 PRX.AS


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

1 CDI.PA





2 AIR.PA


[*********************100%%**********************]  1 of 1 completed


3 SU.PA


[*********************100%%**********************]  1 of 1 completed


4 ETN


[*********************100%%**********************]  1 of 1 completed


5 SNY


[*********************100%%**********************]  1 of 1 completed


6 BUD


[*********************100%%**********************]  1 of 1 completed


7 DTE.DE


[*********************100%%**********************]  1 of 1 completed


8 ALV.DE


[*********************100%%**********************]  1 of 1 completed


9 MDT


[*********************100%%**********************]  1 of 1 completed


10 AI.PA


[*********************100%%**********************]  1 of 1 completed


11 EL.PA


[*********************100%%**********************]  1 of 1 completed


12 TSLA


[*********************100%%**********************]  1 of 1 completed


13 WMT


[*********************100%%**********************]  1 of 1 completed


14 XOM


[*********************100%%**********************]  1 of 1 completed


15 UNH


[*********************100%%**********************]  1 of 1 completed


16 MA


[*********************100%%**********************]  1 of 1 completed


17 PG


[*********************100%%**********************]  1 of 1 completed


18 JNJ


[*********************100%%**********************]  1 of 1 completed


19 MRK


[*********************100%%**********************]  1 of 1 completed


20 HD


[*********************100%%**********************]  1 of 1 completed


21 COST


[*********************100%%**********************]  1 of 1 completed


22 ORCL


[*********************100%%**********************]  1 of 1 completed


23 BAJFINANCE.NS


[*********************100%%**********************]  1 of 1 completed


24 MARUTI.NS


[*********************100%%**********************]  1 of 1 completed


25 HCLTECH.NS


[*********************100%%**********************]  1 of 1 completed


26 TATAMOTORS.NS


[*********************100%%**********************]  1 of 1 completed


27 SUNPHARMA.NS


[*********************100%%**********************]  1 of 1 completed


28 ONGC.NS


[*********************100%%**********************]  1 of 1 completed


29 ADANIENT.NS


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

30 ADANIENT.NS





31 NTPC.NS


[*********************100%%**********************]  1 of 1 completed


32 KOTAKBANK.NS


[*********************100%%**********************]  1 of 1 completed


33 TITAN.NS


[*********************100%%**********************]  1 of 1 completed


In [16]:
merge_df = pd.concat([largest_df, large_df], ignore_index=True)
merge_df

Unnamed: 0,Open,High,Low,Close,Adj Close,Volume,Ticker,Year,Month,Weekday,Date,ticker_category,growth_7d
0,37.349998,37.400002,37.099998,37.160000,31.233053,30632200,MSFT,2014,1,3,2014-01-02,LARGEST,1.009782
1,37.200001,37.220001,36.599998,36.910000,31.022926,31134800,MSFT,2014,1,4,2014-01-03,LARGEST,1.007919
2,36.849998,36.889999,36.110001,36.130001,30.367344,43603700,MSFT,2014,1,0,2014-01-06,LARGEST,0.974380
3,36.330002,36.490002,36.209999,36.410000,30.602688,35802800,MSFT,2014,1,1,2014-01-07,LARGEST,0.972490
4,36.000000,36.139999,35.580002,35.759998,30.056356,59971700,MSFT,2014,1,2,2014-01-08,LARGEST,0.958970
...,...,...,...,...,...,...,...,...,...,...,...,...,...
164602,3580.000000,3638.449951,3560.550049,3627.350098,3627.350098,777099,TITAN.NS,2023,12,4,2023-12-22,LARGE,1.006940
164603,3635.000000,3665.000000,3623.449951,3656.699951,3656.699951,526101,TITAN.NS,2023,12,1,2023-12-26,LARGE,1.018182
164604,3668.000000,3695.000000,3645.000000,3689.250000,3689.250000,666625,TITAN.NS,2023,12,2,2023-12-27,LARGE,1.024635
164605,3699.899902,3737.000000,3680.699951,3715.100098,3715.100098,1033648,TITAN.NS,2023,12,3,2023-12-28,LARGE,1.026384


In [17]:
largest_mean = merge_df[merge_df["ticker_category"] == "LARGEST"].groupby(by="Date")["growth_7d"].mean()
large_mean = merge_df[merge_df["ticker_category"] == "LARGE"].groupby(by="Date")["growth_7d"].mean()
outperform_ratio = (large_mean > largest_mean).mean()
print(f"The share of days when Large Stocks outperform (growth_7d - growth over 7 periods back) the Largest stocks is {round(outperform_ratio * 100)}%")

The share of days when Large Stocks outperform (growth_7d - growth over 7 periods back) the Largest stocks is 47%


## Question 4: Trying Another Technical Indicators strategy

**What's the total gross profit (in THOUSANDS of $) you'll get from trading on CCI (no fees assumption)?**


First, run the entire Colab to obtain the full DataFrame of data (after [Code Snippet 9]), and truncate it to the last full 10 years of data (2014-01-01 to 2023-12-31).
If you encounter any difficulties running the Colab - you can download it using this [link](https://drive.google.com/file/d/1m3Qisfs2XfWk6Sw_Uk5kHLWqwQ0q8SKb/view?usp=sharing).

Let's assume you've learned about the awesome **CCI indicator** ([Commodity Channel Index](https://www.investopedia.com/terms/c/commoditychannelindex.asp)), and decided to use only it for your operations.

You defined the "defensive" value of a high threshould of 200, and you trade only on Fridays (`Date.dt.dayofweek()==4`).

That is, every time you see that CCI is >200 for any stock (out of those 33), you'll invest $1000 (each record when CCI>200) at Adj.Close price and hold it for 1 week (5 trading days) in order to sell at the Adj. Close price.

What's the expected gross profit (no fees) that you get in THOUSANDS $ (closest integer value) over many operations in 10 years?
One operation calculations: if you invested $1000 and received $1010 in 5 days - you add $10 to gross profit, if you received $980 - add -$20 to gross profit.
You need to sum these results over all trades (460 times in 10 years).

Additional:
  * Add an approximate fees calculation over the 460 trades from this calculator https://www.degiro.ie/fees/calculator (Product:"Shares, USA and Canada;" Amount per transaction: "1000 EUR"; Transactions per year: "460")
  * are you still profitable on those trades?

In [18]:
# https://companiesmarketcap.com/usa/largest-companies-in-the-usa-by-market-cap/
US_STOCKS = ['MSFT', 'AAPL', 'GOOG', 'NVDA', 'AMZN', 'META', 'BRK-B', 'LLY', 'AVGO','V', 'JPM']

# You're required to add EU_STOCKS and INDIA_STOCS
# https://companiesmarketcap.com/european-union/largest-companies-in-the-eu-by-market-cap/
EU_STOCKS = ['NVO','MC.PA', 'ASML', 'RMS.PA', 'OR.PA', 'SAP', 'ACN', 'TTE', 'SIE.DE','IDEXY','CDI.PA']

# https://companiesmarketcap.com/india/largest-companies-in-india-by-market-cap/
INDIA_STOCKS = ['RELIANCE.NS','TCS.NS','HDB','BHARTIARTL.NS','IBN','SBIN.NS','LICI.NS','INFY','ITC.NS','HINDUNILVR.NS','LT.NS']

ALL_TICKERS = US_STOCKS  + EU_STOCKS + INDIA_STOCKS

start_date = date(2014, 1, 1)
end_date = date(2023, 12, 31)

In [19]:
def download_dataset(tickers, start_date, end_date):
    stocks_df = pd.DataFrame()

    for i, ticker in enumerate(tickers):
        print(i,ticker)

        # Work with stock prices
        historyPrices = yf.download(tickers = ticker,
                                    start = start_date - timedelta(days=28),
                                    end = end_date + timedelta(days=28))

        # generate features for historical prices, and what we want to predict
        historyPrices['Ticker'] = ticker
        historyPrices['Year']= historyPrices.index.year
        historyPrices['Month'] = historyPrices.index.month
        historyPrices['Weekday'] = historyPrices.index.weekday
        historyPrices['Date'] = pd.to_datetime(historyPrices.index.date)

        historyPrices["CCI"] = talib.CCI(historyPrices["High"].values, historyPrices["Low"].values, historyPrices["Close"].values, timeperiod=14)

        # sleep 1 sec between downloads - not to overload the API server
        time.sleep(1)

        if stocks_df.empty:
            stocks_df = historyPrices
        else:
            stocks_df = pd.concat([stocks_df, historyPrices], ignore_index=True)

    return stocks_df

df = download_dataset(ALL_TICKERS, start_date, end_date)

0 MSFT


[*********************100%%**********************]  1 of 1 completed


1 AAPL


[*********************100%%**********************]  1 of 1 completed


2 GOOG


[*********************100%%**********************]  1 of 1 completed


3 NVDA


[*********************100%%**********************]  1 of 1 completed


4 AMZN


[*********************100%%**********************]  1 of 1 completed


5 META


[*********************100%%**********************]  1 of 1 completed


6 BRK-B


[*********************100%%**********************]  1 of 1 completed


7 LLY


[*********************100%%**********************]  1 of 1 completed


8 AVGO


[*********************100%%**********************]  1 of 1 completed


9 V


[*********************100%%**********************]  1 of 1 completed


10 JPM


[*********************100%%**********************]  1 of 1 completed


11 NVO


[*********************100%%**********************]  1 of 1 completed


12 MC.PA


[*********************100%%**********************]  1 of 1 completed


13 ASML


[*********************100%%**********************]  1 of 1 completed


14 RMS.PA


[*********************100%%**********************]  1 of 1 completed


15 OR.PA


[*********************100%%**********************]  1 of 1 completed


16 SAP


[*********************100%%**********************]  1 of 1 completed


17 ACN


[*********************100%%**********************]  1 of 1 completed


18 TTE


[*********************100%%**********************]  1 of 1 completed


19 SIE.DE


[*********************100%%**********************]  1 of 1 completed


20 IDEXY


[*********************100%%**********************]  1 of 1 completed


21 CDI.PA


[*********************100%%**********************]  1 of 1 completed


22 RELIANCE.NS


[*********************100%%**********************]  1 of 1 completed


23 TCS.NS


[*********************100%%**********************]  1 of 1 completed


24 HDB


[*********************100%%**********************]  1 of 1 completed


25 BHARTIARTL.NS


[*********************100%%**********************]  1 of 1 completed


26 IBN


[*********************100%%**********************]  1 of 1 completed


27 SBIN.NS


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

28 LICI.NS





29 INFY


[*********************100%%**********************]  1 of 1 completed


30 ITC.NS


[*********************100%%**********************]  1 of 1 completed


31 HINDUNILVR.NS


[*********************100%%**********************]  1 of 1 completed


32 LT.NS


[*********************100%%**********************]  1 of 1 completed


In [20]:
df["growth_future_5d"] = df.groupby("Ticker")["Adj Close"].pct_change(5).shift(-5)
# df["future_price_5d"] = df.groupby("Ticker")["Adj Close_x"].shift(-5)
# df["growth_future_5d"] = df["future_price_5d"] / df["Adj Close_x"] - 1
df = df.loc[(df["Date"] >= "2014-01-01") & (df["Date"] <= "2023-12-31")]
df

Unnamed: 0,Open,High,Low,Close,Adj Close,Volume,Ticker,Year,Month,Weekday,Date,CCI,growth_future_5d
19,37.349998,37.400002,37.099998,37.160000,31.233053,30632200,MSFT,2014,1,3,2014-01-02,57.700615,-0.043864
20,37.200001,37.220001,36.599998,36.910000,31.022938,31134800,MSFT,2014,1,4,2014-01-03,1.373763,-0.023571
21,36.849998,36.889999,36.110001,36.130001,30.367348,43603700,MSFT,2014,1,0,2014-01-06,-96.631259,-0.031830
22,36.330002,36.490002,36.209999,36.410000,30.602688,35802800,MSFT,2014,1,1,2014-01-07,-83.904297,-0.017303
23,36.000000,36.139999,35.580002,35.759998,30.056358,59971700,MSFT,2014,1,2,2014-01-08,-147.855135,0.027964
...,...,...,...,...,...,...,...,...,...,...,...,...,...
81939,3424.000000,3496.000000,3408.600098,3477.949951,3477.949951,1681707,LT.NS,2023,12,4,2023-12-22,70.767162,0.013657
81940,3477.949951,3508.350098,3477.949951,3490.050049,3490.050049,1072263,LT.NS,2023,12,1,2023-12-26,99.598220,-0.014885
81941,3510.000000,3549.000000,3504.149902,3544.000000,3544.000000,1389266,LT.NS,2023,12,2,2023-12-27,130.401152,-0.029247
81942,3545.000000,3559.949951,3500.500000,3518.050049,3518.050049,3371121,LT.NS,2023,12,3,2023-12-28,106.774509,-0.016870


In [21]:
trades = df[(df["CCI"] > 200) & (df["Weekday"] == 4)]
print(f"The total gross profit (in THOUSANDS of $) I'll get from trading on CCI (no fees assumption) is {round(trades['growth_future_5d'].sum())}")

The total gross profit (in THOUSANDS of $) I'll get from trading on CCI (no fees assumption) is 1


## [EXPLORATORY] Question 5: Finding Your Strategy for IPOs

You've seen in the first questions that the median and average investments are negative in IPOs, and you can't blindly invest in all deals.

How would you correct/refine the approach? Briefly describe the steps and the data you'll try to get (it should be generally feasible to do it from public sources - no access to internal data of companies)?

E.g. (some ideas) Do you want to focus on the specific vertical? Do you want to build a smart comparison vs. existing stocks on the market? Or you just will want to get some features (which features?) like total number of people in a company to find a segment of "successful" IPOs?


## Submitting the solutions

Form for submitting: https://courses.datatalks.club/sma-zoomcamp-2024/homework/hw02