## Enriching stock market data using Open AI API 

<p align="center">
    <img src="images/nasdaq100.png" width="450">
</p>

The Nasdaq-100 is a stock market index made up of 101 equity securities issued by 100 of the largest non-financial companies listed on the Nasdaq stock exchange. It helps investors compare stock prices with previous prices to determine market performance.

In this project, we are provided with two CSV files containing Nasdaq-100 stock information:
- _**nasdaq100.csv**_: contains information about companies in the index such as symbol, name, etc.
- _n**asdaq100_price_change.csv**_: contains price changes per stock across periods including (but not limited to) one day, five days, one month, six months, one year, etc.

We will leverage the OpenAI API to classify companies into sectors and produce a summary of sector and company performance for this year (2023).

# CSV with Nasdaq-100 stock data

There are two CSV files available: `nasdaq100.csv` and `nasdaq100_price_change.csv`.

## nasdaq100.csv

```py
symbol,name,headQuarter,dateFirstAdded,cik,founded
AAPL,Apple Inc.,"Cupertino, CA",,0000320193,1976-04-01
ABNB,Airbnb,"San Francisco, CA",,0001559720,2008-08-01
ADBE,Adobe Inc.,"San Jose, CA",,0000796343,1982-12-01
ADI,Analog Devices,"Wilmington, MA",,0000006281,1965-01-01
...
```

## nasdaq100_price_change.csv

```py
symbol,1D,5D,1M,3M,6M,ytd,1Y,3Y,5Y,10Y,max
AAPL,-1.7254,-8.30086,-6.20411,3.042,15.64824,42.99992,8.47941,60.96299,245.42031,976.99441,139245.53954
ABNB,2.1617,-2.21919,9.88336,19.43286,19.64241,68.66902,23.64013,-1.04347,-1.04347,-1.04347,-1.04347
ADBE,0.5409,-1.77817,9.16191,52.0465,38.01522,57.22723,21.96206,17.83037,109.05718,1024.69214,251030.66399
ADI,0.9291,-4.03352,2.58486,3.65887,5.01602,17.02062,8.09735,63.42847,92.81874,286.77518,26012.63736
...
```

In [None]:
# Importing the necessary modeules
from tenacity import retry, stop_after_attempt, wait_random_exponential
import pandas as pd
import openai


# Initializing the API key
openai.api_key = ["KEY"]

# Defining a function to prevent a rate limit error (because of making more than one long request)
@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(4))
def backoff_comp(**kwargs):
    return openai.ChatCompletion.create(**kwargs)

# Reading in the two datasets
nasdaq100 = pd.read_csv("nasdaq100.csv")
price_change = pd.read_csv("nasdaq100_price_change.csv")

# Adding 'symbol' into nasdaq100
nasdaq100 = nasdaq100.merge(price_change[["symbol", "ytd"]], on="symbol", how="inner")

# Previewing the combined dataset
nasdaq100.head()

# Creating a prompt to enrich nasdaq100 using OpenAI
prompt = '''Classify company {company} into one of the following sectors. Answer only with the sector name:
    Technology, Consumer Cyclical, Industrials, Utilities, Healthcare, Communication, Energy, Consumer Defensive, Real Estate, Financial.'''

# Looping through the NASDAQ companies
for company in nasdaq100["symbol"]:
    # Create a response from ChatGPT by using the backoff function
    response = backoff_comp(
        model="gpt-3.5-turbo",
        messages=[{ "role": "user", "content": prompt.format(company=company)}],
        temperature=0.0,
    )
    # Storing the output as a variable called 'sector'
    sector = response['choices'][0]['message']['content']
    
    # Adding the 'sector' for the corresponding company
    nasdaq100.loc[nasdaq100["symbol"] == company, "Sector"] = sector
    
# Counting the number of 'sectors'
nasdaq100["Sector"].value_counts()

# Prompting to get stock recommendations
prompt = '''Provide summary information about Nasdaq-100 stock performance year to date (YTD), 
            recommending the three best sectors and three or more companies per sector.
            Company data: {company_data} '''

# Getting the model response
response = backoff_comp(
        model="gpt-3.5-turbo",
        messages=[{ "role": "user", "content": prompt.format(company_data=nasdaq100)}],
        temperature=0.0,
    )

# Storing the output as a variable and print the recommendations
stock_recommendations = response['choices'][0]['message']['content']
print(stock_recommendations)