## Enriching stock market data using Open AI API 

<p align="center">
    <img src="images/nasdaq100.png" width="450">
</p>

The Nasdaq-100 is a stock market index made up of 101 equity securities issued by 100 of the largest non-financial companies listed on the Nasdaq stock exchange. It helps investors compare stock prices with previous prices to determine market performance.

In this project you are provided with two CSV files containing Nasdaq-100 stock information:
- _**nasdaq100.csv**_: contains information about companies in the index such as symbol, name, etc.
- _n**asdaq100_price_change.csv**_: contains price changes per stock across periods including (but not limited to) one day, five days, one month, six months, one year, etc.

As an AI developer, you will leverage the OpenAI API to classify companies into sectors and produce a summary of sector and company performance for this year.

# CSV with Nasdaq-100 stock data

In this project, you have available two CSV files `nasdaq100.csv` and `nasdaq100_price_change.csv`.

## nasdaq100.csv

```py
symbol,name,headQuarter,dateFirstAdded,cik,founded
AAPL,Apple Inc.,"Cupertino, CA",,0000320193,1976-04-01
ABNB,Airbnb,"San Francisco, CA",,0001559720,2008-08-01
ADBE,Adobe Inc.,"San Jose, CA",,0000796343,1982-12-01
ADI,Analog Devices,"Wilmington, MA",,0000006281,1965-01-01
...
```

## nasdaq100_price_change.csv

```py
symbol,1D,5D,1M,3M,6M,ytd,1Y,3Y,5Y,10Y,max
AAPL,-1.7254,-8.30086,-6.20411,3.042,15.64824,42.99992,8.47941,60.96299,245.42031,976.99441,139245.53954
ABNB,2.1617,-2.21919,9.88336,19.43286,19.64241,68.66902,23.64013,-1.04347,-1.04347,-1.04347,-1.04347
ADBE,0.5409,-1.77817,9.16191,52.0465,38.01522,57.22723,21.96206,17.83037,109.05718,1024.69214,251030.66399
ADI,0.9291,-4.03352,2.58486,3.65887,5.01602,17.02062,8.09735,63.42847,92.81874,286.77518,26012.63736
...
```

## Before you start

In order to complete the project you will need to create a developer account with OpenAI and store your API key as an environment variable. Instructions for these steps are outlined below.

### Create a developer account with OpenAI

1. Go to the [API signup page](https://platform.openai.com/signup). 

2. Create your account (you'll need to provide your email address and your phone number).

<img src="images/openai-create-account.jpeg" width="200">

3. Go to the [API keys page](https://platform.openai.com/account/api-keys). 

4. Create a new secret key.

<img src="images/openai-new-secret-key.png" width="200">

5. **Take a copy of it**. (If you lose it, delete the key and create a new one.)

### Add a payment method

OpenAI sometimes provides free credits for the API, but it's not clear if that is worldwide or what the conditions are. You may need to add debit/credit card details. 

**The API costs [$0.002 / 1000 tokens](https://openai.com/pricing) for GPT-3.5-turbo. [1000 tokens is about 750 words](https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them). This project should cost less than 1 US cents (but if you rerun tasks, you will be charged every time).**

1. Go to the [Payment Methods page](https://platform.openai.com/account/billing/payment-methods).

2. Click Add payment method.

<img src="images/openai-add-payment-method.png" width="200">

3. Fill in your card details.

### Add an environmental variable with your OpenAI key

1. In Workspace, click on "Environment," in the left sidebar.

2. Click on the plus button next to "Environment variables" to add environment variables.

3. In the "Name" field, type "OPENAI_API_KEY". In the "Value" field, paste in your secret key.

<img src="images/workspace-env-var-details.png" width="500">

4. Click "Create", then you'll see the following pop-up window. Click "Connect," then wait 5-10 seconds for the kernel to restart, or restart it manually in the Run menu.

<img src="images/workspace-connect-integ.png" width="500">

In [56]:
# Start your code here!
import os
import pandas as pd
from openai import OpenAI

# Instantiate an API client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Continue coding here
# Use as many cells as you like

In [57]:
nasdaq100 = pd.read_csv('nasdaq100.csv')
nasdaq_100_price_change = pd.read_csv('nasdaq100_price_change.csv') 
nasdaq_100_price_change.head()

Unnamed: 0,symbol,1D,5D,1M,3M,6M,ytd,1Y,3Y,5Y,10Y,max
0,AAPL,-1.7254,-8.30086,-6.20411,3.042,15.64824,42.99992,8.47941,60.96299,245.42031,976.99441,139245.53954
1,ABNB,2.1617,-2.21919,9.88336,19.43286,19.64241,68.66902,23.64013,-1.04347,-1.04347,-1.04347,-1.04347
2,ADBE,0.5409,-1.77817,9.16191,52.0465,38.01522,57.22723,21.96206,17.83037,109.05718,1024.69214,251030.66399
3,ADI,0.9291,-4.03352,2.58486,3.65887,5.01602,17.02062,8.09735,63.42847,92.81874,286.77518,26012.63736
4,ADP,2.0589,2.35462,14.66581,16.40059,10.60546,5.53732,0.888943,81.76679,81.87224,248.4095,27613.11042


In [58]:
nasdaq_100_ytd = nasdaq100.merge(nasdaq_100_price_change, how = 'inner', on = 'symbol')

nasdaq_100_ytd.head()

Unnamed: 0,symbol,name,headQuarter,dateFirstAdded,cik,founded,1D,5D,1M,3M,6M,ytd,1Y,3Y,5Y,10Y,max
0,AAPL,Apple Inc.,"Cupertino, CA",,320193,1976-04-01,-1.7254,-8.30086,-6.20411,3.042,15.64824,42.99992,8.47941,60.96299,245.42031,976.99441,139245.53954
1,ABNB,Airbnb,"San Francisco, CA",,1559720,2008-08-01,2.1617,-2.21919,9.88336,19.43286,19.64241,68.66902,23.64013,-1.04347,-1.04347,-1.04347,-1.04347
2,ADBE,Adobe Inc.,"San Jose, CA",,796343,1982-12-01,0.5409,-1.77817,9.16191,52.0465,38.01522,57.22723,21.96206,17.83037,109.05718,1024.69214,251030.66399
3,ADI,Analog Devices,"Wilmington, MA",,6281,1965-01-01,0.9291,-4.03352,2.58486,3.65887,5.01602,17.02062,8.09735,63.42847,92.81874,286.77518,26012.63736
4,ADP,ADP,"Roseland, NJ",,8670,1949-01-01,2.0589,2.35462,14.66581,16.40059,10.60546,5.53732,0.888943,81.76679,81.87224,248.4095,27613.11042


In [59]:
nasdaq_100_with_ytd = nasdaq_100_ytd[['symbol','name','ytd']]

In [60]:
nasdaq_100_with_ytd['symbol'].count()

101

In [61]:
def assign_sector(symbol):
    response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user",
    "content": f"Assign {symbol} into the following categories, only provide the category in the list and do not provide any other response. <list>:           Technology, Consumer Cyclical, Industrials, Utilities, Healthcare, Communication, Energy, Consumer Defensive, Real Estate, Financial </list>"}]
    )
    return response.choices[0].message.content

nasdaq_100_with_ytd = nasdaq_100_with_ytd[['symbol','ytd']]

nasdaq_100_with_ytd['category'] = nasdaq_100_with_ytd['symbol'].apply(assign_sector)

In [62]:
nasdaq_100_with_ytd['category'].value_counts()

Technology                                                     53
Consumer Cyclical                                              15
Healthcare                                                     13
Consumer Defensive                                              5
Communication                                                   4
Industrials                                                     3
Energy                                                          2
Utilities                                                       1
Please provide the CDNS you would like to categorize.           1
Real Estate                                                     1
Please provide the Job Description (JD) for categorization.     1
Financial                                                       1
Please provide the TXN that you would like to categorize.       1
Name: category, dtype: int64

In [63]:
nasdaq_100_with_ytd.rename(columns = {'category':'sector'}, inplace= True)
nasdaq_100_with_ytd.head()

In [64]:
prompt = f'Here is the information about stocks in the nasdaq-100 along with their year-to-date performance and secotr {nasdaq_100_with_ytd}. Provide summary information about Nasdaq-100 stock performance YTD, recommending the three best sectors and three or more companies per sector'

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user",
    "content": prompt}]
)

               
stock_recommendations = response.choices[0].message.content

In [65]:
print(stock_recommendations)

To summarize the year-to-date (YTD) performance of stocks in the Nasdaq-100 and identify the best-performing sectors, we can analyze the data provided. Here’s the breakdown:

### Summary of Performance
- **Best Performing Sector:** This can be determined by the overall YTD performance of the stocks in each sector. Based on the YTD values provided:
  - **Technology:** This is likely to include the highest number of top-performing stocks due to the presence of high-growth companies.
  - **Consumer Cyclical:** Another strong sector showing good YTD growth.
  - **Communication:** May also have standout performers even if it doesn't have the same number of top performers as Technology.

### Three Best Sectors
1. **Technology**
   - **Key Companies:**
     - **AAPL:** YTD +42.99%
     - **ADBE:** YTD +57.22%
     - **ADI:** YTD +17.02%
     - **WDAY:** YTD +37.91%
     - **ZS:** YTD +31.90%
  
2. **Consumer Cyclical**
   - **Key Companies:**
     - **ABNB:** YTD +68.67%
     - *(Assuming the