# Introduction
In this project, I will create an AI assistant for exchange trading. I will use two datasets in this project. The first dataset is about news data and These news data are about Apple Company. The second dataset is about the Apple stock price dataset. Firstly I will prepare news data and then I will prepare stock data. I will use GPT-4o-mini for the AI assistant model.

However, this study is a preliminary study. It cannot be used in any stock trading transactions because the artificial intelligence assistant will not work with 100% accuracy.

**I warn you: please do not use this system for stock market purchases.**

In [1]:
# Download Python Library
!pip install newsapi-python

Collecting newsapi-python
  Downloading newsapi_python-0.2.7-py2.py3-none-any.whl.metadata (1.2 kB)
Downloading newsapi_python-0.2.7-py2.py3-none-any.whl (7.9 kB)
Installing collected packages: newsapi-python
Successfully installed newsapi-python-0.2.7


# News Data
In this chapter, I will prepare a news dataset. I will take news with newsapi. I took the api key at the "newsapi.org".

In [43]:
import requests
import pandas as pd
from bs4 import BeautifulSoup
from newsapi import NewsApiClient
from google.colab import userdata
from datetime import datetime, timedelta

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
from typing import Tuple

import yfinance as yf

from openai import OpenAI

In [19]:
start_day = datetime.today() - timedelta(days=1)
end_day = start_day - timedelta(days=29)

# Year-Month-Day convert to format!
start_day_str = start_day.strftime('%Y-%m-%d')
end_day_str = end_day.strftime('%Y-%m-%d')

print("Yesterday:", start_day_str)
print("29 Days Ago:", end_day_str)

Yesterday: 2025-06-17
29 Days Ago: 2025-05-19


In [5]:
# I download the news datas.
newsapi = NewsApiClient(api_key=userdata.get("NEWS_API"))

all_articles = newsapi.get_everything(q='apple',
                                      from_param=start_day_str,
                                      to=end_day_str,
                                      language='en')

In [6]:
all_articles['articles']

[{'source': {'id': 'the-verge', 'name': 'The Verge'},
  'author': 'Jay Peters',
  'title': 'Apple’s new Games app lets you challenge your friends',
  'description': 'Apple is launching a new app that acts as a central hub for the games and gaming features across its platforms. The new Apple Games app combines Apple Arcade, App Store game recommendations, your App Store game library, and your friends list into a single loc…',
  'url': 'https://www.theverge.com/news/678319/apple-games-app-wwdc-2025',
  'urlToImage': 'https://platform.theverge.com/wp-content/uploads/sites/2/2025/06/Apple-WWDC25-iPadOS-26-Games-app-250609_big.jpg.large_2x.jpg?quality=90&strip=all&crop=0%2C13.350785340314%2C100%2C73.298429319372&w=1200',
  'publishedAt': '2025-06-09T18:09:03Z',
  'content': 'Another gaming initiative from Apple thats meant to be a central hub for players.\r\nAnother gaming initiative from Apple thats meant to be a central hub for players.\r\nApple is launching a new app that… [+1675 chars]'

In [9]:
dataset = []
counter = 0
for idx, article in enumerate(all_articles['articles']):
  url = article["url"]
  try:
    response = requests.get(url, headers = {'User-Agent': 'Mozilla/5.0'})
    soup = BeautifulSoup(response.text, "html.parser")

    paragrapgs = soup.find_all("p")
    content = "\n".join([p.text for p in paragrapgs])

  except Exception as e:
    content = f"ERROR: {e}"

  dataset.append({
      "index": idx,
      "population_date" : article["publishedAt"].split("T")[0],
      "source" : article["source"]["name"],
      "author" : article["author"],
      "title" : article["title"],
      "content" : content
  })
df = pd.DataFrame(dataset)

In [10]:
df.head()

Unnamed: 0,index,population_date,source,author,title,content
0,0,2025-06-09,The Verge,Jay Peters,Apple’s new Games app lets you challenge your ...,Another gaming initiative from Apple that’s me...
1,1,2025-05-28,The Verge,Dominic Preston,Apple’s DIY repair program now covers iPads,"Recent iPad, mini, Air, and Pro models are now..."
2,2,2025-05-27,The Verge,"Wes Davis, Jay Peters",Apple is ready to replace Game Center with a m...,"The company has also acquired RAC7, the develo..."
3,3,2025-06-09,The Verge,Emma Roth,Apple WWDC 2025: the 13 biggest announcements,Big changes are in store across Apple’s platfo...
4,4,2025-06-12,The Verge,Jay Peters,Apple’s upgraded Siri might not arrive until n...,﻿It could launch with iOS 26.4.\n﻿It could lau...


In [12]:
hatali_satirlar = df[df['content'].str.startswith('ERROR:')]
print(f"Number of incorrect content: {len(hatali_satirlar)}")

Number of incorrect content: 0


## News Sentiment
In this chapter, I will sentiment the news dataset. I will use the ProsusAI/finbert llm model at the Huggingface for sentiment news data.

In [14]:
device = "cuda:0" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained("ProsusAI/finbert")
model = AutoModelForSequenceClassification.from_pretrained("ProsusAI/finbert").to(device)
labels = ["positive", "negative", "neutral"]

tokenizer_config.json:   0%|          | 0.00/252 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/758 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

In [15]:
def estimate_sentiment(news):
    if news:
        tokens = tokenizer(news, return_tensors="pt", padding=True, truncation = True).to(device)

        result = model(tokens["input_ids"], attention_mask=tokens["attention_mask"])[
            "logits"
        ]
        result = torch.nn.functional.softmax(torch.sum(result, 0), dim=-1)
        probability = result[torch.argmax(result)]
        sentiment = labels[torch.argmax(result)]
        return probability, sentiment
    else:
        return 0, labels[-1]

In [16]:
for idx, row in df.iterrows():
  probability, sentiment = estimate_sentiment(row["content"])
  df.loc[idx, "probability"] = probability.item()
  df.loc[idx, "sentiment"] = sentiment

In [17]:
df.head()

Unnamed: 0,index,population_date,source,author,title,content,probability,sentiment
0,0,2025-06-09,The Verge,Jay Peters,Apple’s new Games app lets you challenge your ...,Another gaming initiative from Apple that’s me...,0.901312,neutral
1,1,2025-05-28,The Verge,Dominic Preston,Apple’s DIY repair program now covers iPads,"Recent iPad, mini, Air, and Pro models are now...",0.934646,neutral
2,2,2025-05-27,The Verge,"Wes Davis, Jay Peters",Apple is ready to replace Game Center with a m...,"The company has also acquired RAC7, the develo...",0.935417,neutral
3,3,2025-06-09,The Verge,Emma Roth,Apple WWDC 2025: the 13 biggest announcements,Big changes are in store across Apple’s platfo...,0.893284,neutral
4,4,2025-06-12,The Verge,Jay Peters,Apple’s upgraded Siri might not arrive until n...,﻿It could launch with iOS 26.4.\n﻿It could lau...,0.931718,neutral


In [18]:
df["sentiment"].value_counts()

Unnamed: 0_level_0,count
sentiment,Unnamed: 1_level_1
neutral,96
negative,4


There is no positive news in these news data.

# Apple Stock Data
In this chapter, I will prepare apple stock dataset.

In [38]:
end_day = datetime.today() - timedelta(days=1)          # Dün
start_day = end_day - timedelta(days=365)               # 1 yıl önce

# Yıl-Ay-Gün formatında string'e çevir
start_day_str = start_day.strftime('%Y-%m-%d')
end_day_str = end_day.strftime('%Y-%m-%d')

print("One year ago:", start_day_str)
print("Yesterday:", end_day_str)

One year ago: 2024-06-17
Yesterday: 2025-06-17


In [39]:
apple = yf.Ticker("AAPL")

hist = apple.history(start=start_day_str, end=end_day_str, interval="1d")

In [40]:
hist

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Dividends,Stock Splits
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2024-06-17 00:00:00-04:00,212.376781,217.930808,211.729813,215.661423,93728300,0.0,0.0
2024-06-18 00:00:00-04:00,216.577122,217.612289,212.008492,213.292480,79943300,0.0,0.0
2024-06-20 00:00:00-04:00,212.934166,213.242735,207.877826,208.703949,86172500,0.0,0.0
2024-06-21 00:00:00-04:00,209.410665,210.903682,206.145934,206.524170,246421400,0.0,0.0
2024-06-24 00:00:00-04:00,206.753084,211.709899,205.628339,207.171127,80727000,0.0,0.0
...,...,...,...,...,...,...,...
2025-06-10 00:00:00-04:00,200.600006,204.350006,200.570007,202.669998,54672600,0.0,0.0
2025-06-11 00:00:00-04:00,203.500000,204.500000,198.410004,198.779999,60989900,0.0,0.0
2025-06-12 00:00:00-04:00,199.080002,199.679993,197.360001,199.199997,43904600,0.0,0.0
2025-06-13 00:00:00-04:00,199.729996,200.369995,195.699997,196.449997,51447300,0.0,0.0


In [41]:
hist.reset_index(inplace = True)
hist.drop(columns=["Dividends", "Stock Splits"], inplace=True)

In [42]:
hist

Unnamed: 0,Date,Open,High,Low,Close,Volume
0,2024-06-17 00:00:00-04:00,212.376781,217.930808,211.729813,215.661423,93728300
1,2024-06-18 00:00:00-04:00,216.577122,217.612289,212.008492,213.292480,79943300
2,2024-06-20 00:00:00-04:00,212.934166,213.242735,207.877826,208.703949,86172500
3,2024-06-21 00:00:00-04:00,209.410665,210.903682,206.145934,206.524170,246421400
4,2024-06-24 00:00:00-04:00,206.753084,211.709899,205.628339,207.171127,80727000
...,...,...,...,...,...,...
245,2025-06-10 00:00:00-04:00,200.600006,204.350006,200.570007,202.669998,54672600
246,2025-06-11 00:00:00-04:00,203.500000,204.500000,198.410004,198.779999,60989900
247,2025-06-12 00:00:00-04:00,199.080002,199.679993,197.360001,199.199997,43904600
248,2025-06-13 00:00:00-04:00,199.729996,200.369995,195.699997,196.449997,51447300


# AI Asisstant
In this chapter, I will prepare AI assistant. This assistant will use GPT-4o-mini.

## News Data Prepare For model
In this chapter, I will delete some columns. Because If there is so much data for the LLM model.

In [48]:
df

Unnamed: 0,index,population_date,source,author,title,content,probability,sentiment
0,0,2025-06-09,The Verge,Jay Peters,Apple’s new Games app lets you challenge your ...,Another gaming initiative from Apple that’s me...,0.901312,neutral
1,1,2025-05-28,The Verge,Dominic Preston,Apple’s DIY repair program now covers iPads,"Recent iPad, mini, Air, and Pro models are now...",0.934646,neutral
2,2,2025-05-27,The Verge,"Wes Davis, Jay Peters",Apple is ready to replace Game Center with a m...,"The company has also acquired RAC7, the develo...",0.935417,neutral
3,3,2025-06-09,The Verge,Emma Roth,Apple WWDC 2025: the 13 biggest announcements,Big changes are in store across Apple’s platfo...,0.893284,neutral
4,4,2025-06-12,The Verge,Jay Peters,Apple’s upgraded Siri might not arrive until n...,﻿It could launch with iOS 26.4.\n﻿It could lau...,0.931718,neutral
...,...,...,...,...,...,...,...,...
95,95,2025-06-09,CNET,Macy Meyer,"iPhone's Phone App Is Getting a Major Upgrade,...",\n Apple is finally upgrading the Phone app...,0.826602,neutral
96,96,2025-05-30,The Verge,Alex Heath,OpenAI wants ChatGPT to be a ‘super assistant’...,An internal strategy document lays out OpenAI’...,0.906583,neutral
97,97,2025-06-06,MacRumors,Joe Rossignol,Apple TV+ Announces MLB Friday Night Baseball ...,Apple and Major League Baseball this week anno...,0.914831,neutral
98,98,2025-06-12,MacRumors,Joe Rossignol,Take a Break From WWDC 2025 With Apple's Chill...,"It is day four of WWDC 2025 week, and the dust...",0.921819,neutral


In [50]:
df.drop(columns=["index", "content", "probability"], inplace = True)

In [51]:
df.head()

Unnamed: 0,population_date,source,author,title,sentiment
0,2025-06-09,The Verge,Jay Peters,Apple’s new Games app lets you challenge your ...,neutral
1,2025-05-28,The Verge,Dominic Preston,Apple’s DIY repair program now covers iPads,neutral
2,2025-05-27,The Verge,"Wes Davis, Jay Peters",Apple is ready to replace Game Center with a m...,neutral
3,2025-06-09,The Verge,Emma Roth,Apple WWDC 2025: the 13 biggest announcements,neutral
4,2025-06-12,The Verge,Jay Peters,Apple’s upgraded Siri might not arrive until n...,neutral


## AI Asisstant

In [44]:
# Import GPT-4o-mini model from Openai, I use my API key in this code block
openai_api_key = userdata.get("OPENAI_API_KEY")

openai = OpenAI(api_key = openai_api_key)

OPENAI_MODEL = "gpt-4o"

In [45]:
news_data_information = """
news_data columns infomation:
population_date : Data of news
source : Source of the news shared
author : Author of the news
title : headline of the news
content : Content of the news
sentiment : sentiment analysis result of the news. News sentiment classes are "positive", "negative", "neutral".
"""

In [46]:
hist_data_information = """
Apple Stock price data columns infomation:
Date: The trading date of the stock (typically in YYYY-MM-DD format).
Open: The price at which the stock first traded when the market opened on that day.
High: The highest price at which the stock was traded during the day.
Low: The lowest price at which the stock was traded during the day.
Close: The final price at which the stock was traded when the market closed.
Volume: The total number of shares traded on that particular day.
"""

In [47]:
# System prompt for my llm model
system_prompt = """You are a financial time series prediction assistant. Your task is to analyze and use both quantitative stock data and qualitative news content to forecast the next 14 days of Apple Inc.'s closing stock prices.

You are provided with:

Historical stock data for Apple Inc., including daily values such as open, high, low, close, volume, and adjusted versions of these metrics. This data spans the past year.

News data, containing headlines, summaries, or full-text articles, each tagged with a date. The news content may include economic indicators, company-specific announcements, geopolitical events, or other relevant sentiment-rich information.

Instructions:

Correlate the stock market movements with related news events. Consider market sentiment, major events, and timing.

Use recent patterns from the stock data and the tone of the latest news to estimate future behavior.

Focus your prediction on the close price.

Your prediction horizon is 14 consecutive trading days after the latest date in the provided datasets.

If relevant, you may apply time series reasoning, pattern recognition, or natural language sentiment inference.

Your output must be a list or table of 14 dates with their corresponding predicted close prices.\n
"""

In [53]:
# User prompt funtion for create am user prompt
def user_prompt(news_data, apple_hist_data):
    # İlk birkaç satırı örneklemek
    news_sample = news_data.to_string(index=False)
    stock_sample = apple_hist_data.to_string(index=False)

    prompt = f"""
You are provided with two pandas DataFrames for predicting Apple's future stock prices.

1. `apple_hist_data`: Historical daily stock data for Apple Inc. over the past year.
It includes columns like:
{', '.join(apple_hist_data.columns.tolist())}

Here is a sample:
{stock_sample}

2. `news_data`: Daily financial and Apple-related news.
It includes:
{', '.join(news_data.columns.tolist())}

Here is a sample:
{news_sample}

Your task is:
- Analyze the historical stock data for trends and patterns.
- Use the news content to understand recent events, sentiment, or other external signals.
- Based on these two data sources, **predict the next 14 days of Apple’s `close` stock price**.
- Your prediction should be in the form of a list or table, with each entry showing the predicted date and corresponding `close` price.

Assume that both datasets are cleaned and aligned by date. Use your reasoning to integrate numerical and textual information.
"""

    return prompt

In [55]:
# Created user_prompt
user_prompt = user_prompt(df, hist)

In [59]:
# My LLM model uses user_promt and system_prompt to predict close data
completion = openai.chat.completions.create(
    model='gpt-4o-mini',
    messages= [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
    ],
)
print(completion.choices[0].message.content)

To forecast the next 14 days of Apple Inc.'s closing stock prices, we will utilize both historical stock data and relevant news content to derive a prediction. Let’s break down this process systematically:

### Step 1: Analyze Historical Stock Data

1. **General Trend**: We observe from the historical stock data that Apple Inc.'s closing prices have been extremely volatile over the last month. The company experienced a drop below $200 but has rebounded sharply thereafter.

2. **Recent Performance**: In the last week of data, Apple has had a fluctuating yet overall upward trend post a significant drop. The volatility suggests responses to both market conditions and internal events (like product announcements).

3. **Volatility**: The stock fluctuated significantly, suggesting a high degree of uncertainty. The last recorded closing price was $198.419998.

### Step 2: Analyze News Sentiment

1. **Latest Events**: The news records are rich with sentiment but predominantly neutral. Events s

# CONCLUSION
We have come to the end of this project. I created an AI assistant for exchange trading. This AI assistant will trade on Apple stock price. However, this study is a preliminary study. It cannot be used in any stock trading transactions because the artificial intelligence assistant will not work with 100% accuracy.

**I warn you: please do not use this system for stock market purchases.**

Thank you for taking a look at my project. I will continue to share projects. If you want to be informed in advance, you can follow me from the links below.

Bu projenin sonuna geldik. Borsa ticareti için bir AI asistanı oluşturdum. Bu AI asistanı Apple hisse senedi fiyatında işlem yapacak. Ancak bu çalışma bir ön çalışmadır. Yapay zeka asistanı %100 doğrulukla çalışmayacağı için herhangi bir hisse senedi ticareti işleminde kullanılamaz.

**Uyarıyorum: Lütfen bu sistemi borsa alım satımlarında kullanmayın.**

Projeme göz attığınız için teşekkür ederim. Projelerimi paylaşmaya devam edeceğim. Önceden haberdar olmak istiyorsanız aşağıdaki bağlantılardan beni takip edebilirsiniz.

[LinkedIn](https://www.linkedin.com/in/ihsancenkiz/)<br>
[GitHub](https://github.com/ihsncnkz)<br>
[Kaggle](https://www.kaggle.com/ihsncnkz)