# Google TimesFM model for predict today's tesla stock price (yejin)

- TimesFM (Time Series Foundation Model)
- reference 
: https://huggingface.co/google/timesfm-1.0-200m

* used python 3.11.11 in jupyter kernel on my mac

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf
from time import time
from datetime import date
import warnings
warnings.filterwarnings("ignore")

In [3]:
import timesfm
# For Torch
tfm = timesfm.TimesFm(
      hparams=timesfm.TimesFmHparams(
          backend="gpu",
          per_core_batch_size=32,
          horizon_len=128,
          num_layers=50,
          use_positional_embedding=False,
          context_len=2048,
      ),
      checkpoint=timesfm.TimesFmCheckpoint(
          huggingface_repo_id="google/timesfm-2.0-500m-pytorch"),
  )

 See https://github.com/google-research/timesfm/blob/master/README.md for updated APIs.
Loaded PyTorch TimesFM, likely because python version is 3.11.11 (main, Jan 27 2025, 15:58:08) [Clang 16.0.0 (clang-1600.0.26.6)].


Fetching 5 files:   0%|          | 0/5 [00:00<?, ?it/s]

In [4]:
# For Torch
tfm = timesfm.TimesFm(
      hparams=timesfm.TimesFmHparams(
          backend="gpu",
          per_core_batch_size=32,
          horizon_len=128,
      ),
      checkpoint=timesfm.TimesFmCheckpoint(
          huggingface_repo_id="google/timesfm-1.0-200m-pytorch"),
  )

Fetching 3 files:   0%|          | 0/3 [00:00<?, ?it/s]

In [5]:
# 1. Fetch stock data for TSLA
GetTSLA = yf.Ticker("TSLA")
df=GetTSLA.history(period="max")
df.reset_index(inplace=True)  # Reset index to make "Date" a column
df.head()
df.tail()

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
3703,2025-03-19 00:00:00-04:00,231.610001,241.410004,229.199997,235.860001,111993800,0.0,0.0
3704,2025-03-20 00:00:00-04:00,233.350006,238.0,230.050003,236.259995,99028300,0.0,0.0
3705,2025-03-21 00:00:00-04:00,234.990005,249.520004,234.550003,248.710007,132728700,0.0,0.0
3706,2025-03-24 00:00:00-04:00,258.079987,278.640015,256.329987,278.390015,169079900,0.0,0.0
3707,2025-03-25 00:00:00-04:00,283.600006,288.200012,271.279999,288.140015,149151000,0.0,0.0


In [6]:
df_filtered = df[(df['Date'] >= '2023-01-01') & (df['Date'] <= '2024-12-31')]

# 길이 출력
print("총 시계열 길이 (영업일 기준):", len(df_filtered)

SyntaxError: incomplete input (592463418.py, line 4)

### Autocorrelation Analysis for Period Selection

To determine the appropriate sampling periods for the time series model, I analyzed the **Autocorrelation Function (ACF)** of the data. I generated ACF plots for both the **original stock price data** and the **differenced version** to observe any significant lags or periodic patterns.

Based on the results, I tested multiple sampling intervals (e.g., **5-day, 7-day, and 10-day** intervals) to find an optimal resampling frequency for the model input.

Since the **TimesFM model does not require stationarity**, I decided to proceed with the **original (non-differenced) data** while keeping the selected sampling periods.


In [None]:
from statsmodels.graphics.tsaplots import plot_acf

price = df["Close"]
fig, (ax1, ax2 ) = plt.subplots(1, 2, figsize=(16, 4))
price_diff = price.diff().dropna()


plot_acf(price, lags=50, ax=ax1)  # Check autocorrelation
plot_acf(price_diff, lags=50, ax=ax2)
ax2.set_title("Difference once")

In [None]:
# To check if the number of samples is enough
for days in [5, 7, 10, 14, 20]:
    print(f"Sampling every {days} days: {len(price.values[::days])} samples remaining")

In [8]:
# 2. Prepare input data (resample and ensure 1D)
forecast_input = [
    np.array(price.values[::5]).flatten(),   # Every 5th day
    np.array(price.values[::7]).flatten(),  # Every 7th day
    np.array(price.values[::10]).flatten(),  # Every 10th day
]

# 3. Ensure all arrays have the same length
min_length = min(len(arr) for arr in forecast_input)
forecast_input = [arr[:min_length] for arr in forecast_input]

### Determining of the frequency input
In particular regarding the frequency, TimesFM expects a categorical indicator valued in {0, 1, 2}:

- 0 (default): high frequency, long horizon time series. We recommend using this for time series up to daily granularity.
- 1: medium frequency time series. We recommend using this for weekly and monthly data.
- 2: low frequency, short horizon time series. We recommend using this for anything beyond monthly, e.g. quarterly or yearly.
This categorical value should be directly provided with the array inputs. For dataframe inputs, we convert the conventional letter coding of frequencies to our expected categories, that

0: T, MIN, H, D, B, U

1: W, M

2: Q, Y

In [9]:
# 4. Frequency input 
frequency_input = [0]

In [None]:
#5. Build input dataframe for TimesFM
# Create the dataframe in the format required by TimesFM
input_df = pd.DataFrame({
    "unique_id": ["TSLA"] * len(df),  # Use "TSLA" as the unique identifier
    "ds": df["Date"],                # Date column
    "y": df["Close"]                 # Closing price
})

# 3. Ensure y values are 1-dimensional
# Flatten y to ensure it is a 1D array
input_df["y"] = input_df["y"].values.flatten()


# 6. Forecast using TimesFM
# Pass the input dataframe to the TimesFM model for forecasting
forecast_df = tfm.forecast_on_df(
    inputs=input_df,  # Input dataframe
    freq="D",         # Daily frequency
    value_name="y",   # The column to predict
    num_jobs=-1       # Use all available cores for parallel processing
)

# 7. Display forecast results
print(forecast_df)


# TimesFM Forecast Interpretation
The forecast results include multiple columns that represent the predicted values and their uncertainty ranges. Here's how to interpret the key columns:


- timesfm: Median prediction (most likely value).

- timesfm-q-0.1: 10% quantile prediction (lower range of uncertainty).

- timesfm-q-0.5: 50% quantile prediction (equivalent to timesfm median).

- timesfm-q-0.9: 90% quantile prediction (higher range of uncertainty, if available).


<Example Interpretation: > 
For a given date (e.g., 2025-01-28):

- Median prediction (timesfm): 393.19 → The most likely stock price.

- 10% quantile (timesfm-q-0.1): 371.13 → Lower bound of the range.

- 50% quantile (timesfm-q-0.5): 393.19 → The central prediction (same as timesfm).

- 90% quantile: (if present) Higher bound of the range.

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(12,4))
plt.plot(forecast_df["ds"], forecast_df["timesfm"], label="Median Prediction", color="blue")
plt.fill_between(
    forecast_df["ds"],
    forecast_df["timesfm-q-0.1"],
    forecast_df["timesfm-q-0.9"],
    color="blue",
    alpha=0.2,
    label="10%-90% Interval"
)
plt.xlabel("Date")
plt.ylabel("Stock Price")
plt.title("Tesla Stock Price Forecast")
plt.legend()
plt.show()


In [None]:
df

In [None]:
import matplotlib.pyplot as plt
import pandas as pd

plt.figure(figsize=(12, 4))

# plot actual price(past data)
plt.plot(df["Date"], df["Close"], label="Actual Price", color="black")

# plot forecasted price
plt.plot(forecast_df["ds"], forecast_df["timesfm"], label="Median Prediction", color="blue")

# plot 10%-90% interval
plt.fill_between(
    forecast_df["ds"],
    forecast_df["timesfm-q-0.3"],
    forecast_df["timesfm-q-0.7"],
    color="blue",
    alpha=0.2,
    label="30-70% Interval"
)

# Set the x-axis 
start_date = pd.to_datetime("2024-01-01")
end_date = forecast_df["ds"].max()
plt.xlim([start_date, end_date])

plt.xlabel("Date")
plt.ylabel("Stock Price")
plt.title("Tesla Stock Price Forecast with Historical Data")
plt.legend()
plt.grid(True)
plt.show()


# Predicting Tomorrow's TSLA stock price

In [None]:
from datetime import datetime, timedelta

# Generate the next day's datetime
today = datetime.today()
tmr = (today + timedelta(days=1)).date()  

# Ensure both tmr and forecast_df["ds"] are datetime
forecast_df["ds"] = pd.to_datetime(forecast_df["ds"])  # Convert to datetime
filtered_df = forecast_df[forecast_df["ds"].dt.date == tmr]  # Compare dates directly

print(filtered_df)

In [None]:
print(tmr)

### 1. Split the data 2023-2024 - to train, 2025 to predict 

### 2. compare actual vs predict
### 3. performance valuation matrix ( mse, rsme, mae, r-square)

-  set the seed 
- try different seed value and check if the result is the same or not.
- improve hyperparameter (try)
- when I make the markdown note, I should know what each parameter do (taking note for everything)

In [None]:
input_df

In [None]:
ticker = 'TSLA'
data = yf.Ticker(ticker)
data

In [32]:
train = data.history(start='2023-01-01',end='2024-12-31')
test = data.history(start='2025-01-01',end='2025-12-31')