<a href="https://www.kaggle.com/code/keenanbasyir/forecasting-data-using-polynomial-interpolation?scriptVersionId=272138456" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# **Forecasting Future Data Using Polynomial Interpolation & Newton Method**

# **Group Members:** 
1. Muhammad Keenan Basyir
2. Keita Kawabata

# Introduction
First of all, what is forecasting? By basic definition, forecasting is the process of predicting future values or outcomes of a variable according to past and current data. In simple terms, we will be predicting the next values in a data set and to find the estimate of an unknown future value.

**Some Key Points of the Importance of Forecasting:**
* Planning and Decision Making
* Risk Reduction 
* Resource Optimization
* Adaptation to change
* Competitive benefits & advantages

**Real-life problem application**

The temperature forecast for a number of days is a highly useful example of everyday prediction. It allows individuals and organizations to make informed decisions. Temperature forecasting also matters for electricity suppliers. Under extremely hot or cold weather, individuals consume more electricity to operate air conditioners or heaters.

For health and safety, being aware of the future temperature is useful for hospitals and emergency services to be prepared for conditions like heatwaves or record cold that could impact people's health. 

Here in this project, we can use Polynomial Interpolation (Newton's method) to forecast temperature in a simple way.

# Polynomial Interpolation & Extrapolation Overview
Polynomial interpolation is a technique for determining a polynomial function that fits with a given set of data points, as long as the data points are specified. This function may then be used to approximate in-between or just-beyond values for the known points.

Polynomial Extrapolation takes the same polynomial created during interpolation and uses it to predict values outside the range of the original data points. While it is mathematically possible, extrapolation is highly sensitive and can lead to wildly inaccurate results. The polynomial curve that fits the known points perfectly might not represent the real-world trend at all outside of that small window. A small change in the data can cause the extrapolated curve to swing dramatically. 

**Polynomial interpolation (which Newton's method is a way of doing) is the method we can use to create a model. It's the step where we find the one specific polynomial function that perfectly fits all our known data points (like the temperatures for days 0-4). Then, extrapolation (the forecast) is the action of using that function to predict a value outside of our known data range.**

# Newton Method (Newton's Divided Difference Interpolation)
Newton's Interpolation is one of the method in Polynomial Interpolation. It is a method to build a polynomial that exactly passes through a set of known data points and to predict unknown values. These are some important formulas used for this method:

**Newton Polynomial:**
P(x) = a0 + a1(x - x0) + a2(x - x0)(x - x1) + ... + an(x - x0)(x - x1)...(x - x_{n-1})

**Divided Differences:**

**First-order divided differences (Linear)**

f[x0] = y0

f[x0, x1] = (f[x1] - f[x0]) / (x1 - x0)

**Second-order divided differences (Quadratic)**

f[x0, x1, x2] = (f[x1, x2] - f[x0, x1]) / (x2 - x0)

**Third-order divided differences (Cubic)**

f[x0, x1, x2, x3] = (f[x1, x2, x3] - f[x0, x1, x2]) / (x3 - x0)

With these formulas, we are able to estimate unknown values using a polynomial through known data points.

# Data

Using the free weather API [Visual Crossing](https://www.visualcrossing.com/),  retrieve the average temperature data for the last 8 days in Yogyakarta. Format the data as follows:

```
data = [
    (index, avg_temp),
    ...
]
```

In [1]:
import requests

API_KEY = 'MA9SZ8KMB6UQ7WTT54AVDFUG7'
LOCATION = 'Yogyakarta'
URL = f'https://weather.visualcrossing.com/VisualCrossingWebServices/rest/services/timeline/{LOCATION}/last7days?unitGroup=metric&key={API_KEY}&include=days'

response = requests.get(URL)

if response.status_code == 200:
    raw_data = response.json()
    days = raw_data['days']
    
    data = []
    for i, day in enumerate(days):
        temp = day['temp']  # average temperature
        data.append((i, temp))
    
    print("Formatted temperature data:")
    print(data)
else:
    print("Failed to fetch data:", response.status_code)

Formatted temperature data:
[(0, 25.1), (1, 25.0), (2, 24.4), (3, 24.0), (4, 24.2), (5, 24.9), (6, 24.5), (7, 24.2)]


Our Data (2025/05/10 – 2025/05/17):

```
[(0, 25.7), (1, 25.8), (2, 25.5), (3, 25.6), (4, 25.2), (5, 25.5), (6, 25.8), (7, 25.2), (8, 25.1)]
```

This code:

Uses Visual Crossing Weather API,

Fetches past 7 days' temperatures for Yogyakarta,

Extracts average temperature per day, and

Displays the results in a simple list of tuples.

# Using Newton Method to solve real-life case: Forecasting temperature

**Data:**


| Day (x) | Average Temperature (y) |
| ------- | --------------- |
| 1       | 25.7            |
| 2       | 25.8            |
| 3       | 25.5            |

In this case, we want to predict the average temperature for day 4:

x0 = 1,  y0 = 25.7  
x1 = 2,  y1 = 25.8  
x2 = 3,  y2 = 25.5

Step 1: First order divided differences

f[x0, x1] = (y1 - y0) / (x1 - x0) = (25.8 - 25.7) / (2 - 1) = 0.1

f[x1, x2] = (y2 - y1) / (x2 - x1) = (25.5 - 25.8) / (3 - 2) = -0.3

Step 2: Second order divided differences

f[x0, x1, x2] = (f[x1, x2] - f[x0, x1]) / (x2 - x0)
              = (-0.3 - 0.1) / (3 - 1)
              = (-0.4) / 2
              = -0.2

Step 3: Using the Newton Polynomial

P(x) = a0 + a1(x - x0) + a2(x - x0)(x - x1)

* a0 = 25.7
* a1 = 0.1
* a2 = -0.2

(For value of a0,a1,a2, take the first row/section of the order divided differences)

Step 4: Substitute x=4

P(4) = 25.7 + 0.1*(4 - 1) + (-0.2)*(4 - 1)*(4 - 2)
     = 25.7 + 0.1*3 + (-0.2)*3*2
     = 25.7 + 0.3 - 1.2
     = 24.8°C

**The average temperature on day 4 is = 24.8°C**


# Code 1
Using Newton's method and the average temperature data from the first 3 days of the Weather API to forecast the temperature for the next day (day 4)

In [2]:
# temperature data for first 3 days (0–2)
data = [(0, 25.7), (1, 25.8), (2, 25.5)]

# split into x (days) and y (temperatures)
x = [point[0] for point in data]
y = [point[1] for point in data]

# function to construct divided difference table
def divided_differences(x, y):
    n = len(x)
    coef = [0] * n
    for i in range(n):
        coef[i] = y[i]

    for j in range(1, n):
        for i in range(n - 1, j - 1, -1):
            coef[i] = (coef[i] - coef[i - 1]) / (x[i] - x[i - j])

    return coef

# function to evaluate Newton polynomial at a point
def newton_interpolation(x, coef, x_target):
    n = len(coef)
    result = coef[0]
    product = 1.0
    for i in range(1, n):
        product *= (x_target - x[i - 1])
        result += coef[i] * product
    return result

# build the coefficients using divided differences
coefficients = divided_differences(x, y)

# forecast temperature for the next day (day 4)
x_next = 3
predicted_temp = newton_interpolation(x, coefficients, x_next)

print(f"Forecasted temperature on day {x_next+1}: {predicted_temp:.2f}°C")

Forecasted temperature on day 4: 24.80°C


In [3]:
data = [(0, 25.7), (1, 25.8), (2, 25.5)]

* **Defines the data of the values from index 0 to 2**

In [4]:
x = [point[0] for point in data]
y = [point[1] for point in data]

# Calculating Error Percentage between predicted value and real data from API Visual Crossing

Day 4 Avearage Temperature Values: 

* Real data value from API Visual Crossing = 25.6°C
* Predicted value using polynomial interpolation newton's method = 24.8°C

Error% = (|Predicted - Real| / Real) x 100%

Error% = (|24.8 - 25.6| \ 25.6) x 100% = **3.125% error**

# Code 2
Using Newton's method and the average temperature data from the last 8 days to forecast the temperature for the next day.


| Day (x) | Temperature (y) |
| ------- | --------------- |
| 1       | 25.7            |
| 2       | 25.8            |
| 3       | 25.5            |
| 4       | 25.6            |
| 5       | 25.2            |
| 6       | 25.5            |
| 7       | 25.8            |
| 8       | 25.2            |


In [5]:
# Temperature data for 8 days (0–7)
data = [(0, 25.7), (1, 25.8), (2, 25.5), (3, 25.6),
        (4, 25.2), (5, 25.5), (6, 25.8), (7, 25.2)]

# Split into x (days) and y (temperatures)
x = [point[0] for point in data]
y = [point[1] for point in data]

# Function to construct divided difference table
def divided_differences(x, y):
    n = len(x)
    coef = [0] * n
    for i in range(n):
        coef[i] = y[i]

    for j in range(1, n):
        for i in range(n - 1, j - 1, -1):
            coef[i] = (coef[i] - coef[i - 1]) / (x[i] - x[i - j])

    return coef

# Function to evaluate Newton polynomial at a point
def newton_interpolation(x, coef, x_target):
    n = len(coef)
    result = coef[0]
    product = 1.0
    for i in range(1, n):
        product *= (x_target - x[i - 1])
        result += coef[i] * product
    return result

# Build the coefficients using divided differences
coefficients = divided_differences(x, y)

# Forecast temperature for the next day (day 9)
x_next = 8
predicted_temp = newton_interpolation(x, coefficients, x_next)

print(f"Forecasted temperature on day {x_next+1}: {predicted_temp:.2f}°C")

Forecasted temperature on day 9: 43.50°C




**Result**:
```
Forecasted temperature on day 9: 43.50°C
```
Day 9 Avearage Temperature Values:

Real value (average value from 8 days) = 25.5375
Predicted value using polynomial interpolation newton's method = 43.50°C
Error% = (|Predicted - Real| / Real) x 100%

Error% = (|43.50 - 25.5375| \ 25.5375) x 100% = **70.33% error**

# Calculating Error Percentage between predicted value and the real data from API Visual Crossing

**Result**:
```
Forecasted temperature on day 9: 43.50°C
```
Day 9 Avearage Temperature Values:

Real value = 25.1
Predicted value using polynomial interpolation newton's method = 43.50°C
Error% = (|Predicted - Real| / Real) x 100%

Error% = (|43.50 - 25.1| \ 25.1) x 100% = **73.3% error**

# Result and Errors Analysis
As you can see from the results we obtained with the use Newton's method, it is an error from day 9 as the average temperature of day 9 is significantly greater than the rest of the 8 days in the data and code.



# Code 3
We will try again but this time we will only use the first 5 datas of the average temperature from day 1-5 from the Weather API:


| Day (x) | Temperature (y) |
| ------- | --------------- |
| 1       | 25.7            |
| 2       | 25.8            |
| 3       | 25.5            |
| 4       | 25.6            |
| 5       | 25.2            |


In [6]:
# Temperature data for 8 days (0–7)
data = [(0, 25.7), (1, 25.8), (2, 25.5), (3, 25.6),
        (4, 25.2)]

# Split into x (days) and y (temperatures)
x = [point[0] for point in data]
y = [point[1] for point in data]

# Function to construct divided difference table
def divided_differences(x, y):
    n = len(x)
    coef = [0] * n
    for i in range(n):
        coef[i] = y[i]

    for j in range(1, n):
        for i in range(n - 1, j - 1, -1):
            coef[i] = (coef[i] - coef[i - 1]) / (x[i] - x[i - j])

    return coef

# Function to evaluate Newton polynomial at a point
def newton_interpolation(x, coef, x_target):
    n = len(coef)
    result = coef[0]
    product = 1.0
    for i in range(1, n):
        product *= (x_target - x[i - 1])
        result += coef[i] * product
    return result

# Build the coefficients using divided differences
coefficients = divided_differences(x, y)

# Forecast temperature for the next day (day 6)
x_next = 5
predicted_temp = newton_interpolation(x, coefficients, x_next)

print(f"Forecasted temperature on day {x_next+1}: {predicted_temp:.2f}°C")

Forecasted temperature on day 6: 21.70°C


# Calculating Error Percentage between predicted value and the real data from Weather API Visual Crossing
**The Forecasted temperature of day 6 is 21.70°C with the 5 datas we used from the Weather API.**

The Error% between the predicted value and the real value from the Weather API is:

* Real data value from API Visual Crossing = 25.5°C
* Predicted value using polynomial interpolation newton's method = 21.7°C

Error% = (|Predicted - Real| / Real) x 100%

Error% = (|21.7 - 25.5| \ 25.5) x 100% = **14.9% error**

# Error & Analysis

After calculating and analyzing the errors we've encountered in predicting/forecasting temperature, we have come up with a conclusion that prediction with this method can lead to errors because high-degree polynomials can behave erratically.  The curve may oscillate wildly and produce absurd results if you attempt to predict values that are not consistent with the data available.  This challenge, known as Runge's phenomenon, is particularly problematic at the extremes or outside of the data range. 

From Code 2, the result of 43.50°C is one of the error that is caused by high-degree polynomials. The polynomial, was forced to go through all 8 of the data points, in which has likely taken a sharp, unrealistic turn upwards to satisfy the mathematical constraints, leading to an error value. Hence the sharp rise from Code 2 is because of Runge’s phenomenon, a known issue when using high-degree polynomials for beyond known data.

# Conclusion
Newton's method will give you the exact same unique interpolating polynomial for a given set of points and build a polynomial that exactly passes through a set of known data points and to predict unknown values.

The solution to preventing or reducing this error is to: 
* Use fewer data points (Low-Degree Polynomials)
* Using nearby points to build a quadratic or cubic model
* This gives a more stable and local trend/behaviour