# **Unreal Engine - Analysis and Predictions**

In this notebook we will analyze and make predictions on Unreal Engine users. Without further ado, let's get started!

## Setup

Let's setup the notebook and load the data which we will use.


**Imports**

Let's import the necessary modules. We will need plotly and statsmodels to make graphs for analysis and we will use LinearRegression and PolynomialFeatures to make predictions.

In [1]:
import pandas as pd 
import numpy as np 
import seaborn as sns 
import plotly.graph_objects as go
import plotly.express as px
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose
import statsmodels.graphics.tsaplots as sgt
import statsmodels.tsa.stattools as sts
from matplotlib.ticker import StrMethodFormatter
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from IPython.display import display, HTML
print("Setup Complete")

Setup Complete


**Load the data**

Let's load the excel file and define it as data.

In [2]:
data = pd.read_excel('/kaggle/input/unreal-engine-from-past-to-future/Unreal Engine Data.xlsx')

data.head(5)

Unnamed: 0,Year,Version Released,Add Features,Games,Num Users
0,1998,1,Published,Unreal,7500
1,1999,1.5,Improved graphics quality,The Wheel of Time,13600
2,2000,2,"Better physics support, ragdoll physics added,...","Unreal Tournament, Deus Ex",27500
3,2001,"2.1, 2.2","Particle system added, improved graphics","Devastation, Mobile Forces, Devil Inside, Ta...",45300
4,2002,2.5,"Improved rendering support, vehicle physics, p...","Drome Racers, America's Army",79000


**Scrolling Table**

For better viewing of the data, let's make a scrolling table.

In [3]:
def scroll_table(df, table_id, title):
    html = f'<h3>{title}</h3>'
    html += f'<div id="{table_id}" style="height:200px; overflow:auto;">'
    html += df.to_html()
    html += '</div>'
    return html

In [4]:
def make_data_table(dataset, title):
    graph = dataset.select_dtypes(include=[np.number])
    info_graph = scroll_table(dataset, 'graph_data_2', title)
    display(HTML(info_graph))
    return make_data_table

make_data_table(data, 'Data')

Unnamed: 0,Year,Version Released,Add Features,Games,Num Users
0,1998,1,Published,Unreal,7500
1,1999,1.5,Improved graphics quality,The Wheel of Time,13600
2,2000,2,"Better physics support, ragdoll physics added, better external platform support","Unreal Tournament, Deus Ex",27500
3,2001,"2.1, 2.2","Particle system added, improved graphics","Devastation, Mobile Forces, Devil Inside, Tactical Ops: Assault on Terror",45300
4,2002,2.5,"Improved rendering support, vehicle physics, particle system editor, 64-bit support","Drome Racers, America's Army",79000
5,2003,2X,"Optimizations, bug fixes, and enhancements to existing features",Unreal Tournament 2,107650
6,2004,"2.6, 2.7","Improved rendering, improved C++ building","Tom Clancy's Splinter Cell 3D, Killing Floor",123500
7,2005,2.8,Improved shading system,"Tom Clancy's Rainbow Six: Lockdown, Brothers in Arms: Road to Hill 30, Red Orchestra: Ostfront 41-45, Red Orchestra: Ostfront 41-45, Pariah",158430
8,2006,3,"Advanced lighting and shadowing techniques, improved physics simulation, support for next-gen consoles, powerful content creation pipeline",Unreal Tournament 3,178500
9,2007,,,"Bioshock , Mass Effect, Medal of Honor: Airborne",209430


<function __main__.make_data_table(dataset, title)>

## EDA

Let's start our exploratory data analysis with a simple analysis of the data.

In [5]:
print("Simple Analysis")
print("---------------")
print("Shape")
print(data.shape)
print("---------------")
print('Summary Statistics')
print(data.describe())
print('------------------')
print("The types of all columns")
print("------------------------")
print(data.dtypes)

Simple Analysis
---------------
Shape
(27, 5)
---------------
Summary Statistics
              Year     Num Users
count    27.000000  2.700000e+01
mean   2011.000000  1.918321e+06
std       7.937254  2.279315e+06
min    1998.000000  7.500000e+03
25%    2004.500000  1.409650e+05
50%    2011.000000  7.540000e+05
75%    2017.500000  3.310000e+06
max    2024.000000  6.911660e+06
------------------
The types of all columns
------------------------
Year                 int64
Version Released    object
Add Features        object
Games               object
Num Users            int64
dtype: object


In [6]:
fig = go.Figure()
fig.add_trace(go.Scatter(x=data['Year'], y=data['Num Users'], mode='lines+markers', name='Num Users', marker=dict(color='royalblue')))
fig.update_layout(title='UE Users Over Years', xaxis_title='Year', yaxis_title='Amount')
fig.show()

This line plot shows A sudden rise in the amount of users in 2010. This sudden rise in the number of users oincides with the release of Bioshock 2, which likely led to increased interest and adoption of Unreal Engine among game developers. The success of a high-profile game like Bioshock 2 can often have a ripple effect throughout the gaming community, influencing other developers to utilize the same engine for their projects. This surge in users demonstrates the significant impact that major game releases can have on the popularity and utilization of game development tools like the Unreal Engine.

Let's continue our analysis.

In [7]:
trace = go.Scatter(
    x=data['Year'],
    y=data['Version Released'],
    mode='markers',
    marker=dict(
        size=10,
        color='rgba(50, 171, 96, 0.6)',
        line=dict(
            width=2,
            color='rgba(50, 171, 96, 1.0)'
        )
    ),
    line=dict(width=2, color='rgb(229, 151, 50)'),
    name='Unreal Engine Versions'
)

# Layout
layout = go.Layout(
    title='Number of Unreal Engine Users Over Time',
    xaxis=dict(title='Year'),
    yaxis=dict(title='Number of Users')
)

# Create figure
fig = go.Figure(data=[trace], layout=layout)

# Show plot
fig.show()

In [8]:
fig = px.violin(data, y="Version Released", box=True, points="all", title="Violin Plot of Version Usage")
fig.show()

In [9]:
fig = go.Figure()
fig.add_trace(go.Scatter3d(x=data['Year'], y=data['Num Users'], z=data['Year'], mode='lines', name='Users and Versions'))
fig.update_layout(title='3D Line Plot of Users vs. Year', scene=dict(xaxis_title='Year', yaxis_title='Users', zaxis_title='Version'))
fig.show()

## Predictions

Now, let's move onto the predictions.

Let's fit our polynomial regression model on the data with a degress of 10 and store the predictions into a dataframe.

In [10]:
degree = 10
poly_features = PolynomialFeatures(degree=degree)
X_poly = poly_features.fit_transform(data[['Year']])

poly_model = LinearRegression()
poly_model.fit(X_poly, data['Num Users'])

future_years = np.array([[year] for year in range(data['Year'].max() + 1, data['Year'].max() + 100)])

future_years_poly = poly_features.transform(future_years)

future_target = poly_model.predict(future_years_poly)

future_df = pd.DataFrame(future_target, index=future_years.flatten(), columns=['Num Users'])
future_df.index.name = 'Year'

print(future_df)

         Num Users
Year              
2025  8.131822e+06
2026  9.003325e+06
2027  9.930254e+06
2028  1.091419e+07
2029  1.195671e+07
...            ...
2119  5.912947e+08
2120  6.064194e+08
2121  6.218211e+08
2122  6.375031e+08
2123  6.534687e+08

[99 rows x 1 columns]



X does not have valid feature names, but PolynomialFeatures was fitted with feature names



Let's visualize the predictions now.

In [11]:
fig = go.Figure()
fig.add_trace(go.Scatter(x=future_df.index, y=future_df['Num Users'], mode='lines+markers', name='Num Users', marker=dict(color='royalblue')))
fig.update_layout(title='Predictions', xaxis_title='Year', yaxis_title='Amount')
fig.show()

As you see, the predicted amount of users Unreal Engine will have is increasing. Now, let's generate a csv file from the predictions dataframe.

In [12]:
future_df.to_csv('UE_Future.csv', index=False)
print("Output Generated Successfully")

Output Generated Successfully


Thank you for viewing my notebook! Feel free to fork and upvote. I open for suggestions, please write them in the comments. Thank you. GOd bless you.