<a href="https://colab.research.google.com/github/Highfire1/IDS11/blob/main/NeuralNetworkv1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Welcome to a walkthrough of Neural Network v1.0

Thanks to google, you can run this code online, on this website, for free!

Two options: 

- click the play button on every cell and watch the code execute in live time
- click Runtime -> Run All

Thanks for reading!

In [9]:
# used for importing csv (excel) file
import pandas as pd
import numpy as np

# Make numpy values easier to read, apparently
np.set_printoptions(precision=3, suppress=True)

# machine learning library that abstracts everything away
import tensorflow as tf
# sub library (keras) of sorts with more specialized functions
from tensorflow.keras import layers

In [10]:
# get fake data from github and read it as a .csv file
file_dir = r'https://raw.githubusercontent.com/Highfire1/IDS11/main/fake_data.csv'
fake_data_train = pd.read_csv(file_dir, header = 0)

In [11]:
# print it out, just to make sure we have the right file
fake_data_train.head()

Unnamed: 0,playtime,joincount,game_interactions,tnt_ratio,grieferbool
0,45,40,447,5,0
1,147,3,953,12,0
2,204,43,85,8,0
3,551,7,255,21,0
4,26,35,563,11,0


In [12]:
# seperate the csv file into two datasets 
# player_info stores playtime, joincount, game_interactions and tnt_ratio
# grieferbool stores grieferbool
# keep in mind that there are 10 000 rows 
player_info = fake_data_train.copy()
grieferbool = player_info.pop('grieferbool')

# convert into a numpy array which is mandatory for the tensorflow library as apparently it computes faster
# (this is above my paygrade I don't have a better explanation)
player_info = np.array(player_info)
grieferbool = np.array(grieferbool)

In [13]:
player_info # should match up to what you saw in block # 3

# think of this as code being run in the interpreter so it doesn't need print()

array([[ 45,  40, 447,   5],
       [147,   3, 953,  12],
       [204,  43,  85,   8],
       ...,
       [747,  13, 461,   6],
       [ 48,   8, 357,  17],
       [112,  19, 647,   1]])

In [14]:
grieferbool # should also match to the latter half of block # 3

array([0, 0, 0, ..., 0, 1, 0])

In [15]:
# THE KEYSTONE PART

# This is where the neural network is defined. 

# this sets up the most basic of all neural networks - a SEQUENTIAL NEURAL NETWORK has one layer of input and one layer of output. Very simple and the easiest to set up.
player_model = tf.keras.Sequential([
                                    
  layers.Dense(64), # I believe (though the resources I'm looking through all assume prior knowledge of machine learning concepts which I do not possess) that this is the hidden layer.
                    # In other words, this is the layer in which all the math is done.

  layers.Dense(1)   # it is all then compressed into one node
])

# this lets the algorithm know what to improve on
# The loss looks at the output from the neural network then at the true value, then flips some switches(changes weights) in the neural network to make it approach the true value
# Because of this, if you let the neural network train for too long, it can learn all the values in a single dataset and become less useful in real world usage (known as overfitting)

# Trying to figure out what on earth Adam means also runs into the resources assuming prior knowledge problem but it appears that Adam is a name for a type of algorithm that  
# applies the above weights to the program using a method called stochastic gradient descent. Apparently, Adam can do it better and faster than a traditional algorithm.
player_model.compile(loss = tf.losses.MeanSquaredError(),
                      optimizer = tf.optimizers.Adam())


In [16]:
# TRAIN THE NEURAL NETWORK

# as previously mentioned, a neural network needs to train through a dataset with many records in order to work effectively
# Of course, it takes the player data, but theres also a setting for epochs
# Basically, an epoch defines how long the neural network will train for
# Every epoch, each record in the training set affects the neural network once. If you set this too high, you will overfit your model and it will become less effective.
# For now, ten is a reasonable number and works well
player_model.fit(player_info, grieferbool, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f14285821d0>

In [17]:
# Now what?
# Using the neural network:

# download a second (smaller) dataset from Github
# exact same stuff as in block # 2
file_dir = r'https://raw.githubusercontent.com/Highfire1/IDS11/main/fake_data_small.csv'
fake_data_small = pd.read_csv(file_dir, header = 0)

grieferboollist = fake_data_small.pop('grieferbool')
fake_data_small  = np.array(fake_data_small)

In [18]:
# KEYSTONE BLOCK # 2

# run the trained neural network on the new data
predictions = player_model.predict(fake_data_small)

In [32]:
# and print output
# of course, instead of printing in a for loop, you can call a function or write it back to another file, but for conveniences sake:

good = 0
bad = 0

for num in range(len(predictions)):
  print(str(num) + ': predict:' + str(predictions[num]) + ' real:[' + str(grieferboollist[num]), end = '] ')

  # convert all potential outputs to a bool
  if float(predictions[num]) > 0.5:
    predictbool = 1
  else:
    predictbool = 0

  if predictbool == grieferboollist[num]:
    print("GOOD")
    good += 1
  else:
    print("BAD")
    bad += 1

0: predict:[0.468] real:[1] BAD
1: predict:[-0.067] real:[0] GOOD
2: predict:[0.517] real:[1] GOOD
3: predict:[0.692] real:[1] GOOD
4: predict:[0.128] real:[0] GOOD
5: predict:[0.751] real:[1] GOOD
6: predict:[0.614] real:[1] GOOD
7: predict:[0.533] real:[0] BAD
8: predict:[0.261] real:[0] GOOD
9: predict:[0.452] real:[1] BAD
10: predict:[0.625] real:[1] GOOD
11: predict:[0.059] real:[0] GOOD
12: predict:[0.856] real:[1] GOOD
13: predict:[0.144] real:[0] GOOD
14: predict:[-0.213] real:[0] GOOD
15: predict:[-0.239] real:[0] GOOD
16: predict:[0.445] real:[0] GOOD
17: predict:[0.016] real:[0] GOOD
18: predict:[-0.034] real:[0] GOOD
19: predict:[0.784] real:[1] GOOD
20: predict:[0.666] real:[1] GOOD
21: predict:[-0.302] real:[0] GOOD
22: predict:[0.736] real:[1] GOOD
23: predict:[0.712] real:[1] GOOD
24: predict:[0.836] real:[1] GOOD
25: predict:[0.654] real:[0] BAD
26: predict:[0.717] real:[1] GOOD
27: predict:[-0.31] real:[0] GOOD
28: predict:[0.028] real:[0] GOOD
29: predict:[0.257] rea

In [21]:
# JUST LIKE THAT
print("Summary:")
print(f"{good}/500 were correct ({good/500*100}%)")
print(f"{bad}/500 were incorrect ({bad/500*100}%)")

Summary:
470/500 were correct (94.0%)
30/500 were incorrect (6.0%)
