<a href="https://colab.research.google.com/github/Drewe4401/ZeldaGPT/blob/main/ZeldaGPT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ZeldaGPT

  * ZeldaGPT allows you to train a GPT model using the Zelda Text Dump Dataset. 
The program handles the data preprocessing, model configuration, and training process to optimize the language generation capabilities.
  * Once the GPT model is trained, ZeldaGPT enables you to generate new text based on the trained model. This feature allows you to interact with the model and obtain Zelda-like dialogues, descriptions, or other text elements.

## Imports

* import torch: This imports the PyTorch library, which is an open-source machine learning library for Python, used for applications such as computer vision and natural language processing. It provides tensor computation with strong GPU acceleration, deep neural networks built on a tape-based autograd system, and a variety of optimization algorithms and tools for research and development.

* import torch.nn as nn: This imports the neural network module from the PyTorch library and assigns it an alias nn. The torch.nn module provides classes and functions for creating and training neural networks. It contains classes for defining layers, loss functions, and optimization algorithms. By importing it as nn, it allows for easier and cleaner access to the neural network functionalities provided by the PyTorch library.


In [None]:
import torch
import torch.nn as nn
from torch.nn import functional as F

In [None]:
!wget https://raw.githubusercontent.com/Drewe4401/ZeldaGPT/main/zelda_text_dump.txt #getting data set from github

--2023-05-04 01:55:37--  https://raw.githubusercontent.com/Drewe4401/ZeldaGPT/main/zelda_text_dump.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 666033 (650K) [text/plain]
Saving to: ‘zelda_text_dump.txt’


2023-05-04 01:55:37 (14.7 MB/s) - ‘zelda_text_dump.txt’ saved [666033/666033]



In [None]:
#reading the file
with open('zelda_text_dump.txt', 'r', encoding='utf-8') as f:
  text = f.read()
print("length of dataset in characters: ", len(text))

length of dataset in characters:  631587


In [None]:
print(text[:500]) #Checking out the first 500 characters

You borrowed a Pocket Egg!
A Pocket Cucco will hatch from
it overnight. Be sure to give it
back when you are done with it.

You returned the Pocket Cucco
and got Cojiro in return!
Unlike other Cuccos, Cojiro
rarely crows.

You got an Odd Mushroom!
A fresh mushroom like this is
sure to spoil quickly! Take it to
the Kakariko Potion Shop, quickly!

You received an Odd Potion!
It may be useful for something...
Hurry to the Lost Woods!

You returned the Odd Potion 
and got the Poacher's Saw!
The youn


In [None]:
chars_in_text = sorted(list(set(text)))
vocab_size = len(chars_in_text)
print(''.join(chars_in_text))
print(vocab_size)


 !"&'()*+,-./0123456789:;<>?ABCDEFGHIJKLMNOPQRSTUVWXYZ^abcdefghijklmnopqrstuvwxyz|~§©´»ÁÄÈËÌÍÎÏÐÑÒÔÕ×ØÙÚÛÜÝÞßáâãäåæçèéôöùúûü†
138


In [None]:
# create a mapping from characters to integers
stoi = { ch:i for i,ch in enumerate(chars_in_text) }
itos = { i:ch for i,ch in enumerate(chars_in_text) }
encode = lambda s: [stoi[c] for c in s] # encoder: take a string, output a list of integers
decode = lambda l: ''.join([itos[i] for i in l]) # decoder: take a list of integers, output a string

In [None]:
data = torch.tensor(encode(text), dtype=torch.long)
print(data.shape, data.dtype)
print(data[:500])

torch.Size([631587]) torch.int64
tensor([65, 82, 88, 13, 69, 82, 85, 85, 82, 90, 72, 71, 13, 68, 13, 56, 82, 70,
        78, 72, 87, 13, 45, 74, 74, 14,  0, 41, 13, 56, 82, 70, 78, 72, 87, 13,
        43, 88, 70, 70, 82, 13, 90, 76, 79, 79, 13, 75, 68, 87, 70, 75, 13, 73,
        85, 82, 80,  0, 76, 87, 13, 82, 89, 72, 85, 81, 76, 74, 75, 87, 24, 13,
        42, 72, 13, 86, 88, 85, 72, 13, 87, 82, 13, 74, 76, 89, 72, 13, 76, 87,
         0, 69, 68, 70, 78, 13, 90, 75, 72, 81, 13, 92, 82, 88, 13, 68, 85, 72,
        13, 71, 82, 81, 72, 13, 90, 76, 87, 75, 13, 76, 87, 24,  0,  0, 65, 82,
        88, 13, 85, 72, 87, 88, 85, 81, 72, 71, 13, 87, 75, 72, 13, 56, 82, 70,
        78, 72, 87, 13, 43, 88, 70, 70, 82,  0, 68, 81, 71, 13, 74, 82, 87, 13,
        43, 82, 77, 76, 85, 82, 13, 76, 81, 13, 85, 72, 87, 88, 85, 81, 14,  0,
        61, 81, 79, 76, 78, 72, 13, 82, 87, 75, 72, 85, 13, 43, 88, 70, 70, 82,
        86, 22, 13, 43, 82, 77, 76, 85, 82,  0, 85, 68, 85, 72, 79, 92, 13, 70,
       