Skip to content

A dataset of Quake 1 gameplay videos preprocessed for deep learning

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-Apache
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

thavlik/quake-gameplay-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Example Thumbnails

Quake Gameplay Dataset

UPDATE MAY 13, 2024: Hosting costs for multiresolution are unsustainably expensive. Moving forward, only the resized videos will be available for download.

This is a collection of Quake 1 gameplay footage that has been preprocessed such that it is appropriate for use as a deep learning dataset.

There are no class labels or ground truth; this dataset is primarily intended for unsupervised learning.

A few videos containing weapon/enemy mods made their way into dataset. Future efforts may be directed at "purifying" the data in ways such as omitting these custom weapons.

Download Links

Resolution FPS Size (GiB) % Reduction Download (.zip)
320x240 15 29 88 Link
640x480 15 87 63 Link
Source* 30 233 0 (raw) (Unavailable)

* Most raw videos are at 1080p/720p but some are at lower resolutions

S3 Hosting

The data can be downloaded with the AWS Command Line Interface or compatible S3 API. Folders in the S3 bucket are named according to the resolution video they contain. Because the bucket contains all resolutions in both .mp4 and .zip format, syncing the entire bucket is highly redundant and discouraged. s3 sync is the recommended download method for slow or interruptible connections, as it can stopped and resumed without issue.

$ mkdir quake-gameplay-dataset
$ cd quake-gameplay-dataset

# The resolutions are available as both folders and zip files
# --no-sign-request allows use of awscli without credentials
$ aws s3 ls \
    --endpoint https://nyc3.digitaloceanspaces.com \
    --no-sign-request \
    s3://quake-gameplay-dataset/

# Sync only the folder with the resolution you want
$ aws s3 sync \
    --endpoint https://nyc3.digitaloceanspaces.com \
    --no-sign-request \
    s3://quake-gameplay-dataset/320x240 \
    320x240

How To Use

There are several existing Python solutions for loading frames from a directory of videos. decord is currently the most promising, given its narrowly tailored focus of machine learning. Generally, the API entails pointing the loader at a directory containing video files:

import os
import torch
import decord
from decord import VideoLoader, cpu

# Configure decord to output torch.Tensor
# You can also do this for Tensorflow, etc...
decord.bridge.set_bridge('torch')

width = 320
height = 240
dir = f'/data/quake-gameplay-dataset/{width}x{height}'
video_files = [os.path.join(dir, f)
               for f in os.listdir(dir)
               if f.endswith('.mp4')]
num_frames = 1  # Likely (but not always) synonymous with batch_size
batch_shape = (num_frames, width, height, 3)
vl = VideoLoader(video_files,
                 ctx=[cpu(0)],
                 shape=batch_shape,
                 interval=0,
                 skip=0,
                 shuffle=1)
frame_data, indices = vl.next()
# `frame_data` contains the decoded frames
assert type(frame_data) == torch.Tensor
assert frame_data.shape == batch_shape
# `indices` is the (video_num, frame_num) for each frame
assert indices.shape == (num_frames, 2)

Compiling From Raw

The code for this project is maintained over in the Doom Gameplay Dataset repository. It's much simpler to maintain only one repository for the compiler code.

Contributors

Gameplay videos are sourced from YouTube with permission. Special thanks to the following creators for their contributions to the community. They are good folk.

If you would like to contribute, please open an issue or submit a pull request with links to YouTube videos or playlists. The complete list of videos and playlists is raw/quake.txt.

License

All videos are property of their respective creators. Permission to transform and redistribute was granted in each case. This project makes no claims of ownership to the data.

This project's code is released under MIT / Apache 2.0 2.0 dual license, which is extremely permissive.

Related Projects

Releases

No releases published

Packages

No packages published

Languages