Google colab (allows you to generate levels on the browser!): https://colab.research.google.com/drive/1a-wN-f7xXjpaqq3tKpR29rtkc-kabo3b#scrollTo=YvhabvameUo4
Join our discord here! https://discord.gg/T6djf8N
Welcome to the readme for DeepSaber, an automatic generator of BeatSaber levels. There is a lot of stuff here, fruit of a lot of work by the team in OxAI Labs. Contact me at guillermo . valle at oxai.org , or on twitter (@guillefix) for any questions/suggestions!
Google Doc: https://docs.google.com/document/d/1UDSphLiWsrbdr4jliFq8kzrJlUVKpF2asaL65GnnfoM/edit
From Pypi, using pip:
- pytorch (installed as
torchor via https://pytorch.org/get-started/locally/)
From your favorite package manager:
- sox (e.g.
sudo apt-get install sox)
- Nvidia GPU with CUDA [:/ unfortunately, stage 2 is too slow in CPU (although it should work in theory.., after removing "cuda" options in "./scrit_generate.sh" below]
(Do this first time generating) Download pre-trained weights from https://mega.nz/#!tJBxTC5C!nXspSCKfJ6PYJjdKkFVzIviYEhr0BSg8zXINBqC5rpA, and extract the contents (two folders with four files in total) inside the folder
Then, to generate a level simply run (if on linux):
./script_generate.sh [path to song]
Or on windows:
`.\script_generate.ps1 [path to song]
where you should substitute
[path to song] with the path to the song which you want to use to generate the level, which should be on wav format (sorry). Also it doesn't like spaces in the filename :P . Generation should take about 3 minutes for a 3 minutes song, but it grows (I think squared-ly) with the length, and it will depend on how good your GPU is (mine is a gtx 1070).
This will generate a zip with the Beat Saber level which should be found in
scripts/generation/generated. You should be able to put it in the custom levels folders in the current version of DeepSaber (as of end of 2019).
On windows, you'll have to convert the song .wav to an .ogg file (and subsequently
song.egg manually. You can use the ffmpeg invocation from the error messages of the script with a windows build, or use audacity or handbrake or whatever GUI you like.
I also recommending reading about how to use the "open_in_browser" option, described in the next section, which is quite a nice feature to visualize the generated quickly and easy to set up if you have dropbox.
Pro tip: If the generated level doesn't look good (this is deep learning, it's hard to give guarantees :P), try changing in
cpt2=2150000 #cpt2=1200000 #cpt2=1450000
#cpt2=2150000 #cpt2=1200000 cpt2=1450000
See below for explanation
Further generation options
[TODO] make this more user friendly.
If you open the script
scripts/generation/script_generate.sh in your editor, you can see other options. You can change
exp2, as well as the corresponding
cpt2. These correspond to "experiments" and "checkpoints", and determine where to get the pre-trained network weights. The checkpoints are found in folders inside
cpt2, just specify which of the saved iterations to use. If you train your own models, you can change those to generate using your trained models. You can also change them to explore different pre-trained versions available at https://mega.nz/#!VEo3XAxb!7juvH_R_6IjG1Iv_sVn1yGFqFY3sQVuFyvlbbdDPyk4 (for example DeepSaber 1 used the latest in "block_placement_new_nohumreg" for stage 1 and the latest in "block_selection_new"), but the one you downloaded above is the latest one (DeepSaber 2, trained on a more curated dataset), so should typically work best (but there is always some stochasticity and subjectivity so).
You can also change the variable
ddc to use DDC as the stage 1 (where in times to put notes), while still using deepsaber for stage 2 (which notes to put at each instant for which stage 1 decides to put something). But this requires setting up DDC first. If you do, then just pass the generated stepmania file as a third command argument, and it should work the same.
There is also an "open in browser" option (which is activated by uncommenting the line
#--open_in_browser inside the
deepsaber if block), which is very useful for testing, as it gives you a link with a level visualizer on the broser. To set it up, you just need to set up the script
scripts/generation/dropbox_uploader.sh. This is very easy, just run the script, and it will guide you with how to link it to your dropbox account (you need one.).
A useful parameter to change also is the
--peak threshold. It is currently set at about
0.33, but you can experiment with it. Putting it higher, makes it output less notes, and putting it lower, makes more notes.
If you dig deeper, you can also disable the option
--use_beam_search, but the outputs are then usually quite random -- you can also try setting the
--temperature parameter lower to make it a less so, but beam search is typically better.
Digging even deeper, there is a very hidden option :P inside
scripts/generation/generate_stage2.py in line 59, there
opt["beam_size"] = 17. You can change this number if you want. Making it larger means the generation will take longer but it will typically be of higher quality (it's as if the model thinks harder about it), and making it smaller has the opposite effect, but can be a good thing to try if you want fast generation for some reason.
You could change
opt["n_best"] = 1 to something greater than 1, and change some other code, to get outputs that model thought "less likely" and explore what the model can generate [contact me for more details].
Example of whole pipeline
- mpi4py (only for training->data_processing)
This is a quick run through the whole pipeline, from getting data, to training to generating:
Run all this in root folder of repo
Get example data
wget -O DataSample.tar.gz https://www.dropbox.com/s/2i75ebqmm5yd15c/DataSample.tar.gz?dl=1
[Can also download the whole dataset here: https://mega.nz/#!sABVnYYJ!ZWImW0OSCD_w8Huazxs3Vr0p_2jCqmR44IB9DCKWxac]
tar xzvf DataSample.tar.gz
mv scripts/misc/bash_scripts/extract_zips.sh DataSample/
cd DataSample; ./extract_zips.sh
mv DataSample/* data/extracted_data
Get reduced state list
wget -O data/statespace/sorted_states.pkl https://www.dropbox.com/s/ygffzawbipvady8/sorted_states.pkl?dl=1
Data augmentation (optional)
Dependencies: librosa, mpi4py (and mpi itself). TODO: make mpi an optional dependency.
You can change the "
Expert,ExpertPlus" with any comma-separated (and with no spaces) list of difficulties to train on levels of those difficulties.
mpiexec -n $(nproc) python3 scripts/feature_extraction/process_songs.py data/extracted_data Expert,ExpertPlus --feature_name multi_mel --feature_size 80
mpiexec -n $(nproc) python3 scripts/feature_extraction/process_songs.py data/extracted_data Expert,ExpertPlus --feature_name mel --feature_size 100
pregenerate level tensors (new fix that makes stage 1 training much faster)
The way this works is that we need to run this command for each difficulty level we want to train on. Here Expert and ExpertPlus
mpiexec -n 12 python3 scripts/feature_extraction/process_songs_tensors.py ../../data/DataSample/ Expert --replace_existing --feature_name multi_mel --feature_size 80
mpiexec -n 12 python3 scripts/feature_extraction/process_songs_tensors.py ../../data/DataSample/ ExpertPlus --replace_existing --feature_name multi_mel --feature_size 80
Train Stage 1. Either of two options:
- (ddc option):
Train Stage 2:
generation (using the model trained as above)
To generate with the models trained as above, you need to edit
scripts/generation/script_generate.sh and change the variable
exp1 to the experiment name from which we want to get the trained weights: if following the example above it would be either
test_ddc_block_placement if used ddc; change the variable
cpt1 to the latest block placement iteration, and
cpt2 to the latest block selection iteration. The latest iterations can be found by looking for files in the folders in
scripts/training/ with the names of the different experiments have the form
To use the ddc options, or the "open in browser" option requires more setting up (specially the former). But the above should generate a zip file with the level.
The "open in browser" option is very useful for visualizing the level. You just need to set up the script
scripts/generation/dropbox_uploader.sh. This is very easy, just run the script, and it will guide you with how to link it to your dropbox account (you need one.)
The DDC option requires setting up DDC (https://github.com/chrisdonahue/ddc), which now includes a docker component, and requires its own series of steps. But hopefully the new trained model will supersede this.
Getting the data
[TODO] Here we describe the scripts to scrap Beastsaver and BeastSaber to get the training data
obtain the most common states to use for the reduced state representation
prepare and preprocess data
See more at readme in