Forget Suno Run the Ultimate AI Music Studio LOCALLY 100 Free

Forget Suno: Run the Ultimate AI Music Studio LOCALLY (100% Free)

Full tutorial link > https://www.youtube.com/watch?v=9C_6qNKjgpA

Full ACESTEP XL 1.5 Premium guide for local AI music generation, remix, repaint, stem extraction, audio processing, SAM Audio segmentation, Windows installation, RunPod, Massed Compute, SimplePod and Linux cloud workflows. This tutorial walks through the entire practical pipeline from first launch to final output management: generating fast songs, comparing Turbo/SFT/Base models, reusing prompts and seeds, remixing with reference audio, repainting selected sections, improving generated tracks, splitting vocals/drums/bass/other stems and adding instruments back with Lego mode. You will also see how to trim silence, export timelines for editing software, use SAM Audio with text prompts, process batches.

Essential links:

📥 App/latest zip: https://www.patreon.com/posts/ACESTEP-XL-Premium-SAM-Audio-157675060

▶️ Windows requirements guide: https://youtu.be/DrhUHnYfwC0

💬 Discord/help/community: https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

Video Chapters:

00:00:00 Intro: ACESTEP XL 1.5 Premium local music, segmentation and processing tutorial

00:00:52 Fast song generation examples across styles in under one minute

00:01:55 Output manifest proof, 40-second generation time and supported models

00:02:29 Turbo/SFT/Base models, LoRA support, GPU presets and Torch Compile boost

00:03:10 Remix feature preview, same-lyrics requirement and responsible usage note

00:04:16 Repaint mode: regenerate and merge only a selected song section

00:05:38 Extract mode: stems, silence trimming, all-stems and batch folders

00:06:30 Lego mode: add an instrument stem such as guitar into existing audio

00:07:25 Audio Processing presets and manual enhancement controls for AI songs

00:08:35 Auto-Editor silent trim for tutorials, videos, audio and workflow export

00:09:48 DaVinci/Premiere/Final Cut/ShotCut/Kdenlive timeline export demo

00:11:01 SAM Audio Segment: BF16 models, VRAM presets and advanced segmentation

00:11:47 SAM outputs demo: vocals, drums, bass, remaining audio and saved files

00:12:47 Custom SAM prompts, semicolon batch segmenting and speech cleanup example

00:14:19 Batch processing, load metadata, manifests, saved settings and presets

00:15:09 Why local open-source models matter and where to run ACESTEP

00:15:55 Windows install begins: Patreon zip, changelog, attachments and download

00:16:53 Windows requirements tutorial before Python/CUDA/C++/FFmpeg setup

00:17:29 Extract zip safely, avoid bad paths and run Windows_Install_or_Update.bat

00:18:24 Automatic VENV, FFmpeg, UV install, model downloads and hash verification

00:19:24 Turbo default vs all-model download for SFT/Base and BF16 safetensors

00:20:32 First Windows launch, default Generate Song test and CMD progress

00:21:44 Model recommendations, VRAM tiers, languages, vocals and MP4 image output

00:23:29 Torch Compile setup for faster repeated generations

00:24:05 Outputs folder, model switching and full remix setup workflow

00:25:24 Practical remix loop: adapted lyrics, strength, reference audio and seed lock

00:28:03 Repaint workflow with source range preview, generated result and comparison

00:29:13 Recap: extraction, Lego, audio processing and SAM text-prompt usage

00:30:20 Windows wrap-up, LoRA training teaser and move to cloud installs

00:31:16 RunPod setup: credits, template, CUDA filters, GPU choice and storage

00:34:53 Upload zip in Jupyter Lab, extract, run instructions and handle installs

00:35:43 RunPod errors, resume behavior, model downloads and hash verification

00:38:04 Start ACESTEP on RunPod with Gradio Live, proxy ports and persistence

00:40:18 Add 7860/7861 ports, verify storage reuse and rerun installer after resume

00:42:10 RunPod connection troubleshooting and Gradio Live recommendation

00:44:12 Fix corrupted VENV/stale handle errors, reinstall safely and retest

00:47:24 Successful RunPod relaunch, default generation, nvitop and loading tips

00:49:26 RunPod first load vs fast inference, 15-second second generation example

00:51:02 Download outputs and delete RunPod pods/storage to stop spending

00:53:30 Massed Compute setup: coupon, Creator image, GPU prices and ThinLinc

00:57:13 Massed install from extracted folder, Linux notes and ultra-fast downloads

00:59:18 Start app on Massed Compute via localhost or Gradio Live

01:00:23 Default Massed generation, nvitop, faster loading and speed test

01:02:03 Sync/download outputs and delete Massed Compute instance safely

01:03:25 SimplePod setup: template, persistent volume, pricing and GPU choice

01:06:39 Jupyter upload, direct file browser, install command and model downloads

01:08:21 Start SimplePod, Gradio Live, default generation and one-time load errors

01:09:31 nvitop monitoring, newer driver/CUDA details and generation completion

01:10:42 Direct output/model downloads through SimplePod file browser

01:11:42 Delete instance, keep storage, relaunch GPU and verify install

01:13:15 Discord, subreddit, changelog, update guidance and support links

01:14:30 Final cleanup: terminate servers, delete storage and LoRA training outro

#ACESTEP #AIMusic #LocalAI #RunPod #MassedCompute #SimplePod #SAMAudio

Updated Full Written Tutorial Package (v5.3 Wildcards, DOCX, HTML, PDF and 20 Page PNG Guide)

I also prepared a polished written tutorial from the full ACESTEP XL 1.5 video and updated it with the new ACE-Step XL 1.5 Premium v5.3 Wildcards feature. It uses real screenshots from the actual application, includes the YouTube tutorial thumbnail/link context, and keeps the complete video chapters and transcript below this section.

Download / Open

Full YouTube video tutorial: https://youtu.be/9C_6qNKjgpA
Patreon app/latest zip and premium post: https://www.patreon.com/posts/ACESTEP-XL-Premium-SAM-Audio-157675060
Written DOCX tutorial: https://github.com/FurkanGozukara/Stable-Diffusion/raw/main/Tutorials/assets/forget-suno-ace-step-xl-15-written-tutorial/ACESTEP_XL_15_Written_Tutorial.docx
Written HTML tutorial: https://github.com/FurkanGozukara/Stable-Diffusion/raw/main/Tutorials/assets/forget-suno-ace-step-xl-15-written-tutorial/ACESTEP_XL_15_Written_Tutorial.html
Compact 20-page PDF tutorial: https://github.com/FurkanGozukara/Stable-Diffusion/raw/main/Tutorials/assets/forget-suno-ace-step-xl-15-written-tutorial/ACESTEP_XL_15_Compact_20_Page_Tutorial.pdf
Compact 20-page HTML tutorial: https://github.com/FurkanGozukara/Stable-Diffusion/raw/main/Tutorials/assets/forget-suno-ace-step-xl-15-written-tutorial/ACESTEP_XL_15_Compact_20_Page_Tutorial.html
All high-resolution tutorial PNG pages: https://github.com/FurkanGozukara/Stable-Diffusion/tree/main/Tutorials/assets/forget-suno-ace-step-xl-15-written-tutorial

What The Written Tutorial Covers

Windows setup: downloading the Patreon zip, extracting safely, running Windows_Install_or_Update.bat, downloading models, and launching with Windows_Start_App.bat.
First generation workflow: prompt, lyrics, model choice, duration, guidance/steps, seeds, output folders, metadata, and reusable settings.
Wildcards: use [option A|option B|option C] in Style, Advanced Music Caption, or Lyrics so one option is picked at generation time; nested wildcards are supported.
Batch folder processing uses the same Wildcards behavior: batch jobs can vary instruments, moods, hooks, or lyric phrases across outputs without manually editing every run.
Wildcards caveat: keep Auto/Enhance Style and Auto/Enhance Lyrics disabled when you need exact wildcard expressions preserved, because those improvement tools may rewrite the prompt text.
Model and speed choices: Turbo, SFT, Base, BF16, VRAM presets, Torch Compile, and when each choice matters.
Creative tools: Remix, Repaint, Extract, Lego mode, Audio Processing, Auto-Editor timeline export, and SAM Audio text-prompt segmentation.
Output management: MP3/WAV/FLAC/MP4 results, manifests, stems, trimmed files, SAM segment folders, and batch processing.
Cloud workflows: RunPod, Massed Compute, SimplePod, Jupyter uploads, Gradio Live links, persistent storage, restart behavior, nvitop monitoring, downloading outputs, and deleting rented resources.
Practical troubleshooting: bad extraction paths, missing requirements, first-load delays, stale/corrupted VENV handles, connection issues, model verification, and safe reinstall/update behavior.

v5.3 Wildcards Screenshots

Click either screenshot to open the full-resolution PNG. Embedded previews use height="700" with no fixed width.

20 Page Visual Tutorial

Click any page image to open the full-resolution PNG. Embedded previews are intentionally shown at height="700" with no fixed width, so GitHub keeps the page readable while still allowing full-size inspection.

Video Transcription

00:00:00 Greetings everyone. Today I am going to show you how to install and use the ultimate local music
00:00:09 generation application. This application, ACESTEP XL 1.5, is a combination of applications that not
00:00:16 only generates music but also supports SAM Audio Segmenting and Audio Processing. I will not only
00:00:24 show how to install and use it on your Windows computer but also on RunPod, on Massed Compute,
00:00:32 and on SimplePod. I will show it on all 4 different platforms; Linux users can use
00:00:39 the Massed Compute installation procedures and scripts. So what is this application about? This
00:00:46 is a music generation application mainly, but with so many other features. So let's see them
00:00:52 quickly. You can generate widely different style musics under 1 minute. Let me play some of them:
00:01:04 "Hands in the dirt. Put the work on track. Built from the ground, never life on lack.
00:01:26 Woke up with the fire with the weight on my back. No gold on my wrist till
00:01:31 I'm dressed.". "Woke up with the fire. Put the weight on my back. No gold on my
00:01:39 wrist till I'm dressed in fact. Hands in the dirt. Put the work on track.
00:01:53 Woke up with the fire.". So, as you have seen, I have generated several different style music.
00:01:59 Every one of them is amazing and they only took like 50 seconds. Let me show you. When we open
00:02:05 the outputs folder, we can go to generation and we can open the generation manifest file. It
00:02:12 will show us how much time it took; it took only 40 seconds to generate this full song
00:02:20 which is 3 minutes 11 seconds. How? Because this application is very professionally developed. It
00:02:28 supports all 3 models: ACESTEP XL 1.5 SFT, Turbo, and Base models. It supports full LoRA training
00:02:37 and LoRAs, though that will be another tutorial, not this one. We have GPU optimization presets
00:02:42 automatically selecting; therefore, it is fully optimized for my GPU. And most importantly, when
00:02:49 you go to ACESTEP Advanced, you will see that we have a compile model. Therefore, it speeds up the
00:02:57 generation speed up to 50%. This is my addition to the public repositories that you can find about
00:03:03 this application. But believe me, this ACESTEP XL 1.5 premium is nothing like you will find anywhere
00:03:10 else. So what other features it has? It can remix full songs amazingly. For remix to work, you need
00:03:17 to have the same lyrics. By the way, before I start showing you all these features, I need to
00:03:24 tell you that this application needs to be used respectively, and this tutorial is for research
00:03:31 and education purposes. Okay, first I will play from the original then I will play from the remix.
00:03:41 beauty queen from a movie scene. So one of the best part of this application that it is really
00:04:01 fast. Therefore, you can generate many iterations and get what you want exactly. So, it is so easy.
00:04:09 I will show how to use, install, and everything. This was the remix feature. Let's see the repaint
00:04:16 feature. Now, the repaint is like a remix, but you generate only a certain part of the song. And to
00:04:24 be useful, we need actually LoRAs. When we train LoRAs with our vocals and with our audio—you know,
00:04:32 with our own speaking—it will be perfect. For now, it can be still used to generate fully new parts;
00:04:40 however, it won't be very consistent for lyrics generation. For instrumental regenerations it can
00:04:47 work, but let me show you for now. And as I said, it will be much more useful hopefully with LoRA
00:04:53 training and with vocals training. I'm working on that, so first I will show the original area then
00:04:58 I will show the repainted area. I don't need the noise. I don't. So this way I can totally generate
00:05:12 new sections of the song. However, currently, it is not consistent, so it is looking like
00:05:18 this. Let me show you: we have selected the parts between 19 seconds to 23 seconds. Just
00:05:30 as you have seen, it has successfully repainted the selected area and merged it like this. So the
00:05:38 next feature is extraction. This is extremely useful to extract stems from any given song
00:05:45 or audio file. The normal extraction extracts everything like this—the full song—or you can
00:05:51 use our Auto-Editor trim output and just get the extracted section and trim out the silence. Let me
00:05:58 play some of them to show you. So you can extract vocals, woodwinds, brass, fx, and everything that
00:06:09 this ACESTEP XL model supports. You can extract all stems at once. This is very useful,
00:06:15 or you can do batch folder processing to extract from multiple given files. Everything is working
00:06:21 perfectly. To change the configuration and features of the auto editor trim output, just
00:06:26 change the settings set here and it will follow. So the next feature is Lego. Lego is adding a new
00:06:35 stem to the existing audio. So let me demonstrate to you how it works: first I will show the
00:06:41 original then I will show the Lego painted area. So in this example, I have added a guitar stem to
00:06:57 the track. From here you can select the stem that you want to add and it will add that stem. It will
00:07:03 paint that stem into your audio. It can be a full song, it can be just only vocal, whatever. And it
00:07:09 will generate and add it properly. It will follow the music caption; however, not the lyrics. For
00:07:16 lyrics to be valid, you need to use vocals, but you may not get the performance you want. You need
00:07:21 to test it; however, it is working amazing at the moment. The next feature is the audio processing
00:07:28 tab that we have implemented. So this audio processing tab has several different features.
00:07:34 The default feature is processing presets which changes the audio enhancements that you will
00:07:41 see here. This is very useful to improve your AI generated songs. You can read their descriptions,
00:07:47 you can play with their values, or you can use the presets from here and it will update the values
00:07:53 you will see here. And when you click the process file, it will generate the processed audio.
00:08:06 with the weight on my back. Since the original song is already pretty good,
00:08:21 you may not notice the difference, but if you max all of these, you will notice the difference. So,
00:08:26 play with it and see if it is really improving the quality for you. This is a very useful tool that I
00:08:32 have implemented upon the request of someone. So the next feature, which I like very much,
00:08:38 is auto editor trim silent. This is extremely useful. I use this daily in my tutorial videos
00:08:45 to trim out unspoken parts. You know when I am thinking something, I am not speaking;
00:08:51 therefore, that section is silent and I want to trim that part. So I have a video tutorial here,
00:09:00 one of the last tutorials that I have recorded; it is 20 minutes 31 seconds because it has a
00:09:05 lot of pauses. Then I process this file and it becomes 16 minutes. So it trimmed out 4 minutes of
00:09:14 silent parts or maybe when I am taking breaths or some other stuff where I wasn't talking properly.
00:09:23 This is extremely useful. You can use this feature: upload a video and get trimmed output.
00:09:29 You can set the re-encode values from here, but it will automatically try to match the original
00:09:35 video and you don't even need this; you can export only audio or you can export as a workflow which
00:09:44 I am using normally. Davinci Resolve Export. For DaVinci Resolve export, select processing preset
00:09:50 'none' if you don't want to process the audio, obviously enable 'auto editor trim silent',
00:09:55 and select the editor workflow, then use 'browse local file' so that it will generate accurate file
00:10:03 paths. Then click 'process file'. At the end, you will get a file that you can import into DaVinci
00:10:10 Resolve like this from here. Let me show you. Moreover, everything you have generated in our
00:10:16 application will be saved. When you click to open outputs folder, you can see the SAM audio process,
00:10:22 audio processing results all here. They are all saved. And in DaVinci Resolve, I will go
00:10:28 here and I will click timelines, import this file, select the file and okay. And it will fully import
00:10:35 accurately with accurate file path references and resolution, whatever I want. And you can see
00:10:41 that it is properly split like this. So I can use the DaVinci Resolve as I want with this timeline.
00:10:48 This auto editor workflow supports DaVinci Resolve, Adobe Premiere Pro, Final Cut Pro,
00:10:54 ShotCut, and Kdenlive. It is extremely useful; I use this at my every tutorial. It is super
00:11:00 useful. The next feature is SAM audio segment. SAM audio segment is Facebook Meta's published model
00:11:08 and it is a state-of-the-art audio segmentation model right now. I have implemented SAM audio
00:11:14 with all of the extra features and optimizations. The downloaded models are all BF16 by me. We have
00:11:22 VRAM presets automatically set according to your GPU. This is a very heavy model. It is processing
00:11:29 with segmentation. You see, we support as low as 10 GB GPUs and even 8 GB GPUs as well. Moreover,
00:11:36 it can extract any audio arbitrarily and it supports quick prompts. You can select multiple
00:11:42 quick prompts from here as well to extract multiple values at once. So, let me show you
00:11:47 some of the results. It also supports the audio editor trim output and compile model to improve
00:11:54 the speed. So, this is segmentation output, this is vocals output, and also drums and bass. It will
00:12:02 only show the first one; the other ones will be saved inside the outputs folder in the SAM audio
00:12:09 segmentation. You see with suffixes of drums, guitar, or vocals, whatever you want to extract.
00:12:19 So it is like this; let me also play some of the others from here. As you can see, it is amazing.
00:12:35 You can also play the remaining audio; it also saves that. This model is very, very powerful.
00:12:40 It's a heavy model compared to ACESTEP extraction; however, this model is more powerful and it also
00:12:47 supports, as I said, custom prompts. For example, for custom prompts, you can enable batch segment
00:12:53 as well with semicolon separation. You can write the stems or segments that you want to extract
00:12:59 and it will iteratively process every one of them and save them in the outputs folder. It
00:13:04 is working amazing. This was a feature that you have requested, so I added it. Moreover,
00:13:09 you can use this SAM audio segmentation for other tasks as well. Let me demonstrate to you.
00:13:14 This is an everything video: "Hey, hang on. Can you hear me? Super loud. You know what?
00:13:21 You know what actually looks better? You will not believe the day that I have had. Seriously,
00:13:26 no way. I can't wait to tell you all about it when I get home. I'll be over in like...". Now imagine
00:13:31 that I can clean this very well and extract only the speech. To extract only the speech, all I need
00:13:39 to do is select the quick prompt 'speech' from here. I also enabled auto editor trim to trim out
00:13:46 the silent parts, and let me play you the cleaned audio. This is amazing. This is magnificent:
00:13:52 "Hey, hang on. Can you hear me? Super loud. You know what? Do you know what? Actually, what's
00:13:57 better? You will not get the day that I arrived. Seriously, no way. I can't wait to tell you all
00:14:03 about it when I get home. I'll be over in like 10 minutes.". As you have seen, it is just amazing,
00:14:08 amazing—the extracted cleaned audio. So, we have all the optimizations, quantization, attention,
00:14:15 whatever you want. We support so many features and you can use the SAM audio. It also fully
00:14:21 supports batch folder processing so that you can process so many files at once in a given folder
00:14:28 and extract multiple stems, extract multiple, you know, segments, whatever you want. Another
00:14:34 feature is load metadata. I save all the metadata of all generations in the outputs folder. You see
00:14:41 everything is properly saved like this and you can select the manifest and it will load all the
00:14:47 settings that were used to generate this song or audio processing or segmentation, whatever. You
00:14:53 can also have full custom presets; it is fully working. You can save every setting, load them,
00:15:00 you can delete your preset, so whatever you want to do, and they will be also remembered when you
00:15:06 next time restart your application, very useful. This local model is now more important than ever
00:15:13 if you know the recent incident that Anthropic stopped serving its very best model. So anytime if
00:15:23 you are dependent on the cloud providers' cloud services, they may stop serving you. However,
00:15:29 this model is local. You can use it as much as you want. It is fully commercially usable. Therefore,
00:15:35 this is the way of the future: having local models, open-source models, using them on your
00:15:40 local computer. If your local computer is not powerful enough, then you can use it on private
00:15:46 cloud services, private GPU providing services like RunPod, like Massed Compute or SimplePod.
00:15:52 I will show all of them. Since we have seen all the features, now we can begin the installation
00:15:59 and the setup, using it on Windows first then the cloud services. So you need to go to this link;
00:16:06 this link will be in the description of the video below and also in the comment section
00:16:11 of the video. Download the latest installer zip file. It will be here or at the bottom when you
00:16:17 scroll down in the attachments you will see it. You see this post is very long. Why? Because I
00:16:23 write every update as a change log here. If you read everything here, then you will learn how this
00:16:30 app is developed—the timeline, what new features were added, how it is working. This is very, very
00:16:37 useful to read, so I recommend you to read it. And at the very bottom, let me show you—yes, it
00:16:42 is very long, I agree with that—you will see the attachment. So you can also download from here.
00:16:48 Let's go to top again, click download. It will download the zip file, but do not start installing
00:16:55 yet. First of all, you need to follow the Windows requirements tutorial. If you haven't followed it
00:17:01 yet, do so; if you have followed it previously, then you are ready. So this is the tutorial that
00:17:07 you need to follow first. Its source is fully up to date, the tutorial is fully up to date. You
00:17:13 can still skip watching this tutorial and try to make it work. It may work depending on your setup;
00:17:18 however, if you want to use torch compile especially, you need to follow that tutorial. Then
00:17:24 move the downloaded zip file into any disk where you want to install. So let's install it into
00:17:31 our Q drive and extract the file like this and enter inside the extracted folder. Make sure that
00:17:37 your folder path doesn't have space characters or non-English characters; if you don't want to have
00:17:44 any issues, I recommend that. Then double-click Windows install or update.bat file and run. This
00:17:50 will start the installation. It will generate a Python 3.11 virtual environment, so you need
00:17:56 to have Python 3.11. It is also installing local shared FFmpeg runtime. So the installation will
00:18:05 be fully automatic as long as you have followed the Windows requirements tutorial. It will also
00:18:10 automatically download all the models that you need and everything will be inside this folder and
00:18:17 inside the virtual environment that it is going to generate in a moment. First, it is downloading
00:18:23 the necessary shared FFmpeg file so that you won't have any issues. Okay, virtual environment
00:18:29 generated and the FFmpeg are also downloaded like this: ffmpeg shared and ffmpeg shared download.
00:18:36 The installation will be super fast because we are using UV installation, not depending on anything
00:18:41 else. So the installation already completed. You can verify the installation with scroll up,
00:18:46 and it will start downloading the models automatically into accurate folders with accurate
00:18:51 names. My downloader has 16 connection download and it also does hash verification; therefore,
00:18:59 all the downloaded models are 100% accurate. They can never be corrupted. It fully supports resume
00:19:06 as well. When you run the Windows install again or for update again, it will verify the status and
00:19:12 continue wherever it was left. This will download the SAM audio segment model, audio processing
00:19:18 related models, and it will also download the ACESTEP XL 1.5 turbo model. If you also want
00:19:25 to use ACESTEP XL 1.5 SFT and base model, then once this installation has been completed, you
00:19:32 need to run the Windows download all models.bat file. I will show that after this download is
00:19:39 completed. You can also see the speed; it is 100 megabytes per second, which is my maximum
00:19:45 internet connection speed. Normally shared models were FP32; however, we are not doing inference at
00:19:52 FP32. Therefore, I have generated BF16 versions which is our inference precision. And so you see
00:20:00 the models are now half size—faster to download, not faster to load. It uses lesser RAM memory;
00:20:06 everything is much better right now. You see the ACESTEP model is just a single safetensors file. I
00:20:13 also converted the Pytorch PT files into secure safetensors files. Okay, so all the 57 models
00:20:21 downloaded, everything is accurately downloaded, hash verified. If you wonder how much space the
00:20:27 models are taking, it is currently 34 GB which is very reasonable. Let's start the application with
00:20:34 Windows startup.bat file and while generating some songs I will also download the remaining models.
00:20:42 So I will run the Windows download all models; for cloud installation it will automatically download
00:20:47 these remaining models, so you won't be needing that since they are fast. So the new installed
00:20:54 application started. This is an easy-to-use interface which I have developed. First,
00:20:58 you can hit generate song to verify everything is working without changing anything else. This has
00:21:05 a default style prompt to give you an idea and lyrics. In the zip file, you will also
00:21:11 find a file named as ACESTEP lyric generation instructions for LLMs. You can use this file:
00:21:19 give it to your favorite LLM like ChatGPT and make it format your lyrics or your style accordingly.
00:21:27 These ACESTEP XL models really work well to format the lyrics or the style. So you can always watch
00:21:35 the status in the CMD window rather than depending on the Gradio interface itself which I recommend,
00:21:41 and the remaining models are also being downloaded right now which are ACESTEP XL SFT and base
00:21:48 models. When you go to the Advanced tab you will see which models are recommended and supported
00:21:54 for which task. So the turbo model is mostly for generation only because it is fast, high quality,
00:22:02 and the other models are for other tasks. It will automatically select the GPU optimization
00:22:07 preset. If it be too slow, you can reduce your tier to generate fast. You can also provide an
00:22:13 image from here and it will generate an MP4 file with the song it generated and with the image you
00:22:21 have provided. It is automatically generating with the selected resolution and it is keeping
00:22:27 the original aspect ratio. You can also set your song instrumental from here, change the
00:22:32 vocal from here, or change the language from here. You see, it supports so many different languages.
00:22:37 This model is currently state-of-the-art. It is certainly better than the Suno free version. It
00:22:43 is rivaling the Suno 5, Suno 5.5, especially with LoRA training. Hopefully, it will be the next
00:22:48 tutorial. It will be better than even the paid Suno 5.5. And you know, Suno is very likely to be
00:22:55 putting some watermark into your generations; therefore, platforms like YouTube know that
00:23:02 song is AI generated. Okay, we can listen to the default song. Woke up with the track. Woke up with
00:23:17 the five. Put the weight on my back. I'm blessed in fact. Okay, it is working amazing as expected.
00:23:27 If you are going to generate multiple songs and if you want to speed up, go to advanced setup and
00:23:31 enable compile model. For this to work, you need to have CUDA installed properly and also MSVC C++
00:23:40 compiler to be installed properly. Everything is explained in the Windows requirements tutorial.
00:23:44 This application will find accurate CUDA and the installed C++ compiler automatically and use it
00:23:51 to torch compile. The first run with torch compile will be very slow since it will compile, but the
00:23:58 subsequent ones will be very fast, and it may need to recompile depending on the settings
00:24:03 that you can change. So this is how you generate songs. Everything will be saved inside the outputs
00:24:10 folder. When you open the outputs folder, you will see that it is saved like this. You can change the
00:24:16 models from here. You can go to advanced, select remix. For selecting remix, I recommend you to
00:24:23 change the recommended model to SFT and upload your generated song or whatever song that you want
00:24:29 to remix, and use the same lyrics here. Change the music caption like rap or whatever you want, and
00:24:36 then you can also select the remix source start and end. It shows the preview here and generate
00:24:43 music. By the way, if you get an out of VRAM error, you can restart the application. Oh, you
00:24:48 see we have just downloaded the model, it didn't see it. So, I need to restart the application. So,
00:24:54 as I said, if you get an out of VRAM error, I recommend you to restart the application. However,
00:24:59 I made a lot of improvements so it shouldn't be necessary; it should automatically switch between
00:25:05 models without restarting your application. So select SFT, go to advanced, select the remix,
00:25:13 and have the same lyrics as the remix song to not have any lyrics issue. Change the caption,
00:25:20 then hit generate music and it will generate the remix. Okay, as you have seen the remix
00:25:27 results were not great. However, one of our users just messaged me and explained to me how he makes
00:25:35 amazing remixes. So I'm now going to show you how he makes them. He made a production loop. It's a
00:25:43 loop; therefore, you need to repeat it. First of all, we are uploading our original audio. Then
00:25:49 write an adapted lyric, not a literal translation. Keep line length, syllable count, vowel shape, and
00:25:56 stressed syllables close to the original phrase. Replace meaning with sound compatible wording
00:26:02 when needed. Set remix strength between 0.92 and 0.99; start in the middle of the range. Listen,
00:26:10 then move tighter or looser depending on whether the vocal follows the original too much or not
00:26:16 enough. How you set it? If you pay attention to the remix strength, it is here. So, by default,
00:26:21 it is 1. However, as recommended, try with 0.92 to 0.99 to keep the wording more accurate. Add
00:26:30 a clean 30 second voice reference. Now, this is important. This is something that I didn't test,
00:26:36 but he tested and figured that out: use it for timbre vocal character. Keep it clean, dry,
00:26:42 and center it on the voice you want to be copied. So, where you are going to edit? Obviously, it is
00:26:47 here. So, provide the reference audio here to make the remixed audio even closer and better to the
00:26:55 original song. And the next step, generate until the base take feels right. So as I have shown,
00:27:01 the generation is fast. So generate again and again and again until you get a good base result.
00:27:08 Then lock the seed and disable random. How you are going to do that? I have updated the application
00:27:15 and moved the seed to this place. So it is much easier now and it will automatically
00:27:22 set the last generated seed value. So once you have the accurate and reasonably good remix,
00:27:30 just uncheck this and keep working on the same seed value. Moreover, I did set the default remix
00:27:38 strength to 0.95, so with the latest update, you are going to have both of the features.
00:27:45 Then change one word or phrase at a time and keep working until you get the desired result. He is
00:27:52 also doing some additional stuff like separating stems, tuning vocals, then mixing all; it is
00:27:58 all up to you to the level that you want to go. For repaints, same way: upload to source audio,
00:28:04 set your caption and lyrics, and select the repaint start and end like 19 to 24. It will
00:28:11 show the preview here. However, currently, we are generating remix; therefore, it is
00:28:15 waiting for that, so we need to first wait for the remix to be completed. Okay, change it to preview.
00:28:21 You can watch the status in the CMD window. As you are seeing, I am not changing any optional
00:28:27 parameters or advanced settings because they are all set to maximum quality with the VRAM presets
00:28:34 we have. As you play with this application, you will understand how to use it, how useful it is,
00:28:39 how advanced it is. It has so many amazing features. So, our generation is almost
00:28:43 completed. Yes, it is completed. The result should appear in a moment. And it is here. So, the latest
00:28:50 generated result is here. The latest remixed area is here. You can also see the original inputs.
00:29:03 Yeah, this didn't work very well, so I probably need to change my caption or generate more. This
00:29:08 is just a very basic caption, but this is how it works. When I go to repaint, it's the same way.
00:29:13 For extracting, I need to change the base model to this one. The base, the Lego, and extraction
00:29:20 will become available. I have already shown how to use them. And as you have noticed, everything has
00:29:26 very good explanations on the Gradio interface, so read every one of them to understand. I also
00:29:31 can use audio processing: just upload the file and click the process file, or I can use the auto-trim
00:29:39 editor. We already saw them. SAM audio is same way: just select your file and select the options
00:29:46 whichever you want to do like text prompt. I never tried the span actually, so I can't say
00:29:52 that it is working as expected and it is very hard to use, or the visual mask. I also never used it,
00:29:58 but the text is working perfect. These two others are very hard to use; they require a lot of manual
00:30:03 work. However, text is working perfect. Type your custom prompt whatever you want to extract as I
00:30:09 have shown in the beginning of the tutorial and it will work. My installer has properly compiled
00:30:16 Flash Attention; therefore, it should work on your Windows or on cloud services. So, this was how to
00:30:24 do inference on this application on Windows. The LoRA training is also fully working and it will
00:30:31 come as a next tutorial. Hopefully, I am still in research; I will try to train a vocal voice
00:30:37 and generate with that voice consistently and with the style, of course—the singing style. It
00:30:42 will be an amazing, epic tutorial hopefully. So now I will show how to install and use on RunPod,
00:30:48 on Massed Compute, and on SimplePod. If you don't have a powerful GPU, if you want to
00:30:53 generate faster, if you want to scale up your generation, then you need some cloud services
00:30:58 and these are the best services. So in the zip file, we already have all the instructions.
00:31:03 Let's begin with the RunPod. You see we have a RunPod SimplePod ACESTEP instructions.txt
00:31:09 file. They are all in the zip file; you need to first download it and extract it. When I open it,
00:31:14 it will show me all the instructions. So let's begin with the RunPod. First of all,
00:31:19 please use this link to register; I appreciate that very much. Once you register and sign in,
00:31:24 click this plus icon and add some credits. You see I am currently spending some money because I have
00:31:31 some storage. Okay, let's delete them because I will show you how to use the storage as well. Man,
00:31:38 if you forget this, you will spend money like me. This is the homepage, the newest homepage of the
00:31:44 RunPod. You registered, you added some credits, then as a next step, you will see that we have the
00:31:49 template. All of my application installers work the same way; so once you understand the logic,
00:31:56 you will be able to use every one of them. So click this; it will open the accurate template
00:32:03 and select it. This may change depending on the application. Since we are using CUDA 13,
00:32:08 there is one important thing which is go to filter and select the CUDA version as 12.8, 12.9,
00:32:17 and 13. Why? Because RunPod is not updating its NVIDIA drivers. The issue is the NVIDIA driver,
00:32:24 not the CUDA version. Because we can install the CUDA version, but we cannot install the NVIDIA
00:32:29 driver. This way we are going to get a machine that has the accurate NVIDIA driver that will
00:32:35 run the CUDA 13. After that, you can also play with the other filters like secure / community—I
00:32:41 recommend secure—the RAM amount, disk type, whatever, and then apply filters. It will
00:32:47 select it. Now if you don't select any permanent storage, it will create a temporary storage that
00:32:54 you will use on this instance. If you want to use permanent storage which you can resume later,
00:32:59 you need to add a volume disk. So you see currently we don't have a network here;
00:33:04 therefore, I will go to storage first and create a network volume in here. You need to select a
00:33:09 region because it's region-specific and depending on the GPU that you are going to use, you need to
00:33:14 select your region accurately. So I am going to select a region which has a lot of RTX Pro 6000,
00:33:21 which is one of my favorite GPUs, or RTX 5090. For this one, it would work too. Okay, let's see
00:33:28 what we have here—high performance. The US regions are usually better. So this is a Europe region but
00:33:35 it has a lot of RTX Pro 6000. It will be slow but I need to select this. I hope it doesn't
00:33:41 be too slow. Let's say example, I will make this 200 GB and create network volume. Now my volume
00:33:47 is ready. Therefore, I will click this again to get the accurate template. Template is selected,
00:33:54 my filters are kept, very nice. And now I am going to select my GPU. Wow, the prices are all
00:34:00 increased because of the demand, and persistent storage is network volume. Okay, now I have it,
00:34:07 but it doesn't show whether it has selected it or not. It says automatically create; oh, here. Yes,
00:34:12 from here I can select my persistent storage like this. So don't make it automatically create. But
00:34:19 I see that we could create from this as well, so the interface keeps changing. Pay attention
00:34:24 to the interface. The most crucial part is using the accurate template and deploy port
00:34:30 and also selecting the accurate filter for NVIDIA driver CUDA 12.8, 12.9, or 13. Okay,
00:34:38 this template was already used, very nice, so we didn't wait. When you go to the details or
00:34:45 telemetry it will show the driver; yes, this is the driver version. This driver supports 13,
00:34:51 therefore it should work very well. So go to connect and click Jupyter Lab. If you don't
00:34:55 see this is enabled, you can refresh this page. The Jupyter Lab is starting. Then I will drag
00:35:01 and drop the downloaded zip file into here like this. You can also use this upload icon to upload
00:35:10 your zip file. Once the upload is completed, it will show the upload status here. Right-click and
00:35:15 extract archive, then it will extract like this. Then there is RunPod SimplePod instructions and
00:35:22 there is this installation command. Always read these instruction files. Open a new terminal
00:35:27 from this plus, copy-paste it, and it will do all the installation and model downloads. This will
00:35:33 download all of the models, not only the turbo model, because this is a cloud service; therefore,
00:35:38 it is fast. All you need to do is now wait, depending on your port. It may take a lot of time
00:35:44 or it may be fast. However, RunPod may be broken at many times; so if you get any error, just get
00:35:50 a new GPU. Unfortunately, there is no other way. Currently, this one is looking like it's working,
00:35:56 but you can never be sure, and it is already slow. So the installation started and it is installing
00:36:02 the libraries. We are using UV installation and even with UV, RunPod is unfortunately slow. So
00:36:10 depending on your chance, the server, and the GPU you get, you may wait a lot of time or it may be
00:36:17 fast. During the installation, unfortunately, an error occurred. This is an operating system error;
00:36:24 that means this is a RunPod error. To fix this issue, run the installer again and it will try to
00:36:32 resume from wherever it was left, so it should be faster when you run it the second time. As I said,
00:36:39 RunPod is very erratic. It may throw errors, not all GPUs work, so it is totally unpredictable in
00:36:50 performance. However, it is the most widely used one and it has so many different GPUs;
00:36:56 therefore, this is like a trade-off between some features and some errors or unpredictability. You
00:37:05 see the second time running installation is much faster. I hope this time it won't have any issues;
00:37:11 we should see. But this operating system error is 100% related to RunPod's shared storage and
00:37:19 network system itself. Now it is starting to download all of the models. You don't
00:37:24 need to run an additional command. If you get any errors during the model download as well,
00:37:29 just run the installation again; it just resumes, it doesn't start from the beginning. The model
00:37:34 download speed is looking decent. Since I am using 16 connection downloads, it is merging them;
00:37:41 therefore, the downloads are really fast and optimized. It's also verifying the hash values
00:37:46 so that you will never have corrupt model issues, because corrupt models are very annoying where
00:37:52 everything looks normal but it doesn't work or it produces inaccurate results. Yes, the speed
00:37:58 is also decent, equal to 100 megabytes per second. All right, so the installation has been completed.
00:38:03 All the models have been downloaded. Now we are ready to start launching the application.
00:38:10 For launching, return back to the RunPod SimplePod instructions.txt file. All of my applications have
00:38:17 that. Copy the start command, open a new terminal, and paste the start command. This will start as a
00:38:25 Gradio live shared link. The starting may take time depending on your port speed. Okay, so the
00:38:32 application started. We can see that Gradio live is here. If you don't want to use Gradio live,
00:38:39 you need to add a port to here; it will restart your port, so be careful with that. So you need
00:38:46 to edit this port and add a port here to connect from the proxy of the RunPod. I will show that
00:38:54 after this. Okay, so the application started. Now it is the same as using in Windows; I'm not going
00:39:01 to repeat. However, I'm going to show you how you can resume using without making everything
00:39:07 from your persistent storage permanent storage. To resume it, I'm going to delete this port. You
00:39:13 see when I stop this port, it will be like this, and since I was using permanent storage, I will
00:39:19 be able to resume it. Let's verify that. Okay, it doesn't show, so I'm not sure if it started
00:39:26 with my permanent storage now. Okay, it says that yes, it was using this, so it should start. Okay,
00:39:32 I'm going to terminate now. You know, when you terminate it, everything will be deleted unless it
00:39:37 is in your permanent storage, so let's terminate the port. And our template is still selected,
00:39:43 but let's begin from the beginning to verify: so open the instructions.txt and double-click
00:39:49 the template. You see it is selected, then as a filter, I'm going to select the disk and I'm
00:39:56 going to make the CUDA filter. I also recommend selecting maximum RAM and this disk. Okay, apply
00:40:04 filters. So I'm also going to select my network volume. You see when I select the network volume,
00:40:10 the options will be lesser because it will filter based on my network storage region. Okay,
00:40:18 selected; everything is ready. But before I start, I'm also going to add the port; the port was 7860.
00:40:27 So let's also add it. However, am I going to add the port? So for adding the port, I'm going
00:40:33 to click set overrides and add it here: 7860 like this and set overrides. Even though you are using
00:40:45 my template, you can add some stuff like this and deploy port. Now it will be almost instant
00:40:52 to start the application. You will not be spending the whole time to reinstall, and every file you
00:40:58 have generated, everything you did, will be kept. This is how the permanent storage system works.
00:41:04 SimplePod is exactly the same as RunPod; I will also show that. Okay, now when I click the port
00:41:10 instance, you see there is also the 7860 port. First of all, connect with Jupyter Lab so that we
00:41:16 can start the application. I recommend you to run the installation again to not have any issues. So
00:41:22 open a new terminal, run the installation. This time the installation should be almost instant.
00:41:28 Let's see. Okay, it's saying that requirements are verified; everything will be just verified.
00:41:33 It will only install the FFmpeg again. Okay, it is verifying the packages, everything is getting
00:41:40 verified right now. All the models it is skipping since their hash values were already verified. So,
00:41:47 it will take like 1 minute to verify everything and get ready. This is the recommended way with
00:41:52 my old applications; I recommend you run the installer again to be sure to not have
00:41:57 any issues after you start your permanent storage again. Okay, everything verified. It is now just
00:42:04 reinstalling the FFmpeg, then we will be ready. Yes, ready. Now return back and start. So this
00:42:10 time we will be able to connect both from proxy and Gradio live. However, I recommend connecting
00:42:18 from Gradio live because RunPod proxy always causes issues for me—it doesn't work very well.
00:42:25 Therefore, I recommend the Gradio live option always. Okay, it has started on the local port
00:42:30 and also on Gradio live. So let's open the Gradio live, and to open the local port, you see there is
00:42:36 this 7860 which is the port it starts. Okay, looks like we have to use 7861; sometimes it
00:42:44 requires you to start with the plus 1. This is how the RunPod proxy works. So let's also show that;
00:42:51 I will just repeat the steps. You see from here I can also edit the port and I will just add 7861.
00:42:58 When I do this, it restarts the port; therefore, I will be needed to start it again, but once you
00:43:06 restart, you don't need to run the installation again. Restarts keep the workspace installations;
00:43:13 so after the restart—okay, it is done—I will just connect back to the Jupyter Lab interface. Yeah,
00:43:20 it will take some time. Okay, it started. This time I will just run the start command, not the
00:43:26 installation, because I only did a restart. When you stop the port or terminate and start again,
00:43:32 you need to run the installation; if you just restart, you don't need to run the installation
00:43:37 again. It is restarted; I will just run the start command. This start command also updates
00:43:46 the application if there are newer versions. Okay, the application started. So, let's open the Gradio
00:43:51 live and we should be able to connect from 7861. Yes, yes, I know this is awkward, this is weird,
00:43:59 but this is how the RunPod proxy works. Okay. I don't know, maybe I should block or unlock;
00:44:05 let's try block. Okay, Gradio live starts. Yeah, proxy didn't start, but this is the way you can
00:44:11 try it. Let's generate a song with the default values on RunPod. It will be slow to load models;
00:44:17 their network storage system is slow. Okay, we got an error. But don't need to panic;
00:44:24 I know the reason, I know the solution. Remember when we were installing we had this network RunPod
00:44:32 related error? The reason for this error was that UV installs with multi-threads based on
00:44:39 the number of CPU cores, and this machine has 256 CPUs. Therefore, it was spawning so many threads
00:44:48 to install and it was causing this issue and it corrupted our virtual environment. Therefore,
00:44:54 we got this error. So, how are we going to solve this error? First of all, restart your pod. Then
00:45:01 we are going to delete our virtual environment and run the installer again. Also don't worry,
00:45:06 I have updated the zip file; in the future, the installation will be limited to 4 threads.
00:45:11 If your machine is better, like if you're on SimplePod or if you trust your machine,
00:45:16 you can increase this to improve the speed of installation; however, on RunPod, I recommend
00:45:22 making it 4. So first we will delete the virtual environment. For deleting the virtual environment,
00:45:26 we are going to use this command. Let's connect back to the Jupyter Lab interface. Open a new
00:45:32 terminal, copy-paste. It will delete the virtual environment; this is the only way to fix when the
00:45:37 virtual environment is corrupted. Then copy the installation command again. It is still deleting
00:45:43 the virtual environment; when it deletes the virtual environment, it will disappear from here.
00:45:47 You can refresh to see. You see the RunPod network storage system is slow; therefore, Massed Compute
00:45:54 is a much better alternative. However, they don't have a permanent network storage system; that is
00:45:59 their disadvantage. Or you can use SimplePod; I didn't encounter such issues, and their network
00:46:04 storage system is also fast. Okay, it is deleted. Now I will open a new terminal and run the install
00:46:10 command again. This time I will limit it to 4. So if you get such errors, you can reduce it
00:46:15 even to 2 or 1. This is the way of preventing such operating system errors. You see stale file handle
00:46:24 errors on RunPod or any shared network having systems; in my Windows computer or in Massed
00:46:32 Compute, I never encountered this issue, nor on SimplePod. However, this is extra information
00:46:37 for you in the future to fix such issues yourself. Obviously, since we are running 4 threads instead
00:46:44 of the maximum, it will be slower to install, but it should work perfectly fine. Okay,
00:46:49 the installation went smooth this time; we do not see any errors. Everything is looking perfect. The
00:46:57 speed was decent, not very bad, but it was slower obviously, and since we had downloaded everything
00:47:04 they are just skipped—already verified—and now I can start the application again. So this was
00:47:12 important information for any other application in the future where you may encounter this problem,
00:47:20 and now you will know the solution. Okay, the application started; let's open the Gradio live
00:47:26 link. You see it is like this. If you encounter any issues with Gradio live, you can open it
00:47:32 in a private window; it may help sometimes, or restart your browser entirely. As a rule of thumb,
00:47:38 always run the application with default values then verify it is working. You see it said it
00:47:44 could not parse the server response, but then it started because we clicked it too early. Probably
00:47:50 the Gradio is taking some time to load, especially when you change the model, waiting to load values
00:47:56 accurately. So the processing started; we have no error this time and it should work. Let's
00:48:01 see. Once the model is loaded, the subsequent generations will be very fast, but the initial
00:48:07 model loading will be very slow. And if you want to monitor the VRAM usage, you can open a new
00:48:13 terminal. Type pip install nvitop like this. Then type nvitop and it will open the nvitop window. It
00:48:23 shows the driver version and the CUDA version of the driver. This CUDA version is the version of
00:48:28 the driver, not what the template has. So this is important. So it is starting to load the model; we
00:48:35 have to wait because model loading is slow because the hard drive system on RunPod is slow. But it is
00:48:42 loading. We have full optimizations to speed up this process both on Windows, on Linux, and on
00:48:48 cloud machines. We can also follow the status here. You see it is using Flash Attention;
00:48:53 I have compiled this myself so that it supports every cloud GPU. This compile took over 12 hours;
00:49:01 it was really brutal to make it right. We are using torch 2.11 with CUDA 13. Hopefully, I will
00:49:07 upgrade applications and compiles to torch 2.12; I'm waiting for it to mature. We are also using
00:49:14 torchao; this is just torchao loading, but torchao is used in the inference. So, this is a fully
00:49:21 optimized and highest quality, highest performance programming application. It took me weeks to make
00:49:28 this application and get it to this point. I will show the second run as well. You see the
00:49:34 first run is taking like 200 seconds because it is all taking time with loading the model. Okay,
00:49:40 the inference started. So the models were loaded; inference is really fast as you can see. Wow,
00:49:47 really, really fast. It is using like 24 GB of VRAM; you can perfectly run this on an RTX 5090
00:49:53 as well. So the first song has been generated. It may take a little bit of time to appear here
00:50:00 because of the Gradio live. You can click this download; if it doesn't appear here, you can also
00:50:05 download from outputs which I will show. Let's listen to it. Excellent, the default is working.
00:50:17 Let's generate another song and see the speed. Okay, generation started; it should be ultra-fast
00:50:25 since it will use the already loaded model. We are really fast, I can see that already. So, let's see
00:50:32 how many seconds. Okay, it is already done. Oh my god, yep. You see how many seconds it took? Let's
00:50:39 go to outputs. This is the 3rd song; let's see the manifest.json. So, it took literally—let's see—15
00:50:48 seconds. So, on this GPU, you can generate 4 full songs in 1 minute. In 1 hour, you can generate 240
00:50:58 songs and it costs only 2 dollars per hour. And how to download every generation? Go back to your
00:51:06 workspace, go to ACESTEP Premium, and right-click outputs and download as an archive, and it will
00:51:12 zip it and start downloading everything like this. You can of course download from here;
00:51:17 click this and it will download. So how to terminate the machine properly? You can stop
00:51:22 the machine. When you stop the machine, it will still use some credits. You see it shows 0. Why?
00:51:28 Because it is using the credits in my storage. So currently it is using 14 dollars per month.
00:51:35 You need to also delete this if you want to not use any storage. I'm going to just delete this
00:51:41 because I am done with it. But this is how you use the permanent storage system. How to use RunPod,
00:51:48 how to solve issues. Basically, we have explained everything. You see I have to remove this first,
00:51:54 so I will just terminate the pod. So what is the difference between permanent storage and
00:51:58 the regular network volume? So let's go to pods; everything is selected. Let's select this GPU and
00:52:04 let's select the volume disk. So you see network volume uses my permanent network; volume disk uses
00:52:11 the volume, and volume disk also uses some credits by itself. And this time when I stop this you will
00:52:19 see that it is going to use this much money. However, I can resume this if there is a GPU
00:52:24 later. And to make it 0, I need to terminate the pod. Okay, let's also delete my permanent
00:52:30 storage too. So I will not get money wasted and now I am spending 0. Let's verify that. Okay,
00:52:38 it still shows my spend rate; let's refresh. Wow, it is taking some time to update, I guess. But
00:52:46 I am spending 0 right now because I don't have any pods or I don't have any storage here. Wow,
00:52:53 it still didn't delete it; man, the system became very slow. Oh, I generated this inaccurately with
00:52:59 the network volume now accidentally. Yes, why I had generated it accidentally? Because let me show
00:53:05 you so you won't make the same mistake: if you select the network volume here,
00:53:11 it will automatically create it. This is actually misleading; this is how I wasted money. Yes,
00:53:16 I just understood it. So you need to select volume disk if you don't want such accidental spend;
00:53:23 if you make network volume, it will automatically create unless you select your existing network
00:53:28 volume, or you need to select volume disk. Okay, this was everything about the RunPod part. Now
00:53:33 I am going to move into Massed Compute. So for Massed Compute we have Massed Compute instructions
00:53:41 read.txt file. Open it. Please use this link to register; I appreciate that, this is important.
00:53:49 After registration, add some credits to your account from billing. Then go to deploy. In the
00:53:56 deploy menu, it will show you available GPUs like this. Now the main difference of Massed Compute is
00:54:02 that it is always super fast; its speed is like 20 times faster than RunPod. You won't get such
00:54:09 errors and it is cheaper than RunPod. However, it doesn't have a permanent network storage system;
00:54:16 so every time you have to install or you have to download models. So let's use the same GPU.
00:54:22 From category, select 'creator'. From image select 'SECourses'. And you see currently it is
00:54:27 2.19 dollars per hour; it is the same as RunPod. However, we have a coupon 'SECourses'. This is
00:54:34 working on every GPU that you can see here, every one of them. So, this is a great GPU;
00:54:40 you can also take even better ones like H200 NVL or H100 or like H200 NVL. But this is probably
00:54:49 best for inference as a price-performance balance. Let's verify. And it is now 1.64 dollars. You
00:54:56 see it is 25% cheaper than RunPod. Then click deploy. The initial generation may take a little
00:55:04 bit of time; you have to wait for initialization. Meanwhile, let's install the necessary application
00:55:10 to connect, which is ThinLinc Client. You can also connect from the browser, but I don't recommend
00:55:15 that. So, open this ThinLinc Client link from here. Let's allow all downloads according to your
00:55:22 operating system; since I am on Windows, let's download it. Installation is so easy: open it,
00:55:28 click Yes, then click next, accept, next, install. That's it; everything is default. Run ThinLinc
00:55:34 Client. Then you need to set the shared folder; so you can share the zip file. Options—you can also
00:55:41 read the other options here—go to local devices; I only enable clipboard synchronization and drives.
00:55:46 Then click details, and in here add a shared folder like this with read and write permission.
00:55:52 So this is my folder on my PC. And okay. So copy the file into your shared folder; so it is in
00:55:59 my shares folder right now, the installation zip file. Then I need to just wait for initialization
00:56:04 to be completed. Sometimes you may refresh this to verify whether it is done or click
00:56:09 this running instance and it will refresh, but it should automatically show you as well. Okay,
00:56:14 so our machine is ready to connect. You see, click details, copy the login URL, paste it into here,
00:56:21 copy the username, paste it into here, and copy the password and paste it into here. Then connect,
00:56:28 click continue. Just wait a little bit. This ThinLinc Client may be slow for big file transfers
00:56:35 but for small files it should be fast. For big file transfers, you can use like your Google
00:56:40 Drive or your OneDrive. So this is the interface: go to home, go to thin drives. You will see there
00:56:46 is your shared folder, whatever the name you have given. And inside this, I will see my zip file.
00:56:51 You can also open Google Chrome here, login to your Patreon, and download from here. So it is
00:56:57 basically like your Windows but it is running on a cloud machine and this is a Linux Ubuntu system.
00:57:04 So our installer zip file is here; drag and drop it into downloads. Do not run anything here;
00:57:10 you have to drag and drop them into your downloads folder. Right-click and extract here. Enter inside
00:57:15 the folder, open Massed Compute instructions read.txt file. Copy this command, return back
00:57:22 to the downloads, enter inside this folder, and open in terminal. You see this location
00:57:29 is so important: I am in this folder where these files are located. Right-click and paste and hit
00:57:35 enter. Then it will start the installation. The installation on Massed Compute will be ultra-fast;
00:57:40 let's watch it real-time because their disk system is really fast. When you get this software updated
00:57:46 information just click cancel, you don't need it. So it is installing Python 3.10, then installing
00:57:53 packages with a virtual environment. If you are a Linux user, Mass Compute installation is what
00:57:58 you need. It should work perfectly fine on Linux as well on your local Linux system as well. So
00:58:04 for Linux users, I recommend my Massed Compute scripts. So the installation almost completed;
00:58:10 I mean this is like 100 times faster than RunPod. Yes, it is done. You see it took like 1 minute to
00:58:16 install. Then the download begins and the download speed is also amazing on Massed Compute; their
00:58:21 network and their disk speeds are unchallenged. There are no other cloud services like them. So
00:58:27 you are seeing it in real real-time: it is just downloading, verifying, downloading, verifying.
00:58:32 Download speed is around 500 megabytes per second, even faster. The merge speed is very fast since it
00:58:38 is downloading with 16 different connections then doing hash verification. I'm trying to
00:58:43 unify scripts as much as possible; whatever is on Windows is also on Linux and on cloud services,
00:58:50 I'm all trying to use the same stuff. Even the requirements and other stuff are the same. So,
00:58:55 it is just downloading and verifying really, really fast. Yes, I mean yes, 800 megabytes;
00:59:02 it means like it is 5 gigabit internet connection speed—it is just amazing. All right, so the
00:59:08 installation has been completed. All of the models are already downloaded. We can even see the full
00:59:13 size of the folder; so it is 65 GB. Then return back to Massed Compute instructions read.txt file.
00:59:22 Copy this starting command, return back to the folder, and open a new terminal. The location of
00:59:28 the terminal is super important; right-click and paste. Then it will start the application. We can
00:59:33 see that the start of the application will be also really fast. There is one difference with this:
00:59:38 you see it is starting like this, so I can open the localhost as well here since it is
00:59:46 a desktop-based operating system. So you see now this is running inside the Massed Compute,
00:59:52 not in my computer. So I can use it locally like this or I can copy this Gradio live link and open
00:59:59 it in my computer. This is the way I prefer so that I have the full smoothness of the interface.
01:00:07 It is like running on my own Windows computer but it is running on the server. Okay, we got this
01:00:13 error again, so it was because of the loading; let's refresh. It is also happening because of
01:00:18 the Gradio live; sometimes it can happen, but refresh of the page should fix it. Yes. So the
01:00:23 model is selected; let's make a quick generate song. Always do default generation then start
01:00:29 playing with other stuff; verify that it is working first because your GPU could be broken,
01:00:34 your machine could be broken—anything can happen—and the usage is exactly the same as
01:00:39 on the Windows tutorial part. Once you reached this point, when you return back to your CMD,
01:00:44 you can see the status. You can open a new window and type nvitop and you can see the progress on
01:00:52 the GPU. It is loading; the loading is much faster on Massed Compute as expected since
01:00:57 their disk speeds are much faster. Once the model is loaded, the subsequent generations will be
01:01:02 really fast. Okay, generation is proceeding with amazing speeds. Yes, really fast. Almost done.
01:01:09 And since this is the turbo model, it is only 8 steps inference. Yes, it is already generated and
01:01:16 the song has appeared. Since it is Gradio live, it is taking some time, but it is here. Let's listen.
01:01:27 So I can generate as many as songs I want and use the perfect one. The next generation will
01:01:33 be—let's see how many seconds—it should be around 15 seconds, maybe even faster. Okay: 7, 8, 10, 12,
01:01:42 13, 14. Yes, in 15 seconds I got a new song and it should appear. Yes, you see this way I can
01:01:50 generate as many as I want and use the best one. I came with This way you can get your perfect song;
01:02:00 you can try different styles, use any lyrics you want, and everything else is the same. When
01:02:05 you click here, it will download the song, or you can go to your installation folder,
01:02:10 go to outputs—everything will be here. You can copy them into your thin drives. Enter here;
01:02:18 when you generate something here or when you copy something here, it will be synchronized
01:02:22 with your shared drive on your operating system. So my shared drive is here; you see the outputs
01:02:27 arrived. It is also copying; right now you can see that it is synchronizing. It is not very fast,
01:02:34 but for small files it should work fine, and now they are on my computer; they are all
01:02:39 synchronized. So how are you going to stop your machine on Massed Compute? There is no stop; I
01:02:45 mean when you stop your instance, it will continue spending your money. Therefore, once you are fully
01:02:51 done and you did back up your data and everything, you need to delete it like this. If you need to
01:02:58 transfer big files, search 'wget' on our channel and watch this tutorial. This way you can upload
01:03:05 your folders and big files onto Hugging Face and quickly download them back into your Massed
01:03:11 Compute instance. You can also back up your data or big files in your Google Drive or OneDrive and
01:03:16 quickly download them back to your Massed Compute. Everything is possible once you learn it; it is so
01:03:21 easy to use, and this is the tutorial that you need. So now I will show on the SimplePod. Open
01:03:27 back the RunPod SimplePod instructions and you will see that there is this link to register.
01:03:31 Please use this link to register; I appreciate that. Once you registered and logged in,
01:03:36 add some credits to your account. Then return back to the instructions and you will see that
01:03:42 there is this template; you need to use this. Double-click it; it will select the template. Now
01:03:48 it may be a little bit different in the selection part compared to RunPod, so I will explain.
01:03:54 This SimplePod also supports a permanent network storage system. You see there is 'Add persistence
01:04:00 volume'; if you select this, it will use it. If you don't, it will use a temporary disk. So let's
01:04:05 select 'Add persistent volume', then click 'use template' or 'edit and use'. I prefer you to do
01:04:11 first 'edit and use' to add permanent storage. So on this screen, you see there is 'Add new volume'.
01:04:18 I already have a volume, but you can add a new volume. So when I click 'none', it doesn't work,
01:04:23 so I have to first generate the volume. Let's generate the volume and repeat: go to storage. I
01:04:29 am going to delete this one and I will generate a new one. For the tutorial, they only have a single
01:04:36 data center right now; so the biggest disadvantage of SimplePod is that it doesn't have as many GPUs
01:04:42 as RunPod, but it is cheaper and faster. So let's make this 200 GB and save. You see the
01:04:47 200 GB per month price is 6 dollars; it was 14 dollars on RunPod. Save. Then return back
01:04:54 and double-click this link; it will open the template. Let's say 'Add persistent volume',
01:04:59 'edit and use'. Select the volume from here: Tutorial. The mount point is accurate. You can
01:05:04 add the ports like 7860 and 7861. You can give the name like this, and everything else is automatic.
01:05:15 Save and use. Now it is not started yet; we only did set our accurate template. You see there is
01:05:21 'change template', there is 'edit template', and the template is selected. The filters are here;
01:05:26 you can apply them. So currently I am going to select RTX Pro 6000. You see the price is
01:05:33 1.6 dollars; it was 2.15 dollars on RunPod. Almost the same price as Massed Compute,
01:05:41 so you can choose either one of them. You can also use the half RTX Pro 6000—this is shared between 2
01:05:48 people, so you are getting half performance and 48 GB. This is lesser performance than an RTX 5090;
01:05:55 you can also pick an RTX 5090, the price is amazing and the speed and quality is amazing.
01:06:00 Since this model fits into 32 GB with maximum quality, you can also pick this one, and when
01:06:07 you scroll down, you will see all of the features here and run. Then it will start the machine;
01:06:14 the machine is getting composite. Once you get to this screen, the rest is exactly the same as the
01:06:21 RunPod. Let's just wait; the data and everything will appear here. You can also see the logs
01:06:25 here. You see system general crash logs are also shown; it is showing my current spending. Okay,
01:06:31 console and Jupyter links appeared. We can also see the volume: you see it is using our tutorial
01:06:38 permanent storage system, so everything will be automatically saved. Click the Jupyter direct;
01:06:43 my browser is particularly giving a warning—I will just say I want to continue and confirm.
01:06:48 This is because it is using some HTTPS link, therefore it is giving that error, but it is fine;
01:06:55 this is running on HTTPS, so it is secure. Drag and drop your zip file, same as on RunPod; you can
01:07:01 also use this upload files icon, and right-click and extract archive once it is uploaded. One big
01:07:08 advantage of SimplePod is that there is a direct file browser. This is ultra-fast and it allows you
01:07:15 to upload and download very big files very fast; RunPod doesn't have this feature but they have.
01:07:20 So once the extraction is completed it should be instant. Open RunPod SimplePod instructions.
01:07:26 You will notice that it is currently installing with concurrent installs set to 4. On SimplePod,
01:07:32 you can make this 8; on RunPod, we make it 4 to be secure because their network storage system is
01:07:38 slow. On Massed Compute, we make it default; it always works. This should be really fast
01:07:43 to install on SimplePod. So the installation started; it is quickly installing right now,
01:07:49 same as on RunPod. Okay, so the installation has been completed. Let's verify if there are any
01:07:56 errors or not; all looking great so far. I am scrolling down slowly and it started
01:08:02 downloading the models and is still downloading the models, I think. Let's go to the very bottom;
01:08:09 yes, still downloading the models. Let's just wait a little bit more. All right, so the models have
01:08:14 been downloaded. Installation has been completed; we can see that downloaded 75, all completed. Now
01:08:21 we can start the application. So return back to the instructions.txt, copy the start command here,
01:08:28 open a new terminal and paste it. If you get any installation error as happened in RunPod,
01:08:34 you can delete the virtual environment and install again. So the application is starting; we
01:08:39 just need to wait a little bit more. Okay, so the application started quickly; let's open the Gradio
01:08:44 live. You can also connect from this link: you see this is the HTTP port which we have enabled. Let's
01:08:53 see if it will work accurately. Okay, let's try plus 1—maybe as it is in the RunPod proxy. No, it
01:08:59 didn't work on SimplePod, but it doesn't matter; we are going to use the Gradio live. As I said,
01:09:05 make a default generation. This error is not important; this is happening because of the Gradio
01:09:12 live, let's click again. If you get that error, okay, the second time clicking started. Yes, it
01:09:19 is happening because of the Gradio live, but it is not important; it happens one time until the page
01:09:25 is fully loaded. The generation started; to follow everything, I think we can open nvitop. Okay,
01:09:31 we need to install first: pip install nvitop, then nvitop. Yes, so you can monitor your VRAM usage
01:09:38 here; you can see your CPU usage and memory usage. The loading of the model is much faster than
01:09:45 RunPod. Once the first time is loaded, the second and subsequent generations will be much faster;
01:09:50 but as a rule of thumb, generate one time with the default values. Verify it is working then
01:09:56 proceed to use other features. Everything is shown in the Windows tutorial part,
01:10:01 so watch it to learn if you skipped to this point. Okay, generation should be completed quickly once
01:10:08 all of the models are loaded. Okay, models loaded and it is generating with great speed right now.
01:10:15 You can also see this GPU usage, but I see that this one is limited to 500 Watts. This should be
01:10:22 600 Watts if I remember correctly; I will tell the developers of SimplePod. Their driver version is
01:10:28 very up to date—you see, 595, whereas on RunPod it was 580. Their driver is also supporting CUDA
01:10:36 13.2; this is a big disadvantage of RunPod. Okay, generation completed, the song appeared
01:10:42 here. Everything is same: you can click here to download, you can go to ACESTEP Premium from your
01:10:53 Jupyter Lab interface, right-click outputs and download as an archive; but SimplePod, as I said,
01:10:59 supports direct download. Go to file browser; you can even download the models from here or
01:11:04 upload here. When I scroll down to workspace, I will see everything is here. Double-click it:
01:11:10 ACESTEP Premium. Double-click it and let's download the outputs folder. So I click it
01:11:14 and I will click this download and it will zip and download; it will be super fast. Keep it. You can
01:11:21 even download models: go to models, let's download one of the big models. For example, this one; this
01:11:26 is like 9 GB. Click download; you see the download will start immediately with full speed, amazing
01:11:33 speed as you can see. Same way you can upload very big files: go to the folder wherever you want,
01:11:38 click this upload icon. You can upload a file or folder, upload whatever you want. So to prevent
01:11:45 your credits from being used, you need to delete the instance. There is no stop button in here;
01:11:50 when you delete your instance, if you were using your permanent storage volume like this,
01:11:55 everything will be kept there. Delete instance, confirm. Now my storage is back here, it is fully
01:12:02 working. To use again and to start again is the same as the RunPod part: return back to the RunPod
01:12:07 SimplePod instructions and double-click the template link. It will appear like this: add
01:12:12 persistent storage, edit and use. Make sure your volume is selected from here and it will continue;
01:12:18 save and use. Then select your GPU; you can use any GPU like this, for example, let's use the 5090
01:12:25 this time. Scroll down and run, it will start the machine and it will use my storage. Okay,
01:12:31 machine started back; click direct Jupyter Lab, show details, I want to continue. Since this was a
01:12:37 full stop, I recommend running the install command again; it will be really fast. The second time,
01:12:43 it will just verify the installation; it should be almost instant. It will fix the extra libraries
01:12:51 and dependencies that need to be installed into the instance, not into the shared storage,
01:12:58 like FFmpeg. Yes, it is verifying everything. You see it is just verifying the models,
01:13:03 not redownloading, not reinstalling. So the installation verified, then just start again
01:13:10 and start using it back. So this is it; thank you so much for watching. You can always join our
01:13:17 Discord channel; joining the Discord channel is my number 1 recommendation to contact me. You can
01:13:23 make a reply; the Discord channel link is here. You can join our subreddit, follow me on LinkedIn,
01:13:30 and to see all of our scripts, you see we have our Patreon exclusive posts index. Click it and you
01:13:36 will see all the scripts we have with their links and details. You can use Ctrl+F to search anything
01:13:44 here like the ComfyUI, F3 to switch between them. Our subreddit is getting bigger and bigger;
01:13:51 you see we have a massive amount of subscribers, visits, and views. You can also leave a comment
01:13:57 here. I recommend you to read everything here to understand how we are developing,
01:14:03 what features we are adding, and how to use them. I have done so many different updates,
01:14:08 so this is so important. The zip file also updated to version 5.1. The zip file may not get updated
01:14:15 all the times whenever I update the version of the application because it patches updates from
01:14:22 a remote repository. So you just need to run the Windows install or update file if the zip file is
01:14:30 saved. So the new instance also started; this was how we reuse on SimplePod with a permanent
01:14:37 storage system. Now I am going to terminate everything to not spend any money; currently,
01:14:42 I'm spending like this much. So go to my servers—you can also have multiple servers here,
01:14:47 it will show in 'my servers'—I will just delete it. If you want to open it back, just click the
01:14:52 instance name and it will happen; also you can switch them from here too. So let's just delete;
01:14:57 there is no stop button. Then I need to delete my storage too. So I will delete that as well. Okay,
01:15:02 thank you so much and hopefully see you later in the LoRA training tutorial of the ACESTEP 1.5.

Uh oh!

Forget Suno Run the Ultimate AI Music Studio LOCALLY 100 Free

Forget Suno: Run the Ultimate AI Music Studio LOCALLY (100% Free)

Full tutorial link > https://www.youtube.com/watch?v=9C_6qNKjgpA

Updated Full Written Tutorial Package (v5.3 Wildcards, DOCX, HTML, PDF and 20 Page PNG Guide)

Download / Open

What The Written Tutorial Covers

v5.3 Wildcards Screenshots

20 Page Visual Tutorial

Video Transcription

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!