This repository contains scripts used for my BossmanJack live stream capturing system and associated post processing.
- kick_capture/main.pylistens for the "Start Stream" event from Kick and runs the download script
- kick_capture/bmj_dl.batruns yt-dlp and auto-restarts it in the event of errors
- kick_capture/Test-FfmpegDied.ps1is used by the batch script to check if- ffmpegencountered an error while capturing and restart the capture if necessary. Note this used to be very frequent, but it seems Kick has resolved the random 500 errors since December.
- Export-WavFiles.ps1automates extracting wav audio from the streams for running whispercpp
- New-WhisperCppTranscription.ps1automates the transcription process
- Export-WhisperCsvToTxt.ps1creates nicely formatted txt files from the CSV files whispercpp generates
- Write-CleanYtDlpMetadata.ps1cleans the .info.json files generated by- yt-dlp, stripping potentially sensitive information
- grab_kick_vods/grab_kick_vods.pygrabs all Kick VODs for a channel and runs a companion script
- grab_kick_vods/grab_kick_vod.batexample batch script for retrieving a VOD if it hasn't already been downloaded
- rapfame_dl/rapfame_dl.pymass download tracks from Rap Fame by artist ID
- The PowerShell CSV cmdlets are buggy when dealing with escaped double quoted strings in CSV files. Whispercpp will occassionally emit quoted text and this can lead to slight errors in the txt files.
- PowerShell's JSON cmdlets have a shallow depth by default, which leads to some heavily nested objects in the yt-dlpmetadata becoming stringified. This is easily fixable by specifying a sufficient depth, but the property affected isn't important (HTTP headers) and I don't want to change the schema of my existing metadata files.
- There is no handling of whispercpporffmpegerrors
- The Rap Fame script doesn't support obtaining the artist ID from the page