# Fish Diffusion
<div style="display: flex; justify-content: center;">
<img alt="LOGO" src="https://cdn.jsdelivr.net/gh/fishaudio/fish-diffusion@main/images/logo_512x512.png" width="256" height="256" />
</div>

<div style="display: flex; justify-content: center;">
<a href="https://github.com/fishaudio/fish-diffusion/actions/workflows/ci.yml">
<img alt="Build Status" src="https://img.shields.io/github/actions/workflow/status/fishaudio/fish-diffusion/ci.yml?style=flat-square&logo=GitHub">
</a>
<a href="https://hub.docker.com/r/lengyue233/fish-diffusion">
<img alt="Docker Hub" src="https://img.shields.io/docker/cloud/build/lengyue233/fish-diffusion?style=flat-square&logo=Docker&logoColor=white">
</a>
<a href="https://discord.gg/wbYSRBrW2E">
<img alt="Discord" src="https://img.shields.io/discord/1044927142900809739?color=%23738ADB&label=Discord&logo=discord&logoColor=white&style=flat-square">
</a>
<a href="https://huggingface.co/spaces/fishaudio/fish-diffusion">
<img alt="Hugging Face" src="https://img.shields.io/badge/🤗%20Spaces-HiFiSinger-blue.svg?style=flat-square">
</a>
<a target="_blank" href="https://colab.research.google.com/github/fishaudio/fish-diffusion/blob/notebooks-support/notebooks/fish-audio_sample.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>
</div>

</div>

### Environment Setup




#### Install Conda

In [None]:
%%bash
mkdir /content/env
MINICONDA_INSTALLER_SCRIPT=Miniconda3-py310_23.1.0-1-Linux-x86_64.sh
MINICONDA_PREFIX=/content/env
wget https://repo.continuum.io/miniconda/$MINICONDA_INSTALLER_SCRIPT
chmod +x $MINICONDA_INSTALLER_SCRIPT
./$MINICONDA_INSTALLER_SCRIPT -b -f -p $MINICONDA_PREFIX

#### Create conda environment

In [None]:
!source /content/env/bin/activate;\
conda create -n fish_diffusion python=3.10 -y

#### Install pytorch

In [None]:

# Install PyTorch related core dependencies
!source /content/env/bin/activate;\
conda activate fish_diffusion;\
conda install "pytorch>=2.0.0" "torchvision>=0.15.0" "torchaudio>=2.0.0" pytorch-cuda=11.8 -c pytorch -c nvidia -y


#### Install dependencies

In [None]:
!git clone https://github.com/fishaudio/fish-diffusion
%cd fish-diffusion

In [None]:
!source /content/env/bin/activate;\
conda activate fish_diffusion;\
cat requirements.txt | xargs -n 1 pip install;\
pip install -e .

### Vocoder preparation

In [None]:
!source /content/env/bin/activate;\
conda activate fish_diffusion;\
python tools/download_nsf_hifigan.py --agree-license

### Dataset preparation
```shell
dataset
├───train
│   ├───xxx1-xxx1.wav
│   ├───...
│   ├───Lxx-0xx8.wav
│   └───speaker0 (Subdirectory is also supported)
│       └───xxx1-xxx1.wav
└───valid
    ├───xx2-0xxx2.wav
    ├───...
    └───xxx7-xxx007.wav
```

##### Mount google drive or upload your dataset

In [None]:
from google.colab import drive
drive.mount('/content/drive/')

#### Soft link your dataset to the current diretory

In [2]:
dataset_path = "/content/drive/MyDrive/test-fish-audio/dataset/"#@param{type:"string"}
!ln -s $dataset_path dataset

#### Extract all data features, such as pitch, text features, mel features, etc.

##### if error about torchvision occured, run this cell

In [None]:
# !source /content/env/bin/activate;\
# conda activate fish_diffusion;\
# pip uninstall torchvision -y;\
# pip install torchvision  --index-url https://download.pytorch.org/whl/cu118

##### Extract features

In [None]:
!source /content/env/bin/activate;\
conda activate fish_diffusion;\
python tools/preprocessing/extract_features.py --config configs/svc_hubert_soft.py --path dataset  --clean

### Baseline training
> The project is under active development, please backup your config file  
> The project is under active development, please backup your config file  
> The project is under active development, please backup your config file  

In [11]:
training_options = 'single_machine'#@param ['single_machine', 'multi_node']
network ='diffusion'#@param ['diffusion', 'hifisinger']

logger ='wandb' #@param ['wandb', 'tensorboard']
pretrained = 'no'#@param['yes', 'no']
resume = 'no'#@param['yes', 'no']
resume_checkpoint = ''#@param{type:"string"}
pretrain_checkpoint = ''#@param{type:"string"}

In [12]:
if resume== 'yes':
    resume_str = f"--resume {resume_checkpoint}"
else:
    resume_str = ""
    
if pretrain_checkpoint != '':
    pretrain_str = f"--pretrain {pretrain_checkpoint}"
else:
    pretrain_str = ''

if logger== 'wandb':
    logger_str = ""
else:
    looger_str = "--tensorboard"
    %load_ext tensorboard
    %tensorboard --logdir .
    

if training_options == "single_machine":
    cmd = f"tools/{network}/train.py --config configs/svc_hubert_soft.py {resume_str} {pretrain_str} {logger_str}"
elif training_options == "multi_node":
    cmd = f"tools/{network}/train.py --config configs/svc_content_vec_multi_node.py {resume_str} {pretrain_str} {logger_str}"

!source /content/env/bin/activate;\
conda activate fish_diffusion;\
python {cmd}

tools/diffusion/train.py --config configs/svc_hubert_soft.py   


#### For Tensorboard

In [None]:
from tensorboard import notebook
notebook.list() # View open TensorBoard instances

In [None]:
# Control TensorBoard display. If no port is provided, 
# the most recently launched TensorBoard is used
notebook.display(port=6006, height=1000) 