<a href="https://colab.research.google.com/github/neuroidss/EEG-GAN-audio-video/blob/main/neuralfunkv2_eeg.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# WaveGAN interactive notebook

This notebook allows you to play with WaveGAN on (free) cloud GPUs.

To get started, go to: `Runtime > Change runtime type > Hardware accelerator > GPU ... Save`

Then, simply run the cells below (`Ctrl + Enter` while highlighting a cell)

# **NeuralFunk vol.2**

Project by Ivan.Garkavy@skoltech.ru (with help from Marsel.Ishimbaev@skoltech.ru)

Demo video: https://youtu.be/oHKLAkCNrj8

I trained WaveGAN neural network (https://github.com/chrisdonahue/wavegan) on 7500 vintage drumloops (breaks) from around 2000 old recordings.

All the loops were fit to 120bpm tempo and sampled at 32768Hz for the dataset.

The trained model is able to generate either non-existing breaks or new drumloops with elements of existing breaks, and smoothly morph between generated loops.

This result is about to be improved in the future.

**Link to the trained model can be found in this colab notebook.**



In [1]:
#!nvidia-smi

In [2]:
#!pip install astor
#!pip install matplotlib
!pip install mne
#!pip install pandas



In [3]:
import mne
from mne import io
from mne.datasets import sample
from mne.minimum_norm import read_inverse_operator, compute_source_psd

import pandas as pd
import numpy as np



In [4]:
%tensorflow_version 1.15
import tensorflow as tf
#Then check the version:
print(tf.__version__)

`%tensorflow_version` only switches the major version: 1.x or 2.x.
You set: `1.15`. This will be interpreted as: `1.x`.


TensorFlow 1.x selected.
1.15.2


In [5]:
# Confirm GPU is running
from tensorflow.python.client import device_lib
def get_available_gpus():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos if x.device_type == 'GPU']
if len(get_available_gpus()) == 0:
  for i in range(4):
    print('WARNING: Not running on a GPU! See above for faster generation')

In [6]:
# mount google drive
# you'll need to give permission for Google Colab (not me) to access your Google Drive
from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


link to the model: https://drive.google.com/file/d/1ZJir-_ls92s56LFmw_HVuQ7ANqFFN5WG/view?usp=sharing

add shortcut to this folder to your google drive


In [7]:
# downloading the model from your google drive (2GB)
# it takes about 1 min
!cp -r -v "/content/gdrive/MyDrive/neuralfunkv2/model" "/content/"
# (if it fails to find the folder, and you've made a shortcut,
# try changing "MyDrive" to "My Drive" everywhere in the notebook)

'/content/gdrive/MyDrive/neuralfunkv2/model/model.ckpt-18637.index' -> '/content/model/model.ckpt-18637.index'
'/content/gdrive/MyDrive/neuralfunkv2/model/model.ckpt-18637.meta' -> '/content/model/model.ckpt-18637.meta'
'/content/gdrive/MyDrive/neuralfunkv2/model/args.txt' -> '/content/model/args.txt'
'/content/gdrive/MyDrive/neuralfunkv2/model/graph.pbtxt' -> '/content/model/graph.pbtxt'
'/content/gdrive/MyDrive/neuralfunkv2/model/checkpoint' -> '/content/model/checkpoint'
'/content/gdrive/MyDrive/neuralfunkv2/model/infer/infer.pbtxt' -> '/content/model/infer/infer.pbtxt'
'/content/gdrive/MyDrive/neuralfunkv2/model/infer/infer.meta' -> '/content/model/infer/infer.meta'
'/content/gdrive/MyDrive/neuralfunkv2/model/model.ckpt-18637.data-00000-of-00001' -> '/content/model/model.ckpt-18637.data-00000-of-00001'


In [8]:
!cp -r -v "/content/gdrive/MyDrive/EEG-GAN-audio-video/eeg" "/content/"

'/content/gdrive/MyDrive/EEG-GAN-audio-video/eeg/record-[2020.06.28-14.26.09].csv' -> '/content/eeg/record-[2020.06.28-14.26.09].csv'


In [9]:
    key = 0
    idx = 0

    data_path = './eeg'
    #raw_fname = data_path + '/record-[2019.11.13-22.23.59].gdf'
    #raw = mne.io.read_raw_gdf(raw_fname, preload=True)

    path = data_path
    #ch_names = ['FP1','AF3','F7','F3','FC1','FC5','T7','C3','CP1','CP5','P7','P3','Pz','PO3','O1','Oz','O2','PO4','P4','P8','CP6','CP2','C4','T8','FC6','FC2','F4','F8','AF4','FP2','Fz','Cz']
    #data = pd.read_csv(path + '/record-[2019.11.13-22.23.59].csv', skiprows=0, usecols=ch_names, header=0, delimiter=';') 
    
    ch_names = ['FP1','F3','P3','O1','O2','P4','F4','FP2']
    #data = pd.read_csv(path + '/record-[2020.06.28-00.36.11].csv', skiprows=0, usecols=ch_names, header=0, delimiter=';') 
    data = pd.read_csv(path + '/record-[2020.06.28-14.26.09].csv', skiprows=0, usecols=ch_names, header=0, delimiter=';') 
    #data = pd.read_csv(path + '/record-[2020.06.29-19.49.23].csv', skiprows=0, usecols=ch_names, header=0, delimiter=';') 
    
    #print(data)
    data_transpose=np.transpose(data)

    sfreq = 512 
    info = mne.create_info(ch_names = ch_names, sfreq = sfreq)
    #info = mne.create_info(sfreq = sfreq)
    raw = mne.io.RawArray(data_transpose, info)
    #raw.plot()

    # Setup for reading the raw data
    #raw = io.read_raw_fif(raw_fname, verbose=False)
    #events = mne.find_events(raw, stim_channel='STI 014')
    #inverse_operator = read_inverse_operator(fname_inv)
    #raw.info['bads'] = ['MEG 2443', 'EEG 053']

    # picks MEG gradiometers
    #picks = mne.pick_types(raw.info, meg=False, eeg=True, eog=False, stim=False)
    picks = ch_names


Creating RawArray with float64 data, n_channels=8, n_times=844096
    Range : 0 ... 844095 =      0.000 ...  1648.623 secs
Ready.


        eeg_step=idx
        print (f'EEG step: {(eeg_step/3):.1f} s')
        tmin, tmax = 0+(eeg_step/3), 2+(eeg_step/3)  # use the first 120s of data
        #tmin, tmax = 0+(10*eeg_step/512), 2+(10*eeg_step/512)  # use the first 120s of data
        fmin, fmax = 0.5, 512  # look at frequencies between 4 and 100Hz
        #fmin, fmax = 8, 12  # look at frequencies between 4 and 100Hz
        n_fft = 1024  # the FFT size (n_fft). Ideally a power of 2
        #n_fft = 2048  # the FFT size (n_fft). Ideally a power of 2
        #label = mne.read_label(fname_label)

        psds, freqs = mne.time_frequency.psd_welch(raw, picks=picks,
                         tmin=tmin, tmax=tmax, fmin=fmin, fmax=fmax,
                         n_fft=n_fft)

        #print(freqs)
        #print(psds)
        psds_transpose=np.transpose(psds)
        #plt.plot(freqs,psds_transpose)
        #plt.xlabel('Frequency (Hz)')
        #plt.ylabel('PSD (dB)')
        #plt.title('Power Spectrum (PSD)')
        #plt.show()


In [10]:
# Load the model
tf.reset_default_graph()
saver = tf.train.import_meta_graph('./model/infer/infer.meta')
graph = tf.get_default_graph()
sess = tf.InteractiveSession()
saver.restore(sess, f'./model/model.ckpt-18637')
dim = 100
break_len = 65536

z = graph.get_tensor_by_name('z:0')
G_z = graph.get_tensor_by_name('G_z:0')

INFO:tensorflow:Restoring parameters from ./model/model.ckpt-18637


In [11]:
import numpy as np
from IPython.display import display, Audio
#from google.colab import files
import scipy.io.wavfile
import matplotlib.pyplot as plt
%matplotlib inline
!mkdir "./neuralfunk examples"

def generate_trajectory(n_iter, _z0=None, mov_last=None, jump=0.3, smooth=0.3, include_z0=True):
    _z = np.empty((n_iter + int(not include_z0), dim))
    _z[0] = _z0 if _z0 is not None else np.random.random(dim)*2-1
    mov = mov_last if mov_last is not None else (np.random.random(dim)*2-1)*jump
    for i in range(1, len(_z)):
        mov = mov*smooth + (np.random.random(dim)*2-1)*jump*(1-smooth)
        mov -= (np.abs(_z[i-1] + mov) > 1) * 2 * mov
        _z[i] = _z[i-1] + mov
    return _z[-n_iter:], mov

# Configuring is done, next cells are generation of results

*In this notebook, files are previewed at 39936 hz rate (146.25 bpm tempo, but saved at 44100 hz (=161.4990234375 bpm tempo).*

*You may change these numbers in code (for example, 32768 hz for 120 bpm), but you will probably need some standard sample rate (44100 etc) for files to be able to open them your DAW.*

*In this case you may specify either this clumsy bpm value or the number of bars in DAW to fit the audio to any tempo you like.*

In [13]:
# Generate random breaks and display audio

# CHANGE THIS to change number of examples generated
#n_generate = 30
#n_generate = 150
#n_generate = 300
#n_generate = 305
#n_generate = 390

# Sample latent vectors
#seed = 666 # change this seed to generate different set of breaks
#np.random.seed(seed)
#_z = (np.random.rand(n_generate, dim) * 2.) - 1.


hz=44100
#hz=39936
#hz=int(32768*2*(600/240))
#hz=int(32768*2*(480/240))
#hz=int(32768*2*(360/240))
#hz=int(32768*2*(300/240))
#hz=int(32768*2*(265/240))
#hz=int(32768*2*(250/240))
#hz=int(32768*2*(240/240)*1.6666666)
#hz=int(32768*2*(240/240))
#hz=int(32768*2*(120/240))
fps=hz/(32768*2)
#fps=0.5
#fps=44100/(32768*2)
#fps=1
#fps=1/3
#fps=1.5

#n_generate=int((307-2)*fps)
#n_generate=int((160-2)*fps)
#n_generate=int((1598-2)*fps)
n_generate=int((1648-2)*fps)
#n_generate=int((1607-2)*fps)
part_len = 100
#part_len = 275
n_parts = n_generate//part_len
if n_generate%part_len>0:
    n_parts=n_parts+1

vol=0.1

psd_array=np.random.rand(part_len, dim) 


#z_avg_samples=n_generate
#for i in range(n_generate): # display separate audio for each break
for j in range(n_parts): # display separate audio for each break
    for i in range(part_len): # display separate audio for each break
        ji = j * part_len + i
        
        if (i==0) and (n_generate-ji<part_len):
            psd_array=np.random.rand((n_generate-ji), dim) 


        eeg_step=ji
        #print (f'EEG step: {(eeg_step/3):.1f} s')
        tmin, tmax = 0+(eeg_step/fps), 2+(eeg_step/fps)  # use the first 120s of data
        #tmin, tmax = 0+(10*eeg_step/512), 2+(10*eeg_step/512)  # use the first 120s of data
        fmin, fmax = 0.5, 50  # look at frequencies between 4 and 100Hz
        #fmin, fmax = 8, 12  # look at frequencies between 4 and 100Hz
        #n_fft = 512  # the FFT size (n_fft). Ideally a power of 2
        n_fft = 1024  # the FFT size (n_fft). Ideally a power of 2
        #n_fft = 2048  # the FFT size (n_fft). Ideally a power of 2
        #label = mne.read_label(fname_label)
        
        print(ji)

        psds, freqs = mne.time_frequency.psd_welch(raw, picks=picks,
                         tmin=tmin, tmax=tmax, fmin=fmin, fmax=fmax,
                         n_fft=n_fft)

        #print(freqs)
        #print(psds)
        
        z_samples = psds

        #w_samples = G.mapping(torch.from_numpy(z_samples).to(device), None)  # [N, L, C]
        #w_samples = w_samples[:, :1, :].cpu().numpy().astype(np.float32)       # [N, 1, C]
        z_avg = np.mean(z_samples, axis=0)      # [1, 1, C]
        #z_avg = np.mean(z_samples, axis=0, keepdims=True)      # [1, 1, C]
        psd_array[i]=z_avg
        #psd_array[i]=z_avg
        #print(z_avg)
        #z_std = (np.sum((z_samples - z_avg) ** 2) / z_avg_samples) ** 0.5

        #psd_array[i]=psds
        #psds_transpose=np.transpose(psds)
        #plt.plot(freqs,psds_transpose)
        #plt.xlabel('Frequency (Hz)')
        #plt.ylabel('PSD (dB)')
        #plt.title('Power Spectrum (PSD)')
        #plt.show()
        if (i==part_len-1) or (ji==n_generate-1) :
            
            _z = psd_array * vol
            _G_z = sess.run(G_z, {z: _z})[:,:,0]
            if j==0:
                _G_z_full=_G_z
            else:
                _G_z_full=np.append(_G_z_full,_G_z)
            if (ji==n_generate-1) :
                break

#print(psd_array)
#print(psd_array.shape)
#print(psd_array.ndim)
#_z = psd_array / 5.
#_z = psd_array / 10.
#_z = (psd_array * 2.) - 1.
#_G_z = sess.run(G_z, {z: _z})[:,:,0]

# display(Audio(_G_z.flatten(), rate=39936)) # display all in one audio

#for i in range(n_generate): # display separate audio for each break
  #print(i)
  #display(Audio(np.tile(_G_z[i][1:_G_z[i].ndim/2], 2), rate=39936)) # change rate for different tempo
  #display(Audio(np.tile(_G_z[i][1:32768], 2), rate=32768)) # change rate for different tempo
  #display(Audio(np.tile(_G_z[i], 1), rate=32768)) # change rate for different tempo


0
Effective window size : 2.000 (s)
1
Effective window size : 2.000 (s)
2
Effective window size : 2.000 (s)
3
Effective window size : 2.000 (s)
4
Effective window size : 2.000 (s)
5
Effective window size : 2.000 (s)
6
Effective window size : 2.000 (s)
7
Effective window size : 2.000 (s)
8
Effective window size : 2.000 (s)
9
Effective window size : 2.000 (s)
10
Effective window size : 2.000 (s)
11
Effective window size : 2.000 (s)
12
Effective window size : 2.000 (s)
13
Effective window size : 2.000 (s)
14
Effective window size : 2.000 (s)
15
Effective window size : 2.000 (s)
16
Effective window size : 2.000 (s)
17
Effective window size : 2.000 (s)
18
Effective window size : 2.000 (s)
19
Effective window size : 2.000 (s)
20
Effective window size : 2.000 (s)
21
Effective window size : 2.000 (s)
22
Effective window size : 2.000 (s)
23
Effective window size : 2.000 (s)
24
Effective window size : 2.000 (s)
25
Effective window size : 2.000 (s)
26
Effective window size : 2.000 (s)
27
Effectiv

In [17]:
# save to VM local memory (you'll need to copy your files to google drive in the last cell)
filename = f"./neuralfunk examples/neuralfunkv2_gen_fps{fps:.2f}_{120*2*fps:.2f}bpm_n{n_generate}_vol{vol:.2f}.wav"
#scipy.io.wavfile.write(filename, hz, _G_z.flatten()) # change rate for different tempo
scipy.io.wavfile.write(filename, hz, _G_z_full.flatten()) # change rate for different tempo

#filename = f"./neuralfunk examples/neuralfunkv2_gen_fps{fps}_120bpm_n{n_generate}.wav"
#scipy.io.wavfile.write(filename, 32768, _G_z.flatten()) # change rate for different tempo
#filename = f"./neuralfunk examples/neuralfunkv2_gen_fps{fps}_240bpm_n{n_generate}.wav"
#scipy.io.wavfile.write(filename, 32768*2, _G_z.flatten()) # change rate for different tempo
#filename = f"./neuralfunk examples/neuralfunkv2_gen_fps0.3_294bpm_n{n_generate}.wav"
#scipy.io.wavfile.write(filename, 39936*2, _G_z.flatten()) # change rate for different tempo
#filename = f"./neuralfunk examples/neuralfunkv2_gen_fps{fps}_160bpm_n{n_generate}.wav"
#scipy.io.wavfile.write(filename, 44100, _G_z.flatten()) # change rate for different tempo
#print(_G_z[0].shape[0])
#filename = f"./neuralfunk examples/neuralfunkv2_gen_cut_fps{fps}_320bpm_n{n_generate}.wav"
#scipy.io.wavfile.write(filename, 44100*2, _G_z[:][1:(_G_z[0].shape[0]ndim/2)].flatten()) # change rate for different tempo
#print(_G_z[0].shape[0])
#filename = f"./neuralfunk examples/neuralfunkv2_gen_half_fps{fps}_240bpm_n{n_generate}.wav"
#scipy.io.wavfile.write(filename, 32768*2, _G_z[:][1:32768].flatten()) # change rate for different tempo
#filename = f"./neuralfunk examples/neuralfunkv2_gen_fps{fps}_240bpm_n{n_generate}.wav"
#scipy.io.wavfile.write(filename, 32768*2, _G_z.flatten()) # change rate for different tempo
#filename = f"./neuralfunk examples/neuralfunkv2_gen_fps{fps}_360bpm_n{n_generate}.wav"
#scipy.io.wavfile.write(filename, 32768*3, _G_z.flatten()) # change rate for different tempo
#filename = f"./neuralfunk examples/neuralfunkv2_gen_half_fps{fps}_320bpm_n{n_generate}.wav"
#scipy.io.wavfile.write(filename, 44100*2, _G_z[:][1:16384].flatten()) # change rate for different tempo


In [20]:
!cp -r -v "./neuralfunk examples" "/content/gdrive/MyDrive/EEG-GAN-audio-video"


'./neuralfunk examples' -> '/content/gdrive/MyDrive/EEG-GAN-audio-video/neuralfunk examples'
'./neuralfunk examples/neuralfunkv2_gen_fps0.67_161.50bpm_n1107_vol0.10.wav' -> '/content/gdrive/MyDrive/EEG-GAN-audio-video/neuralfunk examples/neuralfunkv2_gen_fps0.67_161.50bpm_n1107_vol0.10.wav'
