# 1 - Introdução - Mod pedalboards

[pedalboards.moddevices.com](https://pedalboards.moddevices.com/) é um site desenvolvido pela ModDevices para o compartilhamento de pedalboards - configurações de utilização de plugins de áudio.

Algumas perguntas vem a mente:

* Quais os plugins de áudio mais utilizados? E os menos?
* Existe alguma empresa preferida?
* Com base na descrição dos plugins, seria possível categorizar os patches? Ex: Clean, crunch, solo, shimmer.
 * Se sim, seria possível obter quais os plugins mais desejados para determinado ritmo?

Graças a algum desenvolvedor legal, eles utilizam-se de uma API.

## Fonte de dados

Pegar os 2 mais recentes pedalboards criados:

In [2]:
import requests
import json

r = requests.get('https://pedalboards.moddevices.com/api/pedalboards?skip=0&take=2')
data = r.json()

print(data)

{'items': [{'created': '2016-12-02T12:56:24.984000', 'author': {'avatarUrl': 'https://www.gravatar.com/avatar/56fd147c79402e963f3c6cc04eea44f9', 'name': 'Breno Ghiorzi'}, 'imageUrl': 'https://api.moddevices.com/v2/pedalboards/58416f792564d40589088018/screenshot', 'fileHref': 'https://api.moddevices.com/v2/pedalboards/58416f792564d40589088018/file', 'id': '58416f792564d40589088018', 'title': 'Basic Power Trio Guit', 'audioUrl': 'https://api.moddevices.com/v2/pedalboards/58416f792564d40589088018/audio', 'hasAudio': True, 'description': 'Practical set for basic power trio gigs.\nClean, Drive, and Solo Delay...', 'plugins': []}, {'created': '2016-12-02T04:05:35.403000', 'author': {'avatarUrl': 'https://www.gravatar.com/avatar/none', 'name': 'Mariva'}, 'imageUrl': 'https://api.moddevices.com/v2/pedalboards/5840f30f2564d4058ad4c0f1/screenshot', 'fileHref': 'https://api.moddevices.com/v2/pedalboards/5840f30f2564d4058ad4c0f1/file', 'id': '5840f30f2564d4058ad4c0f1', 'title': 'Crunch with Clean 

Imprimir de forma mais elegante. A função `info` criada será utilizada no decorrer deste notebook

In [3]:
import pandas as pd

def info(pedalboards):
    data_dump = json.dumps(pedalboards)

    return pd.read_json(
          data_dump,
          orient='created', 
          convert_dates=['created'],
          date_unit='s'
        ).drop(labels=['audioUrl', 'author', 'hasAudio', 'fileHref'], axis=1)

df = info(data['items'])
df.head()

Unnamed: 0,created,description,id,imageUrl,plugins,title
0,2016-12-02T12:56:24.984000,Practical set for basic power trio gigs.\nClea...,58416f792564d40589088018,https://api.moddevices.com/v2/pedalboards/5841...,[],Basic Power Trio Guit
1,2016-12-02T04:05:35.403000,"A strong Crunch (DS), with deep and clean ambi...",5840f30f2564d4058ad4c0f1,https://api.moddevices.com/v2/pedalboards/5840...,[],Crunch with Clean Ambience


## Análise

Trataremos agora todos os 1000 primeiros pedalboards (pode ser que tenham menos registrados na plataforma)

In [4]:
limite = 1000
r = requests.get('https://pedalboards.moddevices.com/api/pedalboards?skip=0&take={}'.format(limite))
data = r.json()

print("Total de pedalboards carregados:", len(data['items']))
print("Existem mais pedalboards?", data['hasMore'])
print("Limite:", limite, "pedalboards a serem carregados")

Total de pedalboards carregados: 152
Existem mais pedalboards? False
Limite: 1000 pedalboards a serem carregados


### Rotulação (tagging) de pedalboards

Como o site não possui rotulação de pedalboards, traremo-nos essa tarefa. Para tal é necessário

1. Obter as palavras que mais utilizadas, para assim escolher as que mais se encaixam no meio musical;
1. Classificar os pedalboards com base nessas palavras.

Uma análise humana muito provavelmente irá ser necessária

Tokenizar (separar) e rotular as palavras 

In [5]:
import nltk

tokenize = lambda phase: [word.lower() for word in nltk.word_tokenize(phase)]

for pedalboard in data['items']:
    pedalboard['description_tokenized'] = tokenize(pedalboard['description']) + tokenize(pedalboard['title'])
    pedalboard['description_tagged'] = nltk.pos_tag(pedalboard['description_tokenized'])

df = info(data['items'])
df.head()

Unnamed: 0,created,description,description_tagged,description_tokenized,id,imageUrl,plugins,title
0,2016-12-02T12:56:24.984000,Practical set for basic power trio gigs.\nClea...,"[[practical, JJ], [set, NN], [for, IN], [basic...","[practical, set, for, basic, power, trio, gigs...",58416f792564d40589088018,https://api.moddevices.com/v2/pedalboards/5841...,[],Basic Power Trio Guit
1,2016-12-02T04:05:35.403000,"A strong Crunch (DS), with deep and clean ambi...","[[a, DT], [strong, JJ], [crunch, NN], [(, (], ...","[a, strong, crunch, (, ds, ), ,, with, deep, a...",5840f30f2564d4058ad4c0f1,https://api.moddevices.com/v2/pedalboards/5840...,[],Crunch with Clean Ambience
2,2016-11-22T21:52:58.597000,This is an updated version of the board I used...,"[[this, DT], [is, VBZ], [an, DT], [updated, JJ...","[this, is, an, updated, version, of, the, boar...",5834be3a2564d426347c1831,https://api.moddevices.com/v2/pedalboards/5834...,[],dual channel bleepy ambience and overdriven lead
3,2016-11-19T21:16:09.820000,"lots of sounds, lots of fun..","[[lots, NNS], [of, IN], [sounds, NNS], [,, ,],...","[lots, of, sounds, ,, lots, of, fun.., br, -, ...",5830c1192564d426347c17a1,https://api.moddevices.com/v2/pedalboards/5830...,[],BR - RingM-Scream-Loop
4,2016-11-17T07:02:26.926000,Version 2 of my shimmer/swell pedalboard. Easi...,"[[version, NN], [2, CD], [of, IN], [my, PRP$],...","[version, 2, of, my, shimmer/swell, pedalboard...",582d56032564d42635fa9a52,https://api.moddevices.com/v2/pedalboards/582d...,[],Shimmer2


Palavras mais comuns. Observe que tem muita coisa aqui que não interessa

In [6]:
from collections import Counter
from pandas import Series

count_all = Counter()

for pedalboard in data['items']:
    count_all.update(pedalboard['description_tokenized'])

print(Series(count_all.most_common(25)))

0       (the, 231)
1         (., 180)
2       (and, 106)
3         (,, 106)
4          (a, 95)
5       (with, 90)
6         (to, 73)
7        (for, 72)
8         (of, 48)
9     (guitar, 44)
10        (in, 36)
11         (i, 34)
12         (!, 34)
13       (mod, 33)
14     (clean, 32)
15        (on, 32)
16        (it, 31)
17         (), 31)
18         (:, 30)
19         (-, 29)
20      (this, 27)
21        (is, 26)
22     (sound, 24)
23        (my, 22)
24     (using, 22)
dtype: object


Tags mais comuns

In [7]:
words_tagged = []

for pedalboard in data['items']:
    words_tagged += pedalboard['description_tagged']

tag_fd = nltk.FreqDist(tag for (word, tag) in words_tagged)
print(tag_fd.most_common())

[('NN', 1125), ('JJ', 430), ('DT', 405), ('IN', 377), ('.', 217), ('NNS', 177), ('RB', 123), ('CC', 123), ('VB', 121), (',', 106), ('VBG', 97), ('CD', 84), ('VBZ', 77), (':', 73), ('TO', 73), ('VBN', 65), ('PRP', 61), ('VBP', 52), ('VBD', 51), (')', 31), ('PRP$', 31), ('RP', 27), ('MD', 22), ('(', 16), ('NNP', 14), ('POS', 12), ('WDT', 11), ("''", 10), ('``', 9), ('WRB', 8), ('FW', 7), ('SYM', 6), ('PDT', 5), ('JJR', 4), ('#', 3), ('JJS', 2), ('$', 2), ('EX', 2), ('WP', 2), ('RBR', 2)]


In [8]:
# Não entendi pq não funciona
text = nltk.Text(set(count_all))
text.similar('effect')




Foi observado empiricamente que NN possui a maioria dos casos desejados:

In [9]:
nltk.help.upenn_tagset('NN.*')

NN: noun, common, singular or mass
    common-carrier cabbage knuckle-duster Casino afghan shed thermostat
    investment slide humour falloff slick wind hyena override subhumanity
    machinist ...
NNP: noun, proper, singular
    Motown Venneboerger Czestochwa Ranzer Conchita Trumplane Christos
    Oceanside Escobar Kreisler Sawyer Cougar Yvette Ervin ODI Darryl CTCA
    Shannon A.K.C. Meltex Liverpool ...
NNPS: noun, proper, plural
    Americans Americas Amharas Amityvilles Amusements Anarcho-Syndicalists
    Andalusians Andes Andruses Angels Animals Anthony Antilles Antiques
    Apache Apaches Apocrypha ...
NNS: noun, common, plural
    undergraduates scotches bric-a-brac products bodyguards facets coasts
    divestitures storehouses designs clubs fragrances averages
    subjectivists apprehensions muses factory-jobs ...


In [10]:
noum = [(word, tag) for (word, tag) in words_tagged if tag.startswith('NN')]

noum_frequence = nltk.FreqDist(word for (word, tag) in noum)
print(noum_frequence.most_common(25))

[('guitar', 43), ('mod', 31), ('sound', 23), ('i', 21), ('loop', 18), ('pedalboard', 17), ('bass', 17), ('tone', 14), ('reverb', 14), ('delay', 14), ('caps', 13), ('midi', 12), ('side', 11), ('cabinet', 11), ('sounds', 11), ('instruments', 10), ('channel', 10), ('space', 10), ('amp', 9), ('effects', 9), ('olaf', 8), ('pedal', 8), ('stefan', 8), ('patch', 8), ('cowboys', 8)]


Pedalboards with reverb

In [11]:
pedalboards_with_reverb = [pedalboard for pedalboard in data['items'] if 'reverb' in pedalboard['description_tokenized']]

df = info(pedalboards_with_reverb)
df.head()

Unnamed: 0,created,description,description_tagged,description_tokenized,id,imageUrl,plugins,title
0,2016-10-08T10:50:08.322000,A kind of maximalistic reverb. Set up as a sen...,"[[a, DT], [kind, NN], [of, IN], [maximalistic,...","[a, kind, of, maximalistic, reverb, ., set, up...",57f8cf602564d40b88f95aab,https://api.moddevices.com/v2/pedalboards/57f8...,[],Deep Well
1,2016-10-01T11:44:48.417000,A clean patch with some nice reverb.,"[[a, DT], [clean, JJ], [patch, NN], [with, IN]...","[a, clean, patch, with, some, nice, reverb, .,...",57efa1b02564d40b88f9598f,https://api.moddevices.com/v2/pedalboards/57ef...,[],Klean Aula #1
2,2016-09-30T19:39:33.725000,Just a some tools to use with my students,"[[just, RB], [a, DT], [some, DT], [tools, NNS]...","[just, a, some, tools, to, use, with, my, stud...",57eebf752564d40b88f95960,https://api.moddevices.com/v2/pedalboards/57ee...,[],"Mixer, Reverb and Click [rogeriocouto.com.br]"
3,2016-09-20T10:05:34.909000,"Inspired by Tame Impala, having the phaser and...","[[inspired, VBN], [by, IN], [tame, NN], [impal...","[inspired, by, tame, impala, ,, having, the, p...",57e109ef2564d426b976d740,https://api.moddevices.com/v2/pedalboards/57e1...,[],Not-so-tame Impala
4,2016-09-12T17:03:21.188000,Combining a rhythmic delay in parallel with my...,"[[combining, VBG], [a, DT], [rhythmic, JJ], [d...","[combining, a, rhythmic, delay, in, parallel, ...",57d6dfd92564d404d77cd0a4,https://api.moddevices.com/v2/pedalboards/57d6...,[],Ambient Guitar


Obtendo detalhes dos pedalboards para obter os efeitos utilizados

In [13]:
for pedalboard in pedalboards_with_reverb:
    r = requests.get('https://pedalboards.moddevices.com/api/pedalboards/{}'.format(pedalboard['id']))
    data = r.json()

    pedalboard['data'] = data

In [14]:
pedalboards_with_reverb_effects = []

for pedalboard in pedalboards_with_reverb:
    for plugin in pedalboard['data']['plugins']:
        pedalboards_with_reverb_effects.append(plugin['uri'])


pedalboards_with_reverb_effects = list(Counter(pedalboards_with_reverb_effects).most_common())
effects = [uri for uri, _ in pedalboards_with_reverb_effects]
counts = [count for _, count in pedalboards_with_reverb_effects]

Series(counts, index=effects)

http://moddevices.com/plugins/mod-devel/Gain2x2                        9
http://moddevices.com/plugins/mod-devel/Gain                           6
http://drobilla.net/plugins/fomp/reverb                                5
http://guitarix.sourceforge.net/plugins/gxts9#ts9sim                   5
http://guitarix.sourceforge.net/plugins/gx_cabinet#CABINET             5
http://moddevices.com/plugins/caps/AmpVTS                              3
http://moddevices.com/plugins/sooperlooper                             3
http://moddevices.com/plugins/mod-devel/ToggleSwitch4                  3
http://invadarecords.com/plugins/lv2/delay/mono                        2
http://moddevices.com/plugins/mod-devel/HighPassFilter                 2
http://moddevices.com/plugins/tap/tubewarmth                           2
http://guitarix.sourceforge.net/plugins/gx_studiopre#studiopre         2
http://faust-lv2.googlecode.com/Prefreak                               2
http://guitarix.sourceforge.net/plugins/gx_susta_#_

Material de referência de estudos:
* https://rodjun.github.io/analise-kabum.html
* http://www.datascienceacademy.com.br/path-player?courseid=python-fundamentos
* https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks#natural-language-processing
* http://nbviewer.jupyter.org/github/fbkarsdorp/python-course/blob/master/Chapter%204%20-%20Programming%20principles.ipynb#Use-library-references,-3rd-party-modules,-and-google-errors-if-you're-stuck

Links para eu não perder:
* Pedalboards: https://pedalboards.moddevices.com/api/pedalboards?skip=0&take=1000&all=true
* Pedalboard:
 * Details: https://api.moddevices.com/v2/pedalboards/58416f792564d40589088018
 * Image: https://api.moddevices.com/v2/pedalboards/58416f792564d40589088018/thumbnail