ismir2016-ldb-audio-captioning-model-keras

Audio captioning model in Keras

What does it do

This is a general sequence-to-sequence model.

Details are in my ISMIR 2016 LDB extended abstract

Usage

TODO

More details

Motivations

if recommendation.type == playlist:
    if generation_method == automatic:
        raise("No descrption! Users don't understand what the playlists are about and confused. Consider generating descriptions so that music discovery can be easier! ")

E.g.,

Playfully, silly or sublime--this is the sound of Paul in love. (for PaulMcCartney Ballads playlist, Apple music)
Just the right blend of chilled-out acoustic songs to work, relax, and dream to (for Your Coffee Break, Spotify playlist)

Backgrounds

Previously,

Tags prediction for track (Eck et al., 2008)
Tags for playlists (Fields et al., 2010)
Visual avatars for tracks (Bogdanov et al, 2013)

Some techniques we can use

RNNs for sequence modelling
Seq2Seq which uses RNNs to model the relationships between two sequences, e.g., two sentences in different languges
Word2vec or anyother word-embedding methods
ConvNet Convolutional neural networks for various tasks including music

The proposed structure

Input

A sequence of track features
A track feature:
- Concat(audio_feature, text_embedding)
  - audio_feature: audio content feature
  - text_embedding: text summarisation of text data (metadata, lyrics, descriptions...) of the track
    - text summasation method: averaging word embeddings of every word

Output

Sequence of word embeddings
- each word embedding represent each word of the description (e.g. if the ground-truth description is Playfully, silly or sublime--this is the sound of Paul in love, word_embedding(playfully), word_embedding(silly), word_embedding(or),... )

The results

I don't have a proper result and looking for dataset. Anybody help?

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
imgs		imgs
README.md		README.md
audio_captioning_1.1_final_submission.pdf		audio_captioning_1.1_final_submission.pdf
recurrentnet.py		recurrentnet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ismir2016-ldb-audio-captioning-model-keras

What does it do

Usage

More details

Motivations

Backgrounds

Previously,

Some techniques we can use

The proposed structure

The results

About

Releases

Packages

Languages

keunwoochoi/ismir2016-ldb-audio-captioning-model-keras

Folders and files

Latest commit

History

Repository files navigation

ismir2016-ldb-audio-captioning-model-keras

What does it do

Usage

More details

Motivations

Backgrounds

Previously,

Some techniques we can use

The proposed structure

The results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages