Skip to content
More than Just Words: Modeling Non-textual Characteristics of Podcasts
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
attributes_labels Add readme, labels, dataset, and pretrained model Dec 9, 2018
audio_links
popularity_labels Add readme, labels, dataset, and pretrained model Dec 9, 2018
.gitignore Add code for computing spectrograms and training ALPR Dec 28, 2018
README.md Add code for computing spectrograms and training ALPR Dec 28, 2018
alpr.py Fix typo Dec 28, 2018
alpr_extractor.py Add code for computing spectrograms and training ALPR Dec 28, 2018
download_audio_attributes.sh
download_audio_popularity.sh Add readme, labels, dataset, and pretrained model Dec 9, 2018
download_pretrained_model.sh Add readme, labels, dataset, and pretrained model Dec 9, 2018
download_spectrograms_attributes.sh Add readme, labels, dataset, and pretrained model Dec 9, 2018
download_spectrograms_popularity.sh Add readme, labels, dataset, and pretrained model Dec 9, 2018
download_transcriptions_popularity.sh
energy_prediction.ipynb Add code for computing spectrograms and training ALPR Dec 28, 2018
mel_lib.py Add code for computing spectrograms and training ALPR Dec 28, 2018
ops.py
wav_to_spectrogram.py

README.md

Podcast data modeling

This repository contains a podcast dataset and an implementation of the Adversarial Learning-based Podcast Representation (ALPR) introduced in the following paper:

Longqi Yang, Yu Wang, Drew Dunne, Michael Sobolev, Mor Naaman and Deborah Estrin. 2018. More than Just Words: Modeling Non-textual Characteristics of Podcasts. In Proceedings of WSDM’19.

A pretrained model is also included. Please direct any questions to Longqi Yang.

If you use this data or algorithm, please cite:

@inproceedings{yang2019podcast,
  title={More than Just Words: Modeling Non-textual Characteristics of Podcasts},
  author={Yang, Longqi and Wang, Yu and Dunne, Drew and Sobolev, Michael and Naaman, Mor and Estrin, Deborah},
  booktitle={Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining},
  year={2019},
  organization={ACM}
}

Code descriptions

  • Converting a WAV audio into a Mel-Spectrogram: wav_to_spectrogram.py.
  • Training ALPR: alpr.py (The files variable needs to be specified - it should contain a list of spectrogram files).
  • Extracting ALPR using a pretrained model: alpr_extractor.py.
  • Reproducing experimental results: energy_prediction.ipynb.

Data descriptions

Raw podcast audio URLs

Each line of these files contains an podcast episode represented by a JSON object with the following fields:

{
    "url": the URL to download the raw audio,
    "itunes_channel_id": the iTunes channel that the episode belongs to,
    "id": a unique epsiode ID,
    "title": the title of the episode
}

Prediction labels

Prediction features and raw audio (caveats: files are large)

  • Energy and seriousness predictions:
    • Spectrograms:
      data/attributes_prediction_spectrograms/e_[episode id]_[offset].npy
      
    • Raw audio:
      data/attributes_prediction_raw_audio/e_[episode id]_[offset].wav
      
  • Popularity prediction:
    • Spectrograms:

      data/popularity_prediction_spectrograms/e_[episode id]_[0 -- length-1].npy
      
    • Transcriptions:

      data/popularity_prediction_transcriptions/e_[episode id].txt
      
      • A transcription file lists transcribed words with the following format (a word per line):

      a spoken word \t starting time (ms) \t end time (ms)

    • Raw audio:

      data/popularity_prediction_raw_audio/e_[episode id]_[0 -- length-1].wav
      

Reproducing experimental results using the pretrained model

from alpr_extractor import ALPRExtractor

extractor = ALPRExtractor()
extractor.load_model(path='pretrained_model/alpr')
features = extractor.forward((spectrograms + 2) / 2)
You can’t perform that action at this time.