Hack for the SRG SSR Hackdays 2014
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
angular
backend
frontend
.gitignore
README.md
package.json

README.md

SRG Transcriptor

A transcript crawler, search engine and explorer for SRF news and talk shows. http://srf-transcriptor.herokuapp.com/

This was implemented at the SRG SSR Hackdays 2014 and is mostly a proof of concept.

Documentation of data formats used can be found in the wiki.

Run Main Server

npm install
npm start

This will start the server on localhost:3000 serving API endpoints for searching and recieving transcripts. Additionally angular/dist is served statically.

Api Example: http://srf-transcriptor.herokuapp.com/search?q=Geri%20M%C3%BCller

Notes

The front end build is checked in for easy deploy to heroku of the whole application. Could be optimized in the future.

Develop Front End

npm install
npm start
cd angular/
npm install
bower install
grunt serve

The front end dev server will relay requests to the API endpoints to localhost:3000 via grunt-connect-proxy - make sure to also run the main server from the root directory.

Crawler

The crawler is part of the backend.

cd backend
grunt --help

Add a New Show

grunt add:show --id=3b016ffc-afa2-466d-a694-c48b7ffe1783

Fetch Data and Transcripts

This will fetch episode information and transcripts of all added shows.

grunt fetch:shows
grunt fetch:transcripts

Process Data for Delivery

grunt parse:transcripts
grunt parse:shows

Currently the processed data needs to be checked in for deployment.

Dependencies

  • Node.js
  • Grunt
  • Bower
  • Compass

Clips

There is an experimental algorithm included to compose short clips out of text.

Usage:

node backend/clip -m 'Krieg in eine Weile her und es wird Sie eine Weile nicht sehen können, den Fall von ihm zu bekommen, ist nicht ein Problem mit ihm für eine Weile her,'

A MP4 clip will be composed and saved to backend/clips, alongside with a JSON file with the source meta data.

Above message is sourced from @lauraperrenoud tweet and translated to German with Google Translate.