No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
dashboard
focusedcrawler
focusedcrawlerclient
nutch-plugin
search
solr-example-conf
wikipictures
.gitignore
README.md

README.md

LinkedTV project
Supported by LinkedTV

IRAPI - media search engine

IRAPI is a repository that holds all parts of builded MEDIA SEARCH ENGINE, which was initially developed for the LinkedTV project.

Folder description

  • "dashboard" : contains web application which displays detailed statistics for Apache Solr index and allows to edit the seed list.

  • "nutch-plugin" : contains plugin for Apache Nutch 2. Its purpose is to extract media from webpages

  • "solr-example-conf/cores" : example configuration for Apache Solr index compatible with data structure required by the media-extractor (nutch-plugin)

  • "search" : contains web application providing endpoint for searching over indexed media data

  • "focusedcrawler, focusedcrawler client" : application for focused on-demand video crawling (wraps on-line search of several news websites) Within IRAPI, the focused crawl is triggered by query issues against the search web application.

Note: While the project is customized for LinkedTV purposes, it can serve as inpiration or template for other related uses.

More information about usage and instalations to individual application on related wiki pages or in folders README.