Join GitHub today
What is VisualLabel?
Welcome to the VisualLabel front-end wiki page! The wiki contains useful information about the VisualLabel service.
Table of Contents
VisualLabel is a content analysis framework designed in the Data to Intelligence program. The participants of the multimedia ecosystem - that is, those responsible for the design, development and implementation of the framework - are shown in the Figure below. The Figure also illustrates the overall structure of the framework.
The main goal of the framework is to provide the user meaningful information (metadata) of his/hers multimedia content - in this case, images, videos and social media profiles. This metadata can be used to classify the user's content, and it can be used in advanced search queries. The actual multimedia content can be located on any of the supported content provider services (Facebook, Picasa, Twitter). The user can synchronize the content with the VisualLabel service by connecting his/hers account with the service. Upon synchronization the metadata for the account contents are retrieved and based on this data analysis tasks are created and delivered to analysis back-ends. For images and videos the tasks generally contain basic details of the images (URLs, EXIF-data, etc...), and for social media profiles the tasks may contain, for example, the status messages or tweets the user has posted.
In the context of the VisualLabel framework, the metadata generated by the back-ends is called objects, and they may be simple key-value pairs, such as EXIF details, or more complex information about the content such as persons recognized from the images, or keywords extracted using feature detection or summarization of the user's social media profile. The objects are indexed on the front-end, and can be used to query the user's content. They can also provide suggestions for the user when manually tagging new content.
The flow of events is illustrated in the Figure above. The components of the framework - the front-end and the back-ends - are described in the sub-chapters of this web page.
Demonstration videos about the features and the capabilities of the analysis back-ends and the service as a whole can be found in demonstration videos page.
The front-end is the connecting point for the framework. It offers the necessary APIs for the clients (queries, content modifications, authorization) and manages the external account synchronization, as well as handles the creation and delivery of tasks for the back-ends.
The Front-end is created using the Java programming language, and the source codes for the front-end can be found from the CAFrontEnd repository, and the HTML pages and demos presented during the Data to Intelligence program can be found in the frontend_html repository. Additionally, Javadoc documentation has been generated for online viewing. The interface specification can be found under the main Javadoc page and for convenience, a separate summary page has been generated and can be found here.
The framework supports any number of back-ends. The back-ends are classified by capabilities configured on the front-end. The capabilities define the task delivery rules - that is, which back-end gets what type of task. In the current specification there are five (5) separate task types. The types are:
- Analysis, for content analysis of photo and video content.
- Feedback, for providing feedback for back-end generated tags and search results. The feedback can be utilized by the back-end for improving future analysis results.
- Photo similarity search, the VisualLabel's version of query-by-example. Can be utilized to perform search queries by providing the search terms in a form of URL to photo file or by directly uploading photographic content.
- Facebook summarization, for extracting keywords from user's Facebook profile.
- Twitter summarization, for extracting keywords from user's Twitter profile.
The types can also be thought to represent the features of the VisualLabel framework. Each back-end can support one or more task types. The back-ends and their features are explained in the sub-chapters below.
The MUVIS analysis back-end can be utilized for keyword extraction, face detection and content-based similarity search. The source code repository for the back-end can be found here.
More information on MUVIS can be found in the system's own web page.
The MUVIS analysis back-end can be utilized for keyword extraction and face detection. The source code repository for the back-end can be found here.
More information on PicSOM can be found at the The Content-Based Image and Information Retrieval Group web page or trying out the demo page.
The text summarizer can be used to analyze social media profiles, and to extract keywords from the profile content. The source code repository for the summarizer can be found here and a minimal web service implementation for testing the summarizer can be found here.