Clone this wiki locally
Schema is a module for the Play! framework (1.2.X) which helps you to create full semantic stack web apps. The
view layers of your
Play! app are picked from a series of template classes following the schema.org structure.
Schema helps you to provide
next generation content for search engines and to program your application in the scope of the Web.
Release date version: September 2012
Author: Samuel Croset
Discussion: Play! group with
[schema-1.0] in subject line.
"full semantic stack web app"?
The way we search things on the Web is about to dramatically change in the coming years. We are entering in an era were the information published will be much more structured than before. To understand it better, let's look at the current semantic stacks.
The application semantics
Nowadays, when you build a web application (for instance with Play! or any MVC framework), you define first a solid structure for your
The model will help you to interact, query and extend your application. The content of the model is manipulated by a
controller and then rendered in a
view. The view is an HTML page template that will be used to publish information on the Web. In this context, you have strict semantics (the
within the scope of your application.
The web page semantics
view is really important, because it is the direct exposition of your data into the WWW jungle. The
view will be parsed and indexed by search engine
companies like Google or Yahoo for example. Then normal users (your parents, your friends) will use these very search engines to look for relevant
information about whatever they are interested in. Search engines will try to give the most relevant answers,
but they still fail hard on trivial tasks.
This problem comes mostly from the fact that most of the information on the Web is in a free-text form, meaning it's just plain text. And raw text is
super ambiguous. For instance a string of character such as
Taj Mahal can be used to refer to a place, a restaurant or a casino. How can Google
tell which one you want based on the query
Taj Mahal? Pretty much just by looking at the popularity and trust of web pages containing
the queried string (PageRank). The usage of plain text affects also smarter usage of your data. For instance, it is almost impossible to convert this email
you received containing your dentist appointment into an entry in your calendar in just one click, as the natural language is often too complex to be correctly
When you publish a web page (with Play!), you totally destroy the time and efforts you have put in coming up into a nice structure in the first
model). Your web page are pretty much a soup of HTLM elements for the display with little meaning for search engines. Even if you had a class
Restaurant in your
model layer, it becomes completely invisible for the robot crawling the web page that you are actually referring to the concept of
Restaurant. When you
Restaurant object on a page, it will for instance have a
<h1> tag with the name of the place and a
<p> containing the data about it (for instance the menu), but there is no
<restaurant> tag to explicitly
tell search engines that this page is about
Restaurant, and that they could use this object as such to help users searching
Everyone would benefit from a clearer and apparent structure: Search engines as they could use better your data, users as they could find better your information and you because you could drive more traffic and get your business easier noticed and integrated.
Full stack semantics
Here by full stack semantics I mean the following idea: A common data structure that is shared at the scope of the Web, that you will re-use to implement
view layers (and CSS and URL patterns). The shared structure is fairly simple to implement and allows an improved interoperability between
web developers, search engines and Web users.
schema module assists you to build such a framework and embed your application in the scope of the Web.
Our good old Web of ambiguity is turning into a Web of Objects, thanks to initiatives such as the Google Knowledge Graph or the Facebook Graph. Go on Google and type "things to do in paris" or "taj mahal". You see that real-life objects are appearing, such as pictures of monuments, etc...
In order to improve the publication of objects on the Web, the major search engine companies sat down together and came up with a series of
universal classes, schema.org. These classes are describing the common things one encounters on the Internet, as
well as the relationships they have between one another. The schema.org classes are not Java classes, it's just a taxonomy of common concepts
you can use to annotate web pages. These classes are powerful, because they provide provide a common interface between all the players on
the Web: Users, developers and search engine providers. The
schema module will help you to implement these classes using Play! and to
provide search engines with objects that they are likely going to re-use.
OK, so long for the introduction, I invite you to read the short but comprehensive documentation provided by schema.org before moving
to the example application, were I describe how to work with the