Skip to content
Browse files

beta release

  • Loading branch information...
1 parent 3a26fc4 commit 1dcb23905eebfedd12a0d4bb6d8bb6a99a04cce9 @thkoch2001 committed Mar 30, 2012
Showing with 271 additions and 207 deletions.
  1. +16 −0 latex/erklaerung.tex
  2. BIN latex/restful_groupware.pdf
  3. +219 −207 latex/restful_groupware.tex
  4. +36 −0 latex/titel.tex
View
16 latex/erklaerung.tex
@@ -0,0 +1,16 @@
+% Bitte halten Sie sich auch an diese Erklärung, sollten wir herausfinden,
+% dass nicht angegebene Quellen benutzt wurden oder wörtliche Zitate nicht
+% als solche gekennzeichnet werden, so führt dies automatisch zum Nichtbestehen
+% der Arbeit.
+
+\thispagestyle{empty}
+
+Ich erkläre hiermit, die folgende Bachelor Arbeit selbständig verfasst zu
+haben. Andere als die angegebenen Quellen und Hilfsmittel habe ich nicht
+benutzt. Wörtliche und sinngemäße Zitate sind kenntlich gemacht.
+
+\vspace{0.5cm}
+Kreuzlingen, den 5.4.2012
+\vspace{1.5cm}
+
+Thomas Koch
View
BIN latex/restful_groupware.pdf
Binary file not shown.
View
426 latex/restful_groupware.tex
@@ -89,26 +89,28 @@
% Entwicklung einer REST-konformen Schnittstelle für die Opensource-Groupware
% Kolab mit Unterstützung verschiedener Medientypen
-\title{A REST API for the Groupware Kolab with support for different Media Types}
-
-\subtitle{bachelor thesis}
-
-\author{Thomas Koch}
-\publishers{Fernuniversität Hagen\\Faculty of mathematics and computer science}
-\date{\today}
-
-\maketitle{}
+\include{titel}
\thispagestyle{empty}
-\pagenumbering{Roman}
-\vspace{3cm}
-\begin{center}
- matriculation number 7250371
-\end{center}
-\vspace{3cm}
-\begin{center}
-Professor Dr.-Ing. Bernd J. Krämer\\
-Dipl.-Inf. Silvia Schreier
-\end{center}
+\cleardoublepage
+\include{erklaerung}
+\pagenumbering{roman}
+
+
+% \title{A REST API for the Groupware Kolab with support for different Media Types}
+% \subtitle{bachelor thesis}
+% \author{Thomas Koch}
+% \publishers{Fernuniversität Hagen\\Faculty of mathematics and computer science}
+% \date{\today}
+% \maketitle{}
+% \vspace{3cm}
+% \begin{center}
+% matriculation number 7250371
+% \end{center}
+% \vspace{3cm}
+% \begin{center}
+% Professor Dr.-Ing. Bernd J. Krämer\\
+% Dipl.-Inf. Silvia Schreier
+% \end{center}
\section*{Aufgabenstellung}
\pagestyle{headings}
@@ -155,6 +157,25 @@ \section*{Aufgabenstellung}
% REST-Schnittstelle
% \end{itemize}
+% \section*{Selbstständigkeitserklärung}
+% % Keine Kopf- und Fußzeilen ausgeben
+% %\thispagestyle{empty}
+% % Aber trotzdem ins Inhaltsverzeichnis aufnehmen
+% % \addcontentsline{toc}{chapter}{Selbstständigkeitserklärung}
+% % Hier der offizielle Text der eidesstattlichen Erklärung
+% Ich erkläre hiermit, dass ich die vorliegende Arbeit selbstständig und ohne
+% Benutzung anderer als der angegebenen Hilfsmittel angefertigt habe; die aus
+% fremden Quellen direkt oder indirekt übernommenen Gedanken sind als solche
+% kenntlich gemacht. Die Arbeit wurde bisher in gleicher oder ähnlicher Form
+% keiner anderen Prüfungskommission vorgelegt.
+% % Etwas Abstand für die Unterschrift
+% \vspace{3cm}
+% % Hier kommt die Unterschrift drüber
+% \begin{tabbing}
+% \hspace{6cm} \= \kill
+% \textit{Kreuzlingen, 4. April 2012} \> \textit{Thomas Koch}
+% \end{tabbing}
+
\tableofcontents{}
\begin{abstract}
@@ -997,31 +1018,40 @@ \subsubsection{Locking}
\section{REST Interactions Design}
\label{sec:interactions}
+
+The Atom Syndication Format\cite{RFC4287}, Atom Publishing
+Protocol\cite{RFC5023}, OpenSearch\cite{Clinton} and optional extensions are
+presented in this section as existing, fundamental building blocks to allow the
+design of a restful Groupware API.
+
\subsection{Discovery of Collections}
\label{sec:disc-coll}
An ideal Rest API is accessed by one main URI and all other resources can be
-discovered by following links. A useful Media type to discover available
+discovered by following links. A useful Mediatype to discover available
collections is the Atom Service Document\cite[sec. 8]{RFC5023}. It contains
links to collections organized in workspaces and annotated with meta data.
A Groupware client most likely needs to discern the available collections by the
contained resources as to consume and present them with the appropriate user
-interfaces for contacts, calendar data, etc. A first idea could be to use the
-Media types declared in the ``accept'' tag of a collection to identify types of
+interfaces, e.g.,for contacts or events. A first idea could be to use the
+Mediatypes declared in the ``accept'' tag of a collection to identify types of
collections. However the specification explicitly states that this tag
-``specifies a type of representation that can be POSTed to a Collection''. If a
-collection can only be read then no accept tag should be present and thus also
-not available for interpretation.
+``specifies a type of representation that can be POSTed to a
+Collection''\cite[sec. 8.3.4]{RFC5023}. If a collection can only be read, then
+no accept tag should be present and thus also not be available for
+interpretation.
-A standard conform approach is demonstrated by Google's Data
+A standard conformant approach is demonstrated by Google's Data
Protocol\footnote{\citeurl{http://code.google.com/apis/gdata/docs/2.0/elements.html}{2012-2-28}}
and by an internal project at
IBM\footnote{\label{snellatomcategory}\citeurl{http://www.imc.org/atom-syntax/mail-archive/msg18208.html}{2012-2-28}}. Both
-use atom categories\cite[sec. 8.3.6]{RFC5023} to mark the type of atom
-entries. James Snell proposed a standard URI to identify the semantic of
-categories\footref{snellatomcategory} but no follow up to this could be
-found. The use of categories to attach arbitrary meaning, e.g ``event type
+use Atom categories\cite[sec. 8.3.6]{RFC5023} to mark the type of Atom entries,
+as shown in the first two elements of Listing \ref{fig:atom-category}. James
+Snell proposed a standard URI to identify the semantics of
+categories\footref{snellatomcategory}, demonstrated by the last two tags in
+Listing \ref{fig:atom-category}. However no follow up to this could be
+found. The use of categories to attach arbitrary meaning, e.g, ``event type
(product or promotion), and its status (new, updated, or cancelled)'' to feeds
and entries is also recommended in \cite[p. 200]{Webber2010}.
@@ -1031,9 +1061,10 @@ \subsection{Discovery of Collections}
a client could of course also fetch the feeds and identify the media types of
the included media entries.}
-An alternative Media Type to Service Documents in JSON could not be found. The
-most promising approach seems to list available collections in a
-\lstinline:application/vnd.collection+json: representation. (\autoref{sec:media-types-coll})
+An alternative Mediatype to Service Documents in JSON format could not be
+found. The most promising approach seems to list available collections in a
+\lstinline:application/vnd.collection+json: representation
+(\autoref{sec:media-types-coll}).
\begin{anylisting}[label=fig:atom-category,
caption={ATOM categories as used by Google and IBM to mark entry
@@ -1070,14 +1101,14 @@ \subsection{Personalized Service Documents} For a Groupware that manages
be to require the user to authenticate when requesting the unique entrance URI
and to answer with a HTTP code ``307 Temporary Redirect'' to the user's
personalized Service Document after successful
-authentication.\footnote{Alternative all Service Documents could be served under
+authentication.\footnote{Alternatively, all Service Documents could be served under
the entrance URI with different HTTP Content-Location
headers\cite[sec. 14.14]{RFC2616}. In that case the personalized Service
Document must however also be available at the indicated location.}
% http://www.berenddeboer.net/rest/authentication.html
-\subsection{Atom Publishing Protocol}
+\subsection{CalAtom and CardAtom}
\label{sec:atom-publ-prot}
% Rob Yates \url{mailto:robert_yates@us.ibm.com}
@@ -1117,36 +1148,42 @@ \subsection{Synchronizing Collections}
to synchronize a full collection is under one minute in most cases. This should
be acceptable for an initial synchronization that is only done once on rare
occasions when a desktop machine or mobile device is first used. If subsequent
-synchronizations only transfer a few resources, that have changed since the last
-synchronization then such updates can be made in the order of a second.
+synchronizations only transfer a few resources that have changed since the last
+synchronization, then such updates are expected to complete fast enough to not
+be a usability concern.
All client scenarios except of a Web Browser client that is used only once, can
-profit from the above scenario. In such a case other interaction patterns need
-to be used (\autoref{sec:spec-reports-search}).
+profit from the above scenario. For such a browser, sections
+\ref{sec:spec-reports-search} and \ref{sec:vcards-soci-netw} propose alternative
+interactions.
The Atom Publishing Protocol identifies collections of resources as Atom
Feeds. Feeds can also be used to synchronize collections. The necessary
ingredients are the link relation ``next''\cite{RFC5005}, the concept of a
``deleted entry''\cite{draft-snell-atompub-tombstones-14} and the prerequisite
-that the feed entries must ``be ordered by their "app:edited" property, with the
+that the feed entries must ``be ordered by their 'app:edited' property, with the
most recently edited Entries coming first in the document
order''\cite[sec. 10]{RFC5023}.
The API server design has the notion of a logical feed that can be split up in
multiple real Atom feeds linked with the relation ``next''. Updated or new
entries are always inserted as first element of the first feed since their
``app:edited'' property is the most recent. Inserting a new entry at the top of
-a feed can lead to entries at the end of the feed being pushed to the subsequent
-feed. This push needs to be atomic such that a client loading subsequent feeds
-may see an entry twice, at the end of a previous feed and the top of the next
-feed, but will never miss an entry in this scenario.
+a feed can lead to entries at the end of that feed being pushed to the
+subsequent feed. This push needs to be atomic such that a client loading
+subsequent feeds may see an entry twice, at the end of a previous feed and the
+top of the next feed, but will never miss an entry in this scenario.
In the case of an initial synchronization, the client loads the initial feed and
-all subsequent feeds linked with the ``next'' relation and adds all Resources
-associated with the feeds entries to its local storage. Resources can either be
-included completely in the content tag of an entry or be linked to by the
-entry. The client memorizes the ``app:edited'' value of the first entry of the
-first feed for subsequent synchronizations.
+all subsequent feeds linked with the ``next'' relation. It also loads all
+resource representations referenced by the feed entries with ``edit-media''
+typed links and saves those to its local storage. An entry can include multiple
+``edit-media'' links pointing to representations with different Mediatypes. In
+that case it is up to the client to select a preferred variant.
+
+At the end of this process, the client memorizes the ``app:edited'' value of the
+first entry of the first feed. This timestamp can be used by the client to stop
+subsequent synchronizations at the first entry with an older timestamp.
It is possible, that the collection has been modified during the
synchronization. Therefor the client should directly conclude with an update
@@ -1156,37 +1193,37 @@ \subsection{Synchronizing Collections}
the client must follow several ``next'' links or even load all feeds in the
extreme case.
-If the client followed a ``next'' link during a synchronization then it will
+If the client followed a ``next'' link during a synchronization then it must
make sure at the end of the synchronization that the first feed has not changed
-meanwhile most probably with a conditional GET request. After this last request
-indicates no further changes the client knows that its local collection is in
-the state of the servers location at the time of the last GET request.
+meanwhile, most probably with a conditional GET request. After this last request
+indicates no further changes, the client knows that its local collection is in
+the state of the servers collection at the time of the last GET request.
-\begin{figure}[tb]
+% \begin{figure}[tb]
- \begin{tikzpicture}
- [doc/.style={rectangle,draw=blue!30,fill=blue!20, node distance=6em}]
- \node[doc] (svc) [] {ServiceDoc};
- \node[doc] (collect) [below=of svc,text width=5em] {Contacts Collection}
- edge [<-] node {} (svc)
- edge [->, loop left] node {next} (collect);
+% \begin{tikzpicture}
+% [doc/.style={rectangle,draw=blue!30,fill=blue!20, node distance=6em}]
+% \node[doc] (svc) [] {ServiceDoc};
+% \node[doc] (collect) [below=of svc,text width=5em] {Contacts Collection}
+% edge [<-] node {} (svc)
+% edge [->, loop left] node {next} (collect);
- \node[doc] (calcollect) [below right=of svc,text width=5em] {Calendar Collection}
- edge [<-] node {} (svc);
+% \node[doc] (calcollect) [below right=of svc,text width=5em] {Calendar Collection}
+% edge [<-] node {} (svc);
- \node[doc] (entry) [below=of collect] {Entry (xCard)}
- edge [<-] node {1..n} (collect)
- edge [->] node [text width=5em] {personal calendar} (calcollect)
- edge [->, loop left] node {related xCards} (entry);
+% \node[doc] (entry) [below=of collect] {Entry (xCard)}
+% edge [<-] node {1..n} (collect)
+% edge [->] node [text width=5em] {personal calendar} (calcollect)
+% edge [->, loop left] node {related xCards} (entry);
- \node[doc] (event) [below=of calcollect] {Entry (xCal event)}
- edge [<-] node {1..n} (calcollect)
- edge [->] node [below right] {participants} (entry);
+% \node[doc] (event) [below=of calcollect] {Entry (xCal event)}
+% edge [<-] node {1..n} (calcollect)
+% edge [->] node [below right] {participants} (entry);
- \end{tikzpicture}
- \caption{Discovery paths from the Service Document to individual Groupware Resources}
-\end{figure}
+% \end{tikzpicture}
+% \caption{Discovery paths from the Service Document to individual Groupware Resources}
+% \end{figure}
\subsection{Efficient Synchronization with HTTP Delta encoding}
\label{sec:effic-synchr-with}
@@ -1203,15 +1240,15 @@ \subsection{Efficient Synchronization with HTTP Delta encoding}
implemented\footnote{\citeurl{http://www.wyman.us/main/2004/09/implementations.html}{2012-1-6}},
even in the popular Microsoft Internet
Explorer\footnote{\citeurl{http://blogs.msdn.com/b/rssteam/archive/2006/04/08/571509.aspx}{2012-3-9}}
-and the author claims substantial bandwidths saving opportunities
+and the author claims substantial bandwidth saving opportunities
\footnote{\citeurl{http://wyman.us/main/2004/10/massive_bandwid.html}{2012-3-9}}.
The idea of delta encoding is that a server can respond to conditional GET
requests with only a small, special patch. The client applies the patch to its
cached representation of the requested resource which results in the new version
-of the resource. All currently IANA registered IMs are byte
-oriented\footnote{\citeurl{http://www.iana.org/assignments/inst-man-values/inst-man-values.xml}{2012-3-9}}. These
-methods however don't add substantial benefit for the case of synchronization
+of the resource. However all currently IANA registered IMs are byte
+oriented\footnote{\citeurl{http://www.iana.org/assignments/inst-man-values/inst-man-values.xml}{2012-3-9}}
+and thus don't add benefit for the case of synchronization with
feeds.\footnote{Byte oriented IMs might however be very beneficial to serve
updates of xCard/xCal resources if only one or a few fields changed.}
@@ -1237,7 +1274,7 @@ \subsection{Efficient Synchronization with HTTP Delta encoding}
response uses HTTP code ``226 IM Used''\cite{RFC3229} to mark the response as a
special one that is not the regular, cachaable representation.
-It is advisable to also include a ``next'' link to the subsequent feed to keep
+It may be advisable to also include a ``next'' link to the subsequent feed to keep
compatibility with the synchronization process from \autoref{sec:synchr-coll}
and prevent the client from accidentally considering the returned feed to
contain the full collection. The ``next'' link however would probably cause the
@@ -1246,50 +1283,34 @@ \subsection{Efficient Synchronization with HTTP Delta encoding}
from its database to satisfy the client's terminating condition. Or the server
could include an artificial, minimal
deleted-entry\cite{draft-snell-atompub-tombstones-14} tag with a non-existent
-ref value and a when value just older then the etag sent by the client:
+ref value and a ``when'' value just older then the etag sent by the client:
+
\begin{lstlisting}
<at:deleted-entry
xmlns:at="http://purl.org/atompub/tombstones/1.0"
ref="tag:example.org,2005:NONEXISTENT"
when="2005-11-29T12:11:12Z"/>
\end{lstlisting}
-The above precautions not to break the client's synchronization logic is
-necessary to permit the server to also respond to RFC3229+Feed requests with
-paginated feeds in cases where more entries have changed then the server is
-comfortable to include in a single response.
+
+If more entries have changed then the server is comfortable to include in one
+response, then the server is free to respond with a regular feed and the status
+code 200.
\subsection{Media Entries and the content tag}
\label{sec:inline-feeds-or}
-The Atom format provides the opportunity to include a full representation of a
-resource in the content tag of an entry\cite[sec. 4.1.3]{RFC4287}. It is thus
-possible to embed complete xCard or xCal resources in the Atom feed
-and so to relieve the client from issuing many GET requests for each individual
-resource.
-
-The benefit of saved GET requests must be balanced with the possible
-disadvantage of serving the client resource representations already seen. A
-client that does regular updates may probably be interested only in the first
-one or two entries of a feed while the server might have made the effort to
-produce tens of entries.
-
-On the other hand the Atom Format mandates that an entry without embedded
-content must provide a summary element. It may not make much of a difference in
-bandwidth and processing whether a summary is produced or the full content is
-provided.
+The Atom Feed format provides the opportunity to include a full representation
+of a resource in the content tag of an entry\cite[sec. 4.1.3]{RFC4287}. The Atom
+Publishing Protocol however mandates, that a ``Media Link Entry MUST have an
+atom:content element with a 'src' attribute''\cite[sec. 9.6]{RFC5023}. The
+latter requirement in turn triggers two requirements of the Atom Feed format,
+namely that the entry should have a summary tag and that the content tag must be
+empty\cite[sec. 4.1.1.1,4.1.3.2]{RFC4287}.
-Different optimization strategies are possible here, e.g.
-
-\begin{itemize}
-\item The first feed in a sequence of paged feeds could contain only very few
- entries to optimize for regular updates and have more entries in all following
- feeds.
-\item The server could remember the entries already consumed by an authenticated
- client and serve only new entries in the first feed.
-\end{itemize}
-
-In any case it is mandatory that a client can handle embedded content as well as
-linked content.
+It is unfortunate, that the Atom Publishing Protocol does not allow the direct
+inclusion of the managed resource in the content tag. This requires the client
+to issue additional GET requests for each resource instead of extracting it
+directly from the feed.
\subsection{Modifying Resources and Offline editing}
@@ -1328,13 +1349,13 @@ \subsection{Special Reports, Queries, Search}
\label{sec:spec-reports-search}
In few cases it may not be feasible for a client to synchronize a full
-collection, e.g. due to low bandwidth. This section explores restful ways to let
-the client request only a subset (selection) of a collection. More specifically
-the client should be informed about possible query facilities without relying on
-out-of-band information.
+collection, e.g. due to low bandwidth or limited memory. This section explores
+restful ways to let the client request only a subset (selection) of a
+collection. More specifically the client should be informed about possible query
+facilities without relying on out-of-band information.
A promising approach is to use the de-facto standard
-OpenSearch\cite{Clinton}. According to its homepage it is implemented by most
+OpenSearch\cite{Clinton}. According to its homepage, it is implemented by most
major browsers, search engines and many other sites. OpenSearch is also
recommended for the link type ``search'' in the HTML5
standard\cite[sec. 4.12.4.12]{Hickson2011a}. The default format of an OpenSearch
@@ -1367,11 +1388,10 @@ \subsection{Special Reports, Queries, Search}
provides the possibility to sort result sets which might be interesting to
present an address book sorted by names.
-Search result Atom feeds can make use of annotated HTML
-(\autoref{sec:microdata}) in the summaries of entries and should not embed full
-resources in the content tag. Thus the client can still provide a structured
-view of the data, like calendar views or a tabular contacts list without the
-need to transfer full representations.
+Search result Atom feeds can make use of semantically annotated HTML (Microdata,
+\autoref{sec:microdata}) in the summaries of entries. Thus the client can still
+provide a structured view of the data, like calendar views or a tabular contacts
+list without the need to transfer full representations.
The OpenSearch specification suggests that links to the OpenSearch Description
Document for an Atom feed might be added inside a feed tag. There is however no
@@ -1392,6 +1412,10 @@ \subsection{Special Reports, Queries, Search}
\section{Other Design Considerations}
\label{sec:design}
+This section discusses remaining design considerations that are not directly
+connected to the interactions of the Atom Publishing Protocol and OpenSearch but
+rather to the Mediatypes and formats used to represent Groupware specific data.
+
% > - Es wird gezeigt, dass AtomPub mit ein paar, meist bereits standardisierten
% > Ergänzungen eine sinnvolle, resourcenorientierte Alternative zu C.*DAV,
% > OpenSocial ist.
@@ -1417,59 +1441,57 @@ \section{Other Design Considerations}
% > - Problematik des Updates mit nicht isomorphen Medientypen, Möglichkeiten,
% > damit umzugehen
-
\subsection{Media Type conversion and non-isomorphism}
-Two media types are non isomorphic, if at least one of them can express
-information which the other could not express. For example the vcard media type
-defines many property parameters that have no equivalent in portable contacts,
-like language, altid or sort-as. So a conversion of a vcard into portable
-contacts will most likely lose this data.
+Two Mediatypes are non isomorphic, if at least one of them can express
+information which the other could not express. For example the vCard Mediatype
+defines many property parameters that have no equivalent in PortableContacts,
+like language, altid or sort-as. So a conversion of a vCard into
+PortableContacts will most likely lose this data.
This data loss could first be a problem when a client receives a
-representation. However since the client negotiated the media type with the
+representation. However since the client negotiated the Mediatype with the
server it is most likely that it is satisfied with only the data representable
-in that data type.
+in that type.
-Now if the client uses such a media type in a put request to update a resource,
-it may not be clear how to deal with the information the client could not
-express in the submitted resource. Should it be deleted or merged with the new
-representation?
+Now if the client uses such a Mediatype in a put request to update a resource,
+it may not be clear how to deal with the information that the client could not
+express in the submitted resource. Should it be deleted or should data from the
+server be merged with the new representation?
Different strategies are possible in such scenarios and must be selected for the
individual use case:
\begin{enumerate}
-\item The server accepts updates only for one media type while serving other
- media types in a ``read-only'' mode.
+\item The server accepts updates only for one Mediatype while serving other
+ Mediatypes in a ``read-only'' mode.
\item The server accepts PATCH requests\cite{RFC5789} as a compromise while
- still not accepting certain media types for updates
+ still not accepting certain Mediatypes for updates
(\autoref{sec:patching-resources}).
-\item The implementer decides to either merge or deletes information not
- representable in a received media type and lives with the consequences. In the
+\item The implementer decides to either merge or delete information not
+ representable in a received Mediatype and lives with the consequences. In the
case of contact information this can be a valid strategy since the most
- essential information is representable in all media types. The server
- practically only works with data in the intersection of all supported media
- types.
-\item Available facilities to extend media types are used to establish
- isomorphism. Vcard for example allows the addition of arbitrary properties
+ essential information is representable in all Mediatypes. The server
+ practically only works with data in the intersection of all supported
+ Mediatypes.
+\item Available facilities to extend Mediatypes are used to establish
+ isomorphism. VCard for example allows the addition of arbitrary properties
prefixed with ``x-''.
\item The server implements version control so that the situation can be
resolved manually later.
\end{enumerate}
-The creation of resources can be handled more liberate then updating, since no
+The creation of resources can be handled more liberate than updating, since no
state on the server exists that could be lost.
-
\subsection{Microformats, Microdata, RDFa}
\label{sec:microdata}
%TODO Schreier:nach dem lesen des einführenden Abschnitts weiss der Leser nicht, was er im Rest des Abschnitts zu erwarten hat, dementsprechend ist die Motivation für die Abschnitte unklar und es fällt schwer den roten Faden zu finden
HTML documents are primarily meant to be rendered by browsers and interpreted by
humans. It is hard for a machine to interpret the meaning of text and data
included in an HTML document. To help this, different techniques have evolved to
-add additional meta data to HTML that allows machines to identify structured
+add additional meta data to HTML thus allowing machines to identify structured
data in HTML without having an impact on the rendering. The most popular ones,
Microformats, Microdata and RDFa, are presented and discussed in
\cite{Tennison2012}.
@@ -1494,12 +1516,12 @@ \subsubsection{Use Cases}
``Operator''.\footnote{\citeurl{https://addons.mozilla.org/en-US/firefox/addon/operator/}{2012-2-20}}
It allows to extract annotated entities from web pages. A user could thus import
contact or event data from arbitrary web pages in his personal information
-manager with one click\footnote{Apparently, Android phones can import annotated
- addresses from web pages directly to their address books.}. Semantic
-annotations can also be used to make web content accessible to disabled
+manager with one click\footnote{Apparently, Android phones can directly import
+ annotated addresses from web pages too.}. Semantic annotations can also be
+used to make web content accessible to disabled
people\cite{Yesilada:2007:EDS:1279700.1279704}.
-A third use case is currently under development as part of the European Union
+Another use case is currently under development as part of the European Union
Research Project ``Interactive Knowledge Stack'' (IKS) that builds a semantic
content management stack. The sub-project ``Vienna IKS Editables''
(VIE)\footnote{\citeurl{http://www.iks-project.eu/projects/vienna-iks-editables}{2012-2-20}}
@@ -1508,8 +1530,8 @@ \subsubsection{Use Cases}
building editing interfaces for those. A modified entity can then be sent to the
server via AJAX in a format called ``json-ld'' that serializes semantic data to
JSON.\footnote{\citeurl{http://json-ld.org}{2012-2-20} the iana registration of
- the mime type \lstinline:application/ld+json: is currently discussed}
-For a Groupware, this editor could be used to automatically create HTML forms instead
+ the mime type \lstinline:application/ld+json: is currently discussed} For a
+Groupware, this editor could be used to automatically create HTML forms instead
of creating them on the server site.
\begin{anylisting}[label=fig:microdata-atom-summary,
@@ -1552,8 +1574,8 @@ \subsubsection{Format selection}
decide which to implement. It is possible to implement multiple formats in
parallel inside the same HTML document, but this means more markup and a more
complex publishing task\cite{Tennison2012}. This choice is not a choice of
-different Media Types, but a choice in the scope of the Media Type text/html (or
-application/xhtml+xml).
+different Mediatypes, but a choice inside the scope of the containing Mediatype
+text/html (or application/xhtml+xml).
A first consideration has to be the ability of expected consumers to handle the
format, a second consideration the available tooling to produce a particular
@@ -1562,8 +1584,9 @@ \subsubsection{Format selection}
RDFa with XHTML or HTML5 and Microdata introduces special attributes that work
only with HTML5\cite{Tennison2012}.
-Microdata is part of HTML5 and a standard effort of the W3C\cite{Hickson2011}.
-It is also backed up by the schema.org effort of Google and
+Microdata is part of HTML5 and a standardization effort of the
+W3C\cite{Hickson2011}. It is also backed up by the schema.org effort of Google
+and
Microsoft.\footnote{\citeurl{http://schema.org/docs/gs.html\#microdata_why}{2012-2-17}}
The schema.org vocabulary in turn has been mapped to the semantic world by
researchers working on linked
@@ -1717,13 +1740,20 @@ \subsection{VCard's (social) network properties}
\section{Implementation}
\label{sec:implementation}
+Based on the requirements and design considerations of the previous sections,
+this section presents a Java based implementation of a restful Groupware API
+supporting different Mediatypes. The solution is based on the
+JAX-RS\cite{JAX-RS1.1} implementation Jersey, relies on dependency injection
+provided by Guice and introduces a new concept tentatively called ``Resource
+Facades''. The latter abstracts from different possible representations of a
+resource and thus applies a restful principle from the network layer in the
+implementation layer.
\subsection{Control Flow Overview}
\label{sec:overview}
\autoref{fig:executionflowoverview} outlines the most important classes for the
-control flow. The implementation relies on the JAX-RS\cite{JAX-RS1.1}
-implementation Jersey to route calls to the four different ``Jersey Resources''
+control flow. Jersey routes calls to the four different ``Jersey Resources''
classes, representing Atom Service Documents, Atom Collections, Atom Entries and
Media Resources of different Mediatypes.
@@ -1746,7 +1776,7 @@ \subsection{Control Flow Overview}
\lstinline:Resource: class. The Precond(itions) parameter is a wrapper class
around the corresponding HTTP headers\footnote{If-Match, If-None-Match,
If-Modified-Since, If-Unmodified-Since}. It provides
-\lstinline:shouldPerform(etag, updated):bool: methods that the storage must call
+\lstinline;shouldPerform(etag, updated):bool; methods that the storage must call
with the resource's etag, last update timestamp or both. The CollectionStorage
indicates with each methods return value, whether it actually performed any
action. The GetResult and ResultList classes are simple tuple classes wrapping
@@ -1789,7 +1819,7 @@ \subsubsection{Resource properties}
\item generic meta properties: title, summary, author
\item a Mediatype independent interface corresponding to the concept represented
by the resource e.g., a person, location, event, product, \ldots
-\item a mediatype specific serialization (representation) of the resource
+\item a Mediatype specific serialization (representation) of the resource
\end{itemize}
The id and update time are required for the synchronization protocol outlined in
@@ -1801,9 +1831,9 @@ \subsubsection{Resource properties}
efficient processing of conditional HTTP requests.
The generic meta properties of the resource can be used to fill the
-corresponding tags of an atom entry. They can either be extracted from a
+corresponding tags of an Atom entry. They can either be extracted from a
meaningful property of the resource or be provided to the resource. E.g., the
-author property could be extracted from the meta data of an image file (EXIF),
+author property could be extracted from the meta data of an image file (EXIF) or
set to the organizer of an ical event. The title of a contact resource in the
implementation is set to its full name and email. The summary also contains the
address and phone number.
@@ -1838,34 +1868,36 @@ \subsubsection{Resource properties}
produce the requested Mediatype (\autoref{sec:resourcefacades}). Accordingly the
\lstinline:ResourceWriterJerseyProvider: of \autoref{fig:executionflowoverview}
is trivially simple: It just calls the asMediaType method of the provided
-resource. The Resource is responsible for providing a mediatype specific
+resource. The Resource is responsible for providing a Mediatype specific
representation of itself.
The Resource class outlined in this section does not correspond to the equally
-named resource class concept in JAX-RS\cite{JAX-RS1.1}. The JAX-RS resource
-classes in this work are found in the ``Jersey Resources'' package of
-\autoref{fig:executionflowoverview}. But they do not really represent resources
-but rather the binding of resources to URIs and their processing logic.
+named resource class concept in JAX-RS\cite{JAX-RS1.1}. The later kind of
+resource classes are found in the ``Jersey Resources'' package of
+\autoref{fig:executionflowoverview}. However such JAX-RS resource classes do not
+really represent REST resources but rather the binding of resources to URIs and
+their processing logic.
\subsubsection{Resource life cycle}
\label{sec:resource-life-cycle}
Resource classes in this work have a four staged life cycle. The first stage is
represented by the \lstinline:UnparsedResource: class, instantiated by the
-ResourceReaderJerseyProvider class for post or put requests. In this stage the
-Resource has already been assigned an appropriate \lstinline:Reader:
+ResourceReaderJerseyProvider class for post or put requests. In this stage, the
+resource has already been assigned an appropriate \lstinline:Reader:
implementation according to the Content-Type request header but it has not yet
received an Id and update timestamp.
The \lstinline:Resource: class represents the second stage, a parsed request
-body with an Id and update timestamp. Only in this stage the getFacade() method
-can be used.
+body with an Id and update timestamp. Only in this stage the
+\lstinline:getFacade(): method can be used.
Once a resource has been deleted, it does not vanish entirely, but enters stage
three. Only the associated data originally submitted in the request body is
discarded but the Id and update timestamp (now referring to the time of
-deletion) is preserved. Such a ``deleted resource'' is used to generate the
-``deleted-entry'' entries in the AtomPub feed (\autoref{sec:synchr-coll}).
+deletion) is preserved. Such a ``deleted resource'' is used to generate a
+corresponding ``deleted-entry'' tombstone in the AtomPub feed
+(\autoref{sec:synchr-coll}).
Deleted resources don't need to be preserved eternally. A deleted resource with
the oldest timestamp of all resources managed by a particular CollectionStorage
@@ -1878,13 +1910,14 @@ \subsubsection{Resource life cycle}
\subsubsection{Resource Facades}
\label{sec:resourcefacades}
-\autoref{sec:resource-properties} introduced and motivated the concept of
+Subsection \ref{sec:resource-properties} introduced and motivated the concept of
Resource Facades. This section explains the inner workings of the classes
providing this mechanism as drafted in \autoref{fig:resourcefacades}.
The Resource class does not hold any attribute that directly corresponds to its
-``main data''. Instead it holds a FacadeProvider instance to request a specific
-data facade to access data. Facades are primarily referenced by Java interfaces.
+``main'' or ``body'' data. Instead it holds a FacadeProvider instance to request
+a specific data facade to access data. Facades are primarily referenced by Java
+interfaces.
A FacadeProvider in turn is instantiated with a FacadeRegistry of available
FacadeFactories and one or more ``seed'' facades, making up the initial content
@@ -1909,9 +1942,9 @@ \subsubsection{Resource Facades}
Representations.
Requests for facades can be further parameterized with a Predicate. The
-Predicate has one apply(FacadeFactory):bool method which is called only for
-FacadeFactories producing the desired interface. This mechanism is used in the
-implementation to check an \lstinline:isWriteable(MediaType): method on
+Predicate has one \lstinline;apply(FacadeFactory):bool; method which is called
+only for FacadeFactories producing the desired interface. This mechanism is used
+in the implementation to check an \lstinline:isWriteable(MediaType): method on
factories producing Writer instances and thus to select the correct Writer
according to the Mediatype accepted by the client. Future work could
considerably enhance this rather brittle mechanism e.g., to check for
@@ -2035,9 +2068,6 @@ \subsection{CollectionStorage}
the message body and headers but not attachments.}. Those must therefor be
implemented by a separate component.
-To make implementation of the interface easy and to correspond to the REST
-characteristic that every request is atomic,
-
The CollectionStorage does not expose any support for transactions. This should
make the interface easier to implement and also corresponds to the REST
characteristics of statelessness and transfer of full representations. As a
@@ -2049,10 +2079,10 @@ \subsection{CollectionStorage}
\lstinline:doUpdate: call would be overwritten.
\begin{javalisting}[label=fig:evaluatepreconditions-concurrency,
+ float=htb,
caption={Potential lost-update problem with JAX-RS}]
ResponseBuilder rb = request.evaluatePreconditions(etag);
-if (rb == null)
- return doUpdate(foo);
+if (rb == null) return doUpdate(foo);
\end{javalisting}
% @TODO a resource should not be build, it the etag has not changed. How to make
@@ -2095,7 +2125,7 @@ \subsubsection{Preparsed Request Components with Dependency Injection}
injection to isolate request parsing. This can be seen for example in the
PaginationRange class which should just hold the values of the URI query
parameters limit and offset. The provider function in Listing
-\ref{fig:paginationrangeprovider} is invoked by guice when this class is
+\ref{fig:paginationrangeprovider} is invoked by Guice when this class is
required. It depends in turn on UriInfo, extracts the necessary information and
returns the simple value class PaginationRange.
@@ -2115,7 +2145,7 @@ \subsubsection{Preparsed Request Components with Dependency Injection}
(Preconditions). The main advantages of this approach are supposed to be:
\begin{itemize}
-\item Classes parsing commonly used query parameters can be reused, even across
+\item Classes that parse commonly used query parameters can be reused, even across
unrelated applications.
\item The request method declaration gets much easier to read.
\item Sophisticated validation can be applied without obfuscating the request method.
@@ -2139,11 +2169,11 @@ \subsubsection{Driving Dependency Injection further}
The use of dependency injection can be extended to comprise several levels of
dependencies and thus to build processing pipelines. The information from the
above PaginationRange class is in the implementation just forwarded to the
-CollectionStorage's listUpdates method.
+CollectionStorage's listUpdates method to receive a ResultList instance.
Consequently the resource method could as well use dependency injection to
-directly request the corresponding ResultList
-instance. \autoref{fig:dependency-injection-pipeline} visualizes the resulting,
+directly request the required ResultList
+instance. Figure \ref{fig:dependency-injection-pipeline} visualizes the resulting,
hypothetic dependency graph of this approach.
The figure shows how the CollectionStorage relevant for the request is
@@ -2691,24 +2721,6 @@ \section{Conclusions}
\newpage
\bibliography{references}{}
\bibliographystyle{alphadin}
-\section*{Selbstständigkeitserklärung}
-% Keine Kopf- und Fußzeilen ausgeben
-%\thispagestyle{empty}
-% Aber trotzdem ins Inhaltsverzeichnis aufnehmen
-% \addcontentsline{toc}{chapter}{Selbstständigkeitserklärung}
-% Hier der offizielle Text der eidesstattlichen Erklärung
-Ich erkläre hiermit, dass ich die vorliegende Arbeit selbstständig und ohne
-Benutzung anderer als der angegebenen Hilfsmittel angefertigt habe; die aus
-fremden Quellen direkt oder indirekt übernommenen Gedanken sind als solche
-kenntlich gemacht. Die Arbeit wurde bisher in gleicher oder ähnlicher Form
-keiner anderen Prüfungskommission vorgelegt.
-% Etwas Abstand für die Unterschrift
-\vspace{3cm}
-% Hier kommt die Unterschrift drüber
-\begin{tabbing}
-\hspace{6cm} \= \kill
-\textit{Kreuzlingen, 4. April 2012} \> \textit{Thomas Koch}
-\end{tabbing}
\end{document}
@@ -2719,4 +2731,4 @@ \section*{Selbstständigkeitserklärung}
% LocalWords: RESTful programmatically instantiation hypothetic cacheable
% LocalWords: Algermissen interoperability representable isomorphism doubtable
% LocalWords: Cacheability extensibility referenceable shareable injectable
-% LocalWords: dereferenceable mergeable unmergeable
+% LocalWords: dereferenceable mergeable unmergeable conformant
View
36 latex/titel.tex
@@ -0,0 +1,36 @@
+\begin{titlepage}
+
+\vspace*{0.2cm}
+\begin{center}
+\Huge{\textbf{A REST API for the Groupware Kolab with support for different Media Types}}
+\vspace{1cm}
+
+\large{Bachelor Thesis in Computer Science}
+
+\vspace{1cm}
+
+\normalsize{by}
+
+\Large{\textbf{Thomas Koch}}
+
+\normalsize{(Matrikelnummer: 7250371)}
+
+\vspace{1.5cm}
+\large{vorgelegt der \\
+Fakultät für Mathematik und Informatik\\
+der FernUniversität in Hagen}
+
+\vspace{3.5cm}
+
+\end{center}
+
+\begin{tabular}{ll}
+\textbf{Erster Prüfer:} & Prof. Dr. Bernd Krämer\\
+& Lehrgebiet Datenverarbeitungstechnik\\
+& Fakultät für Mathematik und Informatik \\
+& \\
+\textbf{Beginn der Arbeit:} & 15.12.2011 \\
+\textbf{Abgabe der Arbeit:} & 5.4.2012 \\
+\end{tabular}
+
+\end{titlepage}

0 comments on commit 1dcb239

Please sign in to comment.
Something went wrong with that request. Please try again.