Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Fetching contributors…

Cannot retrieve contributors at this time

696 lines (563 sloc) 27.229 kB

Hello, wiki

Wikis, and other form of user-editable pages, are components commonly encountered on the web. In itself, developing a simple wiki is error-prone but otherwise not very difficult. However, developing a rich wiki that both scales up and does not suffer from vulnerabilities caused by the content provided by users is quite harder. Once again, Opa makes it simple.

In this chapter, we will see how to program a complete, albeit simple, wiki application in Opa. Along the way, we will introduce the Opa database, the client-server security policy, the mechanism used to incorporate user-defined pages without breaking security, as well as more user interface manipulation.

Overview

Let us start with a picture of the wiki application we will develop in this chapter:

result

This web application stores pages and lets users edit them, using a HTML-like syntax that supports paragraphs, tables, etc.

If you are curious, this is the full source code of the application.

link:hello_wiki.opa[]

Run

In this listing, we define a database for storing the content of the pages in a safe format, we define the user interface and finally, the main application. In the rest of the chapter, we will walk you through all the concepts and constructions introduced.

As for the chat, we use Bootstrap CSS from Twitter with a single import line.

Setting up storage

A wiki is all about modifying, storing and displaying pages. This means that we need to set up some storage to receive these pages, as follows:

db /wiki: stringmap(Template.default_content)

This defines a database path, i.e. a place where we can store information, and specifies the type of information that will be stored there. Here, we use type stringmap(Template.default_content). This is a stringmap (i.e. an association from string keys to values) containing values of type Template.default_content.

Tip
About database paths

In Opa, the database is composed of paths. A path is a named slot where you can store exactly one value at a time. Of course, this value can be a list, or a map, for instance, both of which act as containers for several values.

As everything else in Opa, paths are typed, with the type of the value that can be stored. Paths can be defined for any Opa type, except types containing functions or real-time web constructions such as networks. To guard against subtle incompatibilities between successive versions of an application, the type must be given during the definition of the path.

In turn, Template.default_content is the default content format for templates, i.e. the default method of representing rich text that can be modified by the user.

Tip
About template

On the web, letting users write HTML code that can be seen at a later stage by other users is a very bad idea, as it opens the way for numerous forms of attacks, sometimes quite subtle and hard to detect. Opa’s templating mechanism is a manner of representing rich text that can be provided by a user, stored in the database and displayed to other users, without compromising security.

Although this topic is not covered by this chapter, the templating mechanism is designed for full extensibility - indeed, our website is assembled using this mechanism, including our own on-line editor.

It is generally a good idea to associate a default value to each path, as this makes manipulation of data easier:

db /wiki[_] = Template.text("This page is empty. Double-click to edit.")

The square brackets [_] are a convention to specify that we are talking about the contents of a map and, more precisely, providing a default value. Here, the default value is Template.text("This page is empty. Double-click to edit."), i.e. a simple text in our templating system. As expected, it has type Template.default_content.

With these two lines, the database is set. Any data written to the database will be kept persistent, versioned and snapshoted regularly. Should you stop and restart your application, the data will be checked and made available. If the data has been corrupted or is somehow incompatible with your application, you will be informed.

Loading, parsing and writing

Reading data from or writing data to the database is essentially transparent. For clarity and performance, we may start by defining two loading functions. One will load some content from the database and present it as source code that the user can edit, while a second will load the same content and present it as xhtml that can be displayed immediately:

load_source(topic)   = Template.to_source(Template.default, /wiki[topic])
load_rendered(topic) = Template.to_xhtml( Template.default, /wiki[topic])

In this extract, topic is the topic we wish to display or edit — i.e. the name of the page. Both functions use Template.default, the default template engine. This engine uses as source language a safe subset of HTML and produces content with type Template.default_content. The content associated with topic can be found in the database at path /wiki[topic]. Once we have this content, depending on our needs, we convert it to source code, as a string, or to a xhtml data structure used for display.

For parsing and saving data, the logic is slightly more complicated:

save_source(topic, source) =
(
   match Template.try_parse(Template.default, source) with
    | {success = x} ->
    (
           do /wiki[topic] <- x
           Template.to_xhtml(Template.default, x)
    )
    | {failure = _} ->
    (
           do /wiki[topic] <- Template.text(source)
           <>Syntax error<br/>: {source}</>
    )
)

This function takes two arguments: a topic, with the same meaning as above, and a source, i.e. a string meant to be understood by our templating engine. The call to function Template.try_parse submits this source to the default templating engine, which may either accept it (if the syntax was correct) or reject it (otherwise). For this wiki, we do not try to fix the syntax automatically or to fallback to an approximate syntax.

Function Template.try_parse can produce two kinds of results: success results and failure results. More precisely, by definition, the value returned by this function is always either a record \{ success = something \} or \{ failure = something\}. We call this a sum of records, and it has type \{ success = … \} /\{ failure = …\} — we will fill in the dots later.

Tip
About sum types

A value has a sum type t/u if, depending on the execution path, it can have either values of type t or values of type u . A good example of sum type is booleans, which are defined as {false}/{true} . Another good example of sum type is the type list of linked lists, whose definition can be summed up as {nil} / \{\hd: …; tl: list\}.

Note that sum types are not limited to two cases. Advanced applications commonly manipulate sum types with ten cases or more.

As you can see, we apply a match … with … operation to this result. This operation is known as pattern-matching, and it lets us determine safely whether the result was a success or a failure.

The operation used to branch according to the case of a sum type is called pattern-matching. A good example of pattern-matching is if … then … else … . The more general syntax for pattern matching is match … with | case_1 -> … | case_2 -> … | case_3 -> …

The operation is actually more powerful than just determining which case of a sum type is used. Indeed, if we use the vocabulary of languages such as Java or C#, pattern-matching combines features of if, switch, instanceof/is, multiple assignment and dereferenciation, but without the possible safety issues of instanceof/is and with fewer chances of misuse than switch.

This pattern-matching has two cases. Let us concentrate on the first one:

    | {success = x} ->
    (
           do /wiki[topic] <- x
           Template.to_xhtml(Template.default, x)
    )

The items between the pipe (|) and the arrow (->) are known as the pattern. The corresponding case is executed if the pattern matches the value. Here, we have a pattern that accepts all records containing exactly one field called success. When our pattern-matching encounters such a record, it calls x the contents of field and executes everything that appears after the arrow (the body).

The first instruction writes x to the database at path /wiki[topic]. The second instruction produces a xhtml-formatted version of x, which is the result of our function.

Let us now examine the second case of our pattern-matching:

    | {failure = _} ->
    (
           do /wiki[topic] <- Template.text(source)
           <>Syntax error<br/>: {source}</>
    )

By definition, cases are executed in order. Here, the pattern matches all records containing exactly one field called failure. When our pattern-matching encounters such a record, it ignores the contents of the field — recall that _ is pronounced "I don’t care" — and executes the body.

Here, we are in the error case, i.e. the user has entered syntactically incorrect text. We could decide to perform sophisticated error reporting, but that goes beyond the scope of this chapter. We will rather employ an alternative strategy: we store the text entered by the user, but as source code, so that it can be modified at a later stage. For this purpose, we again call function Template.text to adopt the source code as immediate text, we write this to the database at path /wiki[topic] and then we produce a message stating that there was a syntax error.

With this, we have covered everything related to database, parsing or rendering. We can now move to the user interface.

User interface ~~~~~~

As previously, we define a function to produce the user interface:

display(topic) =
(
   Resource.styled_page("About {topic}", ["/resources/css.css"],
     <div class="topbar"><div class="fill"><div class="container"><div id=#logo></div></div></div></div>
     <div class="content container">
       <div class="page-header"><h1>About {topic}</></>
       <div class="well" id=#show_content ondblclick={_ -> edit(topic)}>{load_rendered(topic)}</>
       <textarea clas="xxlarge" rows="30" id=#edit_content onblur={_ -> save(topic)}></>
     </div>
   )
)

This time, instead of producing a xhtml result, we have embed this result in a resource, i.e. a representation for anything that the server can send to the client, whether it is a page, an image, or anything else. In practice, most applications produce a set of resources, as this is more powerful and more flexible than plain xhtml . Of course, you will need to use the appropriate server constructor, which we will introduce later.

A number of functions can be used to construct a resource. Here, we use Resource.styled_page, a function which constructs a web page, from its title (first argument), a list of stylesheets (second argument) and xhtml content. At this stage, the xhtml content should not surprise you. We use <div> to display the contents of the page, or <textarea> (initially hidden) to modify them. When a user double-clicks on the content of the page (event dblclick), it triggers function edit, and when the user stops editing the source (event blur), it triggers function save.

Function edit is defined as follows:

edit(topic) =
(
   do Dom.set_value(#edit_content, load_source(topic))
   do Dom.hide(#show_content)
   do Dom.show(#edit_content)
   do Dom.give_focus(#edit_content)
   void
)

This function loads the source code associated with topic, sets the content of #edit_content, replaces the <div> with the <textarea>, gives focus to the <textarea> (purely for comfort) and returns void.

Similarly, function save is defined as follows, and shouldn’t surprise you:

save(topic) =
(
   content = save_source(topic, Dom.get_value(#edit_content))
   do Dom.transform([#show_content <- content]);
   do Dom.hide(#edit_content);
   do Dom.show(#show_content);
   void
)

With these three functions, the user interface is ready. It is now time to work on the server.

Serving the pages ~~~~~~~

For this application, as we are serving several pages, we employ a new server constructor, which is more suited to our task: Server.simple_dispatch. This constructor also produces a service, but is suited to applications that need to return a number of distinct pages:

server = Server.simple_dispatch(start)

To get this to work, we need to define a function start as follows:

start(uri) = (
  match uri with
   | {path = {nil} ...} -> display("Hello")
   | {path = path  ...} -> display(String.concat("::", path))
)

This is another pattern-matching, built from some constructions that you have not seen yet. Pattern {nil} accepts the empty list. Pattern accepts any number of fields in a record. In other words, the first case of our pattern-matching accepts any record containing at least one field named path, provided that this field contains exactly the empty list. In other words, this corresponds to a request for a URI with an empty path, such as "http://localhost:8080/".

The second pattern accepts any record containing at least one field named path. From the definition of pattern-matching, it is executed only if the first pattern did not match, for instance on a request to "http://localhost:8080/hello".

In both cases, we execute function display. The first case is trivial, while in the second case, we first convert our list to a string, with separator "::".

Actually, we will make it a tad nicer by also ensuring that the first letter is uppercase, while the other letters are lowercase. Along the way, we can make this function shorter:

start =
   | {path = [] ...} -> display("Hello")
   | {~path     ...} -> display(String.capitalize(String.to_lower(String.concat("::", path))))

In this new version, we use a shorter syntax for pattern-matching. We use [] for the empty list — this is equivalent to {nil} — and we write ~path — which is equivalent to path = path.

Adding some style ~~~~~~~

As in the previous chapter, without style, this example looks somewhat bland. As previously, we will fix this with an external stylesheet resources/css.css, with the following contents:

Contents of file resources/css.css
link:resources/css.css[]

One last step: including the stylesheet. For this purpose, instead of using a predefined function such as Server.one_page_bundle, we define a second server, just to include the contents of the directory, as follows:

server = Server.of_bundle([@static_include_directory("resources")])

Note that the order of server declarations is meaningful. This server must appear before our previous server.

With this final line, we have a complete, working wiki. With a few additional images, we obtain:

result

As a summary, let us recapitulate the source file:

The complete application
link:hello_wiki_simple.opa[]

Run

This is a total of 30 effective lines of code + CSS.

Questions ~~~ What about the wiki syntax? ^^^^^^^^^ For this tutorial chapter, we have used the default syntax provided by Template, which we have not detailed. That is because you already know this syntax: Template uses a subset of the syntax of HTML, without JavaScript. As mentioned earlier, this syntax can be extended. The general idea is to plug new engines into Template which can interpret new namespaces. We will discuss the specifics in another chapter, or you can already look at the documentation of module Template.

Opa also features a powerful mechanism for parsing simple or complex syntaxes, which you can easily use to replace the default syntax of Template with something closer to the usual Wiki syntax. We will introduce this mechanism in another chapter.

What about user security? ^^^^^^^^^

As mentioned, one of the difficulties when developing a rich wiki is ensuring that it has no security vulnerabilities. Indeed, as soon as a user may edit content that will be displayed on another user’s browser, the risk exists of letting one user hide some JavaScript code (or possibly Flash or Java code) that will be executed by another user. This is a well-known technique for stealing identities.

You may attempt to reproduce this with the wiki, the chat, or any other Opa application. This will fail. Indeed, while lower-level web technologies make no difference between JavaScript code, text, or structured data, Opa does, and ensures that data that has been provided as the one can never be interpreted as one of the others.

Caution
Careful with the <script>

There is, actually, one exception to this otherwise bullet-proof guarantee: if a developer manually introduces <script> tag containing a insert, as follows, the possibility exists that a malicious user could take advantage of this to inject arbitrary code.

<script type="text/javascript">{security_hole}</script>

The bottom line is therefore: do not introduce <script> tag containing an insert. This is the only case that, at the time of this writing, Opa cannot check.

What about database security? ^^^^^^^^^^^

Now that we have a database, it is high time to think about what can be modified and in which circumstances. By default, Opa takes a conservative approach and ensures that malicious clients have access to as few entry points as possible — we call this publishing a function. By default, the only published functions are functions that users could trigger themselves by manipulation of the user interface, i.e. event handlers.

Tip
Published entry points

An entry point is a function that exists on the server but that could possibly be triggered by a user, possibly malicious.

Typically, every application contains at least one entry point, introduced by server — in the wiki, this is function start. Most applications also feature a manner for the client to send information to the server and trigger some treatment upon this information.

More generally, any function can be used as an entry point if it is published. By default, Opa will automatically publish event handlers.

Here, this means functions edit and save. In our listing, no other functions are published. Consequently, the Template mechanism is invoked only on the server. A direct consequence is that we can be certain that the stringmap is always well-formed and that /wiki[x] can only ever contain well-formed template content.

What about client-server performance? ^^^^^^^^^^^^^

At this stage, keeping the code in mind, you may start to wonder about performance and in particular about the number of requests involved in editing or saving a page. This is a very good point. Indeed, if your browser offers performance/request tracing tools, you will realize that calls to edit or save are quite costly.

This issue can be solved quite easily, but let us first detail the cost of saving:

  1. The client sends a save request to the server (1 request).

  2. The server prompts the client for the value of edit_content (2 requests).

  3. The server instructs the client to hide show_content (2 requests).

  4. The server instructs the client to show edit_content (2 requests).

Surely, this is not the best that Opa can do? Indeed, Opa can do better. To solve this, we only need to provide a little additional information to the compiler, namely to let it know that load_source, load_rendered and save_source have been designed to handle anything that can be thrown at them, and thus do not need to be kept hidden.

For this purpose, Opa offers a publication primitive: @publish. To apply it to our three functions, we simply add this directive where needed:

@publish load_source(topic)   = ...
@publish load_rendered(topic) = ...
@publish save_source(topic, source) = ...

And we are done!

With this simple modification, saving is now just one request. We will discuss @publish and the details of client-server slicing in a later chapter.

Once again, we may take a look at the complete application:

The complete application, made faster
link:hello_wiki.opa[]

What about database performance? ^^^^^^^^^^^^

Generally speaking, the Opa database is very fast. In particular, in our experience, a stringmap or an intmap is as fast as an indexed table in a relational database, sometimes much faster.

Of course, you do not have to restrict yourself to a map indexed by a string or an int. Indeed, any type that can be stored in the database can be used as an index. The chapter dedicated to the database will detail how you can do this and how to ensure that the performance of your database remains optimal.

Exercises ~~~ Time to put your new knowledge to the test.

Better handling of errors ^^^^^^^^^ When the user attempts to save content containing an error, instead of saving the result:

  • keep editing;

  • change the background color of edit_content to red;

  • show a message informing that the user can simply reload the page to cancel changes.

You will need to use a CSS class for setting the background color. You may also wish to use the following functions:

Dom.add_class:    dom, /*class*/string -> void
Dom.remove_class: dom, /*class*/string -> void

Changing the default content ^^^^^^^^^^ Customize the wiki so that the database contains not a Template.default_content but a option(Template.default_content), i.e. a value that can be either {none} or \{some = x\}, where x is a Template.default_content.

Use this change to ensure that the default content of a page with topic topic is "We have no idea about topic. Could you please enter some information?".

Inform users of changes ^^^^^^^^^ Drawing inspiration from the chat, add an on-screen zone to inform users of changes that take place while they are connected.

Template chat ^^^^^ Modify your "Hello, Chat" application so that users can enter rich text, not just raw text.

Chat log ^^^^ Modify your "Hello, Chat" application to add the following features:

  • store the conversation as it takes place;

  • when a new user connects, display the log of the conversation.

For this purpose, you will need to maintain a list of messages in the database.

Tip
About lists

In Opa, lists are one of the most common data structures. They are immutable linked lists.

Lists have type list. More precisely, a list of elements of type t has type list(t), pronounced "list of t". The empty list is written

[]

or, equivalently, {nil}. It has type list('a), which means that it could be a list of anything. A list containing elements x, y, z is written

[x,y,z]

or, equivalently,

{hd = x; tl =
{hd = y; tl =
{hd = z; tl =
{nil}}}}

More generally, the definition of list in Opa is:

type list('a) = {nil}
              / {hd: 'a; tl: list('a)}

If you have a list l and you wish to construct a list starting with element x and continuing with l, you can write either

[x|l]

or, equivalently,

{hd = x; tl = l}

or, equivalently,

List.cons(x, l)
Tip
About loops

If you have a list l and wish to apply a function f to all elements if l, use function List.iter. This is one of the many loop functions of Opa.

Yes, in Opa, loops are just regular functions.

For bonus points, make sure that the log is displayed on a slightly different background color.

Multi-room chat ^^^^^ Now that you know how to manipulate Server.simple_dispatch, you can implement a multi-room chat, with the following definition:

  • visiting a page with path p connects you to a chatroom p;

  • each message also contains the name of the chatroom;

  • for a client visiting path p, only display messages for chatroom p.

Tip
Minimizing communications

The best place for deciding whether to display a message is inside the callback added with Network.add_callback. Fine-tuning this callback can help you minimize the amount of server-to-client communications. For this purpose, you can use directive @server, a variant of @publish which lets you force a function to be executed only on the server.

Tip
Scaling up

For optimal scalability, a better design is required.

You will need to maintain a family of networks, one for each room, typically as a stringmap of networks. Networks are real-time data structures and can therefore not be stored in the database, as it is meaningless to store information that becomes inconsistent and unsafe whenever a client disconnects. Rather, the stringmap should be maintained as part of the state of a distributed session. We will introduce this mechanism of distributed sessions, which is the powerful primitive used to implement networks, in a few chapters.

And more ^^^^ Improve the wiki and the chat. Add features, make them nicer, make them better! And, once again, do not forget to show your changes to the community!

Jump to Line
Something went wrong with that request. Please try again.