Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Fetching contributors…

Cannot retrieve contributors at this time

303 lines (216 sloc) 12.956 kB
Hello, database
===============
In this chapter we will explore data storage possibilities. Opa has its own internal database, which is perfect for prototyping and quickly getting off the ground. However, for complex, data-intense applications, requiring scalable database with data replication and complex querying capabilities, Opa provides extensive support for the popular [MongoDB database](http://www.mongodb.org). This chapter focuses on MongoDB.
MongoDB: Quickstart
-------------------
In this section we will take the simple counter example from the [Opa tour chapter](/manual/A-tour-of-Opa) and run it using MongoDB.
First, unless done already, you need to [download](http://www.mongodb.org/downloads), [install & run](http://www.mongodb.org/display/DOCS/Quickstart) the MongoDB server. Installation essentially means unpacking the downloaded archive. And you can run the server with a command such as:
```{.sh}
$ mkdir -p ~/mongodata/opa
$ {INSTALL_DIR}/mongod --noprealloc --dbpath ~/mongodata/opa > ~/mongodata/opa/log.txt 2>&1
```
which creates directory `~/mongodata/opa` for the data (1) and then runs MongoDB server using that location and writing logs to `~/mongodata/opa/log.txt` (2). The server will take a moment to start and once done you can verify that it works by running `mongo` (client) in another terminal. The response should look something like this (of course version number may vary):
```
$ {INSTALL_DIR}/mongo
MongoDB shell version: 2.0.2
connecting to: test
>
```
Let us recapitulate the example from the previous chapter (we will assume it is saved in a file `counter.opa`).
[opa|fork=hello-opa|run=http://hello-opa.tutorials.opalang.org]file://hello-opa/hello-opa.opa
So how do we upgrade this example to MongoDB? All you have to do is to use the compilation switch for setting the DB backend: `--database mongo` (MongoDB will be the default database in future releases of Opa and you will be able to omit the switch).
`opa --database mongo counter.opa`
run it as before:
`./counter.exe`
and... voilà, you have your first Opa application running on MongoDB!
{block}[WARNING]
Beware, temporarily Opa accepts database declarations without explicit names, so:
```
database int /counter = 0;
```
instead of the above:
```
database mydb {
int /counter = 0;
}
```
However, programs with such name-less databases will *not* work with MongoDB. This glitch will be removed in the next version of Opa.
{block}
The above setup assumes that Mongo is running on local machine on its default port 27017; if that's not the case you can run your application (not the compiler!) with the option `--db-remote:db_name host[:port]`. For instance if Mongo was running locally but on port 4242, we'd need to run the application with:
```{.sh}
./counter.exe --db-remote:mydb localhost:4242
```
Incidentally, if you use authentication with your MongoDB databases, you can
provide the authentication parameters on the command line as follows:
```{.sh}
./counter.exe --db-remote:mydb myusername:mypassword@localhost:4242
```
Note, however, that we have a slightly simplified view of MongoDB authentication
in that we only support one database per connection.
MongoDB implements authentication on a per-database basis and can have
authentication to multiple databases over the same connection.
If you have more than one database at the same server you wil have to issue
multiple `--db-remote` options, one for each target database.
A potentially useful feature is that if you authenticate with the `admin`
database then it acts like ''root'' access for databases and you can then access
all databases over that connection.
So switching to Mongo is very easy. It is worth noting that although Mongo itself is "Schema free", the Opa compiler ensures adherence to program database definitions, therefore providing a safety layer and ensuring type safety for all database operations.
What are the gains of switching database backend? Apart from running on an industry standard DBMS it opens doors for more complex querying, which we will explore in the following section.
Database declarations
---------------------
One example is worth a thousand words, so in the remainder of this section we will use a running example of a database for a movie rating service. First we need data for service users, which will just be a set of all users, each of them having her `id` of type `user_id` (here explicit `int`, though in a real-life example it would probably be abstracted away), an `int age` field and a `status` which is a simple variant type (enumeration).
```
type user_status = {regular} or {premium} or {admin}
type user_id = int
type user = { user_id id, string name, int age, user_status status }
database users {
user /all[{id}]
/all[_]/status = { regular }
}
```
First thing to notice is the notation for declaring *database sets*. Writing `user /user` declares a single value of type `user` at path `/user`, however `user /all[{id}]` declares a *set of* values of type `user`, where the record field `id` is the *primary key*.
Second important thing is that all database paths in Opa have associated *defaults values*. For `int` values that is `0`, for `string` values that is an empty string. For custom values we either need to provide it explicitly, which is what `/all[_]/status = { regular }` does, or we need to explicitly state that a given records will never be modified partially, which can be accomplished with the following declaration `/all[_] full` -- more on partial record updates below.
Now we are ready to declare movies. First we introduce a type for a (movie) `person`, which in this simple example just consists of a `name` and the year of birth, `birthyear`.
```
type person = { string name, int birthyear }
```
Then we introduce a type for movie cast, including `director` and a list of `stars` playing in the movie (in credits order).
```
type cast = { person director, list(person) stars }
```
Now a (single) movie has its `title` (`string`), `view_counter` (`int`; how many times a given movie was looked at in the system), its `cast` and a set of ratings which are mappings from `int` (representing the rating) to `list(user_id)` (representing the list of users who gave that rating). It's not very realistic to just represent one single movie, but for our illustration purposes this will do.
```
database movie {
string /title
int /view_counter
cast /cast
intmap(list(user_id)) /ratings
}
```
It is worth noticing that we can store complex (non-primitive) values in the database, without any extra effort -- `cast` is a record, which itself contains a list as one if its fields.
Here we notice that complex types -- records (`cast`), `list`s (`stars` field of `cast`) and `intmap` -- can be declared as entries in the database, in the same way as primitive tyes.
In the rest of this section we will explore how to read, query and update values of different types.
Basic types
-----------
We can read the title and the view counter of a single movie simply with:
```
string title = /movie/title;
int view_no = /movie/view_counter;
```
The `/movie/title` and `/movie/view_counter` are called *paths* and are composed of the database name (`movie`) and field names (`title`/`view_counter`).
Updates can be performed using the database write operator `path <- value`:
```
/movie/title <- "The Godfather";
/movie/view_counter <- 0;
```
What will happen if one tries to read unitialized value from the database? Say we do:
```
int view_no = /movie/view_counter;
```
when `/movie/view_counter` was never initialized; what's the value of `view_no`? We already mentioned default values above and the default value for a given path is exactly what will be given, if it was not written before.
This default value can be overwritten in the path declaration as follows:
```
database movies {
...
int /view_counter = 10 // default value will be 10
}
```
Another option is to use question mark (`?`) before the path read, which will return an optional value, `{none}` indicating that the path has not been written to yet, as in:
```
match (?/movies/view_counter) {
case {none}: "No views for this movie...";
case {some: count}: "The movie has been viewed {count} times";
}
```
For integer fields we can also use some extra operators:
```
/movie/view_counter++;
/movie/view_counter += 5;
/movie/view_counter -= 10;
```
Records
-------
With records we can do complete reads/updates in the same manner as for basic types:
```
cast complete_cast = /movie/cast;
/movie/cast <- { director: { name: "Francis Ford Coppola", birthyear: 1939 }
, stars: [ { name: "Marlon Brando", birthyear: 1924 }, { name: "Al Pacino", birthyear: 1940 } ]
};
```
However, unless a given path has been declared with for full modifications only (`full` modificator explained above), we can also cross record boundaries and access/update only chosen fields, by including them in the path:
```
person director = /movie/cast/director;
/movie/cast/director/name <- "Francis Ford Coppola"
```
Also with updates we can only update some of the fields:
```
// Notice the stars field below, which is not given and hence will not be updated
/movie/cast <- { director: { name: "Francis Ford Coppola", birthyear: 1939 } }
```
// TODO Explain defaults and need to define them for custom variant types
Lists
-----
List in Opa are just (recursive) [records](/type/stdlib.core/list) and can be manipulated as such:
```
/movie/cast/stars <- []
person first_billed = /movie/cast/stars/hd
list(person) non_first_billed = /movie/cast/stars/tl
```
However, as it's a frequently used data-type, Opa provides a numer of extra operations on them:
```
// removes first element of the list
/movie/cast/stars pop
// removes last element of the list
/movie/cast/stars shift
// appends one element to the list
/movie/cast/stars <+ { name: "James Caan", birthyear: 1940 }
person actor1 = ...
person actor2 = ...
// appends several elements to the list
/movie/cast/stars <++ [ actor1, actor2 ]
```
Sets and Maps
-------------
Sets and maps are two types of collections that allow to organize multiple instances of data in the database. Sets represent a number of items of a given type, with no order between the items (as opposed to lists, which are ordered). Maps represent associations from keys to values.
We can fetch a single value from a given set by referencing it by its primary key:
```
user some_user = /users/all[{id: 123}]
```
Similarly for maps:
```
list(user_id) gave_one = /movie/ratings[1]
```
We can also make various queries using the following operators (*comma* separated).
- `field == expr`: value of `field` is *equal* to the expression `expr`.
- `field != expr`: value `field` is *not equal* to the expression `expr`.
- `field < expr`: value of `field` is *smaller than* that of `expr` (you can also use: `<=`, `>` and `>=` with ).
- `field in list_expr`: value of `field` is *one of* those of the list `list_expr`.
- `q1 or q2`: satisfies query `q1` *or* `q2`.
- `q1 and q2`: satisfies queries `q1` *and* `q2`.
- `not q`: does not satisfy query `q`.
- `{f1 q1, f2 q2, ...}`: field `f1` satisfies `q1`, field `f2` satisfies `q2` etc.
and possibly some options (*semicolon* separated)
- `skip n`: skip the first `n` results (`n` an expression of type `int`)
- `limit n`: limit the result to the maximum of `n` results (`n` an expression of type `int`)
- `order fld (, fld)+`: order the results by given fields. Possible variants include: `fld`, `+fld` (sort ascending), `-fld` (sort descending), `fld=expr` (where `expr` needs to be of type `{up} or {down}`).
Note that if the result of a query is not fully constained by the primary key and hence can contain multiple values then the result of the query is of type [dbset](/module/stdlib.database.mongo/DbSet) (for sets) or of the initial map type (for maps, giving a sub-map). You can manipulate a dbset with `DbSet.iterator` and [Iter](/module/stdlib.core.iter/Iter).
Examples for sets:
```
user /users/all[id == 123] // accessing a single entry by primary key
dbset(user, _) john_does = /users/all[name == "John Doe"] // return a set of values
it = DbSet.iterator(john_does) // then use Iter module
dbset(user, _) underages = /users/all[age < 18]
dbset(user, _) non_admins = /users/all[status in [{regular}, {premium}]]
dbset(user, _) /users/all[age >= 18 and status == {admin}]
dbset(user, _) /users/all[not status == {admin}]
// showing second 50 results for users that are below 18 or above 62,
// sorted by age (ascending) and then id (descending)
dbset(user, _) users1 = /users/all[age <= 18 or age >= 62; skip 50; limit 50; order +age, -id]
```
Examples for maps:
```
// users who rated the movie 10
list(user_id) loved_it = /movie/ratings[10]
// users who rated the movie between 7 and 9 (inclusive)
list(user_id) liked_it = /movie/ratings[>= 7 and <= 9]
```
Jump to Line
Something went wrong with that request. Please try again.