/
ref_mongodb.omd
1018 lines (827 loc) · 42 KB
/
ref_mongodb.omd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Low-level MongoDB support
=========================
{block}[WARNING]
This chapter is about advanced uses of MongoDB in Opa and details low-level access to MongoDB in Opa. For most applications, you should only read [this chapter](/manual/Hello--database) instead.
{block}
Introduction
------------
In this chapter, we describe the current state of support for MongoDB in the Opa
standard library.
We assume some familiarity with MongoDB concepts and particularly with the
MongoDB
[shell](http://www.mongodb.org/display/DOCS/mongo+-+The+Interactive+Shell).
This familiarization can be gained by reading the MongoDB
[tutorial](http://www.mongodb.org/display/DOCS/Tutorial).
MongoDB is a server-based document-oriented non-relational database intended to
be scalable and fast.
Documents are stored in a binary JSON-like format called
[BSON](http://bsonspec.org).
Although BSON has a richer set of types than JSON it is 100%
compatible with JSON.
For speed, MongoDB does not implement joins but is instead provided with a
powerful query language of its own and almost anything that can be done with a
relational database can be implemented in MongoDB with a little bit of effort
(see MongoDB's page on
[SQL compatibility](http://www.mongodb.org/display/DOCS/SQL+to+Mongo+Mapping+Chart)).
In addition, MongoDB allows multiple indices into its data although these are
not automatic and have to be initiated in client code.
MongoDB is intended to be deployed in reliable large-scale web-based
applications and thus has features which facilitate scalability such as sharding
and master-slave arrangements of servers along with features for reliability
such as replicated servers with fail-over.
Backups of MongoDB data are usually done either offline on a slave server in the
network using [external tools](http://www.mongodb.org/display/DOCS/Backups) or to
redundant nodes in the MongoDB server network.
### Setting-up MongoDB
If you are not familiar with the MongoDB database, here are some quick
instructions to get you going.
Firstly, make sure that you have MongoDB installed on your system:
```
% which mongod
```
Note that MongoDB doesn't come with any major packages such as Ubuntu, yet, but
installation is trivial, download the latest version from the MongoDB
[downloads](http://www.mongodb.org/downloads) site and unpack the files locally.
You should then just have to add the `bin` directory to your path and you should
be up and running.
To run a MongoDB server, you first have to create a directory to store the
database files.
In fact, you need a directory for each node you wish to run, see the MongoDB
documentation for how to create replica sets, sharding etc.
At its simplest, start a `mongod` server with:
```
% mkdir -p ~/mongodata/master
% mongod --rest --oplogSize 500 --noprealloc --master --dbpath ~/mongodata/master > ~/mongodata/master/log.txt 2>&1 &
```
Use the `--oplogsize` and `--noprealloc` options to limit the initial allocated
disk space (the default is about 1Gb).
The `--rest` option allows you to monitor your database via the http interface
(found at the port number plus 1000).
If you wish to run the server on a different port, use the `--port 27017`
option, the default MongoDB server port is 27017.
Note, however, that to run the MongoDB shell on a non-default port you also need
the `--port` option:
```
% mongo --port 27017
MongoDB shell version: 2.0.1
connecting to: test
>
```
For the MongoDB OPA drivers we recommend version 1.6.0 or greater since much of
the current functionality was mature by that version.
We always recommend the current MongoDB stable version (at the time of writing 2.0.2)
but for the most part the driver is quite stable with respect to backwards compatibility.
### Overview
The Opa support for MongoDB consists of a hierarchy of modules leading to
successively higher-level programming.
#### Bson
Support for the BSON binary format is in the form of the `Bson` module, all
other modules are built on top of this one.
In general, BSON values are handled by the `Mongo.document` Opa data-type but we
also provide the `Bson.opa2doc` and `Bson.doc2opa` functions to allow conversion
between Opa types and BSON documents.
#### MongoCommon
This contains general support routines for dealing with replies from the MongoDB
server.
These include:
- printing results to meaningful strings
- testing results for error status
- handling tag lists instead of bit-mapped integers
- extracting fields and Opa types from MongoDB replies
#### MongoConnection
The code which talks to the MongoDB server is in the private `MongoDriver`
module. This includes support for
[replica sets](http://www.mongodb.org/display/DOCS/Replica+Sets) with automatic
reconnection on fail-over and
[cursors](http://www.mongodb.org/display/DOCS/Queries+and+Cursors) but for
programming at this level we provide a single all-purpose module called
`MongoConnection`.
Advanced programmers wishing to use some of the more obscure features of MongoDB
can use the driver code directly but this is not recommended.
MongoDB has a complex API involving over 70 functions and many of the simple
access commands have numerous options.
Our intention with this driver is to make accessing MongoDB databases as simple
and logical as possible while still exposing the power and flexibility of the
MongoDB engine.
#### MongoCommands
As an adjunct to the low-level programming interface we provide a module
containing a large (but still incomplete) number of the MongoDB command set
called `MongoCommands`.
These encompass most functions that will be required for meta-programming the
MongoDB database, such as `dropDatabase`, `repairDatabase`, `createCollection`
and so on plus functions associated with normal database access operations such
as `getLastError`.
The more advanced MongoDB functionality is also supported here, including
`findAndModify` and the very powerful `mapReduce` function.
These commands occur in two flavors, those which return `Bson.document` values
and those which convert their results into Opa types.
If you are only looking for a single value out of a large and complex reply
document then using the `Bson` module access functions on the raw BSON may be
more efficient.
If you intend complex analysis of the reply then the Opa types may be more
convenient.
At the present time only partial support is provided for Opa types.
Some command results may never be treated this way because they include
arbitrary field names which we can't safely convert into Opa types.
#### MongoCollection
This module represents a type-safe view of the low-level routines in
`MongoConnection`.
Here, we insist upon Opa types as arguments and results from MongoDB operations.
This necessarily limits what we can put into the database since the BSON
documents stored in the database have to be consistent with the Opa types they
represent.
To achieve this, we have implemented the `MongoSelect` and `MongoUpdate` modules
which enforce a type discipline upon the arguments to, for example,
`MongoCollection.insert`.
The type safety is implemented as run-time type checks so there is a significant
performance penalty for using these routines.
In the future, however, we will provide fully type-safe compile-time type checks
along the lines of the Opa internal database.
Programming
-----------
Here, we provide some notes on programming with the Opa MongoDB driver.
The full interface is too large for complete coverage here, refer to the online
Opa [API documentation](http://doc.opalang.org/api) for detailed notes on each
function.
### Using BSON types in Opa
The full Opa BSON data-type is as follows:
```
/**
* A BSON value encapsulates the types used by MongoDB.
**/
type Bson.value =
{ float Double }
or { string String }
or { Bson.document Document }
or { Bson.document Array }
or { string Binary }
or { string ObjectID }
or { bool Boolean }
or { Date.date Date }
or { Null }
or { (string, string) Regexp }
or { string Code }
or { string Symbol }
or { (string, Bson.document) CodeScope }
or { int Int32 }
or { int32 RealInt32 }
or { (int, int) Timestamp }
or { int Int64 }
or { int64 RealInt64 }
or { Min }
or { Max }
/**
* A BSON element is a named value.
**/
type Bson.element = { string name, Bson.value value }
/**
* The main exported type, a BSON document is just a list of elements.
*/
type Bson.document = list(Bson.element)
```
While values of this type can be constructed manually:
```
doc = Bson.document
[{name: "$eval", value: {Code:"function(x,y) \{return x*y;}"}},
{name: "args", value:{Array:[{name:"0", value:{Int32:6}},
{name:"1", value:{Int32:7}}]}}]
```
there are two more convenient ways of constructing BSON values.
Firstly, we provide a set of abbreviations in the `Bson.Abbrevs` module:
```
H = Bson.Abbrevs
doc = Bson.document [H.code("$eval","function(x,y) \{return x*y;}"),
H.valarr("args",[{Int32:6},{Int32:7}])]
```
Secondly, we can construct the values in Opa and use `Bson.opa2doc`:
```
doc = Bson.opa2doc({`$eval`:(Bson.code "function(x,y) \{return x*y;}"),
args:(list(Bson.int32) [6,7])})
```
Notice that to get a field with non-alphanumeric characters we have to back-quote
the field name in the Opa value and that to control the representation in the
BSON type we can apply helper types, for example `Bson.code` is just a string
but it instructs `Bson.opa2doc` to treat it as code.
Remember also to escape curly brackets in strings.
Note that to get `Int32` values you need the `Bson.int32` type, the default for
`int` is actually `Bson.int64`.
There are several such types provided by the `Bson` module but some merit
special mention:
* Optional types have a special significance with respect to `Bson.doc2opa` in that if a field value is missing in the document it will appear in the Opa type as `{none}`. The alternate direction does not apply, `{none}` values are represented in the BSON document as `{ none : null }`.
```
type Bson.register('a) = {'a present} or {absent}
```
* We take this one step further, however, with the `Bson.register` type, which actually behaves much as `option('a)` except that when we call `Bson.doc2opa` any `{absent}` values are omitted from the resulting document altogether. Note that there is a module `Bson.Register` which provides the same functionality for `Bson.register` as the `Option` module does for type `option`.
* Care should be taken in dealing with integer values which may have been placed into the database outside of OPA. OPA uses, internally, the OCaml integer representation `int` which is actually 31 bits wide on 32-bit systems and 63 bits wide on 64-bit systems (the spare bit is reserved by the garbage collector). Now MongoDB actually uses fully 32-bit and 64-bit integers which means that it is possible to find an integer value in a MongoDB database which is too large for the OPA representation (remember that all values generated by OPA and stored in the database are guaranteed to be within range). Currently, OPA only has 32-bit and 64-bit integers as abstract values. Such values can be stored in OPA as an external type (`int32` and `int64`) but no operations are possible on these values (they are sometimes needed by external libraries). We handle this situation in the MongoDB driver by automatically detecting overflow values and storing them as `RealInt32` and `RealInt64` when returning `Bson.document` types from the driver. While these values may appear to be invisible to the `Bson` module functions such as `find_int`, you can detect overflows by inspecting the document values:
```
match (value) {
case {RealInt32:_}: error("overflow");
case {Int32:i}: i;
default: error("not an int");
}
```
* The `Bson.meta` type is intended to support situations where MongoDB can return a field of different types depending upon the nature of the command executed. A good example of this is the `out` option to the `mapReduce` function which can be either a `string` or a document type. We cast the parameter as `Bson.meta` which allows us to control the type at the function's application. We can also apply this trick to the `result` type from `mapReduce` calls:
```
mr = MC.mapReduceSimple(mongodb,map,reduce,{String:"example1"})
/* or */
mr = MC.mapReduceSimple(mongodb,map,reduce,{Document:[H.str("reduce","session_stat")]})
```
* Two other cases should be mentioned. Both `list` and `intmap` are mapped onto `Array` values in BSON. The difference is that `list` is mapped to consecutive-numbered elements in the `Array` document whereas `intmap` allows sparse arrays.
As a rough guide to `Bson.opa2doc` and `Bson.doc2opa`, the following simple
schema shows the mapping:
```
/* We use a "natural" mapping of constant types */
float <-> Double
string <-> String
Bson.binary <-> Binary
Bson.oid <-> ObjectID
bool <-> Boolean
Date.date <-> Date
void <-> Null
Bson.regexp <-> Regexp
Bson.code <-> Code
Bson.symbol <-> Symbol
Bson.codescope <-> CodeScope
Bson.int32 <-> Int32
Bson.realint32 <-> Int32
Bson.timestamp <-> Timestamp
Bson.realint64 <-> Int64
Bson.min <-> Min
Bson.max <-> Max
/* Basic record scheme */
{a:'a; b:'b} <-> { a: 'a, b: 'b }
/* Sum types */
{a:'a} / {b:'b} <-> { a: 'a } <or> { b: 'b }
/* Non-record types are called "value" */
'a <-> { value: 'a }
/* Special cases */
/* Default for int is Int64 */
int <-> Int64
/* Overflow */
Bson.realint32 <- Int32 /* when integer exceeds range */
Bson.realint64 <- Int64 /* when integer exceeds range */
/* Options */
option('a):
{some=a} <-> { some : 'a }
{none} <-> { none : null }
{none} <- { }
/* Registers */
Bson.register('a):
{present=a} <-> { present : 'a }
{absent} <- { absent : null }
{absent} <-> { }
/* Lists are consecutive arrays */
list('a) <-> { Array=(<label>,{ 0:'a; 1:'a; ... }) }
/* Intmaps are non-consecutive arrays */
ordered_map(int,'a) <or>
intmap('a) <-> { Array=(<label>,{ 1:'a; 3:'a; ... }) }
/* Bson.document is treated verbatim (including labels) */
Bson.document <-> Bson.document
/* Bson.meta is treated as a variable type */
int:Bson.meta <-> { Int64:int }
string:Bson.meta <-> { String:string }
bool:Bson.meta <-> { Boolean:bool }
etc.
```
Notes:
* For `ObjectID` values, there are a couple of routines which convert between (hex value) strings and the BSON representation, `Bson.oid_of_string` and `Bson.oid_to_string`. You can also create a BSON-style OID value with `Bson.new_oid`.
* `Bson.document` types are completely write-through, i.e. they are not processed at all.
* In case you're wondering, `Min` and `Max` are used in sharded databases to indicate infimum and supremum bounds on sharding regions, respectively.
//TODO: other functions find_xyz, to_pretty, error stuff
### Using the low-level interface
Connecting to and using the low-level drivers should be done using the
`MongoConnection` module.
This gathers together various low-level features in a single module.
#### Opening a connection to the MongoDB server
The preferred method is to use the system of named connections which can be
defined from the command line or setup internally using the `Mongo.param` type
and the `MongoConnection.add_named_connection` function.
Initially, there is one default connection (called ''default'') which is set to
`localhost:27017`, the default port for MongoDB servers on the local machine.
To open this connection use:
```
mongodb =
match (MongoConnection.open("default")) {
case {success:mongodb}: mongodb
case {~failure}: ... /* take action on error */
}
/* or */
mongodb = MongoConnection.openfatal("default")
```
The `MongoConnection.open` function returns an outcome of either the connection
or the standard `Mongo.failure` type whereas the `MongoConnection.openfatal`
function returns just the connection but treats a failed connection as a fatal
error.
To setup the connection from the command line the following options are defined:
{table}
{* Option | Abbrev Type | Description *}
{| `--mongo-name` | `(--mn) <string>` | Name for the MongoDB server connection |}
{| `--mongo-repl-name` | `(--mr) <string>` | Replica set name for the MongoDB server |}
{| `--mongo-buf-size` | `(--mb) <int>` | Hint for initial MongoDB connection buffer size |}
{| `--mongo-socket-pool`| `(--mp) <int>` | Number of sockets in socket pool (>=2 enables socket pool) |}
{| `--mongo-seed` | `(--ms) <host>{:<port>}` | Add a seed to a replica set, allows multiple seeds |}
{| `--mongo-host` | `(--mh) <host>{:<port>}` | Host name of a MongoDB server, overwrites any previous hosts |}
{| `--mongo-log` | `(--ml) <bool>` | Enable MongoLog logging |}
{| `--mongo-log-type` | `(--mt) <string>` | Type of logging: stdout, stderr, logger, none |}
{| `--mongo-auth` | `(--ma) <user:pwd@dbname>` | Define user name and password for database dbname |}
{table}
So, for example, to connect to the default connection at `machinexyz:12345` you
would use:
```{.sh}
% prog.exe --mh machinexyz:12345
```
This remains a single connection, to connect to a replica set you also need to
define a name for the replica set plus some seeds:
```{.sh}
% prog.exe --mn blort --mr blort --ms machinexyz:27017 --ms machineuvw:27017
```
Here we have defined a connection called ''blort'' to a replica set also called
''blort'' with two seed machines.
Remember that you only really need one seed which is active in the set, the
connection logic queries the seeds for the actual host list and then polls the
hosts until it finds the current primary server.
From then on reconnection will be attempted if the current primary goes down.
Note that you can define as many named connections as you like, this example
still retains the default connection.
Note also that you can clone a connection such that the connection itself will
not be closed until all clones have already been closed.
Handling concurrency within an Opa program is done by a socket pool.
This means that a pool of open connections is maintained to the same server such
that blocking only occurs if there are no more available connections in the pool
(set with `--mp 2`, for example).
If you ensure that the pool size is at least as big as the number of threads in
your code then no blocking will occur.
Named connections can also be defined within the program:
```
MongoConnection.add_named_connection({
name: "blort",
replname: {some: "blort"},
bufsize: 50*1024,
pool_max: 2,
log: false,
seeds:[("localhost",10001),("localhost",10002)],
auth:[{dbname:"mydb",user:"me",password:"secret"}]
})
mongodb2 = N.openfatal("blort")
```
Once a connection has been opened, it can be pointed to different databases and
collections using a functional interface.
The default database is ''db'' and the default collection is ''collection'' but
we can make a connection to a different collection without re-opening the
connection as follows:
```
mongodb_wiki = MongoConnection.namespace(mongodb,"db","wiki")
```
This mechanism also applies to the flags that some of the MongoDB operations can
take, for example to set the `Upsert` flag for all insert operations:
```
mongodb3 = MongoConnection.upsert(mongodb)
```
This method is quite flexible since you can define these flags once when the
connection is made, making the flags globally persistent, or you can add these
function calls at the point of calling the operation, i.e. locally defined flags
(there are examples below).
All of the MongoDB flags are supported in this way.
One particular flag is worth mentioning, the `log` flag which can be set on the
command line and can actually be overridden in this way allowing you to generate
logs for targeted sections of code.
In fact, you can change any of the command line options this way but bear in mind
that some of them, for example, seed lists, will not take effect until the
connection is reconnected.
#### Authentication
As you can see, you can add the MongoDB authentication parameters for a given database
either on the command line using the `--mongo-auth` argument which is of the
form: `user:password@database_name` or by placing the authentication
parameters in the `auth` field in the `add_named_connection` function argument.
Alternatively, you can call the `MongoCommands.authenticate` function to perform
an additional, external authentication.
Note that if you are connecting to a replica set then the driver needs to
re-authenticate after connecting to the new host so the authentication
parameters are built into the low-level Mongo datatype.
This means that if you call this function you should perform all subsequent
operations on the returned Mongo datatype, not on the original which won't have
the parameters built in.
Remember that authentication in MongoDB is to a database, not to a connection so
you can have multiple user names and passwords associated with a single
connection.
If you want to authenticate with all of the databases over a connection you need
to authenticate with the `admin` database which acts a bit like ''root'' access
for databases.
#### Basic operations
The basic database access operations are the same as the MongoDB protocol
operations, i.e. insert, update, query, get_more, delete, kill_cursors and msg.
So, for example, to insert a document:
```
/* A couple of documents */
p1 = [H.str("name","Joe1"), H.i32("age",44)]
p2 = [H.str("name","Joe2"), H.i32("age",55)]
/* Insert the documents */
MongoConnection.insert(mongodb,p1)
MongoConnection.insert_batch(mongodb,[p1,p2])
```
The basic write operations come in three types:
* `insert` is the write-and-forget operation where the insert message is sent and a boolean value is returned which simply states that the correct number of bytes were written to the socket.
* `inserte` is a ''safe'' operation where the insert message has a `getlasterror` query piggy-backed onto it and then the raw optional reply is returned.
* `insert_result` does an `inserte` and then analyzes the reply, turning it into a standard `Mongo.result` type.
All of the basic write operations have these three forms.
The `Mongo.result` type is an `outcome` of either success as a `Bson.document`
type or failure as a `Mongo.failure` type.
The `Mongo.failure` type looks like:
```
type Mongo.failure =
{OK}
or {string Error}
or {Bson.document DocError}
or {Incomplete}
or {NotFound}
```
This defines either a raw document error `{DocError:doc}` which is an error as
reported by the MongoDB server, a driver error `{Error:str}` which is a
message generated by the Opa driver or a few special-purpose errors returned
under specific circumstances (`{OK}` is simply a connection that has never
been used).
Post-processing of results may include checking for errors:
```
error = MongoConnection.insert_result(MongoConnection.upsert(mongodb),[H.i32("i",n)])
println("insert error={MongoCommon.is_error(error)}")
```
or extracting specific fields from the reply:
```
println("errmsg={MongoCommon.result_string(error,"errmsg")}")
```
noting that we also support the MongoDB
[dot notation](http://www.mongodb.org/display/DOCS/Dot+Notation+%28Reaching+into+Objects%29)
syntax:
```
println("indexSizes._id_={MongoCommon.dotresult_int(collStats,"indexSizes._id_")}")
```
Closing a connection is as simple as:
```
MongoConnection.close(mongodb)
```
Remember that the connection will only close once all of the clones have also
been closed.
#### Cursors
Handling queries in MongoDB has the complication that, for efficiency, cursors
are stored on the server which entails tracking them at the client side.
While the bare `MongoConnection.query` and `MongoConnection.get_more` operations
can be used to handle queries in conjunction with the reply support code in
`MongoCommon` they are a bit inconvenient.
For this purpose we have defined cursor operations in the `MongoCursor` module
and re-exported the most important ones into the `MongoConnection.Cursor`
module.
A cursor object itself contains all the parameters needed to manage the cursor
at the server side and, in fact, duplicates some of the information in the
connection object.
Using the re-exported functions reduces the number of parameters to the basic
functions since this information can be lifted from the connection into the
cursor object.
Here is an example of a low-level cursor dialog:
```
cursor = MongoConnection.Cursor.init(mongodb)
cursor = MongoConnection.Cursor.set_query(cursor,{some:[H.str("name","Joe")]})
cursor = MongoConnection.Cursor.set_limit(cursor,3)
cursor = MongoConnection.Cursor.set_fields(cursor,{some:[H.i32("_id",0)]})
cursor = MongoConnection.Cursor.next(cursor)
result = MongoConnection.Cursor.check_cursor_error(cursor)
println("result 1 = {MongoCommon.pretty_of_result(result)}")
println("valid 1 ={MongoConnection.Cursor.valid(cursor)}")
cursor = MongoConnection.Cursor.next(cursor)
result = MongoConnection.Cursor.check_cursor_error(cursor)
println("result 2 = {MongoCommon.pretty_of_result(result)}")
println("valid 2 = {MongoConnection.Cursor.valid(cursor)}")
MongoConnection.Cursor.reset(cursor)
```
The cursor is initialized with `init` and then the parameters for the query
are setup.
The `next` function generates the `query` (or `get_more`) call to the server and
places the next document internally in the cursor object along with any error
status.
The `check_cursor_error` function is a convenient way of extracting either the
current document or the error as a `Mongo.result`.
Subsequent calls to `next` will either return the next document from the
previous reply or issue a `get_more` call to re-populate the cursor.
The end of the matching documents (or if no document matches) is signaled with
`NotFound` and if you try to read past the end of matching documents you will
get an ''end of data'' error from the driver.
The `valid` function is used to poll whether there is any remaining data.
Finally, the call to `reset` is important here because it doesn't just end the
query, it will issue a `kill_cursors` operation to the server to tell it to
delete the cursor (cursors time out after 10 minutes by default on the MongoDB
server).
This method works fine but this logic has been wrapped up into some convenience
functions:
* `find_one` returns the first matching document as a `Mongo.result`
* `find_all` gives all the matches as a list of documents (use the `limit` function to limit the number of replies).
For example:
```
/* Find all objects in db.session, excluding the _id field */
mongo_session_no_id =
MongoConnection.fields(MongoConnection.namespace(mongodb,"db","session"),{some:[H.i32("_id",0)]})
println("findAll: {CM.pretty_of_results(MongoConnection.Cursor.find_all(mongo_session_no_id,[]))}")
```
You can also define custom loops over the matches using `start` (or `find`) in
conjunction with `next` and `valid`.
(Note that you must use the `MongoConnection.Cursor.for` loop instead of the
more usual `for` function in the Opa stdlib, you need to check for valid and
only call next if still valid at that point, otherwise you will miss the last
document in the list of matches).
//Commands
//~~~~~~~~
Collections
-----------
While you can achieve anything that MongoDB is capable of using the low-level
drivers, there are no guarantees of type safety while converting between BSON
documents and Opa values.
You can of course base your entire project around BSON values and eliminate
the need for converting between MongoDB's documents and Opa types altogether but
this may not be very convenient depending upon what is happening elsewhere in
your application.
Secondly, to use the low-level drivers requires an investment in learning
MongoDB's powerful but rather complex interface (which may be new to users of
relational databases) in order to exploit what MongoDB has to offer.
Finally, basing your application on MongoDB's API will tie your application to
MongoDB and you may at some point in the future wish to migrate to other
database solutions.
Ultimately, the intention is to provide an abstract view of the database which
is general enough to encompass several of the existing database solutions, of
which MongoDB is an important player, and support this with compiler-generated
syntax in the manner of the Opa inbuilt database.
This support is still not available but we can offer an intermediate layer of
programming MongoDB whereby we assume collections of Opa types and support
type-safety by performing run-time type-checks on operations over these
collections.
This support is in the form of the `MongoCollection` module plus some support
modules for generating values suitable to be applied to these functions.
### The `collection` type
The central idea in the `MongoCollection` module is a collection (in the MongoDB
terminology sense) of Opa values.
This is embodied in the `Mongo.collection` type which is extremely simple, it's
just a `MongoConnection` value cast to the specific type of the values to be
stored in the collection:
```
type Mongo.collection('a) = {
Mongo.mongodb db /* the mongodb connection */
}
```
When a value is stored in the collection it is automatically converted from its
Opa type into a matching BSON document and _vice versa_ for queries.
While this sounds simple there are a number of pitfalls to watch out for.
We assume that any offline modifications of the collection will not
create any incompatible values.
If, for example, we add or delete a field from a record then the entry can no
longer be represented as an Opa type.
To overcome this problem we place checks in the code to verify the suitability
of documents read from the collection and an error will be generated if any such
values are found.
We also provide features to allow handling of this situation in some specific
circumstances, for example, if you type a field in the collection as
`Bson.register` it will allow you to successfully read in values with missing
fields but this is not recommended for collections.
Ultimately, it is up to the maintainer of the database to ensure that the values
stored there are consistent with the application's usage of the collection.
Despite these provisos, using a collection is very simple and gives the
programmer the ability to integrate Opa types with the MongoDB system without
having to understand the underlying complexity of the database and with a modest
level of type-safety.
The cost, for the moment, is the overhead of the run-time type-checks which will
slow down database operations.
### Programming with collections
A simple dialog for creating and manipulating a collection might be as follows:
```
/* The type of our first collection */
type t = {int i}
/* Create a collection of type t */
Mongo.collection(t) c1 = MongoCollection.openfatal("default","db","collection")
/* Put a single value into the collection */
result = MongoCollection.insert_result(c1,{i:0})
/* Finally, destroy the collection */
MongoCollection.destroy(c1)
```
We define a type for the collection (`type t`) so that when we open a connection
to the database we can cast the resulting collection object and thus install the
correct run-time representation of the type.
The `openfatal` function returns a collection and treats a connection failure as
fatal.
There are several variants of the `open` function.
A collection is a pointer to a specific collection in the database (here,
`db.collection`) and we create a connection to the MongoDB server using the
connection name (in this instance, `default`).
Inserting a value into the collection is trivial, the value is simply passed as
it is to the `insert` function (here we use the safe `insert_result` function
which also returns the result of a `getlasterror` call).
The insert has exactly the same effect as a call to `MongoConnection.insert` but
with the value automatically converted into a BSON document using the scheme
outlined above.
The call to `MongoCollection.destroy` should not be forgotten because this
closes the underlying connection.
While the `insert` function is trivial, we need more care with `update` and
`delete`.
The problem is that to maintain our level of type-safety we need to match select
(and update) documents with the type of the collection they are applied to.
We do this with a system of run-time type-checks applied to the select
documents.
For example:
```
/* Create pre-typed select and update generation functions */c
MongoSelect.create reatest = Bson.document -> Mongo.select(t)
MongoUpdate.create createut = Bson.document -> Mongo.update(t)
/* Generate the select documents */
select = createst(MongoSelectUpdate.int64(MongoSelectUpdate.empty(),"i",0))
update = createut(MongoSelectUpdate.inc(MongoSelectUpdate.int64(MongoSelectUpdate.empty(),"i",1)))
/* We can now apply update to these documents */
result = MongoCollection.update_result(c1,select,update)
```
Firstly, we use the `MongoSelectUpdate` module to generate the basic documents.
Note that we could also have used the `Bson.opa2doc` function to achieve the
same result:
```
select = createst(Bson.opa2doc({i:0}))
update = createut(Bson.opa2doc({`$inc`:{i:1}}))
```
The choice between these two styles may depend upon the type of document being
generated.
The Opa type-based versions are more readable but the `MongoSelectUpdate` ones
are much faster since no conversion is required.
The select documents have to be correctly typed for the collection they apply to
so we generate a couple of convenience functions `createst` and `createut` to do
the casting for us.
Secondly, once we have these documents we can apply the `update` function to
them but note that although a select document is just a typed `Bson.document` it
triggers a set of suitability tests.
These tests are complex and probably do not cover all possible MongoDB
operations but briefly, the select document is scanned by a knowledge-base of
the types of MongoDB field types, for example `$inc` only applies to updates,
`$and` only applies to selects whereas `$comment` can apply to both.
Once the status (select/update/both) is determined, the type of the resulting
values is determined from the select document and is verified to be a subtype of
the type of the collection.
So, for example, `{int a}` is a subtype of `{int a, string b}` but `{int a, bool c}`
is not.
Presently, we only print a suitable warning but in future, once these routines
have fully matured we may return an error value.
All of the basic database write operations occur in both send-and-forget and in
send-with-getlasterror forms: `insert`, `insert_result`, `insert_batch`,
`insert_batch_result`, `update`, `update_result`, `delete` and `delete_result`.
As an aside, notice that we use a similar functional interface for flags as for
the low-level code:
```
MongoCollection.delete(MongoCollection.singleRemove(c1),createst(Bson.opa2doc({i:104})))
```
The select mechanism applies to queries as well but in this case we have to be
careful what types we return from the database:
```
result = MongoCollection.find_one(c1,createst(Bson.op12doc({`$where`:(Bson.code "this.i > 106")})))
match (result) {
case {success:{~i}}: println("i={i}")
case {~failure}: println("error={MongoCommon.string_of_failure(failure)}")
}
```
This example returns the first value in the collection for which `i` is greater
than 106, it expresses the select as a Javascript expression.
Many of the MongoDB query methods are perfectly safe with collections such as
the `$where` example here but some methods are not safe in that they return
documents which contain fields other than those in the Opa type, a good example
being the http://www.mongodb.org/display/DOCS/Explain[`$explain`] documents
which are a set of statistical data concerning the given query (see the
`Mongo.explainType` type in `MongoCommands`).
In general, we attempt to support such features with special purpose functions
rather than via the normal database operations.
The usual simplified query functions are present in `MongoCollection`,
`find_one` and `find_all`.
There are also two functions which return the bare `Bson.document`
representation of the result, `find_one_doc` and `find_all_doc` which may be
useful in the above situation where the result of the query is not compatible
with Opa types.
For more general query scanning, the cursor-based routines are available.
For example, the following code scans the results of a `MongoCollection` query
```
query = createst(Bson.opa2doc({i:{`$gt`:102, `$lt`:106}}))
match (MongoCollection.query(MongoCollection.limit(c1,0),query)) {
case {success:cc1}:
cc1 =
while(cc1,(function(cc1) {
match (MongoCollection.next(cc1)) {
case (cc1,{success={~i}}):
println("i={v}")
(cc1,MongoCollection.has_more(cc1))
case (cc1,{~failure}):
println("error={MongoCommon.string_of_failure(failure)}")
(cc1,false))})
MongoCollection.kill(cc1)
case {~failure}:
println("error={MongoCommon.string_of_failure(failure)}")
}
```
In this code, we create a `Mongo.collection_cursor` object using
`MongoCollection.query` to which we can then apply the collection-specific cursor
functions `MongoCollection.next` and `MongoCollection.has_more`.
This allows arbitrary processing of collection queries.
Remember, as with the low-level cursors above, that the `MongoCollection.kill`
function does not just end the scan, it also sends a `kill_cursors` message to
the MongoDB server to tell it to destroy the cursor.
Another aside in this code is that we set the `limit` value to `0` which means
''use the default number of documents per reply''.
If we had set this to `1` we would only ever get one document in the reply
because MongoDB treats this as a special case, i.e. ''just return one document''.
Again, to help with the situation where return values may be incompatible with
Opa types, we provide the `_unsafe` variants of the query functions.
These, for example `query_unsafe`, take an additional boolean flag,
`ignore_incomplete` which instructs the driver to simply ignore any return
documents which have missing fields and are thus not compatible with Opa types.
MongoDB will actually return partial documents if the document meets the query
document but does not contain all of the fields (an exception is the `_id` field
which is always returned unless specifically excluded with the return field
selector document).
These functions should be used with care.
Apart from the support described here the `MongoCollection` module also provides
a few convenience functions such as creating indexes using collection objects
and some direct support for some of the aggregation functions (`count`,
`distinct` and `group`).
Finally, one of the variants of the `open` function, `openpkg` and
`openpkgfatal` supplies a set of pre-cast versions of `MongoSelect.create` and
`MongoUpdate.create`.
Example: Hello, MongoDB wiki
----------------------------
In this section, we describe how to convert the `hello_wiki` example described
in the [previous chapter](/manual/Hello--wiki) to using the MongoDB database.
This is actually a simple process and uses MongoDB as a simple key-value storage
database.
// TODO: more realistic example
The first task is to open a connection to the database.
We are going to use collections and in fact, we will use the version of `open`
which also gives us the casting functions for selects:
```
/**
* The basic info. about the database and table location.
*/
type page = {
string _id,
Bson.int32 _rev,
string content
}
/**
* We work at level 1, run-time type-checked storage of a collection of Opa values.
* The Mongo.pkg type provides convenience functions for building select and update documents.
**/
Mongo.pkg(page) (wiki_collection,wiki_pkg) = MongoCollection.openpkgfatal("default","db","wiki");
function pageselect(v) { wiki_pkg.select(Bson.opa2doc(v)); }
function pageupdate(v) { wiki_pkg.update(Bson.opa2doc(v)); }
```
The `_rev` field has been cast to `Bson.int32` so we can use 32-bit integers for
this field (it is unlikely we will ever have more than 4 giga-revisions of any
value in the database!).
We then open our connection using the default named connection and connect to
the collection `db.wiki`.
This returns a collection object plus a package of values which we use to build
our select documents.
Next we are actually going to search for documents including the `_rev` field so
we can't just use the default index for our collection (the `_id` field):
```
/**
* Indexes aren't automatic in MongoDB apart from the non-removable _id index.
* Since we're searching on _rev as well, we need a separate index.
**/
MongoCollection.create_index(wiki_collection, "db.wiki", Bson.opa2doc({_id:1; _rev:1}), 0)
```
The `get_content` function can then be modified using a simple call to
`MongoCollection.find_one`:
```
function get_content(docid) {
default_page = "This page is empty. Double-click to edit."
function extract_content(page record) { record.content }
/* Order by reverse _rev to get highest numbered _rev. */
orderby = {some:Bson.opa2doc({_rev:-1})}
match (MongoCollection.find_one(MongoCollection.orderby(wiki_collection,orderby),pageselect({_id:docid}))) {
case {success:page}: extract_content(page)
case {failure:{NotFound}}: default_page
case {~failure}:
jlog("hello_wiki_mongo: failure={MongoCommon.string_of_failure(failure)}")
default_page
}
}
```
We search the database for the given `_id` value but we want the
highest-numbered `_rev` field so we sort by inverse order on that field (the
default ordering for numerical fields is in increasing order).
A missing document is signaled by the `NotFound` failure condition, other
`failure` values are errors.
Finally, the `save_source` function becomes a call to
`MongoCollection.update_result`: