Skip to content

Latest commit

 

History

History
263 lines (186 loc) · 9.18 KB

tutorial.md

File metadata and controls

263 lines (186 loc) · 9.18 KB

MongoDB Haskell Mini Tutorial

Author: Brian Gianforcaro (b.gianfo@gmail.com)

Updated: 2/28/2010

This is a mini tutorial to get you up and going with the basics of the Haskell mongoDB drivers. It is modeled after the pymongo tutorial.

You will need the mongoDB bindings installed as well as mongo itself installed.

$ = command line prompt
> = ghci repl prompt

Installing Haskell Bindings

From Source:

$ git clone git://github.com/srp/mongoDB.git
$ cd mongoDB
$ runhaskell Setup.hs configure
$ runhaskell Setup.hs build
$ runhaskell Setup.hs install

From Hackage using cabal:

$ cabal install mongoDB

Getting Ready

Start a MongoDB instance for us to play with:

$ mongod

Start up a haskell repl:

$ ghci

Now We'll need to bring in the MongoDB/BSON bindings:

> import Database.MongoDB
> import Database.MongoDB.BSON

Making A Connection

Open up a connection to your DB instance, using the standard port:

> con <- connect "127.0.0.1" []

or for a non-standard port

> import Network
> con <- connectOnPort "127.0.0.1" (Network.PortNumber 666) []

By default mongoDB will try to find the master and connect to it and will throw an exception if a master can not be found to connect to. You can force mongoDB to connect to the slave by adding SlaveOK as a connection option, eg:

> con <- connect "127.0.0.1" [SlaveOK]

Databases, Collections and FullCollections

As many database servers, MongoDB has databases--separate namespaces under which collections reside. Most of the APIs for this driver request the FullCollection which is simply the Database and the Collection concatenated with a period.

For instance 'myweb_prod.users' is the the FullCollection name for the *Collection 'users' in the database 'myweb_prod'.

Databases and collections do not need to be created, just start using them and MongoDB will automatically create them for you.

In the below examples we'll be using the following FullCollection:

> import Data.ByteString.Lazy.UTF8
> let postsCol = (fromString "test.posts")

You can obtain a list of databases available on a connection:

> dbs <- databaseNames con

You can obtain a list of collections available on a database:

> cols <- collectionNames con (fromString "test")
> map toString cols
["test.system.indexes"]

Documents

Data in MongoDB is represented (and stored) using JSON-style documents. In mongoDB we use the BsonDoc type to represent these documents. At the moment a BsonDoc is simply a tuple list of the type '[(ByteString, BsonValue)]'. Here's a BsonDoc which could represent a blog post:

> import Data.Time.Clock.POSIX
> now <- getPOSIXTime
> :{
  let post = [(fromString "author", BsonString $ fromString "Mike"),
              (fromString "text",
               BsonString $ fromString "My first blog post!"),
              (fromString "tags",
               BsonArray [BsonString $ fromString "mongodb",
                          BsonString $ fromString "python",
                          BsonString $ fromString "pymongo"]),
              (fromString "date", BsonDate now)]
  :}

With all the type wrappers and string conversion, it's hard to see what's actually going on. Fortunately the BSON library provides conversion functions toBson and fromBson for converting native between the wrapped BSON types and many native Haskell types. The functions toBsonDoc and fromBsonDoc help convert from tuple lists with plain String keys, or Data.Map.

Here's the same BSON data structure using these conversion functions:

> :{
  let post = toBsonDoc [("author", toBson "Mike"),
                        ("text", toBson "My first blog post!"),
                        ("tags", toBson ["mongoDB", "Haskell"]),
                        ("date", BsonDate now)]
  :}

Inserting a Document

To insert a document into a collection we can use the insert function:

> insert con postsCol post
BsonObjectId 23400392795601893065744187392

When a document is inserted a special key, _id, is automatically added if the document doesn't already contain an _id key. The value of _id must be unique across the collection. insert returns the value of _id for the inserted document. For more information, see the documentation on _id.

After inserting the first document, the posts collection has actually been created on the server. We can verify this by listing all of the collections in our database:

> cols <- collectionNames con (fromString "test")
> map toString cols
[u'postsCol', u'system.indexes']
  • Note The system.indexes collection is a special internal collection that was created automatically.

Getting a single document with findOne

The most basic type of query that can be performed in MongoDB is findOne. This method returns a single document matching a query (or Nothing if there are no matches). It is useful when you know there is only one matching document, or are only interested in the first match. Here we use findOne to get the first document from the posts collection:

> findOne con postsCol []
Just [(Chunk "_id" Empty,BsonObjectId (Chunk "K\151\153S9\CAN\138e\203X\182'" Empty)),(Chunk "author" Empty,BsonString (Chunk "Mike" Empty)),(Chunk "text" Empty,BsonString (Chunk "My first blog post!" Empty)),(Chunk "tags" Empty,BsonArray [BsonString (Chunk "mongoDB" Empty),BsonString (Chunk "Haskell" Empty)]),(Chunk "date" Empty,BsonDate 1268226361.753s)]

The result is a dictionary matching the one that we inserted previously.

  • Note: The returned document contains an _id, which was automatically added on insert.

findOne also supports querying on specific elements that the resulting document must match. To limit our results to a document with author "Mike" we do:

> findOne con postsCol $ toBsonDoc [("author", toBson "Mike")]
Just [(Chunk "_id" Empty,BsonObjectId (Chunk "K\151\153S9\CAN\138e\203X\182'" Empty)),(Chunk "author" Empty,BsonString (Chunk "Mike" Empty)),(Chunk "text" Empty,BsonString (Chunk "My first blog post!" Empty)),(Chunk "tags" Empty,BsonArray [BsonString (Chunk "mongoDB" Empty),BsonString (Chunk "Haskell" Empty)]),(Chunk "date" Empty,BsonDate 1268226361.753s)]

If we try with a different author, like "Eliot", we'll get no result:

> findOne con postsCol $ toBsonDoc [("author", toBson "Eliot")]
Nothing

Bulk Inserts

In order to make querying a little more interesting, let's insert a few more documents. In addition to inserting a single document, we can also perform bulk insert operations, by using the insertMany api which accepts a list of documents to be inserted. This will insert each document in the iterable, sending only a single command to the server:

> now <- getPOSIXTime
> :{
  let new_postsCol = [toBsonDoc [("author", toBson "Mike"),
                                 ("text", toBson "Another post!"),
                                 ("tags", toBson ["bulk", "insert"]),
                                 ("date",  toBson now)],
                      toBsonDoc [("author", toBson "Eliot"),
                                 ("title", toBson "MongoDB is fun"),
                                 ("text", toBson "and pretty easy too!"),
                                 ("date", toBson now)]]
  :}
> insertMany con postsCol new_posts
[BsonObjectId 23400393883959793414607732737,BsonObjectId 23400398126710930368559579137]
  • Note that new_posts !! 1 has a different shape than the other posts - there is no "tags" field and we've added a new field, "title". This is what we mean when we say that MongoDB is schema-free.

Querying for More Than One Document

To get more than a single document as the result of a query we use the find method. find returns a cursor instance, which allows us to iterate over all matching documents. There are several ways in which we can iterate: we can call nextDoc to get documents one at a time or we can get a lazy list of all the results by applying the cursor to allDocs:

> cursor <- find con postsCol $ toBsonDoc [("author", toBson "Mike")]
> allDocs cursor

Of course you can use bind (>>=) to combine these into one line:

> docs <- find con postsCol (toBsonDoc [("author", toBson "Mike")]) >>= allDocs
  • Note: nextDoc automatically closes the cursor when the last document has been read out of it. Similarly, allDocs automatically closes the cursor when you've consumed to the end of the resulting list.

Counting

We can count how many documents are in an entire collection:

> num <- count con postsCol

Or we can query for how many documents match a query:

> num <- countMatching con postsCol (toBsonDoc [("author", toBson "Mike")])

Range Queries

No non native sorting yet.

Indexing

WIP - coming soon.

Something like...

> index <- createIndex con testcol [("author", Ascending)] True