/
ImageSearch.lhs
1573 lines (1227 loc) · 65.1 KB
/
ImageSearch.lhs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% Using the Dropbox API from Haskell
% Rian Hunter
% 1/2/2012
Introduction
------------
I love [Haskell](http://www.haskell.org/). My first encounter
with Haskell started out about eight
years ago. Like many people in those days, when I was in high school
I spent a lot of time playing around with code on my computer. Reading
and understanding open source projects was a main source of knowledge
and inspiration for me when I was learning how to program. When I came
upon the bzip2 homepage and consequently
[Julian Seward](http://en.wikipedia.org/wiki/Julian_Seward)'s homepage
I found a short note about Haskell and how
it was a super fun and interesting language to write a compiler for.
Haskell? What's that?
After reading more about Haskell, functional programming, lazy evaluation,
and type inference and seeing the elegance of the various code samples,
I was hooked. I spent the next couple of weeks going through "[Yet
Another Haskell Tutorial](http://www.cs.utah.edu/~hal/htut/)"
and I remember it being incredibly
difficult yet incredibly rewarding. After I wrote my first fold over a
recursive algebraic datatype, I felt like I was finally starting
to speak Haskell. I felt like I had massively improved as a programmer.
While I've been at Dropbox, [Python](http://www.python.org/)
has been my main language of computational expression.
Even though it was a bit rocky at first, I've grown
to really love Python and the culture of fast iteration and
[duck typing](http://en.wikipedia.org/wiki/Duck_typing).
Like a good Pythonista, I'm of the opinion that types are training wheels,
but that's really only until you use a language with a real type system.
In C# or Java, types can get in your way and even force you to
write overly verbose code or follow silly "design patterns." In Haskell, types help
you soar to higher computational ground. They encourage you to
model your data in coherent, concise, and elegant ways that feel
right. They aren't annoying.
More people should use Haskell. The steep learning curve forces
you to understand what you are doing at a deeper level and you
will be a better programmer because of it. To help that happen, this
post will be in a
[semi-literate programming](http://en.wikipedia.org/wiki/Literate_programming)
style and I'll be describing a
[Dropbox API](https://www.dropbox.com/developers)
app written in Haskell. This won't be like a normal tutorial
so you'll probably have to do a bit of supplemental
reading and practice afterward. The goal is to give you a
flavor of what a real program in Haskell looks like.
This post assumes no previous knowledge with Haskell but
it does assume moderate programming ability in another
language, e.g. C, C++, Ruby, Python, Java, or Lisp.
Since this post does not assume previous
Haskell experience the beginning will be more of a
Haskell tutorial and core concepts will be sectioned
off to facilitate the learning process.
This post is a published version of a
[Literate Haskell](http://www.haskell.org/haskellwiki/Literate_programming#Bird_Style) file.
Code lines prefixed with the "`>`" character are actually part of the final program.
This makes it so that it's possible to simply copy & paste the text
here into your favorite editor and run it, just make sure you save the
file with a ".lhs" extension. If you ever get tired
of reading you can get
[the real source](http://github.com/dropbox/image-search)
at the GitHub repo.
The Haskell implementation we'll be using is
[The Haskell Platform](http://hackage.haskell.org/platform/),
it's a full stack of Haskell tools prepackaged to work
out of the box on all three major desktop operating systems.
It's based on [GHC](http://haskell.org/ghc/), The Glasgow Haskell Compiler.
GHC is an advanced optimizing compiler
focused on producing efficient code.
Image Search for Dropbox
------------------------
Years ago [Robert Love](http://en.wikipedia.org/wiki/Robert_Love) of Linux kernel hacker fame wrote a
[FUSE](http://fuse.sourceforge.net/)
file system that made it so that user-created folders were populated with
[Beagle](http://en.wikipedia.org/wiki/Beagle_(software))
search results using the folder name as the search query. It was called
[beaglefs](http://svn.gnome.org/svn/beagle/trunk/beaglefs/README).
The point was
to demonstrate the power of user-space file systems, notably the
power of having so much more library code available to you than in
kernel-space.
We can do a similar thing with the Dropbox API. We're going
to write a hypothetical Dropbox API app that populates user-created
folders with
[Creative Commons](http://creativecommons.org/)
licensed images found by performing a web image search
using the folder name as the search term. Using Dropbox, all the
user has to do to perform an image search is simply create a folder.
Let's get started!
LANGUAGE Pragma
---------------
> {-# LANGUAGE TypeFamilies #-}
> {-# LANGUAGE QuasiQuotes #-}
> {-# LANGUAGE MultiParamTypeClasses #-}
> {-# LANGUAGE FlexibleContexts #-}
> {-# LANGUAGE TemplateHaskell #-}
> {-# LANGUAGE DeriveDataTypeable #-}
Despite being more than two decades old, Haskell is still evolving.
You can tell your Haskell compiler to allow newer language
features using this syntax, this is called the LANGUAGE Pragma.
Please don't worry about what these exact language extensions do just yet,
you can read more in the [GHC docs](http://haskell.org/ghc/docs/latest/html/users_guide/pragmas.html#language-pragma "Language Pragma").
Import Declarations
-------------------
> import Yesod
This is an import declaration. Declarations like these inform the Haskell module
system that I am going to use definitions from this other module. A module in Haskell
is a collection of values (functions are values in Haskell too!), datatypes,
type synonyms, type classes, etc.
Here I am telling the module system to bring in all definitions from the module
`Yesod` in the namespace of this module. If I wanted to I could also access
those definitions prefixed with "Yesod." similar to Java.
[Yesod](http://www.yesodweb.com/) is a fully-featured modern web-framework for Haskell. We'll be using
it to create the web interface of our API app.
> import Text.XML.HXT.Core hiding (when)
This is a another import declaration. This is just like the Yesod import
except we use the "hiding" syntax to tell the Haskell module system to not
import the "when" definition into this module's namespace.
The module we are importing is the main module from
[HXT](http://www.fh-wedel.de/~si/HXmlToolbox/index.html), the
XML processing library that we use to parse out the search results from an
image search result page. More details on this much later.
> import Control.Applicative ((<$>), (<*>))
> import Control.Concurrent (forkIO, threadDelay, newChan,
> writeChan, readChan, Chan, isEmptyChan)
> import Control.Monad (forM_, when)
> import Control.Monad.Base (MonadBase)
> import Control.Monad.Trans.Control (MonadBaseControl)
> import Data.Maybe (fromJust, mapMaybe, isNothing)
> import Data.Text.Encoding (decodeUtf8With)
> import Data.Text.Encoding.Error (ignore)
> import Data.Typeable (Typeable)
> import Network.HTTP.Enumerator (simpleHttp)
> import System.FilePath.Posix (takeFileName, combine)
These import declarations are slightly different. Instead of bringing in
all names from the modules I am only bringing in specific names. This is very
similar to the "from module import name" statement in Python.
> import qualified Control.Exception.Lifted as C
> import qualified Data.ByteString as B
> import qualified Data.ByteString.Lazy as L
> import qualified Data.List as DL
> import qualified Data.Map as Map
> import qualified Data.Text as T
> import qualified Data.URLEncoded as URL
> import qualified Dropbox as DB
> import qualified Network.URI as URI
Remember how I said that I could also access names by prefixing them with
"Yesod." earlier? Adding `qualified` to the import declaration makes it so
you must refer to names in other modules using the module prefix. The
"as C" part in the first line makes it so that I can do `C.try` instead of
`Control.Exception.Lifted.try`.
Algebraic Datatypes
-------------------
> data ImageSearch = ImageSearch (Chan (DB.AccessToken, String)) DB.Config
This is an *algebraic datatype* declaration. Nevermind what algebraic means
for the moment, this is the basic way to define new types in Haskell.
Even though it's very short this little code actually does
a couple of things:
1. It defines a type called `ImageSearch`. This is the immediate text after "data".
2. It defines a function called `ImageSearch` that takes two arguments,
the first of type `Chan (DB.AccessToken, String)`, the second of type `DB.Config`,
and it returns a value of type `ImageSearch`, **ignore
these complex types for now they aren't important**. This function
is called a *constructor* in Haskell.
Constructors play a big role in Haskell. Wherever a name
can be bound to a value in Haskell, you can also use contructor pattern matching,
or *deconstruction*,
to extract out the data contained within that value. Here's
a function to get out the channel component of our `ImageSeach`
type:
< imageSearchChan (ImageSearch chan config) = chan
`imageSearchChan` takes in an `ImageSearch` argument and
return the channel wrapped inside of it. You'll see deconstruction
a lot more later.
So far we've defined a type called `ImageSearch` and we've
also defined a function called `ImageSearch`. This is okay because in
Haskell **type names and value names live in different namespaces**.
Maybe Type
----------
The `Maybe` type is one algebraic datatype that you'll
see a lot in Haskell code. It's often used to denote an error
value from a function or an optional argument to a function.
< data Maybe a = Nothing | Just a
Unlike `ImageSearch`, the `Maybe` type has not one but two
constructors: `Nothing` and
`Just`. You can use either constructor to create a value in the `Maybe` type.
A value of `Nothing` usually denotes an error in the `Maybe` type. A `Just`
value indicates success.
Another difference from our `ImageSearch` type is that `Maybe` isn't a concrete
type on its own; **it has to wrap some other type**. In Haskell, a "higher-order"
type like this is called a *type constructor*; it takes a concrete type and
returns a new concrete type. For example, `Maybe Int` or `Maybe String` are two
concrete types created by the `Maybe` type constructor. This is similar
to generics, like `List<T>`, in Java or C#. We use the *type variable*
"`a`" in the declaration to show that the `Maybe` type constructor can be
applied to any concrete type.
It's important to note
that **type constructors, like `Maybe`, are different from data constructors,
like `Nothing` or `Just`**.
Type Signatures & Currying
--------------------------
> toStrict :: L.ByteString -> B.ByteString
> toStrict = B.concat . L.toChunks
This is a function definition in Haskell. I'll explain the definition
in the next section but first I wanted to talk about the use of "`::`" since it keeps
coming up. This is how we explicitly tell the Haskell compiler what type
we expect a value to be. Even though a Haskell compiler is good enough to infer
the types of all values in most cases,
it's considered good practice to at least explicitly specify
the types of top-level definitions.
The arrow notation in the type signature denotes a function.
`toStrict` is a function that takes an `L.ByteString` value
and returns a `B.ByteString` value.
One cool thing about Haskell is that functions can only take one argument.
I know it doesn't sound cool but it's actually really cool. How do you
specify a function that takes two arguments you ask? Well that's a
function that takes a single argument and returns another function that
takes another argument. This reformulation of multiple-argument functions
is called *currying*. Currying is cool because it allows us
to easily do *partial application* with functions, you'll see examples of
this later.
Here's the type signature for a function that takes two arguments of
any two types and returns a value in the second type:
< twoArgFunc :: a -> b -> b
We use the type variables "`a`" and "`b`" to indicate that `twoArgFunc`
is *polymorphic*, i.e. we can apply `twoArgFunc` to
any two values of any two respective types.
With currying in mind, it shouldn't be too hard to figure out that the
arrow is right-associative, i.e. "`a -> b -> b`" actually means
"`a -> (b -> b)`". That would make sense, `twoArgFunc` is a function that
takes an argument and returns another function that takes an argument and
returns the final value. Say that over and over again until you understand it.
What if we grouped the arrow the other way?
< weirdFunc :: (a -> b) -> b
In this case `weirdFunc` is a function that takes another function as a
its sole argument and returns a value. This is much different from
`twoArgFunc` which instead returns a second function after it accepts its
first argument. Passing functions to functions like this is a
common idiom in Haskell and it's one of the strengths of a language
where functions are ordinary values just like integers and strings.
Functions
---------------------
The definition of `toStrict` makes use of the function composition operator, "`.`", but
Haskell functions don't have to be defined this way. Here's a function defined using
a variable name:
< square x = x * x
What about two arguments?
< plus x y = x + y
We can define a function that adds five to its argument by making using of currying:
< addFive = plus 5
We curried in the `5` argument to the `plus` function and just as we said, it returns a
new function that takes an argument and adds a five to it:
< addFive 2 == 7
Another way to define functions is to use a *lambda abstraction*.
You can write a function as an expression by using the backslash and arrow symbols.
Lambda abstractions are also known as *anonymous functions*. Here's another way
to define `plus`:
< plus = \x y -> x + y
All functions in Haskell are pure. This means that functions in Haskell cannot
do anything that causes side-effects like changing a global variable or performing IO.
More on this later.
Operators
---------
Okay now that you're cool with functions let's get back to `toStrict`. Here's
another way to define it using a lambda:
< toStrict = \x -> B.concat (L.toChunks x)
Instead, we define `toStrict` using the function composition operator, "`.`".
The function composition operator takes two functions
and returns a new function that passes its argument to the right
function and the result of that is passed to the left function and the result of that is
returned. Here's one possible definition:
< f . g = \x -> f (g x)
Yes this is a real way to define an operator in Haskell!
The function composition operator comes standard in Haskell but even if it didn't
you'd still be able to define it yourself. In Haskell, operators and functions
are actually two different syntaxes for the same thing.
The form above is the infix form but you can also use an operator in the prefix form, like
a normal function. Here's another way to define "`.`":
< (.) f g = \x -> f (g x)
In this definition of the function composition
operator we use prefix notation. It is only necessary to surround an operator
with parentheses to transform it into its prefix form.
Switching between infix and prefix notation
works for any operator, e.g. you
can add two numbers with `(+) 4 5` in Haskell.
The ability to switch
between prefix and infix notation isn't limited to operators, you can do it with
functions too by surrounding the function name with backticks, "`":
< div 1 2 == 1 `div` 2
This is useful for code like: ``"Dropbox" `isInfixOf` "The Dropbox API is sick!"``
One last thing about operators; Haskell has special syntax to curry
in arguments to operators in infix form. For example, `(+5)` is a function
that takes in a number and adds five to that number. These are called
*sections*. To further illustrate,
all of the following functions do the same thing:
* `(+5)`
* `(5+)`
* `\x -> x + 5`
* `(+) 5`
* `plus 5`
* `addFive`
With operator currying it's important to recognize that **the side you curry
the argument in matters**. For example, `(.g)` and `(g.)` behave differently.
The "`$`" Operator
------------------
Next to the "`.`" operator there is another function-oriented
operator that you'll see often in Haskell code. This is the
function application operator and it's defined like this:
< f $ x = f x
Weird right? Why does Haskell have this?
In Haskell, normal function application has a higher precedence than any
other operator and it's left-associative:
< f g h j x == (((f g) h) j) x
Conversely, "`$`" has the lowest precedence of all operators and it's
right-associative:
< f $ g $ h $ j $ x == f (g (h (j x)))
Using "`$`" can make Haskell code more readable as an alternative to
using parentheses. It has other uses too, more on that later.
Lists
-----
> listToPair :: [a] -> (a, a)
> listToPair [x, y] = (x, y)
> listToPair _ = error "called listToPair on list that isn't two elements"
`listToPair` is a little function I use to convert a two-element *list* to
a two-element *tuple*, usually called a *pair*.
A list in Haskell is a higher order type that represents an ordered collection of same-typed
values. A list type is denoted using brackets, e.g. "`[a]`" is the polymorphic list of any inner type and
"`[Int]`" is a concrete type that denotes a list of `Int` values.
Unlike vectors or arrays in other languages, you can't index into a Haskell list in constant
time. It is more akin to the traditional [Linked List](http://en.wikipedia.org/wiki/Linked_list) data
structure.
You can construct a list in a number of ways:
* The empty list: `[]`
* List literal: `[1, 2, 3]`
* Adding to the front of a list: `1 : [2]`
The "`:`" operator in Haskell constructs a new list that starts with the left argument
and continues with the right argument:
< 1 : [2, 3, 4] == [1, 2, 3, 4]
One last thing about the list type in Haskell, it's not that special. We can define our own list type
very simply:
< data MyList a = Nil | Cons a (MyList a)
<
< -- using "#" as my personal ":" operator
< x # xs = Cons x xs
<
< myHead :: MyList a -> a
< myHead (Cons x xs) = x
< myHead Nil = error "empty list"
Yep, a list is just a recursive algebraic datatype.
`foldr` and `map`
-----------------
In the Lisp tradition, lists are a fundamental data structure in Haskell.
They provide an elegant model for solving problems that deal with multiple
values at once. While lists are still fresh in your mind
let's go over two functions that are essential to know when manipulating lists.
The first function is `foldr`. `foldr` means "fold from the right" and
it's used to build a aggregate value by visiting all the elements in a list.
It's arguments are an aggregating function, a starting value, and
a list of values. The aggregating function
accepts two argu--Screw it, let's just define it using recursion:
< foldr :: (a -> b -> b) -> b -> [a] -> b
< foldr f z [] = z
< foldr f z (x:xs) = f x $ foldr f z xs
Notice how we used the "`:`" operator to deconstruct
the input list into its head and tail components.
Like I said, you can use `foldr` to aggregate things in a list, e.g.
adding all the numbers in a list:
< foldr (+) 0 [1..5] == 15
"`[1..5]`" is [syntactic sugar](http://en.wikipedia.org/wiki/Syntactic_sugar) for all integers from
1 to 5, inclusive.
Another less useful thing you can do with `foldr` is copy a list:
< foldr (:) [] ["foo", "bar", "baz"] == ["foo", "bar", "baz"]
`foldr`'s brother is `foldl`; it means "fold from the left."
< foldl :: (b -> a -> b) -> b -> [a] -> b
< foldl f z [] = z
< foldl f z (x:xs) = foldl f (f z x) xs
`foldl` collects values in the list from the left while `foldr` starts
from the right. Notice how the type signature of the aggregating
function is reversed, this should help as a sort of mnemonic when
using `foldr` and `foldl`. Though it may not seem like it,
the direction of the fold matters a lot. As an exercise try
copying a list by using `foldl` instead of `foldr`.
`map` is another common list operation.
`map` is awesome! It takes a function
and a list and returns a new list with the user-supplied function
applied to each value in the original list.
[Recursion is kind of clunky](http://haskell.org/haskellwiki/Things_to_avoid#Avoid_explicit_recursion)
so let's define it using `foldr`:
< map :: (a -> b) -> [a] -> [b]
< map f l = foldr ((:) . f) [] l
Even though `foldr` is more primitive than `map` I find myself
using `map` much more often. Here's how you would map a list of strings
to a list of their lengths:
< map length ["a", "list", "of", "strings"] == [1, 4, 2, 7]
Remember "`$`", the function application operator? We can also use `map`
to apply the same argument to a list of functions:
< map ($5) [square, (+5), div 100] == [25, 10, 20]
We curried in the `5` value into right side the "`$`" operator. That creates a function
that takes a function and then returns the application of that function to
the value `5`. Using `map` we then apply that to every function in the list.
This is why having an "`$`" operator in a functional language is a good idea
but it's also why `map` is awesome!
A Quick Note About Tuples
-------------------------
[I talked about lists](#lists) but I kind of ignored tuples. Like lists,
tuples are a way to group values together in Haskell. Unlike lists,
with tuples you can store values of different types in a single tuple.
< intAndString :: (Int, String)
< intAndString = (07734, "world")
A more subtle difference from lists is that tuples of different lengths
are of different types. For instance, writing a function that returns
the first element of a three-element tuple is easy:
< fst3 :: (a, b, c) -> a
< fst3 (x, _, _) = x
Unfortunately, there is no general way to define a "first" function
for tuples of any length without writing a function for each tuple type.
Haskell does at least provide a `fst` function for two-element tuples.
Final note, see how we ignored the second and third elements of
the tuple deconstruction by using "`_`"? This is a common way to avoid
assigning names in Haskell.
Template Haskell
----------------
> $(mkYesod "ImageSearch" [parseRoutes|
> / HomeR GET
> /dbredirect DbRedirectR GET
> |])
Yesod makes heavy use of [Template Haskell](http://www.haskell.org/haskellwiki/Template_Haskell).
Template Haskell allows you to do compile-time
[metaprogramming](http://en.wikipedia.org/wiki/Metaprogramming)
in Haskell, essentially writing code that writes code at compile-time. It's similar to
[C++ templates](http://en.wikipedia.org/wiki/Template_(C%2B%2B))
but it's a lot more like
[Lisp macros](http://en.wikipedia.org/wiki/Macro_(computer_science)#Lisp_.2F_S-expression_macros).
Template Haskell is a pretty exotic feature that is rarely used in Haskell but Yesod makes
use of it to minimize the amount of code you have to write to get a website up and running.
The `mkYesod` function here generates all the boilerplate code necessary for connecting
the HTTP routes to user-defined handler functions. In our app we have two HTTP routes:
1. `/ -> HomeR`
2. `/dbredirect -> DbRedirectR`
The first route connects to a resource called `HomeR`. The second route, located at `/dbredirect`,
connects to a resource called `DbRedirectR`. We'll define these resources later.
Type Classes
------------
This part of the app brings us to one of the most
powerful parts of Haskell's type system, *type classes*.
Type classes specify a collection of functions that can be applied to multiple types.
This is similar to interfaces in C# and Java or duck typing
in dynamically-typed languages like Python and Ruby. An instance
declaration actually defines the functions of a type
class for specific type. Formally, type classes are an extension to the
[Hindley-Milner](http://en.wikipedia.org/wiki/Hindley-Milner)
type system that Haskell implements to allow for
[ad-hoc polymorphism](http://en.wikipedia.org/wiki/Ad-hoc_polymorphism),
i.e. an advanced form of function overloading.
**Note that the word instance used in this context
is very different from the meaning in object-oriented languages. In an
object-oriented language instances are more akin to Haskell values**.
Section [6.3](http://www.haskell.org/onlinereport/haskell2010/haskellch6.html#x13-1270006.3)
of the
[Haskell 2010 Report](http://www.haskell.org/haskellwiki/Language_and_library_specification#The_Haskell_2010_report)
has a graphic of the standard Haskell type classes. Many core functions in the standard
library are actually part of some type class, e.g. `(==)`, the equals operator, is
part of the `Eq` type class. For fun, let's make a new type class called `Binary`. It defines
functions that convert data between the instance type and a byte format:
< import qualified Data.ByteString as B
<
< class Binary a where
< toBinary :: a -> B.ByteString
< fromBinary :: B.ByteString -> a
You can imagine I might use this class when serializing Haskell data over a
byte-oriented transmission medium, for example a file or a BSD socket. Here's an example
instance for the `String` type:
< import Data.Text as T
< import Data.Text.Encoding as E
< import Data.ByteString as B
<
< instance Binary String where
< toBinary = E.encodeUtf8 . T.pack
< fromBinary = T.unpack . E.decodeUtf8
For this instance `toBinary` is defined as a composition of
`E.encodeUtf8` and `T.pack`. **Notice the use of
`T.pack`, you'll see it a lot. `T.pack` converts a
`String` value into a `Text` value**. `E.encodeUtf8` converts
the resulting `Text` value into a `ByteString` value.
`fromBinary` does the inverse conversion, it converts a `ByteString`
value into a `String` value.
Let's define another instance:
< import Control.Arrow ((***))
< import Data.Int (Int32)
< import Data.Bits (shiftR, shiftL, (.&.))
<
< -- stores in network order, big endian
< instance Binary Int32 where
< toBinary x = B.pack $ map (fromIntegral . (0xff.&.) . shiftR x . (*8))
< $ reverse [0..3]
< fromBinary x = sum $ map (uncurry shiftL . (fromIntegral *** (*8)))
< $ zip (B.unpack x) (reverse [0..3])
`fromBinary` in this instance may look kind of gnarly but you should know what it does;
it converts a `ByteString` value to an `Int32` value. Understanding it
is left as an exercise for the reader.
Now getting Haskell data into a byte format is as easy as calling `toBinary`. An important
distinction between type classes and the interfaces of C# and Java is that it's very easy
to add new functionality to existing types. In this example, *the creators of the both
the `String` and `Int32` types
didn't need any foreknowledge of the `Binary` type class*.
With interfaces, it would have been necessary to specify
the implementation of `toBinary` and `fromBinary`
at the time those types were defined.
It's also possible to define functions that depend on their arguments or return
values being part of a certain type class:
< packWithLengthHeader :: Binary a => a -> B.ByteString
< packWithLengthHeader x = B.append (toBinary $ B.length bin) bin
< where bin = toBinary x
Here `packWithLengthHeader` requires that its input type "`a`" be a part of the
`Binary` type class, this is specified using the "`Binary a =>`" *context*
in the type signature. A
subtle point in the definition of this function is that it requires the `Int` type
to be a part of the `Binary` type class as well (the return value of `B.length`).
> instance Yesod ImageSearch where
> approot _ = T.pack "http://localhost:3000"
Yesod requires you to declare a couple of instances for
your app type. Most of the definitions in the `Yesod` type
class are optional and have reasonable defaults but it does
require you to define `approot`, the root URL location
of your app. This is necessary for Yesod to be able generate URLs.
> instance RenderMessage ImageSearch FormMessage where
> renderMessage _ _ = defaultFormMessage
Here's another instance declaration. The `RenderMessage` type class in this case
is actually based on two types, `ImageSearch` and `FormMessage`. It defines a function
called `renderMessage` which takes two arguments and returns `defaultFormMessage`.
Exceptions & Automatic Type Class Derivation
--------------------------------------------
> data EitherException = EitherException String
> deriving (Show, Typeable)
> instance C.Exception EitherException
There are [multiple ways](http://www.randomhacks.net/articles/2007/03/10/haskell-8-ways-to-report-errors)
to specify errors in Haskell. In purely functional contexts
it's not uncommon to see the use of either the `Maybe` or `Either` types. In *monads* based on the
`IO` monad,
I usually like to use Haskell exceptions. What's a monad you say? It's
[complicated](http://www.haskell.org/haskellwiki/Monad_tutorials_timeline). Just kidding :)
I'll get to them later.
For now, we're defining a new exception type. It's the same algebraic datatype
declaration you saw earlier for our `ImageSearch` type except now there's this "deriving"
thing. A Haskell compiler can automatically derive instance declarations for some of
the standard type classes. For `EitherException` we automatically derive instances
for the type classes `Show` and `Typeable`. As a note,
the ability to automatically derive instances for the `Typeable` type class
was enabled by the LANGUAGE pragma `DeriveDataTypeable` above.
The last line declares `EitherException` to be an instance of `C.Exception`. It might be
weird that we didn't define any functions for this instance. This is because type classes
sometimes provide default implementations for the functions in the class. The `C.Exception`
type class actually provides default implementations for types that are part of the
`Typeable` type class.
Monads
------
> exceptOnFailure :: MonadBase IO m => m (Either String v) -> m v
> exceptOnFailure = (>>=either (C.throwIO . EitherException) return)
`exceptOnFailure` is a function that takes a monad that wraps an `Either` type
and returns that same monad except now wrapping the right side of the `Either` type. This makes sense
to me very clearly but I know, o patient reader, that this must look like gibberish to you.
First let's talk about monads. Monads are types within the `Monad` type class. The `Monad`
type class specifies two functions (or an operator and a function):
< class Monad m where
< (>>=) :: m a -> (a -> m b) -> m b
< return :: a -> m a
The [actual `Monad` type class definition](http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Monad.html#t:Monad)
is slightly more complicated but for simplicity's sake this will do.
The first operator "`>>=`" is called *bind*. What does it do? Well it depends on your
monad instance! What we can say for sure is that it takes takes a monad of
type "`a`", a function that maps from "`a`" to a monad of
type "`b`", and returns a monad of type "`b`".
The second function, `return`, takes a value and wraps it in the monad. What it means to be
wrapped in the monad, again, depends on the instance. Please note, **the `return` function
isn't like the `return` statement in other languages, i.e. it doesn't short-circuit
execution of a monad bind sequence.**
If that sounds abstract that's because it is! [Monads are a sort of computational framework](http://www.haskell.org/haskellwiki/What_a_Monad_is_not),
many different types of computation are
monadic, i.e. they fit the form imposed by bind. Why are monads important? Perhaps the most
important reason they exist in Haskell is that they provide a purely functional way
to perform (or specify how to perform) IO. Monads are much more than just a way to do IO, however,
their general applicability extends to many things.
I won't dwell on monads too much in this post but for the purposes of your immediate understanding,
it suffices to explain the `IO` monad and do notation. Just as I'm not dwelling on
monads, you shouldn't either. It takes a long time to really understand and master what's going on.
The more you program in Haskell, the more it'll make sense.
So why can't we do IO in Haskell without the `IO` monad? Haskell is a purely functional language, that
means that functions in Haskell mimic their mathematical counterparts. A pure function is a
mathematical entity that consistently maps values from one domain to another. You expect `cos(0)`
to always evaluate to `1`, if it ever evaluated to something else something would be very wrong.
In a purely functional language how would you define `getChar`?
< getChar :: Char
< getChar = ...
It takes no arguments so how can it deterministically return what the user is submitting? The answer is
it can't. You can't do this in a purely functional language.
So what are our options? The answer is to
generate a set of actions to take as IO
is occurring, in a purely functional manner. This is what the `IO` monad
is and this is why values in the `IO` monad are called
*IO Actions*. It's a form of metaprogramming. Here's an IO action that prints "yes" if a user
types "y" and "no" otherwise:
< -- type signatures
< print :: Show a => a -> IO ()
< getChar :: IO Char
<
< main :: IO ()
< main = getChar >>= (\x -> print $ if x == 'y' then "yes" else "no")
Why does this work? Notice how the "output" of `getChar` isn't tied to its own value, instead
the bind operation gets the output value for us. We're using monads here to build and
model sequential and stateful computation, in a purely functional way!
You can imagine that writing real programs in the `IO` monad could get ugly
if you used "`>>=`" and lambdas everywhere so that's why Haskell has some syntactic sugar
for writing out complex monadic values. This is called do notation. Here's the same IO action
from above written in do notation.
< main = do
< x <- getChar
< print $ if x == 'y'
< then "yes"
< else "no"
In do notation each monad is bound using bind in order and values
are pulled out using the left-pointing arrow "`<-`".
We're coming to the close of yet another Haskell monad explanation
but before we finish I really want to emphasize that monads and
the `IO` monad in particular aren't that special. Here's my very
own implementation of the `IO` monad:
< data MyIO a = PrimIO a | forall b. CompIO (MyIO b) (b -> (MyIO a))
<
< myGetChar :: MyIO Char
< myGetChar = PrimIO 'a'
<
< myPrint :: Show a => a -> MyIO ()
< myPrint s = PrimIO ()
<
< instance Monad MyIO where
< m >>= f = CompIO m f
< return x = PrimIO x
<
< runMyIO :: MyIO m -> m
< runMyIO (PrimIO x) = x
< runMyIO (CompIO m f) = runMyIO (f (runMyIO m))
Of course in the real `IO` monad, `getChar` isn't hard-coded to return the same thing each time and
`print` actually prints something on your terminal. IO actions are run by your
Haskell runtime which is usually written in a language where you can actually call a non-pure
`getChar` function, like C.
Either Type
-----------
Now, back to `exceptOnFailure`. Let's look at it again:
< exceptOnFailure :: MonadBase IO m => m (Either String v) -> m v
< exceptOnFailure = (>>=either (C.throwIO . EitherException) return)
Is it still confusing? :)
Remember how I said earlier that the `Either` datatype was used to denote errors in Haskell?
The definition of `Either` looks like this:
< data Either a b = Left a | Right b
<
< either :: (a -> c) -> (b -> c) -> Either a b -> c
< either f _ (Left x) = f x
< either _ g (Right x) = g x
You can use the `either` function to return different values depending on the
`Either` datatype passed in. By convention the `Left` constructor is used to denote
an error value.
For `exceptOnFailure`, first we create a function that takes an `Either` value and if
it's a failure we throw an exception using `C.throwIO` otherwise we call `return` to
rewrap the success value. Then we curry in that function to the right side
of the "`>>=`" operator.
Some Utility Functions
----------------------
> myAuthStart :: DB.Config -> Maybe DB.URL -> IO (DB.RequestToken, DB.URL)
> myAuthStart config murl = exceptOnFailure $ DB.withManager $ \mgr ->
> DB.authStart mgr config murl
>
> myAuthFinish :: DB.Config -> DB.RequestToken -> IO (DB.AccessToken, String)
> myAuthFinish config rt = exceptOnFailure $ DB.withManager $ \mgr ->
> DB.authFinish mgr config rt
>
> myMetadata :: DB.Session -> DB.Path -> Maybe Integer
> -> IO (DB.Meta, Maybe DB.FolderContents)
> myMetadata s p m = exceptOnFailure $ DB.withManager $ \mgr ->
> DB.getMetadataWithChildren mgr s p m
Here I've defined a couple of convenience functions for using the Dropbox SDK.
This is just so I don't have to use `DB.withManager` every time I call these
functions. Also all of the vanilla Dropbox SDK functions return an `Either`
value in the `IO` monad so we make use of `exceptOnFailure` to automatically
throw an exception for us if something goes wrong.
> tryAll :: MonadBaseControl IO m => m a -> m (Either C.SomeException a)
> tryAll = C.try
`C.try` is the normal way to catch exceptions in Haskell. Unfortunately it
uses ad-hoc polymorphism within the `Exception` type class to determine
which exception it catches. Since `tryAll` has an explicit type signature,
it's bound to the instance of `C.try` that catches `C.SomeException`.
> dbRequestTokenSessionKey :: T.Text
> dbRequestTokenSessionKey = T.pack "db_request_token"
This is a constant I use for the name of my session key that stores the OAuth
request token when authenticating the user to my API app but more on
that later.
The Dropbox API Authentication Process
--------------------------------------
The Dropbox API uses OAuth to grant apps access to Dropbox user accounts. To access
any of the HTTP endpoints of the Dropbox API an app must provide
an *access token* as part of the HTTP request. Access tokens
are revokable long-lived per-user per-app tokens that are granted to an
app at the request of a user.
Acquiring an access token is a three step process:
1. The app
[must obtain a unauthenticated *request token*](https://www.dropbox.com/developers/reference/api#request-token)
via the
Dropbox API.
2. The app redirects the user to the
[Dropbox website](https://www.dropbox.com/developers/reference/api#authorize).
Once the user is at the Dropbox website the user can either approve
or deny the request to authenticate the request token for the app.
3. The Dropbox website redirects the user back to the app's website.
The app can then [exchange the
authenticated request token for an access token](https://www.dropbox.com/developers/reference/api#access-token).
A request token actually consists of two components, a key and a secret.
Only the key component should be exposed in plaintext. To ensure proper
security, the secret should only ever be known to the Dropbox servers
and the API app attempting authentication.
To exchange an authenticated request token for an access token,
the app must also provide the original secret of the request token. This
prevents third-parties from hijacking authenticated request tokens.
An access token is long-lived but at any point in time can become
invalid. When an access token becomes invalid it is the responsibility
of the API app to go through the authentication process again. This
allows Dropbox and its users to revoke access tokens at will.
Initiating the Authentication Process
-------------------------------------
> getHomeR :: Handler RepHtml
> getHomeR = do
Users can enable our app for their Dropbox account using the web.
Yesod is the web framework we are using to implement web pages in Haskell.
In Yesod all HTTP routes are values in the `Handler` monad. The convention is
that the handler name is the combination of the
HTTP method (GET in this case) and the name of the resource.
`getHomeR` is the handler for
the GET method on the "HomeR" resource which is located at root HTTP path, "/".
Handlers are connected to HTTP routes served by the web server via the use
of [Template Haskell](#template-haskell) above.
The `Handler` monad is essentially a decorated `IO` monad so don't worry
about what it is that much. You should use it just like you
would use the `IO` monad.
> ImageSearch _ dbConfig <- getYesod
`getYesod` retrieves the app's value. In our app this value has the `ImageSearch`
type that we defined at the very beginning. Here we're deconstructing the `ImageSearch`
value and extracting only the config component (while ignoring the channel component).
The config value stores some information about our
app, like app key and locale, that is used by the Dropbox SDK.
> myRender <- getUrlRender
`getUrlRender` gets the URL render function for your app.
It turns a resource value for your
app into a `Text` URL.
<a name="myAuthStartCall"></a>
> (requestToken, authUrl) <- liftIO
> $ myAuthStart dbConfig
> $ Just $ T.unpack $ myRender DbRedirectR
Here we call the `DB.authStart` function (by way of `myAuthStart`). This function performs
the first step of the Dropbox API authentication process.
We pass in the URL, `DbRedirectR`, that we want the user
to be redirected to after they authenticate and we get
back our new unauthenticated request token
and the Dropbox URL where the user can authenticate it. Note that
`T.unpack` converts a `Text` value into a `String` value.
The `myAuthStart` function is in the `IO` monad so we make use of
`liftIO` to execute `myAuthStart` in the `Handler` monad.
Since `Handler` is a wrapper around the `IO` monad,
the `liftIO` function "lifts" the IO action into the higher monad.
> let DB.RequestToken ky scrt = requestToken
Do notation allows you to bind names using "let" in a do block. This is for when
you need to assign a name to a value that isn't coming from a monad. Here
we're deconstructing the request token from `myAuthStart` to get the
key and the secret.
<a name="setSession"></a>
> setSession dbRequestTokenSessionKey $ T.pack $ ky ++ "|" ++ scrt
A *session* in Yesod is a set of key-value pairs that is preserved per browsing
session with our web site. It's implemented using encryption on top of HTTP cookies.
We store the key and the secret of the request token in the session using `setSession`.
We'll need the secret to finish the authentication process after the user
authenticates our app.
`setSession` expects two `Text` values so we use `T.pack` to turn
the second argument from a `String` type into a `Text` type.
> redirectText RedirectTemporary $ T.pack authUrl
Finally we redirect the user to the URL given to us from `myAuthStart`, `authUrl`. Dropbox