haskell

Haskell技巧

学习教程

CheatSheet

http://cheatsheet.codeslower.com/

参考书

教程

课程

CIS 194: Introduction to Haskell (Fall 2016)
- 使用一个在线编辑器
- 图形化的例子，入门作业是直观的画图

文档

使用

安装

MacOS

推荐使用 ghcup 安装ghc和cabal-install，再安装stack

curl https://get-ghcup.haskell.org -sSf | sh

ghc、cabal等可执行程序将会安装到$HOME/.ghcup/bin，应在.bashrc中source $HOME/.ghcup/env

# brew install ghc cabal-install

Windows

通过stack安装

https://docs.haskellstack.org/en/stable/README/

cabal

基本使用

cabal update
cabal list <模块名称>
cabal install <模块名称>

配置国内源
- 执行cabal update，待生成~/.cabal/config后打断
- 在~/.cabal/config中加入：
```
repository mirrors.tuna.tsinghua.edu.cn
  url: http://mirrors.tuna.tsinghua.edu.cn/hackage
```

stack

https://github.com/Originate/guide/blob/master/haskell/stack-tutorial.md

安装

curl -sSL https://get.haskellstack.org/ | sh

或者

cabal install stack

配置国内源

~/.stack/config.yaml（在 Windows 下是 %APPDATA%\stack\config.yaml）加上：

setup-info: "http://mirrors.tuna.tsinghua.edu.cn/stackage/stack-setup.yaml"
urls:
  latest-snapshot: http://mirrors.tuna.tsinghua.edu.cn/stackage/snapshots.json

package-indices:
  - download-prefix: http://mirrors.tuna.tsinghua.edu.cn/hackage/
    hackage-security:
        keyids:
        - 0a5c7ea47cd1b15f01f5f51a33adda7e655bc0f0b0615baa8e271f4c3351e21d
        - 1ea9ba32c526d1cc91ab5e5bd364ec5e9e8cb67179a471872f6e26f0ae773d42
        - 280b10153a522681163658cb49f632cde3f38d768b736ddbc901d99a1a772833
        - 2a96b1889dc221c17296fcc2bb34b908ca9734376f0f361660200935916ef201
        - 2c6c3627bd6c982990239487f1abd02e08a02e6cf16edb105a8012d444d870c3
        - 51f0161b906011b52c6613376b1ae937670da69322113a246a09f807c62f6921
        - 772e9f4c7db33d251d5c6e357199c819e569d130857dc225549b40845ff0890d
        - aa315286e6ad281ad61182235533c41e806e5a787e0b6d1e7eef3f09d137d2e9
        - fe331502606802feac15e514d9b9ea83fee8b6ffef71335479a2e68d84adc6b0
        key-threshold: 3 # number of keys required

        # ignore expiration date, see https://github.com/commercialhaskell/stack/pull/4614
        ignore-expiry: no

基本使用

stack new my-project
cd my-project
stack setup
stack build
stack exec my-project-exe

vim

makeprg

au FileType haskell setlocal makeprg=ghc\ -e\ :q\ %
au FileType haskell setlocal errorformat=
                \%-G,
                \%-Z\ %#,
                \%W%f:%l:%c:\ Warning:\ %m,
                \%E%f:%l:%c:\ %m,
                \%E%>%f:%l:%c:,
                \%+C\ \ %#%m,
                \%W%>%f:%l:%c:,
                \%+C\ \ %#%tarning:\ %m,

Vim + Haskell
支持SpaceVim :A命令

可以在目录下放一个.project_alt.json文件，内容如下，在SpaceVim中可以用:A跳转到单元测试。
```
{
  "src/*.hs": {"alternate": "test/{}Spec.hs"},
  "test/*Spec.hs": {"alternate": "src/{}.hs"}
}
```

haskell-ide-engine (hie, lsp server)

从源代码开始搭建

下载

git clone https://github.com/haskell/haskell-ide-engine --recurse-submodules
cd haskell-ide-engine

构建

列出可以构建的目标
```
stack ./install.hs help
```

构建需要的版本

stack ./install.hs hie-8.6.5
stack ./install.hs build-doc

日常命令

新建项目

stack new my-project
cd my-project
stack setup
stack build
stack exec my-project-exe

编译

ghc --make 程序名

编码规范

布局

空格缩进的数量有多种选择，有时候在一个文件中，二，三，四格缩进都很正常

in通常正对着let

tidyLet = let foo = undefinedwei's
              bar = foo * 2
          in undefined

单独列出in或者让in在一系列等式之后都可以

do在行尾跟着而非在行首单独列出

commonDo = do
  something <- undefined
  return ()

如果等式的右侧另起一行，通常在和他本行内，相关变量名或者函数定义的下方之前留出一些空格
```
normalIndent =
    undefined
```

写where语句的缩进时，最好让它分辨起来比较容易

goodWhere = take 5 lambdas
    where lambdas = []

alsoGood =
    take 5 lambdas
  where
    lambdas = []

匿名函数的悬挂缩进

匿名函数的参数和->都放在行末，函数体从下一行开始。可以为函数主体留下更多的空间。

parseByte :: Parse Word8
parseByte =
    getState ==> \initState ->
    case L.uncons (string initState) of
      Nothing ->
          bail "no more input"
      Just (byte,remainder) ->
          putState newState ==> \_ ->
          identity byte
        where newState = initState { string = remainder,
                                     offset = newOffset }
              newOffset = offset initState + 1

命名规范

驼峰式命名
- 类型名字以大写开头ClassName
- 变量、函数名字以小写开头someThing
  - 经常出现(x:xs)/(d:ds)之类，s表示元素的复数

内置函数

“M”结尾代表Monad，也可以想成IO
- 如mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]
以下划线结尾的函数一般不管它们的返回值
- 如mapM_ :: (Monad m) => (a -> m b) -> [a] -> m ()

其他

组合函数的长管道（三个以上元素序列）难以阅读，应进行分解
- 使用let或者where语句块分解为若干小部分
- 为每个管道元素定义一个有意义的名字。如果想不出来，可能需要继续简化代码
每行不超过80字符

语言

基本概念

Haskell语言的特点
- 强类型
- 静态类型
- 自动类型推导
Haskell非严格求值
- 大体上等同于惰性求值，在需要时才进行计算
以_作为函数参数，可以避免未使用变量的警告

注释

单行
```
-- some comment
```
多行
```
{- some comment
   continue
-}
```

注释文档

单行

-- |The 'square' function squares an integer.
-- It takes one argument, of type 'Int'.
square :: Int -> Int
square x = x * x

多行

{-|
  The 'square' function squares an integer.
  It takes one argument, of type 'Int'.
-}
square :: Int -> Int
square x = x * x

缩进与空白

注释行视作空白行
缩进
- 顶级声明（不算注释）
  - 首行可以从任意列开始
  - 后续所有顶级声明必须与首行保持相同缩进
- 续行（视为当前行的延续）
  - 紧跟着的空白行
  - 比当前行缩进更深的行
  - let与where区块
    - 以第一个标记（token）的位置为准
      - 空白行或更深缩进视作该token行的延续
      - 与token相同缩进视作区块内新的一行

显式语法（基本没人用）

大括号{/}包含，分号;分隔语句

foo = let { a = 1;  b = 2;
        c = 3 }
      in a + b + c

表达式

if
- then和else后的表达式类型必须一致
- 不能省略else

case

支持模式匹配
从上到下，返回第一个匹配的结果
可以用通配符_作为最后一个被匹配的模式

样例

fromMaybe defval wrapped =
    case wrapped of
      Nothing     -> defval
      Just value  -> value

let/in：引入局部变量

这是一个表达式，可以用在任何使用表达式的地方
局部变量绑定到表达式，用到的时候才求值
局部变量可以在let块或者在紧跟着的in表达式中使用

样例

lend amount balance = let reserve    = 100
                          newBalance = balance - amount
                      in if balance < reserve
                         then Nothing
                         else Just newBalance

where从句：定义在其所跟随的主句中生效的局部变量

样例

lend2 amount balance = if amount < reserve * 0.5
                       then Just newBalance
                       else Nothing
    where reserve    = 100
          newBalance = balance - amount

Let vs. Where - HaskellWiki：比较了let与where的异同。

列表推导式（List comprehensions）

普通用法

[ (a,b) | a <- [1,2], b <- "abc" ]
-- [(1,'a'),(1,'b'),(1,'c'),(2,'a'),(2,'b'),(2,'c')]

带guard

[ (a,b) | a <- [1..6], b <- [5..7], even (a + b ^ 2) ]
-- [(1,5),(1,7),(2,6),(3,5),(3,7),(4,6),(5,5),(5,7),(6,6)]

使用let

[ x | a <- "etaoin", b <- "shrdlu", let x = [a,b], all (`elem` "aeiou") x ]
-- ["eu","au","ou","iu"]

operator

=是定义，不是赋值
- 不能重复定义
在一个两参数函数前后加上`，可以变成中缀运算符
- 1 `mod` 2
用小括号()把二元运算符括起来，可以变成函数
- (+) 2 3
比较
- ==、/=、<、>、<=、>=
not

::

表示类型

fromIntegral :: (Integral a, Num b) => a -> b

类型签名
```
'a' :: Char
```

<-
- 从运行I/O动作中抽出结果，并且保存到一个变量中
(.) :: (b -> c) -> (a -> b) -> a -> c：组合函数f . g x = f (g x)
($) :: (a -> b) -> a -> b：application operator
(Data.Function.&) :: a -> (a -> b) -> b：把前一操作的结果传给下一操作。reverse application operator，优先级比($)高一点
(<$>) :: (a -> b) -> f a -> f b：fmap的操作符形式
(<$) :: a -> f b -> f a：常量
(<*>) :: f (a -> b) -> f a -> f b
(*>) :: f a -> f b -> f b
(<*) :: f a -> f b -> f a
(>>=) :: m a -> (a -> m b) -> m b
(>>) :: m a -> m b -> m b

类型

基本类型关系图

基本数值

Num
- Real
  - Integral
    - Int：机器字长的整形。最小2^29
    - Integer：无限大的整形
  - Fractional
    - Float
    - Double
- RealFrac：实数
  - Rational
  - Double
/只用于Double类型。对于整形可以使用div和mod
^的指数部分只能是整数，**的指数部分可以是浮点
不同类型不能混合运算，必须先进行转换
- fromIntegral : 从整形（Int/Integer转换为其他数值类型）
- round, floor, ceiling : 浮点转换为整形
负数一般要用括号括起来
- 1 + (-2)
区分字符（'a'）与字符串（"abc"）
- 字符串是字符的列表
- ""与[]等价

主要数值类型

类型	介绍
Double	双精度浮点数。表示浮点数的常见选择。
Float	单精度浮点数。通常在对接 C 程序时使用。
Int	固定精度带符号整数；最小范围在 -2^29 至 2^29-1 。相当常用。
Int8	8 位带符号整数
Int16	16 位带符号整数
Int32	32 位带符号整数
Int64	64 位带符号整数
Integer	任意精度带符号整数；范围由机器的内存限制。相当常用。
Rational	任意精度有理数。保存为两个整数之比（ratio）。
Word	固定精度无符号整数。占用的内存大小和 Int 相同
Data.Word.Word8	8 位无符号整数，字节
Data.Word.Word16	16 位无符号整数
Data.Word.Word32	32 位无符号整数
Data.Word.Word64	64 位无符号整数

Prelude中的其他类型

Bool
- False
- True

Either

Left：很多时候用于返回错误
Right

either :: (a -> c) -> (b -> c) -> Either a b -> c

either
  (\err -> {- what to do if decode gave an error -})
  (\msg -> {- what to do if decode succeeded -})
  (decode a)

Maybe
- Nothing
- Just
可以使用Data.Maybe.fromMaybe :: a -> Maybe a -> a简化提取操作

Data.Maybe.fromJust :: Maybe a -> a
Ordering
- LT
- EQ
- GT

列表（Lists）

列表元素必须是相同类型
方括号括起，逗号分隔
- [1,2,3]
  - 实际上只是(1:(2:(3:[])))的一种简单的表示方式，其中(:)用于构造列表
  - 也等于1:2:3:[]
- 空列表 []
- 可以用列举（..）
  - [1..10]：[1,2,3,4,5,6,7,8,9,10]
  - 前两项控制间隔：
    - [1,4..15]：[1,4,7,10,13]
    - [10,9..1]：[10,9,8,7,6,5,4,3,2,1]
  - 省略终点，可得到无穷数列
    - [1..]
:：cons，在头部加入
- 1 : [2,3]

元组

各元素类型可以不同
圆括号括起，逗号分隔
- (True, "hello")
函数
- fst :: (a, b) -> a：返回二元组的第一个元素
- snd :: (a, b) -> b：返回二元组的第二个元素

自定义类型

三种方式的异同：
- data关键字提出（introduce）一个真正的代数（albegraic）数据类型。
- type关键字给我们一个别名（synonym）去用，为一个存在着的（existing）类型。我们可以交换地（interchangeably）使用这个类型和他的别名,
- newtype关键字给予一个存在着的类型以一个独特的身份（distinct identity）。这个原类型和这个新类型是不可交换的（interchangeable）。

type类型别名

type CustomerID = Int
type ReviewBody = String

data BetterReview = BetterReview BookInfo CustomerID ReviewBody

data 可容纳多种类型的数据

代数数据类型（algebraic data type）

data LispVal = Atom String
             | List [LispVal]
             | DottedList [LispVal] LispVal
             | Number Integer
             | String String
             | Bool Bool

LispVal是类型构造器，首字母大写
Atom/List等是值构造器，首字母大写
不同类的值构造器不能重名
可以有多个值构造器，使用|符号分割，读作“或者”

使用记录record语法可以自动提供对每个成分的访问器

data Customer = Customer {
      customerID      :: CustomerID
    , customerName    :: String
    , customerAddress :: Address
    } deriving (Show)

相当于

data Customer = Customer Int String [String]
customerID (Customer id _ _) = id
customerName (Customer _ name _) = name
customerAddress (Customer _ _ address) = address

可以复制并改变部分值

modifyOffset initState newOffset =
    initState { offset = newOffset }

newtype

给现有类型以一个新的身份
只能有一个值构造器，并且那个构造器须恰有一个字段(field)

-- file: ch06/NewtypeDiff.hs
-- 可以：任意数量的构造器和字段（这里的两个Int为两个字段(fields)）
data TwoFields = TwoFields Int Int

-- 可以：恰一个字段
newtype Okay = ExactlyOne Int

-- 可以：类型变量是没问题的
newtype Param a b = Param (Either a b)

-- 可以：记录语法是友好的
newtype Record = Record {
        getInt :: Int
    }

-- 不可以：没有字段
newtype TooFew = TooFew

-- 不可以：多于一个字段
newtype TooManyFields = Fields Int Int

-- 不可以：多于一个构造器
newtype TooManyCtors = Bad Int
                     | Worse Int

函数

定义

add a b = a + b

参数多态

如
```
last :: [a] -> a
```
如果函数的类型签名里包含类型变量，那么就表示这个函数的某些参数可以是任意类型，我们称这些函数是多态的。
参数化类型（parameterized type），表示代码并不在乎实际的类型是什么
- 没有办法知道参数化类型的实际类型是什么，也不能操作这种类型的值；
- 不能创建这种类型的值，也不能对这种类型的值进行探查（inspect）。

模式匹配

sumList (x:xs) = x + sumList xs
sumList []  = 0

data BookInfo = Book Int String [String]

bookID      (Book id title authors) = id
bookTitle   (Book id title authors) = title
bookAuthors (Book id title authors) = authors

-- 使用通配符
nicerID      (Book id _     _      ) = id
nicerTitle   (Book _  title _      ) = title
nicerAuthors (Book _  _     authors) = authors

tidySecond :: [a] -> Maybe a

-- 匹配至少两个元素的列表。后面的_可能是[]
tidySecond (_:x:_) = Just x
tidySecond _       = Nothing

使用守卫guard实现条件求值

niceDrop n xs | n <= 0 = xs
niceDrop _ []          = []
niceDrop n (_:xs)      = niceDrop (n - 1) xs

lend3 amount balance
     | amount <= 0            = Nothing
     | amount > reserve * 0.5 = Nothing
     | otherwise              = Just newBalance
    where reserve    = 100
          newBalance = balance - amount

case getNat (L8.dropWhile isSpace s3) of
  Nothing -> Nothing
  Just (maxGrey, s4)
    | maxGrey > 255 -> Nothing
    | otherwise ->

case () of _
             | cond1     -> ex1
             | cond2     -> ex2
             | cond3     -> ex3
             | otherwise -> exDefault

一个模式后面可以跟0到多个Bool类型的守卫
- 用|符号来标识守卫的使用
- 后面是守卫表达式
- 再来一个等号=
- 后面是守卫值为真时对应的函数体
otherwise是一个值为True的普通变量，用于提高可读性

匿名函数Lambda

基本格式
以反斜杠符号\为开始，后跟函数的参数（可以包含模式）
- 函数体定义在 -> 符号之后
- \符号读作lambda。
lambda函数的定义只能有一条语句

部分函数应用和柯里化（Currying）

Haskell的函数最多只有一个参数，->左边是参数的类型，右边是返回值的类型
传入参数的数量，少于函数所能接受参数的数量，这种情况被称为函数的部分应用（partial application of the function）
部分函数应用被称为柯里化（currying），以逻辑学家 Haskell Curry 命名（Haskell 语言的命名也是来源于他的名字）。
节（section）
- 使用括号包围一个操作符，通过在括号里面提供左操作对象或者右操作对象，产生一个部分应用函数
```
Prelude> (2^) 3
8

Prelude> (^3) 2
8
```
- 对于普通函数，可以用`括起来，变成中序运算符后应用此技术
```
Prelude> (`elem` ['a' .. 'z']) 'f'
True
```

As-模式

suffixes :: [a] -> [[a]]
suffixes xs@(_:xs') = xs : suffixes xs'
suffixes [] = []

如果输入值能匹配@符号右边的模式，那么就将这个值绑定到@符号左边的变量中（这里是xs）
可以提升代码的可读性
可以对输入数据进行共享，而不是复制它

提高代码可读性

少用尾递归，多用库函数的组合
- 尾递归太通用，可以同时执行过滤、映射和其他操作，不如一次只完成一件事的库函数好理解
用局部函数（通过let/where）代替匿名函数
- 有函数名，更具可读性
为所有顶层（top-level）函数添加类型签名，让编译器及时发现错误

使用seq实现严格求值

https://wiki.haskell.org/Seq
严格求值（strict）指非惰性求值
seq :: a -> t -> t：强迫（force）求值传入的第一个参数，然后返回第二个参数
- Data.List.foldl' 就是左折叠的严格版本，它使用特殊的 seq 函数来绕过 Haskell 默认的非严格求值：
```
foldl' _ zero [] = zero
foldl' step zero (x:xs) =
    let new = step zero x
        in new `seq` foldl' step new xs
```
正确使用seq的要点
- 表达式中被求值的第一个必须是seq
  - seq必须放在外层而不是内层
- seq的第二个参数应该用到第一个参数
  - 一般用let语句保存一个表达式，然后作为seq的第一个参数，也用到第二个参数中
```
foldl' _ zero [] = zero
foldl' step zero (x:xs) =
    let new = step zero x
        in new `seq` foldl' step new xs
```

IO

不纯（impure）函数的类型签名以IO开头
```
readFile :: FilePath -> IO String
```
保存一个IO操作时，不会马上执行
- writefoo = putStrLn "foo"时，什么都不会发生
- 在别的IO操作中使用writefoo，则会执行对应的操作
IO函数可以调用纯函数，但纯函数不能调用IO函数

data Color = Red | Green | Blue
    deriving (Read, Show, Eq, Ord)

类型类（class)

基本定义

class BasicEq a where
  isEqual :: a -> a -> Bool

instance BasicEq Bool where
  isEqual True  True  = True
  isEqual False False = True
  isEqual _     _     = False

自动派生

对于许多简单的数据类型，Haskell编译器可以自动将类型派生（derivation）为Read、Show、Bounded、Enum、Eq和Ord的实例(instance)。

模块（Module）

必须放在其他定义之前
模块名必须大写字母开头，必须和包含这个模块的文件的基础名（不包含后缀的文件名）一致
```
module ExportSomething (Class1, func1, func2) where
```
```
module ExportEverything where
```
```
module ExportNothing () where
```
导出名称中JValue(..)的(..)代表同时导出JValue的所有值构造器

导入

import Prelude hiding (zip, (<>))

import qualified Data.ByteString.Lazy as L

语言扩展

{-# LANGUAGE 语言扩展 #-}

{-# LANGUAGE TypeSynonymInstances #-}：允许instance JSON String where {-# LANGUAGE FlexibleInstances #-}：允许instance JSON [Char] where

重要的类型类

Eq - 相等比较

class Eq a where
  (==) :: a -> a -> Bool
  (/=) :: a -> a -> Bool
  {-# MINIMAL (==) | (/=) #-}

实例中至少定义其中一个方法

Show - 将值转换为字符串

show :: Show a => a -> String

Read - 将字符串转换为指定类型的值

read :: Read a => String -> a

必要时，显式指定要读入的类型：

(read "3")::Int

最简实现只要实现一个readPrec（readsPrec也可，但效率较低，推荐实现readPrec）：

instance Read T where
  readPrec     = ...
  readListPrec = readListPrecDefault

Ord - 比较顺序

所有Ord实例都可以使用Data.List.sort来排序。

class Eq a => Ord a where
  compare :: a -> a -> Ordering
  (<) :: a -> a -> Bool
  (<=) :: a -> a -> Bool
  (>) :: a -> a -> Bool
  (>=) :: a -> a -> Bool
  max :: a -> a -> a
  min :: a -> a -> a
  {-# MINIMAL compare | (<=) #-}

Data.Monoid - 幺半群

需要满足两个性质：

需要有一个满足结合律的二元操作符*。a * (b * c)与(a * b) * c的结果必须相同
一个单位元素e，a * e == a且e * a == a

对于整数，1作为单位元素，乘号作为操作符就构成了一个幺半群。以0作为单位元素，加号作为操作符也构成一个幺半群。

class Monoid a where
    mempty  :: a                -- the identity
    mappend :: a -> a -> a      -- associative binary operator

(<>) :: a -> a -> a infixr 6：等价于mappend

Functor - 函子，支持fmap操作的类型（`<$>`）

定义

class Functor f where
  fmap :: (a -> b) -> f a -> f b
  (<$>) :: (a -> b) -> f a -> f b
  (<$) :: a -> f b -> f a

实例化时，最小实现为fmap（<$>）。

简介

Functor必须保持结构（shape），也可以理解为上下文（context）。集合的结构不应该受到 Functor 的影响，只有对应的值会改变。

约束

-- Functor 必须保持身份（preserve identity）
fmap id  ==  id
-- Functor 必须是可组合的
fmap (f . g)  ==  fmap f . fmap g

列表、Maybe、IO等类型都满足这个条件。

一个类型有且仅有一个类型参数，我们才能给它实现 Functor 实例。

其他方法

(<$)把输入中所有值都替换为相同的值。缺省实现为fmap . const。

Control.Applicative.<$>：fmap的operator形式

Applicative（`<*>`）

定义

class Functor f => Applicative (f :: * -> *) where
  pure :: a -> f a
  (<*>) :: f (a -> b) -> f a -> f b
  GHC.Base.liftA2 :: (a -> b -> c) -> f a -> f b -> f c
  (*>) :: f a -> f b -> f b
  (<*) :: f a -> f b -> f a
  {-# MINIMAL pure, ((<*>) | liftA2) #-}
        -- Defined in ‘GHC.Base’

实例化时，最小实现为pure、(<*>)或liftA2

简介

Applicative类型必然是Functor类型。
Functor只有值是属于上下文的，而Applicative的函数也属于上下文。

<*>左边也可以属于上下文（Applicative f => f (a -> b)），但返回的是不属于上下文的值
Applicative会对结构/context进行改变，但这个改变只和参数的结构有关，与参数数值无关，所提供的函数仅能改变值。如对于列表，输出的长度是两个输入参数长度的乘积。
```
[(2*),(3*)] <*> [2,5,6]
[4,10,12,6,15,18]
```

约束

pure id <*> v = v                            -- Identity
pure f <*> pure x = pure (f x)               -- Homomorphism
u <*> pure y = pure ($ y) <*> u              -- Interchange
-- pure (.) composes morphisms similarly to how (.) composes functions:
pure (.) <*> u <*> v <*> w = u <*> (v <*> w) -- Composition

以上规则推导出来：

-- Applying a "pure" function with (<*>) is equivalent to using fmap
fmap f x = pure f <*> x                      -- fmap

例子

Just (* 2) <*> Just 3
-- Just 6

Nothing <*> Just 3
-- Nothing

Functor只能用于单参数的情况，但与Applicative结合，可以用于多参数的场景：

(+) <$> Just 2 <*> Just 3
-- Just 5

Monad - 单子（`>>=`）

定义

class Applicative m => Monad (m :: * -> *) where
  (>>=) :: m a -> (a -> m b) -> m b
  (>>) :: m a -> m b -> m b
  return :: a -> m a
  fail :: String -> m a
  {-# MINIMAL (>>=) #-}
        -- Defined in ‘GHC.Base’

实例化时，最小实现为(>>=)

简介

Monad类型必然是Applicative类型。
在Applicative基础上，Monad允许函数操纵上下文（返回值是m b）。

约束

(return x) >>= f == f x
m >>= return == m
(m >>= f) >>= g == m >>= (\x -> f x >>= g)

基本使用

<-：从Action中提取Pure值
组合多个action
- (>>=) :: (Monad m) => m a -> (a -> m b) -> m b
  - 运行一个操作，然后把它的结果传递给第二个函数，返回第二操作的结果
```
getLine >>= putStrLn  -- 从键盘读取一行，然后显示出来
```
- (>>) :: (Monad m) => m a -> m b -> m b
  - 把两个操作串联在一起：第一个操作先运行，然后是第二个
  - 丢弃第一个action的结果，返回第二个action的结果
```
m >> k = m >>= (\_ -> k)
```
```
putStrLn "line 1" >> putStrLn "line 2"
```
- do块
  - 一种语法糖
  - 由多行组成
  - 每行有两种形式
    - name <- action
      - 把action的结果绑定到name上
      - 从IO中提取Pure的值
    - action
      - 执行action
return :: Monad m => a -> m a：把Pure值变成一个动作（Action）

Control.Monad.MonadPlus

定义

在Monad基础上增加了mzero和mplus。

class (GHC.Base.Alternative m, Monad m) =>
      GHC.Base.MonadPlus (m :: * -> *) where
  GHC.Base.mzero :: m a
  GHC.Base.mplus :: m a -> m a -> m a
        -- Defined in ‘GHC.Base’

约束

mzero >>= f == mzero
m >>= (\x -> mzero) == mzero
mzero `mplus` m == m
m `mplus` mzero == m

常用的Monad类

Identity

概述

Computation type

Simple function application
Binding strategy:

The bound function is applied to the input value.
```
Identity x >>= f == f x
```
Zero and plus: 不支持
Example type:
```
Identity a
```

用途

The purpose of the Identity monad is its fundamental role in the theory of monad transformers.
Any monad transformer applied to the Identity monad yields a non-transformer version of that monad.

例子

-- derive the State monad using the StateT monad transformer
type State s a = StateT s Identity a

Maybe

概述

Computation type

Computations which may return Nothing.
Binding strategy:
- 输入Nothing得到Nothing
- 输入其他值，将被传给绑定的函数
Zero and plus
- mzero是Nothing
- mplus返回第一个不是Nothing的值，或Nothing

用途

可用于链式计算，当一个环节返回Nothing时就会停止求值并返回Nothing。

常用方法

Data.Maybe.fromJust

例子

data MailPref = HTML | Plain
data MailSystem = ...

getMailPrefs :: MailSystem -> String -> Maybe MailPref
getMailPrefs sys name =
  do let nameDB = fullNameDB sys
         nickDB = nickNameDB sys
         prefDB = prefsDB sys
     addr <- (lookup name nameDB) `mplus` (lookup name nickDB)
     lookup addr prefDB

Error

Computation type

Computations which may fail or throw exceptions
Binding strategy

失败时跳过绑定的函数，其他值被用作绑定函数的输入
Useful for

Building computations from sequences of functions that may fail or using exception handling to structure error handling.

Zero and plus
- mzero是空错误
- mplus如果第一个操作失败，则执行第二个

用途

？？

提到Control.Monad.Except，说Either是其实例，但好像Either的定义中没有列出这点。

List

概述

Computation type

Computations which may return 0, 1, or more possible results.
Binding strategy

绑定的函数将应用于输入列表的所有组合，而结果的列表会被concat到一起生成包含所有可能结果的列表。
Useful for

Building computations from sequences of non-deterministic operations. Parsing ambiguous grammars is a common example.
Zero and plus
- mzero是[]
- mplus是++

用途

用于要面对不确定性的结算时。将会尝试所有可能直到消除不确定性。

IO

instance Monad IO where
    return a = ...   -- function from a -> IO a
    m >>= k  = ...   -- executes the I/O action m and binds the value to k's input
    fail s   = ioError (userError s)

data IOError = ...

ioError :: IOError -> IO a
ioError = ...

userError :: String -> IOError
userError = ...

catch :: IO a -> (IOError -> IO a) -> IO a
catch = ...

try :: IO a -> IO (Either IOError a)
try f = catch (do r <- f
                  return (Right r))
              (return . Left)

Control.Monad.State.Lazy.State

get :: m s：

Return the state from the internals of the monad.
put :: s -> m ()：

Replace the state inside the monad.
state :: (s -> (a, s)) -> m a：

Embed a simple state action into the monad.
modify :: MonadState s m => (s -> s) -> m ()：

Monadic state transformer.

Maps an old state to a new state inside a state monad. The old state is thrown away.
modify' :: MonadState s m => (s -> s) -> m ()：

A variant of modify in which the computation is strict in the new state.
gets :: MonadState s m => (s -> a) -> m a：

Gets specific component of the state, using a projection function supplied.
runState :: State s a -> s -> (a, s)：

Unwrap a state monad computation as a function. (The inverse of state.)
evalState :: State s a -> s -> a：

Evaluate a state computation with the given initial state and return the final value, discarding the final state
execState :: State s a -> s -> s：

Evaluate a state computation with the given initial state and return the final state, discarding the final value
mapState :: ((a, s) -> (b, s)) -> State s a -> State s b：

Map both the return value and final state of a computation using the given function
withState :: (s -> s) -> State s a -> State s a：

withState f m executes action m on a state modified by applying f
```
withState f m = modify f >> m
```

标准库

字符串函数

内置

read：将字符串转换为数字
show：将数字等转换为字符串
lines :: String -> [String]：把字符串按\n拆开为列表
unlines :: [String] -> String：把字符串列表每一项（包括最后一项）加上换行符拼成一个字符串
words :: String -> [String]：把字符串按空白字符分隔（连续空格、换行等）
unwords :: [String] -> String：把列表项用空格连接
lookup :: Eq a => a -> [(a, b)] -> Maybe b：在一个键值对列表中查找指定的键

Data.Char

digitToInt :: Char -> Int：把字符（0-9，a-f，A-F）转换为数字
toUpper :: Char -> Char
ord :: Char -> Int
chr :: Int -> Char
isDigit :: Char -> Bool
isHexDigit :: Char -> Bool
isPrint :: Char -> Bool
isSpace :: Char -> Bool

正则表达式

实现

`regex-posix`

其中一种正则表达式实现。其他实现也暴露相同的接口。

根据Haskell的Wiki，这个库依赖libc的实现，在一些平台上有Bug。

安装

cabal new-install regex-posix

stack install regex-posix

包
```
import Text.Regex.Posix
```
regex-tdfa

import Text.Regex.TDFA

纯Haskell实现，不支持find-and-replace

{-# LANGUAGE QuasiQuotes #-}

import Text.RawString.QQ
import Text.Regex.TDFA

λ> "2 * (3 + 1) / 4" =~ [r|\([^)]+\)|] :: String
>>> "(3 + 1)"

使用

=~：non-monadic
- =~操作符的参数和返回值都使用了类型类
- 参数
  
  对每个参数我们都可以使用 String 或者 ByteString
  - 第一个参数是要被匹配的文本
  - 第二个参数是准备匹配的正则表达式
- 返回值
  
  返回类型是多态的RegexContent，支持多种类型，具体看Text.Regex.Base.Context的文档。不限于：
  - Bool：是否匹配
  - Int：匹配了多少次
  - String：返回第一个匹配的子串，或者表示无匹配的空字符串
  - [[String]]：返回由所有匹配的的字符串组成的列表
  - (String,String,String)：获取字符串中首次匹配之前的部分，首次匹配的子串，和首次匹配之后的部分
    - 若匹配失败，整个字符串会作为 “首次匹配之前” 的部分返回，元组的其他两个元素将为空字符串。
  - (String,String,String,[String])：前三项与三元组一样，第四项是所有分组的列表
  - (Int,Int)：首次匹配在字符串中的偏移，以及匹配结果的长度。首元素为-1表示匹配失败
  - [(Int,Int)]：所有匹配在字符串中的偏移，以及匹配结果的长度。空列表表示匹配失败
```
getAllMatches  ("i foobarbar a quux" =~ pat) :: [(Int,Int)]
```
  - 其他（https://github.com/erantapaa/haskell-regexp-examples/blob/master/RegexExamples.hs）
```
-- all matches, all captures, returns offset and length
str =~ regex :: [MatchArray]
getAllMatches $ match regex str :: [MatchArray]

-- all matches, all captures, returns text, offset and length
str =~ regex :: [MatchText String]

-- all matches, all captures, returns offset and length
getAllMatches $ str =~ regex :: Array Int MatchArray
```
=~~：monadic, uses fail on lack of match

自己实现replaceAll功能

{-# LANGUAGE FlexibleContexts #-}

import Text.Regex.Base
import Data.List ( foldl' )

replaceAll :: RegexLike r String => r -> (String -> String) -> String -> String
replaceAll re f s = start end
  where (_, end, start) = foldl' go (0, s, id) $ (getAllMatches $ match re s :: [(Int, Int)])
        go (ind,read,write) (off,len) =
          let (skip, start) = splitAt (off - ind) read
              (matched, remaining) = splitAt len start
           in (off + len, remaining, write . (skip++) . (f matched ++))

import Text.Regex.TDFA
import Data.List ( foldl' )
import Data.Array ( (!) )

replaceAll :: String -> (MatchArray -> String) -> String -> String
replaceAll re f s = start end
  where (_, end, start) = foldl' go (0, s, id) (s =~ re :: [MatchArray])
        go :: (Int,String,String->String) -> MatchArray -> (Int,String,String->String)
        go (ind,read,write) ma =
          let (off,len) = ma!0
              (skip, start) = splitAt (off - ind) read
              (matched, remaining) = splitAt len start
           in (off + len, remaining, write . (skip++) . (f ma ++))

raw-strings-qq：方便写regex，不需要转义\

{-# LANGUAGE QuasiQuotes #-}

module Main
       where

import Text.Regex.Posix
import Text.RawString.QQ

haystack :: String
haystack = "My e-mail address is user@example.com"

needle :: String
needle = [r|\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}|]

multiline :: String
multiline = [r|<HTML>
<HEAD>
<TITLE>Auto-generated html formated source</TITLE>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1252">
</HEAD>
<BODY LINK="#0000ff" VLINK="#800080" BGCOLOR="#ffffff">
<P> </P>
<PRE>|]

main :: IO ()
main = do
  print multiline
  print ""
  print $ ((haystack =~ needle) :: String)

参考样例

http://pleac.sourceforge.net/pleac_haskell/patternmatching.html

数值函数

类型转换

整数相关转换

fromIntegral :: (Integral a, Num b) => a -> b：把Integral转换为其他数值类型
Integer的特例
- fromInteger :: Num a => Integer -> a：把Integer转换为其他数值类型
- toInteger:: Integral a => a -> Integer：把其他整数类型转换为Integer

实数相关转换

realToFrac:: (Real a, Fractional b) => a -> b：把实数转换为Fractional类型，包括Rational和Double
Rational的特例
- fromRational :: Fractional a => Rational -> a
- toRational :: Real a => a -> Rational

分数到整数的有损转换

ceiling :: (RealFrac a, Integral b) => a -> b
floor :: (RealFrac a, Integral b) => a -> b
truncate :: (RealFrac a, Integral b) => a -> b
round :: (RealFrac a, Integral b) => a -> b

不同精度转换

Float.float2Double :: Float -> Double
Float.double2Float :: Double -> Float

整数除法

quot :: a -> a -> a infixl 7：整数除法，向0截断
rem :: a -> a -> a infixl 7：与quot对应的被截断的尾数，满足
```
(x `quot` y)*y + (x `rem` y) == x
```
div :: a -> a -> a infixl 7：整数触发，向负无穷大截断
mod :: a -> a -> a infixl 7：与mod对应的被截断的尾数，满足
```
(x `div` y)*y + (x `mod` y) == x
```
quotRem :: a -> a -> (a, a)：同时进行quot和rem
divMod :: a -> a -> (a, a)：同时进行div和mod

其他

odd :: Integral a => a -> Bool
even :: Integral a => a -> Bool
truncate :: Integral b => a -> b：返回浮点数或者有理数的正数部分

Numeric

showHex :: (Integral a, Show a) => a -> ShowS
showIntAtBase :: (Integral a, Show a) => a -> (Int -> Char) -> a -> ShowS
showOct :: (Integral a, Show a) => a -> ShowS

列表函数

内置

基本操作
- length :: [a] -> Int：返回列表长度，需要遍历列表，性能不佳
- null :: [a] -> Bool：判断列表是否为空，可快速判断列表是否为空
- head :: [a] -> a：返回第一个元素
- tail :: [a] -> [a]：丢弃列表第一个元素
- init :: [a] -> [a]：丢弃列表最后一个元素
- last :: [a] -> a：返回最后一个元素
构造列表
- iterate :: (a -> a) -> a -> [a]：返回对a调用f0次、1次、2次。。。的结果
  - 如f是next，每次取一项
- repeat :: a -> [a]
- replicate :: Int -> a -> [a]
- cycle :: [a] -> [a]：把有限列表转换为无限循环列表
产生子列表
- take :: Int -> [a] -> [a]：返回列表前N项
- drop :: Int -> [a] -> [a]：丢弃列表前N项
- splitAt :: Int -> [a] -> ([a], [a])：把前N项和剩余部分分开
- takeWhile :: (a -> Bool) -> [a] -> [a]：提取判断第一次失败前的元素
- dropWhile :: (a -> Bool) -> [a] -> [a]：丢弃判断第一次失败前的元素
- break :: (a -> Bool) -> [a] -> ([a], [a])：提取列表中使谓词失败的元素组成二元组的首项
```
ghci> break odd [2,4,5,6,8]
([2,4],[5,6,8])
```
- span :: (a -> Bool) -> [a] -> ([a], [a])：提取列表中使谓词成功的元素组成二元组的首项
```
ghci> span even [2,4,5,6,8]
([2,4],[5,6,8])
```
列表操作
- (++) :: [a] -> [a] -> [a]：追加
- concat :: [[a]] -> [a]：把列表的列表合并为一个列表。每次去掉一层方括号
- concatMap :: Foldable t => (a -> [b]) -> t a -> [b]：map后再concat
- reverse :: [a] -> [a]
逻辑运算
- (&&) :: Bool -> Bool -> Bool
- (||) :: Bool -> Bool -> Bool
- not :: Bool -> Bool
- otherwise :: Bool：固定为True
- bool :: a -> a -> Bool -> a：bool x y p等价于if p then y else x
- and :: [Bool] -> Bool
- or :: [Bool] -> Bool
- all :: (a -> Bool) -> [a] -> Bool
- any :: (a -> Bool) -> [a] -> Bool
搜索列表
- elem :: (Eq a) => a -> [a] -> Bool：判断元素在列表中
- notElem :: (Eq a) => a -> [a] -> Bool：判断元素不在列表中
- Data.List.elemIndex :: Eq a => a -> [a] -> Maybe Int：返回元素在列表中的下标
- maximum :: Ord a => [a] -> a：找出列表中的最大值
- (!!) :: [a] -> Int -> a：从列表中取出指定项。下标从0开始
高阶操作
- zip :: [a] -> [b] -> [(a, b)]：从两个列表依次分别抽取一项形成二元组列表，长度等于最短的列表。三列表版本为zip3
- zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]：从两个列表依次分别抽取一项，经函数运算后，形成新的列表。长度等于最短的列表。三列表版本为zipWith3
- map :: (a -> b) -> [a] -> [b]
- filter :: (a -> Bool) -> [a] -> [a]：留下所有满足条件的元素
- foldl :: (a -> b -> a) -> a -> [b] -> a：从左边开始进行折叠
  - 不建议实际使用foldl
    - 由于惰性求值，在最终使用前，foldl会把左边一系列中间结果保存到一个个块中
      - 性能低下
      - 如果层次深还会耗尽栈空间
  - 可以用Data.List中的foldl'来代替
- foldr :: (a -> b -> b) -> b -> [a] -> b：从右边开始进行折叠
  - 所有可以用foldr定义的函数，统称为主递归（primitive recursive）。很大一部分列表处理函数都是主递归函数。

Data.List

isPrefixOf :: Eq a => [a] -> [a] -> Bool：检查第一个列表是不是第二个列表的前缀
isInfixOf :: Eq a => [a] -> [a] -> Bool：检查第一个列表是不是第二个列表的一部分
isSuffixOf :: Eq a => [a] -> [a] -> Bool：检查第一个列表是不是第二个列表的后缀
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]：要求输入列表是排序过的
tails :: [a] -> [[a]]：返回输入字符串的所有后缀，以及一个空列表
```
> tails "foobar"
["foobar","oobar","obar","bar","ar","r",""]
> tails ""
[""]
```
intersperse :: a -> [a] -> [a]：把第一个参数插入到第二个参数的每个项目间
```
>>> intersperse ',' "abcde"
"a,b,c,d,e"
```
intercalate :: [a] -> [[a]] -> [a]：把第二个参数各项间插入第一个参数，并concat为一个列表。intercalate xs xss等价于concat (intersperse xs xss)
```
>>> intercalate ", " ["Lorem", "ipsum", "dolor"]
"Lorem, ipsum, dolor"
```

其他数据结构

Data.Vector

Data.Map

在containers包中。

import qualified Data.Map.Strict as Map

import qualified Data.Map.Lazy as Map

构造

empty :: Map k a
singleton :: k -> a -> Map k a
fromSet :: (k -> a) -> Set k -> Map k a
fromList :: Ord k => [(k, a)] -> Map k a

有相同key的元素，保留最后一个

fromListWith :: Ord k => (a -> a -> a) -> [(k, a)] -> Map k a

fromListWith (++) [(5,"a"), (5,"b"), (3,"b"), (3,"a"), (5,"a")] == fromList [(3, "ab"), (5, "aba")]

查询

member :: Ord k => k -> Map k a -> Bool
notMember :: Ord k => k -> Map k a -> Bool
lookup :: Ord k => k -> Map k a -> Maybe a
findWithDefault :: Ord k => a -> k -> Map k a -> a
lookupLT, lookupGT, lookupLE, lookupGE :: Ord k => k -> Map k v -> Maybe (k, v)

操纵

insert :: (Ord k) => k -> a -> M.Map k a -> M.Map k a

如果原来有值，会被更新。
insertWith' :: (Ord k) => (a -> a -> a) -> k -> a -> M.Map k a -> M.Map k

如果原来有值就调用合并函数，否则直接插入。注意函数名后的'表示这是一个严格求值的函数，基本不会用非严格求值的相应版本。
union :: (Ord k) => M.Map k a -> M.Map k a -> M.Map k a

合并两个Map，如果有重复的key，保留左边的值。
foldrWithKey :: (k -> a -> b -> b) -> b -> Map k a -> b

foldrWithKey f z == foldr (uncurry f) z . toAscList

转换

toAscList :: Map k a -> [(k, a)]：转换为一个按键值递增的列表
toDescList :: Map k a -> [(k, a)]：转换为一个按键值递减的列表

Data.Array

import Data.Array (Array(..), (!), bounds, elems, indices,
                    ixmap, listArray)
import Data.Ix (Ix(..))

Data.Sequence

import qualified Data.Sequence as Seq

位操作

import Data.Bits

bit :: Bits a => Int -> a：返回只设置一个bit的数值
setBit :: Bits a => a -> Int -> a：设置指定bit
clearBit :: Bits a => a -> Int -> a
testBit :: Bits a => a -> Int -> Bool
finiteBitSize :: b -> Int：只用于有固定长度的类型，取位数
bitSizeMaybe :: a -> Maybe Int：相当于Just . finiteBitSize
popCount :: a -> Int：返回为1的位数（Hamming weight）
(.|.) :: Bits a => a -> a -> a：位或
(.&.) :: Bits a => a -> a -> a：位与
xor :: Bits a => a -> a -> a：位异或
complement :: a -> a：取反
shift :: a -> Int -> a：根据数值左移（正数）或右移（负数）
shiftL :: a -> Int -> a：左移
shiftR :: a -> Int -> a：右移
rotate :: Bits a => a -> Int -> a：循环
rotateL :: Bits a => a -> Int -> a：向左循环
rotateR :: Bits a => a -> Int -> a：向右循环
.&.：位与
.|.：位或

对函数的操纵（元函数？）

flip :: (a -> b -> c) -> b -> a -> c：把一个双参数函数的参数顺序交换顺序

Data.Function中的：

on :: (b -> b -> c) -> (a -> b) -> a -> a -> c：on b u x y等于b (u x) (u y)
- sortBy (compare `on` fst)
fix :: (a -> a) -> a：fix f是f的最小不动点（least fixed point）
- fix f = let {x = f x} in x：https://en.wikibooks.org/wiki/Haskell/Fix_and_recursion
- 也可以通过Control.Monad.Fix模块导入
- 支持在lambda中使用递归
- 例子
  - 生成一个常量
```
  fix (const "hello")
= let {x = const "hello" x} in x
= "hello"
```
  - 生成无穷列表
```
  fix (1:)
= let {x = 1 : x} in x
= let {x = 1 : x} in 1 : x
```
(.) :: (b -> c) -> (a -> b) -> a -> c：组合函数f . g x = f (g x)
- 组合链的长度并没有限制
- . 符号右边函数的输出值类型必须适用于.符号左边函数的输入值类型
```
-- 计算字符串中以大写字母开头的单词的个数
capCount = length . filter (isUpper . head) . words
```
- (.)是右关联的
($) :: forall r a (b :: TYPE r). (a -> b) -> a -> b：application operator
- (f $ x)等价于(f x)，但$是right-associative binding precedence，有助于省略一些括号
```
f $ g $ h x  =  f (g (h x))
```
- 也可以应用于其他场合，如
  - map ($ 0) xs
  - zipWith ($) fs xs
(Data.Function.&) :: a -> (a -> b) -> b：把前一操作的结果传给下一操作。reverse application operator，优先级比($)高一点
```
>>> 5 & (+1) & show
"6"
```
curry :: ((a, b) -> c) -> a -> b -> c：把需要一个(a, b)参数的函数变成需要两个参数的函数
```
>>> curry fst 1 2
1
```

uncurry :: (a -> b -> c) -> (a, b) -> c：把需要两个参数的函数变成需要tuple

>>> uncurry (+) (1,2)
3

>>> uncurry ($) (show, 1)
"1"

>>> map (uncurry max) [(1,2), (3,4), (6,8)]
[2,4,8]

日期、时间

import Data.Time.Clock (UTCTime(..))

错误、异常处理

error :: [Char] -> a：报告错误，中止求值过程
System.IO.Error.catchIOError :: IO a -> (IOError -> IO a) -> IO a
- 在遇到IOError时使用指定处理函数的值
```
tempdir <- catchIOError (getTemporaryDirectory) (\_ -> return ".")
```
Control.Exception.finally :: IO a -> IO b -> IO a
- 无论第一个函数是否抛出异常，都会执行第二个函数

Control.Exception.handle :: Exception e => (e -> IO a) -> IO a -> IO a

在动作抛出异常时，执行另一个动作。

{-# LANGUAGE ScopedTypeVariables #-}
import Control.Exception (bracket, handle, SomeException)

getFileSize path = handle (\(_ :: SomeException) -> return Nothing) $ do
  h <- openFile path ReadMode
  size <- hFileSize h
  hClose h
  return (Just size)

Control.Exception.bracket :: IO a -> (a -> IO b) -> (a -> IO c) -> IO c

类似与RAII，保证释放资源。

第一个参数是打开资源的动作，第二个是关闭资源的动作，第三个是要在资源上执行的动作。

{-# LANGUAGE ScopedTypeVariables #-}
import Control.Exception (bracket, handle, SomeException)

getFileSize path = handle (\(_ :: SomeException) -> return Nothing) $ do
  bracket (openFile path ReadMode) hClose $ \h -> do
    size <- hFileSize h
    return (Just size)

二进制和高效文件处理

Data.ByteString.ByteString：将文本或二进制数据用数组表示
- 适用于不在意内存占用而且需要随机访问的数据
Data.ByteString.Lazy.ByteString：将文本或二进制数据用64K块的列表表示
- 适用于大体积的文件流（几百MB至几TB）

这两个模块中都定义了take/readFile之类的函数。

pack :: [Word8] -> ByteString

import qualified Data.ByteString.Lazy as L
import qualified Data.ByteString.Lazy.Char8 as L8

hasElfMagic :: L.ByteString -> Bool
hasElfMagic content = L.take 4 content == elfMagic
    where elfMagic = L.pack [0x7f, 0x45, 0x4c, 0x46]

pack :: ByteString -> [Word8]
split :: Word8 -> ByteString -> [ByteString]
L8.readInt :: ByteString -> Maybe (Int, ByteString)
L8.uncons :: L.ByteString -> Maybe (Char, L.ByteString)：取出第一个元素

ByteString的各种转换

UTF8相关类在utf8-string包中。

import Data.ByteString.Lazy as BL
import Data.ByteString as BS
import Data.Text as TS
import Data.Text.Lazy as TL
import Data.ByteString.Lazy.UTF8 as BLU
import Data.ByteString.UTF8 as BSU
import Data.Text.Encoding as TSE
import Data.Text.Lazy.Encoding as TLE

-- String <-> ByteString

BLU.toString   :: BL.ByteString -> String
BLU.fromString :: String -> BL.ByteString
BSU.toString   :: BS.ByteString -> String
BSU.fromString :: String -> BS.ByteString

-- String <-> Text

TL.unpack :: TL.Text -> String
TL.pack   :: String -> TL.Text
TS.unpack :: TS.Text -> String
TS.pack   :: String -> TS.Text

-- ByteString <-> Text

TLE.encodeUtf8 :: TL.Text -> BL.ByteString
TLE.decodeUtf8 :: BL.ByteString -> TL.Text
TSE.encodeUtf8 :: TS.Text -> BS.ByteString
TSE.decodeUtf8 :: BS.ByteString -> TS.Text

-- Lazy <-> Strict

BL.fromStrict :: BS.ByteString -> BL.ByteString
BL.toStrict   :: BL.ByteString -> BS.ByteString
TL.fromStrict :: TS.Text -> TL.Text
TL.toStrict   :: TL.Text -> TS.Text

Monad相关

基本函数

(>>=) :: m a -> (a -> m b) -> m b
(>>) :: m a -> m b -> m b
return :: a -> m a
fail :: String -> m a
sequence :: Monad m => [m a] -> m [a]：对于一个Monad的列表，依次执行，返回结果的列表
sequence_ :: Monad m => [m a] -> m ()：依次执行，丢弃结果（只要副作用）
(=<<) :: Monad m => (a -> m b) -> m a -> m b：参数调转的>>=
```
f =<< x = x >>= f
```

列表函数的Monadic版本

mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]：对a中每项执行指定的IO操作，保留结果。适合命名函数+复杂表达式计算的数据
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m ()：对a中每项执行指定的IO操作，丢弃结果
```
putString :: [Char] -> IO ()
putString s = mapM_ putChar s
```
forM :: (Monad m) => [a] -> (a -> m b) -> m [b]：参数顺序颠倒的mapM，适合数据+匿名函数使用
Control.Monad.filterM :: Applicative m => (a -> m Bool) -> [a] -> m [a]
Control.Monad.liftM :: Monad m => (a1 -> r) -> m a1 -> m r: 把一个纯操作提升为对Monad操作
Control.Monad.foldM :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m a：monadic版本的foldl

Contro.Monad.filterM :: Monad m => (a -> m Bool) -> [a] -> m [a]

import Monad
import Directory
import System

-- this program prints only the directories named on the command line
main :: IO ()
main = do names <- getArgs
          dirs  <- filterM doesDirectoryExist names
          mapM_ putStrLn dirs

Contro.Monad.zipWithM ::(Monad m) => (a -> b -> m c) -> [a] -> [b] -> m [c]
Contro.Monad.zipWithM_ ::(Monad m) => (a -> b -> m c) -> [a] -> [b] -> m ()

Conditional monadic computations

Contro.Monad.when :: (Monad m) => Bool -> m () -> m ()
```
when p s = if p then s else return ()
```
Contro.Monad.unless :: (Monad m) => Bool -> m () -> m ()
```
unless p s = when (not p) s
```

ap和lifting

把一个普通函数转换为对Monad操作的函数。

liftM :: (Monad m) => (a -> b) -> (m a -> m b)
liftM2 :: (Monad m) => (a -> b -> c) -> (m a -> m b -> m c)：支持多参数，直到liftM5

Control.Monad.ap :: Monad m => m (a -> b) -> m a -> m b

ap = liftM2 ($)

ap u v = do
    f <- u
    x <- v
    return (f x)

IO函数

import System.IO

命令行处理

System.Environment.getArgs：把命令行参数解析为字符串列表
```
inpStr <- getLine
```

标准句柄

stdin
stdout
stderr

输入/输出

每个不带句柄的函数都有带h的对应函数

putStrLn：输出字符串
- 接受一个字符串
- 创建一个把字符串写到控制台的action
hPutStrLn :: Handle -> String -> IO ()
print :: Show a => a -> IO ()
hPrint :: Show a => Handle -> a -> IO ()
readLine：从控制台读取一行，返回字符串
hGetLine :: Handle -> IO String
hGetContents :: Handle -> IO String：读入整个文件。惰性求值，因此用于大文件也没有问题
interact :: (String -> String) -> IO ()：从标准输入读取内容，变换后输出到标准输出

文件IO

import System.IO (IOMode(..), hClose, hFileSize, openFile)

openFile :: FilePath -> IOMode -> IO Handle
- data IOMode = ReadMode | WriteMode | AppendMode | ReadWriteMode

hClose :: Handle -> IO ()

例子

main = do inFile <- openFile "foo" ReadMode
          contents <- hGetContents inFile
          putStr contents
          hClose inFile

hTell :: Handle -> IO Integer
hIsSeekable :: Handle -> IO Bool
hSeek :: Handle -> SeekMode -> Integer -> IO ()
- data SeekMode = AbsoluteSeek | RelativeSeek | SeekFromEnd<Paste>
hIsEOF :: Handle -> IO Bool
readFile :: FilePath -> IO String：读取文件的内容
writeFile :: FilePath -> String -> IO ()：把字符串写入文件

缓冲区

Haskell中有3种不同的缓冲区模式，它们定义成BufferMode：

NoBuffering
- 没有缓冲区
- 通常性能很差，不适用于一般目的的使用
LineBuffering
- 当换行符输出的时候会让输出缓冲区写入，或者当缓冲区太大的时候
- 在输入上，它通常试图去读取块上所有可用的字符，直到它首次遇到换行符
- 当从终端读取的时候，每次按下回车之后它会立即返回数据
- 这个模式经常是默认模式
BlockBuffering
- 在可能的时候以一个固定的块大小读取或者写入数据
- 在批处理大量数据的时候是性能最好的，就算数据是以行存储的也是一样
- 对于交互程序不能用，因为它会阻塞输入直到一整块数据被读取
- BlockBuffering接受一个Maybe类型的参数
  - 如果是Nothing ，它会使用一个自定的缓冲区大小
  - 可以使用一个像Just 4096的设定，设置缓冲区大小为4096个字节

设置

hSetBuffering :: Handle -> BufferMode -> IO ()
hGetBuffering :: Handle -> IO BufferMode

可以强制刷新

hFlush :: Handle -> IO ()

命令行参数

取命令行参数

System.Environment.getProgName :: IO String
System.Environment.getArgs :: IO [String]

System.Console.GetOpt

环境变量

获取

System.Environment.getEnv :: String -> IO String：取特定环境变量，不存在则抛出异常
System.Environment.getEnvironment :: IO [(String, String)]：返回所有环境变量

设置环境变量没有采用跨平台的方式。在像Linux这样的POSIX平台上，你可以使用System.Posix.Env模块中的putEnv或者setEnv。环境设置在Windows下面没有定义。

目录操作

System.Directory.removeFile :: FilePath -> IO ()
System.Directory.renameFile :: FilePath -> FilePath -> IO ()
System.Directory.doesDirectoryExist :: FilePath -> IO Bool
System.Directory.doesFileExist :: FilePath -> IO Bool
System.Directory.getCurrentDirectory :: IO FilePath
System.Directory.getDirectoryContents :: FilePath -> IO [FilePath]
System.Directory.getModificationTime :: FilePath -> IO System.Time.ClockTime

System.Directory.getPermissions :: FilePath -> IO Permissions

data Permissions
  = Permissions {readable :: Bool,
                 writable :: Bool,
                 executable :: Bool,
                 searchable :: Bool}
      -- Defined in System.Directory
instance Eq Permissions -- Defined in System.Directory
instance Ord Permissions -- Defined in System.Directory
instance Read Permissions -- Defined in System.Directory
instance Show Permissions -- Defined in System.Directory

System.FilePath（路径运算，非IO函数）

(</>) :: FilePath -> FilePath -> FilePath：组合两个路径
dropTrailingPathSeparator :: FilePath -> FilePath：去掉路径末尾的分隔符
splitFileName :: FilePath -> (String, String)：把路径从最后一个分隔符分为两部分（目录、文件名)
takeExtension :: FilePath -> String

临时文件

System.IO.openTempFile :: FilePath -> String -> IO (FilePath, Handle)
- 参数
  - 第一个参数是创建文件的目录，如.或getTemporaryDirectory
  - 第二个参数是命名模板，会添加一些随机字符保证文件名唯一
- 返回值
  - 文件路径
  - 以ReadWriteMode打开的文件句柄
- 使用完毕，需要hClose文件并removeFile删除文件
System.IO.openBinaryTempFile :: FilePath -> String -> IO (FilePath, Handle)
System.Directory.getTemporaryDirectory :: IO FilePath

UTF-8支持

System.IO

import System.IO

main = do
    inputHandle <- openFile "input.txt" ReadMode
    hSetEncoding inputHandle utf8
    theInput <- hGetContents inputHandle

    outputHandle <- openFile "output.txt" WriteMode
    hSetEncoding outputHandle utf8
    hPutStr outputHandle (unlist . proc . lines $ theInput)

    hClose inputHandle
    hClose outputHandle

Encoding

使用encoding包

读取UTF-8文件

import Prelude hiding (getContents,putStr)
import System.IO.Encoding
import Data.Encoding.UTF8

main = do
  let ?enc = UTF8
  str <- getContents
  putStr str

使用系统当前编码

import Prelude hiding (getContents,putStr)
import System.IO.Encoding

main = do
  e <- getSystemEncoding
  let ?enc = e
  str <- getContents
  putStr str

其他函数

id :: a -> a：接受一个值，并原封不动地返回这个值
max :: Ord a => a -> a -> a：取两个值中的最大值
const :: a -> b -> a：对于任何参数，都返回指定常量

其他库

`Parsec`

https://hackage.haskell.org/package/parsec
安装
```
cabal update
cabal install parsec
```

import

import Text.ParserCombinators.Parsec hiding (spaces)

predication
- oneOf
- letter
- digit

`Data.Binary`

提供了很多对Lazy ByteString的支持，主要是序列化的支持。

put :: t -> Put：Encode a value in the Put monad.
get :: Get t：Decode a value in the Get monad
putList :: [t] -> Put：Encode a list of values in the Put monad. The default implementation may be overridden to be more efficient but must still have the same encoding format.

测试

`hspec`

自动发现测试用例

在Spec.hs中只需要一行：

{-# OPTIONS_GHC -F -pgmF hspec-discover #-}

expectations

https://github.com/hspec/hspec-expectations

expectationFailure :: HasCallStack => String -> Expectation
shouldBe :: (HasCallStack, Show a, Eq a) => a -> a -> Expectation
shouldSatisfy :: (HasCallStack, Show a) => a -> (a -> Bool) -> Expectation
shouldStartWith :: (HasCallStack, Show a, Eq a) => [a] -> [a] -> Expectation
shouldEndWith :: (HasCallStack, Show a, Eq a) => [a] -> [a] -> Expectation
shouldContain :: (HasCallStack, Show a, Eq a) => [a] -> [a] -> Expectation
shouldMatchList :: (HasCallStack, Show a, Eq a) => [a] -> [a] -> Expectation
shouldReturn :: (HasCallStack, Show a, Eq a) => IO a -> a -> Expectation
shouldNotBe :: (HasCallStack, Show a, Eq a) => a -> a -> Expectation
shouldNotSatisfy :: (HasCallStack, Show a) => a -> (a -> Bool) -> Expectation
shouldNotContain :: (HasCallStack, Show a, Eq a) => [a] -> [a] -> Expectation
shouldNotReturn :: (HasCallStack, Show a, Eq a) => IO a -> a -> Expectation

异常

shouldThrow :: Exception e => IO a -> Selector e -> Expectation：期待捕捉异常

Control.Exception.evaluate :: a -> IO a可以用于从pure code期待异常

import Control.Exception (evaluate)

evaluate (head []) `shouldThrow` anyException
evaluate (head []) `shouldThrow` errorCall "Prelude.head: empty list"

evaluate在遇到第一个constructor时就会停止求值，可以用**force**遍历整个列表
- 这样不会触发，在遇到:时就会停止求值
```
evaluate ('a' : undefined) `shouldThrow` anyErrorCall
```
- 这样才行
```
import Control.DeepSeq (force)
(evaluate . force) ('a' : undefined) `shouldThrow` anyErrorCall
```

Selector是断言函数，type Selector a = a -> Bool
- 部分内置的Selector
  - anyException :: Selector SomeException
  - anyErrorCall :: Selector ErrorCall
  - anyIOException :: Selector IOException
  - anyArithException :: Selector ArithException
- 可以使用模式匹配
```
launchMissiles `shouldThrow` (== ExitFailure 1)
```
- error和undefined都会抛出ErrorCall异常
  - 由于ErrorCall没有实现Eq，所以不能通过模式匹配进行捕捉
  - 可以使用errorCall :: String -> Selector ErrorCall进行捕捉
```
evaluate (head []) `shouldThrow` errorCall "Prelude.head: empty list"
```
- System.IO.Error中暴露了一批用于判断是否特定异常的断言函数
  - isAlreadyExistsError :: IOError -> Bool
  - isDoesNotExistError :: IOError -> Bool
  - isAlreadyInUseError :: IOError -> Bool
  - isFullError :: IOError -> Bool
  - isEOFError :: IOError -> Bool
  - isIllegalOperation :: IOError -> Bool
  - isPermissionError :: IOError -> Bool
  - isUserError :: IOError -> Bool
```
launchMissiles `shouldThrow` isPermissionError
```

浮点数比较

在Codewars上提供了用于浮点数比较的shouldbeApprox/shouldBeApproxPrec

https://github.com/Codewars/hspec-codewars/blob/master/src/Test/Hspec/Codewars.hs

-- | Create approximately equal expectation with margin.
--
-- > shouldBeApprox' = shouldBeApproxPrec 1e-9
shouldBeApproxPrec :: (Fractional a, Ord a, Show a) => a -> a -> a -> Expectation
shouldBeApproxPrec margin actual expected =
  if abs (actual - expected) < abs margin * max 1 (abs expected)
    then return ()
    else expectationFailure message
  where
    message = concat [
      "Test Failed\nexpected: ", show expected,
      " within margin of ", show margin,
      "\n but got: ", show actual]

infix 1 `shouldBeApprox`

-- | Predefined approximately equal expectation.
-- @actual \`shouldBeApprox\` expected@ sets the expectation that @actual@ is
-- approximately equal to @expected@ within the margin of @1e-6@.
--
-- > sqrt 2.0 `shouldBeApprox` (1.4142135 :: Double)
shouldBeApprox :: (Fractional a, Ord a, Show a) => a -> a -> Expectation
shouldBeApprox = shouldBeApproxPrec 1e-6

QuickCheck

自动生成随机数据进行测试。

集成到`hspec`

只支持Property。可以用QuickCheck的property函数把任意Testable类型转换为Property，如：

-- file Spec.hs
import Test.Hspec
import Test.QuickCheck

main :: IO ()
main = hspec $ do
  describe "read" $ do
    context "when used with ints" $ do
      it "is inverse to show" $ property $
        \x -> (read . show) x == (x :: Int)

可以指定QuickCheck的参数：

import Test.Hspec.Core.QuickCheck (modifyMaxSize)

describe "read" $ do
  modifyMaxSize (const 1000) $ it "is inverse to show" $ property $
    \x -> (read . show) x == (x :: Int)

调试Debug

import Debug.Trace (trace)

trace ("calling f with x = " ++ show x) (f x)
trace ("calling f with x = " ++ show x ++ " -> " ++ show result) result
  where result = f x

ghci

命令

:module/:m：载入给定的模块
:info/:i：可以查看运算符的优先级和结合性
```
:info (+)
```
:set/:unset
- :set +t：打印结果的类型
:type/:t：打印值的类型
:load/:l：载入指定的文件
:browse：浏览指定模块的内容

技巧

it：存放最后一次求值的结果

包管理

ghc-pkg

ghc-pkg list：列出已安装的包
ghc-pkg unregister：告诉GHC我们不再用这个包了。我们需要手动删除已安装的文件。

例程

控制台交互框架

import System.Environment (getArgs)

interactWith function inputFile outputFile = do
  input <- readFile inputFile
  writeFile outputFile (function input)

main = mainWith myFunction
  where mainWith function = do
          args <- getArgs
          case args of
            [input,output] -> interactWith function input output
            _ -> putStrLn "error: exactly two arguments needed"

        -- replace "id" with the name of our function below
        myFunction = id

Uh oh!

haskell

Haskell技巧

学习教程

CheatSheet

参考书

教程

课程

文档

使用

安装

MacOS

Windows

cabal

stack

vim

haskell-ide-engine (hie, lsp server)

日常命令

新建项目

编译

编码规范

布局

命名规范

其他

语言

基本概念

注释

缩进与空白

表达式

列表推导式（List comprehensions）

operator

类型

基本数值

Prelude中的其他类型

列表（Lists）

元组

自定义类型

函数

定义

参数多态

模式匹配

使用守卫guard实现条件求值

匿名函数Lambda

部分函数应用和柯里化（Currying）

As-模式

提高代码可读性

使用seq实现严格求值

IO

类型类（class)

基本定义

自动派生

模块（Module）

语言扩展

重要的类型类

Eq - 相等比较

Show - 将值转换为字符串

Read - 将字符串转换为指定类型的值

Ord - 比较顺序

Data.Monoid - 幺半群

Functor - 函子，支持fmap操作的类型（<$>）

定义

简介

约束

其他方法

Applicative（<*>）

定义

简介

约束

例子

Monad - 单子（>>=）

定义

简介

约束

基本使用

Control.Monad.MonadPlus

定义

约束

常用的Monad类

Identity

概述

Functor - 函子，支持fmap操作的类型（`<$>`）

Applicative（`<*>`）

Monad - 单子（`>>=`）

`regex-posix`

`regex-tdfa`