# Instruction to regular expression in Scala

by Victoria Liao

-----------------------------------------------------------

## Table of content

1. [X] What is regular expression
1. [X] Regular expression in Scala
    - Regex function #r
    - Find & Replace 
1. [ ] Regular expression patterns
    - Basic tokens
    - Practical problems
1. Reference
------------------------

Downlaod jupter notebook to one of markdown / pdf / html

```
jupyter nbconvert --to pdf your_jupyter_notebook.ipynb
jupyter nbconvert --to markdown your_jupyter_notebook.ipynb
jupyter nbconvert --to html your_jupyter_notebook.ipynb
```

## 1. What is regular expression?

Everything in a string is essentially a character, and we are writing patterns to match a specific string. This
pattern is called regular expression. 

Regular expressions are useful in extracting information from text.

---------------------------------------

## 2. Regular expression in Scala


In [1]:
//  Scala dependency requirement 

import scala.util.matching.Regex

val text = "foo bar foo"

[32mimport [39m[36mscala.util.matching.Regex

[39m
[36mtext[39m: [32mString[39m = [32m"foo bar foo"[39m

--------------------

### 2.1 Regex function #r 

We create a String and call the r( ) / .r method on it. 

`String -> RichString -> Regex`

In [2]:
 val pattern = "foo".r

[36mpattern[39m: [32mRegex[39m = foo

### 2.2 Find & replace

### findFirstIn / replaceFirstIn

To find / replace a first match of the regular expression 

In [4]:
pattern findFirstIn text

[36mres3[39m: [32mOption[39m[[32mString[39m] = [33mSome[39m([32m"foo"[39m)

In [3]:
val ReplacedText = "bar"

pattern replaceFirstIn(text, ReplacedText)

[36mReplacedText[39m: [32mString[39m = [32m"bar"[39m
[36mres2_1[39m: [32mString[39m = [32m"bar bar foo"[39m

### findAlltIn / replaceAllIn

To find / replace all occurrences of the matching word

In [5]:
pattern findAllIn text foreach println

foo
foo


In [6]:
pattern replaceAllIn(text, ReplacedText)

[36mres5[39m: [32mString[39m = [32m"bar bar bar"[39m

----------------------
## 3. Regular expression patterns

To make it easier, I always use `#findFirstIn` in the section.

### 3.1 Basic tokens

1. A sub string that is the same as the pattern
1. \d: any digit from 0 to 9
1. Dot the wildcard
----------------------

In [2]:
//  Scala dependency requirement 

import scala.util.matching.Regex

[32mimport [39m[36mscala.util.matching.Regex[39m

### A sub string that is the same as the pattern


Pattern: `"foo 1"` 

Match `"foo 1"` in 
 - `"foo 1 fooo"`
 - `"bar foo 1"`

In [15]:
val pattern = "foo 1".r
val text = "foo 1 fooo"
pattern findFirstIn text 

[36mpattern[39m: [32mRegex[39m = foo 1
[36mtext[39m: [32mString[39m = [32m"foo 1 fooo"[39m
[36mres14_2[39m: [32mOption[39m[[32mString[39m] = [33mSome[39m([32m"foo 1"[39m)

-----------------------------------

### \d: any digit from 0 to 9 

The preceding slash `\` distinguishes it from the simple d character and indicates that it is a metacharacter.

> Take away: need to use double slash in Scala string for \d - `"\\d".r`

-----------

Pattern: `"\\d"`

 Match `1` in `1234`
 
 Match `2` in `2 foo`


In [18]:
val pattern = "\\d".r
val text = "1234"
pattern findFirstIn text 

[36mpattern[39m: [32mRegex[39m = \d
[36mtext[39m: [32mString[39m = [32m"1234"[39m
[36mres17_2[39m: [32mOption[39m[[32mString[39m] = [33mSome[39m([32m"1"[39m)

In [19]:
val pattern = "\\d".r
val text = "2 foo"
pattern findFirstIn text 

[36mpattern[39m: [32mRegex[39m = \d
[36mtext[39m: [32mString[39m = [32m"2 foo"[39m
[36mres18_2[39m: [32mOption[39m[[32mString[39m] = [33mSome[39m([32m"2"[39m)

---------------------

### Dot the wildcard

A wildcard is a card that can represent any card in the deck in poker games. Similarly, . (dot)  can match any single character (letter, digit, whitespace, everything). 

```
Note: 
.  is the wildcard
\\. is the dot symbol
```
-----------

Pattern: `...\\.` 

Match 
- `"cat."`
- `"896."`
- `"?=+."`	

Skip	
- `abc1`

In [3]:
val pattern = "...\\.".r
val text = "cat."
pattern findFirstIn text 

[36mpattern[39m: [32mRegex[39m = ...\.
[36mtext[39m: [32mString[39m = [32m"cat."[39m
[36mres2_2[39m: [32mOption[39m[[32mString[39m] = [33mSome[39m([32m"cat."[39m)

In [4]:
val pattern = "...\\.".r
val text = "abc1"
pattern findFirstIn text 

[36mpattern[39m: [32mRegex[39m = ...\.
[36mtext[39m: [32mString[39m = [32m"abc1"[39m
[36mres3_2[39m: [32mOption[39m[[32mString[39m] = [32mNone[39m

# Reference

1. Regexone.com. 2021. RegexOne - Learn Regular Expressions - Lesson 1: An Introduction, and the ABCs.
[online] Available at: [RegexOne - Learn Regular Expressions, 2021](https://regexone.com/lesson/introduction_abcs) [Accessed 5 June 2021].
1. Tutorialspoint.com. 2021. Scala - Regular Expressions - Tutorialspoint. [online] Available at: [Scala -
Regular Expressions - Tutorialspoint, 2021](https://www.tutorialspoint.com/scala/scala_regular_expressions.htm) [Accessed 5 June 2021]
1. Dib, F., 2021. regex101: build, test, and debug regex. [online] regex101. Available at: [Dib, 2021](https://regex101.com/) [Accessed 5 June 2021].