# Scalars in Clojure

###### REFS:
The Joy of Clojure - Part 2 Chapter 4

### What are Scalars?
A scalar data type represents a singular value of one of the following types:
   
    number, symbol, keyword, string, or character.

Most of Clojure’s scalar data types are familiar to you, but there are nuances that should be observed.
- - -
## Numerical Precision
<figure>
    <img src="https://static2.srcdn.com/wordpress/wp-content/uploads/2019/06/kevin-the-office-stupidity.jpg?q=50&fit=crop&w=740&h=370" alt="drawing"  style="width:25%" />
    <figcaption>I do the numbers</figcaption>
</figure>
<br>
Numbers in Clojure are by default as precise as they need to be.

With enough memory, you are able to store a number as accurately as possible, but that is rarely needed.

Raw Clojure functions and forms handle precision automatically, so it’s pretty trivial.

Because Clojure encourages interoperability with its host platform, the matter of accuracy becomes less than certain. 

### Truncation

Truncation is limiting accuracy for a floating-point number. When a number is truncated,<br> 
the percision is based on the number of bits that can fit into the storage space allowed by its representation. Clojure usually truncates numbers by default.

To keep the floating point percision, we use the M literal. 
Note: 
- M = BigDecimal
- N = BigInt     

In [None]:
(let [imadeapi 3.14159265358979323846264338327950288419716939937M]
    (println (class imadeapi))
    (println imadeapi))

(println "-------------")

(let [iatethepi 3.14159265358979323846264338327950288419716939937]
    (println (class iatethepi))
    (println iatethepi))

Essentially *iatethepi* is truncated because the default Java double type is insufficient<br>
and *imadeapi* uses Clojure's literal notation.

### Promotion

Clojure is able to automatically detect when overflow occurs, and it promotes the value to an appropriate numerical representation.

In [None]:
(def x 5)

(println "Clojure sets numbers to long by defult->" (class x))

;(println "Should still be long->" (class (+ x 9000000000000000)))

;(println "Promoted to BigInt->" (class (+ x 90000000000000000000)))

;(println "Promoted to Double->" (class (+ x 1.0)))

### Overflow

When a numeric calculation results in a value that’s too big the bits of storage wrap around.<br>
Promotion usually handles it, but when you’re dealing with numeric operations on primitive types an overflow can occur. 

In [None]:
;; how Clojure handles integer overflows
;(+ Long/MAX_VALUE Long/MAX_VALUE)

;; you can use unchecked-add to allow the overflow to happen
;(unchecked-add Long/MAX_VALUE Long/MAX_VALUE)

    ;; unchecked-add apparently keeps numbers at Long even if I cast them
    ;(class (unchecked-add (int 1) (int 1)))

    ;; Note: Using integer? tests if it's a math integer, not a Java Integer
    ;(integer? (+ 1 1))

    ;; integer? returns true for BigInts as well. If you don't want this behavior, you can 
    ;; use the int? predicate instead
    ;(println "should return true ->"(integer? 13N))
    ;(println "should return false ->"(int? 13N))

;; anyway, to overflow integers you use unchecked-add-int
;(unchecked-add-int IntegerA/MX_VALUE Integer/MAX_VALUE)

;; --what happens if we add 2 doubles?
;(+ Double/MAX_VALUE Double/MAX_VALUE)

When is an overflow desireable? I have no idea.


### Underflow

The number is so small it collapses to 0.

In [None]:
(float 0.0000000000000000000000000000000000000000000001)

### Rounding errors

There’s a famous case involving the failure of a Patriot missile caused by a rounding error, resulting in many deaths.<br>
This occurred due to a rounding error in the representation of a count register’s update interval.<br>
The timer register was meant to update once every 0.1 seconds, but an approximation was used instead and over the course of 100 hours there was a timing error of .34 seconds.

In [None]:
(let [approx-interval  (/ 209715 2097152)
      actual-interval (/ 1 10)
      hours           (* 3600 100 10)
      actual-total    (double (* hours actual-interval))
      approx-total    (double (* hours approx-interval))]
    (- actual-total approx-total))

Can you spot and fix what is wrong with the following code?

In [None]:
(+ 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 )

;(+ 0.1M 0.1M 0.1M 0.1 0.1M 0.1M 0.1M 0.1M 0.1M 0.1M)


;;ok now tell me why this one works?
;(+ 0.1M 0.1M 0.1M 0.1 0.1M 0.1M 0.1M 0.1M 0.1 0.1M)

- - -

## Rationals

Clojure provides a data type that theoretically lets you retain perfect accuracy.

Clojure provides a decimal type that’s boundless relative to your computer memory, but decimal operations can be easily corrupted and are not as accurate.

In [None]:
;; can't store big numbers
(def x 1.0E-4300000000M)

In [None]:
;; associativity should have 17 for both of them
(def a  1.0e50)
(def b  -1.0e50)
(def c  17.0e00)

;(+ (+ a b) c);;=> 17.0

;(+ a (+ b c));;=> 0.0

How to fix this? Use the rationalize function. Try it yourself

Clojure has the following functions to help with rationals:
- rationalize
- rational?
- ratio?


Some rules to follow to maintain perfect accuracy
- Never use Java math libraries unless they return results of BigDecimal.
- Don’t rationalize values that are Java float or double primitives.
- If you must write your own high-precision calculations, do so with rationals.
- Only convert to a floating-point representation as a last resort.

---

## Keywords

keywords are self-evaluating types that are prefixed byone or more colons.<br>
Keywords always refer to themselves, whereas symbols don’t.

One way of using keywords is to use them as keys (as shown below).<br>

Another way of using keywords is as functions. How would you change the code below to use keywords as functions? 

In [None]:
;(keyword? 'blues)
;(keyword? :blues)
;(symbol? :blues)
;(symbol? 'blues)

;; keywords refer to themselves whereas symbols are not
;(identical? :a :a)
;(identical? 'a 'a)
 
;(= :a :a)
;(= 'a 'a)

In [None]:
(def population {:zombies 2700, :humans 9}) ;; how to define using keyword

(get population :zombies) ;; should be => 2700

(println (/ (get population :zombies)
            (get population :humans))
         "zombies per capita") ;; should output 300 zombies per capita

;; you can also set keywords like this
;(keyword "test")

An important reason to use keywords as map keys is that they’re also functions that take a map as an argument to perform lookups of themselves.

Keywords can also be used as enums.

<b>Keywords and namespaces</b>

Keywords don’t belong to any specific namespace, although they may appear to at certain times

In [None]:
;; simple keyword
:not-in-ns

In [None]:
;; this is also not in a namespace, but it appears to be
::also-not-in-ns

In [None]:
(defn do-blowfish [directive]
    (case directive
        ;; these are only prefixed to look like they belong to a namespace
      
        :aquarium/blowfish (println "feed the fish")
        :crypto/blowfish   (println "encode the message")
        :blowfish          (println "not sure what to do")))
(ns crypto)
(user/do-blowfish :blowfish) ;; not sure what to do
(user/do-blowfish ::blowfish) ;; encode the message

(ns aquarium)
(user/do-blowfish :blowfish) ;; not sure what to do
(user/do-blowfish ::blowfish) ;; feed the fish

When switching to different namespaces using ns, you can use the namespace-qualified keyword syntax to ensure that the correct domain-specific code path is executed.

---

## Regex

_"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."_ - Jamie Zawinski

Regular expressions are a powerful and compact way to find specific patterns in text strings.

A literal regex pattern looks like:
    
    #"this"
   
this complies into a regex pattern object that can be used with:
- Java interop method calls
- Clojure regex functions

In [None]:
(class #"test")

Regex option flags:

| Flag | Flag name       | description |
|:---: |:---             |:---         |
|  d   | UNIX_LINES      | (.^\$) match only with unix line terminator /n|
|  i   | CASE_INSENSITIVE| disregards upper and lower case characters|
|  x   | COMMENTS        |  ignores whitespaces and comments |
|  m   | MULTILINE       |  (^\$) match near line terminators instead of just the begining of input string |
|  s   | DOTALL          |  . matches any character, including line terminators |
|  u   | UNICODE_CASE    | Causes the *i* flag to use Unicode case instead of ASCII |

In [None]:
(re-matches #"(?i)you should bring in doughnuts" "YoU shOuLd bRinG iN DougHNuTs")

Regex functions:

re-matcher :  Returns an instance of java.util.regex.Matcher, for use, e.g. in re-find.

re-find    :  Returns the next regex match, if any, of string to pattern, using java.util.regex.Matcher.find().  Uses re-groups to return the groups.

re-groups  :  Returns the groups from the most recent match/find. (Just use re-find)

re-pattern :  Lets you define the regex pattern without using the literal

re-matches :  Returns the match of string to pattern, using java.util.regex.Matcher.matches(). Uses re-groups to return the groups.

In [None]:
;; re-matcher example
(re-matcher #"\d+" "abc12345def")

In [None]:
;; re-find 
(def matcher (re-matcher #"\d+" "abc12345def"))
    ;; \d+ finds sequence of numbers in string
(re-find matcher)
;; or 
;(re-find #"\d+" "abc12345def"))

In [None]:
(def phone-number "672-345-456-3212")
(def matcher (re-matcher #"\d+" phone-number))

(println (re-find matcher))
(println (re-find matcher))
(println (re-find matcher))
(println (re-find matcher))
(println (re-find matcher))

;; can you rewrite this code to loop through instead?

In [None]:
;; re-groups
(def phone-number "672-345-456-3212")
(def matcher (re-matcher #"\d+" phone-number))
(re-find matcher)
(re-groups matcher)

In [None]:
;; re-pattern
(re-pattern "\\d+")
;; or you can just define the literal #"\d+" 

;(re-find (re-pattern "\\d+") "abc123def")
;(re-find #"\d+" "abc123def")

In [None]:
;; re-matches example
(re-matches #"abc" "abc") ;; if there is a match, it returns a string

(re-matches #"(.*)\d+(.*)" "abc12345def") ;; if there is a match but there are groups, then it returns a vector

As it turns out, Java's Matcher object mutates in a non-thread-safe way, so avoid re-matcher, re-groups, re-find as much as you can.

You can also replace regex matches within a string and split strings up

In [None]:
(clojure.string/replace "mississippi" #"i.." "obb")
;(clojure.string/replace "mississippi" #"(i)" "$1$1") ;; $1 refers to the regex group number
;(clojure.string/split "This is a string    that I am splitting." #"\s+")