# Example: Fun with Hashing Functions
This example familiarizes students with the concept of [hash functions](https://en.wikipedia.org/wiki/Hash_function), i.e., a function that maps data to an index of a fixed-size table. In particular, we'll look the `myhash(...)` function in `src/Compute.jl` which is an example of a `linear hash function` of the form:
$$
h(x) = (ax+b)~\text{mod}~{m}
$$
where $\text{mod}$ denotes the [modulo operation](https://en.wikipedia.org/wiki/Modulo), and $a,b$ and $m$ are parameters. The $m$ parameter (called the `size`) strongly influences the likelihood of `collisions`.

## Setup
This example may use external third-party packages. In the `Include.jl` file, we load our codes to access them in the notebook, set some required paths for this example, and load any required external packages.

In [25]:
include("Include.jl");

[32m[1m  Activating[22m[39m project at `~/Desktop/julia_work/CHEME-4800-5800-Examples-AY-2024/week-5/L5c`
[32m[1m  No Changes[22m[39m to `~/Desktop/julia_work/CHEME-4800-5800-Examples-AY-2024/week-5/L5c/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/julia_work/CHEME-4800-5800-Examples-AY-2024/week-5/L5c/Manifest.toml`
[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/julia_work/CHEME-4800-5800-Examples-AY-2024/week-5/L5c/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/julia_work/CHEME-4800-5800-Examples-AY-2024/week-5/L5c/Manifest.toml`


## Example: Computing the `hashcode` of a String using a Linear hash function
One of the interesting (and amazing!) things about the [Dictionary type in Julia](https://docs.julialang.org/en/v1/base/collections/#Base.Dict) or [Python](https://docs.python.org/3/tutorial/datastructures.html#dictionaries) is the ability to map `any key` => to `any value`, where the `key` is unique. But how does this work?
* Behind the scenes, we use a `hash` function to take a `key` and convert it into an `Int` index into an `Array{typeof{value},1}` that holds the value. Thus, the magic of a dictionary is just a clever way of computing an array index.

Let's specify a `test_string` with some data in it:

In [34]:
test_string = "This is a test string. In lecture ... wow!";

Next, let's use the `myhash(...)` function in the `src/Compute.jl` file to compute the `hashcode` of the `test_string`:

In [35]:
test_hashed_value = myhash(test_string, β = 31, size = 1000) # big size

227

### OK, but does this trick always work?
The short answer is no; sometimes, there may be `collisions` when generating the `hashcode` of a `key.` When the `size` parameter is `small,` there is a higher likelihood that two strings will get mapped to the same `hashcode`

In [42]:
another_test_string = "CHEME-4800"

"CHEME-4800"

In [43]:
another_hashed_value = myhash(another_test_string, β = 31, size = 10000)

2153

In [45]:
what = hash(another_test_string)

0x468e4aa808e10017

In [46]:
typeof(what)

UInt64