# `gitermap` examples (1)

In this notebooks we will covered a vast range of examples to use with `gitermap`. Most of these examples will be simple to cover uage. In the next example notebook will deploy some more complex examples using `sklearn` for example machine learning with this module.

### Example 1

Basic `map()`.

In [1]:
from gitermap import umap

In [2]:
umap(lambda x: x**2, range(10))

100%|██████████| 10/10 [00:00<?, ?it/s]


[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

### Example 2

A longer example that allows viewers to see the progressbar update given sleep commands.

In [3]:
import time

In [4]:
def long_f(x):
    time.sleep(0.5)
    return x**2

In [5]:
umap(long_f, range(5))

100%|██████████| 5/5 [00:02<00:00,  1.92it/s]


[0, 1, 4, 9, 16]

### Example 3: Parallel

Easy parallelization with another letter. By default the number of cores selected $C$ will either be the maximum your machine provides $C=m-1$, or if $k<m$ iterations, then $C=k$.

In [6]:
from gitermap import umapp

In [7]:
_ = umapp(long_f, range(50))

  0%|          | 0/50 [00:00<?, ?it/s]

In [8]:
def abit_more_complex(x, y, special):
    return (x+y)*special

In [9]:
from functools import partial
import numpy as np

my_special_func = partial(abit_more_complex, special=1.5)
x = np.random.randn(10)
y = np.random.rand(10)

_ = umapp(my_special_func, x, y)

  0%|          | 0/10 [00:00<?, ?it/s]

### Example 4: Caching

In [10]:
from gitermap import umapc, umapcc
_ = umapc("examples/example4.pkl", my_special_func, x, y)

Second run...

In [11]:
_ = umapc("examples/example4.pkl", my_special_func, x, y)

#### Example 4.5: Caching with step-chunks

Here a folder called `tmp_umapcc_` is created with each number stored in a separate file - a bit overkill for this simple example but incredibly useful if the return value is more complex, and takes a long time to compute. The folder is deleted if and only if the final file writes to disk, meaning that if your code breaks part of the way through a run, re-running the function will pick up where you left off by reading in all of the chunks up and until the crashpoint.

In [16]:
_ = umapcc("examples/example4-5.pkl", long_f, range(20))

### Example 5: Context

`MapContext` basically allows similar things to the raw functions, except you have much more control over the entire pipeline, including verbosity, parallelism, chunking, and more. 

Options include:
- verbose : level of verbosity.
- n_jobs : number of cores
- chunks : whether to use chunking 
- return_type : whether to compute and return as a list or produce a generator (only works for umap)
- savemode : whether to only run once, override or add-on to the cache file, see examples.

Processing is done through the `compute` function:

In [17]:
from gitermap import MapContext

The same as `umap`

In [18]:
with MapContext() as ctx1:
    ctx1.compute(my_special_func, x, y)

100%|██████████| 10/10 [00:00<00:00, 10012.66it/s]


`umapp` with Parallel n_jobs=-1, and verbosity.

In [19]:
def long_f2(x, y):
    time.sleep(0.5)
    return x**2+y**2

In [20]:
with MapContext(verbose=1, n_jobs=-1) as ctx2:
    data = ctx2.compute(long_f2, range(50), range(50))

  0%|          | 0/50 [00:00<?, ?it/s]

`umapc` example

In [22]:
with MapContext("examples/example5.pkl", verbose=1) as ctx3:
    data = ctx3.compute(my_special_func, x, y)

loading from file 'examples/example5.pkl'


In [23]:
MapContext().compute(long_f, range(10))

100%|██████████| 10/10 [00:05<00:00,  1.96it/s]


[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

### Example 6: Contexts with keyword arguments

You can make use of `functools.partial` to pass in your keyword arguments to custom function `f`: alternatively just use `compute` passing in keyword arguments, which we wrap for you.

So we could use `abit_more_complex` with compute instead, passing in `special` as a keyword argument.

In [24]:
_ = MapContext().compute(abit_more_complex, x, y, special=1.5)

100%|██████████| 10/10 [00:00<00:00, 10012.66it/s]


#### Example 6.5: Returning a generator

Instead of getting the list back, we may wish to not compute the steps and return a generator. This could be useful if you wish to couple a `gitermap` execution pipeline with a custom one, or you are using large files and don't wish to load it in at once.

In [26]:
gen1 = MapContext(return_type="generator").compute(long_f, range(10))

In [28]:
list(gen1)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

### Example 7: Changing the save mode type.

This determines how end-caching works over multiple runs. You can choose from:

- `initial`: Saves at the end, then reloads the saved data thereafter
- `override`: Saves every time its run, deleting previous version
- `add`: Saves every time its run, concatenating results to previous version

The last option is particularly useful when sampling random numbers within probabilistic frameworks. `example2` will have an example on this.

In [29]:
ctx4 = MapContext("examples/example7.pkl", n_jobs=-1, savemode="add")

In [30]:
data =ctx4.compute(long_f, range(10))

  0%|          | 0/10 [00:00<?, ?it/s]

In [31]:
data = ctx4.compute(long_f, range(20))

  0%|          | 0/20 [00:00<?, ?it/s]

In [32]:
len(data)

30

#### Example 7.5: Clearing the cached file

In [33]:
ctx4.clear()

### Example 8: Sound!

If you have `simpleaudio` and `numpy` installed, you can play sounds at the end of a run: a happy or not so happy arpeggio!

In [34]:
#!pip install simpleaudio

In [35]:
with MapContext(n_jobs=-1, end_audio=True) as ctx5:
    print(ctx5.end_audio)
    _ = ctx5.compute(long_f, range(30))

True


  0%|          | 0/30 [00:00<?, ?it/s]

### Example 9: Iterables

We also handle the case where iterables are passed to the list comprehension, but the major drawback is that because there is no access to the `__len__` attribute, `tqdm` does not indicate how many iterations are remaining, and hence ETA.

In [36]:
import itertools as it

In [37]:
_ = umap(long_f, it.islice(it.count(), 0, 10))

10it [00:05,  1.96it/s]


However, if **any** of the arguments is not an iterable, and estimate of the number of runs can be sampled. This is because the `range()` object exposes the `__len__` attribute to use, and we assume that the number of arguments passed for each arg is the same length.

In [38]:
_ = umap(long_f2, it.islice(it.count(), 0, 10), range(10))

100%|██████████| 10/10 [00:05<00:00,  1.96it/s]


Hopefully you enjoyed some of these small examples.

### Example 10: More cache-by-chunks

In [40]:
with MapContext("examples/example10.pkl", chunks=True) as ctx6:
    d10 = ctx6.compute(long_f, range(10))