# Primitive Generators - The Basic Building Blocks


[TOC]


## Introduction

The most fundamental building blocks of `tohu` are the so-called "primitive" generators. A primitive generator is simply a random generator which knows how to produce values of a specific "type" (in a slightly loose sense of the word). For example, there are primitive generators which produce random integers, random boolean values, random names, etc. (see [here](../../reference_guides/primitive_generators/) for a full list).

The reason they are called "primitive" is because they do not depend on any other generators in `tohu`, and they can be combined into more complex generators (see subsequent sections).

This section illustrates how to create and use primitive generators directly. Note that in practical use you will rarely need or want to create them manually as we do here - typically, they will be created as part of a `CustomGenerator` (TODO: see section [...]). However, it is useful to get a feel for how they work under the hood, so let's look at an example.

## First example: using `Integer` to produce random integers

For our first example, let's use the `Integer` generator, which produces random integers in a given range.

Here is the full code snippet and its output (we will look at this line by line in the next section).

In [1]:
from tohu.primitive_generators import Integer

# Create an instance of an Integer generator
g = Integer(100, 200)
g.reset(seed=12345)

# Produce a single value
print(next(g))

# Produce a few more values manually
for _ in range(3):
    print(next(g))
    
# Produce a sequence of values
#g.generate(num=10, seed=12345)

153
193
101
138


## Analysing the example

### Creating the `Integer` generator

First we create an `Integer` generator that will produce values between 100 and 200.

In [2]:
g = Integer(100, 200)

### Resetting the generator

Before we actually make it produce any values, we first reset the generator. The purpose of this is to initialize the internal (pseudo-)random number generator so that the output is reproducible. The `seed` argument which we pass to the `reset` method can have any value. As long as you pass the same seed the generator will produce the same sequence of output values, which ensures reproducibility.

In [3]:
g.reset(seed=12345)

<Integer (id=1273e4)>

### Producing individual random values

Now that we have a primitive generator in a well-defined state, how can we produce values using this generator? One way of doing this is to call `next()` on it, which will ask `g` to produce a single new value for us.

In [4]:
print(next(g))

153


We can do this as many times as we want, and each time `g` will produce a new random integer in the range `[100, 200]`. Let's get five more.

In [5]:
for _ in range(5):
    print(next(g))

193
101
138
147
124


### Producing a sequence of random values

While this works ok, it quickly becomes cumbersome if we need a lot of values. A more convenient way is to call the `generate_as_list` method. We can pass the number of elements we want, as well as (optionally) a seed. If the seed is given, this internally calls `reset`, which ensures that the returned sequence is reproducible (see above).

In [6]:
g.generate_as_list(num=15, seed=99999)

[115, 139, 164, 183, 194, 130, 145, 152, 125, 132, 151, 192, 177, 177, 166]

This is much more convenient, and often the right choice. However, if `num` is very big then it may be expensive (both in terms of time and memory) to generate all elements at once and store them in a huge list.

An alternative would be to call `generate_as_stream` instead. The result is a Python generator object, and we can iterate over this to obtain the elements sequentially (but this happens in a "lazy" fashion, so it is much more time and memory efficient.

In [7]:
result = g.generate_as_stream(num=15, seed=99999)

In [8]:
print(result)

<generator object TohuBaseGenerator.generate_as_stream at 0x10f2eab30>


In [9]:
[x for x in result]

[115, 139, 164, 183, 194, 130, 145, 152, 125, 132, 151, 192, 177, 177, 166]

Beware that as usual with Python generator objects, once you have iterated over the result it will be exhausted so you can't iterate over it a second time, or else you won't get any elements:

In [10]:
[x for x in result]

[]

You should therefore carefully choose which method of generating items is best for your use case. For interactive exploration it is often more convenient to generate lists because they don't need as careful treatment, but if you need to be careful with performance or memory efficiency you can use the stream method instead.

## Second example: producing random `HashDigest` values

Let's look at another example using a different primitive generator. We choose `HashDigest`. This produces random strings that look like hash values.

The example follows the same pattern as above:

1. Create an instance of the `HashDigest` generator.
2. Reset it to ensure the output is reproducible.
3. Produce a sequence of elements by calling the `generate` method.

In [11]:
from tohu.primitive_generators import HashDigest

In [12]:
g = HashDigest(length=8, lowercase=True)
g.reset(seed=99999)
for _ in range(5):
    print(next(g))

4b4d0235
9097bc5e
ec6df8fc
b3e6caf3
ee19b1d3


In [13]:
g.generate_as_list(num=5, seed=99999)

['4b4d0235', '9097bc5e', 'ec6df8fc', 'b3e6caf3', 'ee19b1d3']

## Summary And Next Steps

*TODO*