In [1]:
import tohu
from tohu.generators import *
from utils import print_generated_sequence

In [2]:
# NBVAL_IGNORE_OUTPUT
tohu.__version__

'v0.5.1+57.ga2adb65.dirty'

This notebook contains high-level tests for `tohu`'s "standard" generators.

## Class `Integer`

Generates random integers in the range [`lo`, `hi`].

In [3]:
g = Integer(low=100, high=200)

In [4]:
g.reset(seed=12345); print_generated_sequence(g, num=15)
g.reset(seed=9999); print_generated_sequence(g, num=15)

Generated sequence: 153, 193, 101, 138, 147, 124, 134, 172, 155, 120, 147, 115, 155, 133, 171
Generated sequence: 115, 120, 196, 109, 116, 124, 136, 124, 187, 199, 176, 174, 138, 180, 170


In [5]:
some_integers = g.generate(5, seed=99999)

In [6]:
for x in some_integers:
    print(x)

115
139
164
183
194


The default distribution is "uniform", but we can use any(?) of the distributions [supported](https://docs.scipy.org/doc/numpy/reference/routines.random.html) by numpy.

In [7]:
#g = Integer(low=100, high=200, distribution=None)

## Class `Float`

Generates random floating point numbers in the range [`lo`, `hi`].

In [8]:
g = Float(low=2.71828, high=3.14159)

In [9]:
g.reset(seed=12345); print_generated_sequence(g, num=4)
g.reset(seed=9999); print_generated_sequence(g, num=4)

Generated sequence: 2.8946393582471686, 2.7225847111228716, 3.0675981674322017, 2.8446972371045396
Generated sequence: 3.0716413078479454, 2.785006097591815, 2.750284761944705, 3.0530348312992466


## Class `NumpyRandomGenerator`

Generates random numbers using one of the random number generators [supported](https://docs.scipy.org/doc/numpy/reference/routines.random.html) by numpy.

In [10]:
g1 = NumpyRandomGenerator(method="normal", loc=3.0, scale=5.0)
g2 = NumpyRandomGenerator(method="poisson", lam=30)
g3 = NumpyRandomGenerator(method="exponential", scale=0.3)

In [11]:
g1.reset(seed=12345); print_generated_sequence(g1, num=4)
g2.reset(seed=12345); print_generated_sequence(g2, num=15)
g3.reset(seed=12345); print_generated_sequence(g3, num=4)

Generated sequence: 1.9764617025764353, 5.394716690287741, 0.40280642471630923, 0.22134847826254989
Generated sequence: 40, 24, 31, 34, 27, 32, 29, 29, 35, 38, 30, 32, 38, 36, 36
Generated sequence: 0.7961371899305246, 0.11410397056571128, 0.060972430042086474, 0.06865806254932436


## Class `FakerGenerator`

It is also possible to use any generator provided by the [faker](http://faker.readthedocs.io/) library.

In [12]:
g1 = FakerGenerator(method="name")
g2 = FakerGenerator(method="name", locale='hi_IN')
g3 = FakerGenerator(method="phone_number")
g4 = FakerGenerator(method="job")

In [13]:
g1.reset(seed=12345); print_generated_sequence(g1, num=4)
g2.reset(seed=12345); print_generated_sequence(g2, num=4)
g3.reset(seed=12345); print_generated_sequence(g3, num=4)
g4.reset(seed=12345); print_generated_sequence(g4, num=4)

Generated sequence: Adam Bryan, Jacob Lee, Candice Martinez, Justin Thompson
Generated sequence: आदित्य ढींगरा, ललित दीक्षित, कुण्डा, कैलाश, ईश कुण्डा
Generated sequence: (045)349-6251x648, 298-251-8698x22313, 1-507-508-6002, 1-241-619-2638x9503
Generated sequence: Pension scheme manager, Administrator, Hydrogeologist, Merchandiser, retail


## Class Constant

Generates a sequence repeating the same element indefinitely.

In [14]:
g = Constant("Foobar"); print_generated_sequence(g, num=10)
g = Constant(42); print_generated_sequence(g, num=20)

Generated sequence: Foobar, Foobar, Foobar, Foobar, Foobar, Foobar, Foobar, Foobar, Foobar, Foobar
Generated sequence: 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42


## Class `Sequential`

Generates a sequence of sequentially numbered strings with a given prefix.

In [15]:
g = Sequential(prefix='Foo_', digits=3)

Calling `reset()` on the generator makes the numbering start from 1 again.

In [16]:
g.reset()
print_generated_sequence(g, num=5)
print_generated_sequence(g, num=5)
print("-----------------------------")
g.reset()
print_generated_sequence(g, num=5)

Generated sequence: Foo_001, Foo_002, Foo_003, Foo_004, Foo_005
Generated sequence: Foo_006, Foo_007, Foo_008, Foo_009, Foo_010
-----------------------------
Generated sequence: Foo_001, Foo_002, Foo_003, Foo_004, Foo_005


**Note**: the method `Sequential.reset()` supports the `seed` argument for consistency with other generators, but its value is ignored - the generator is simply reset to its initial value. This is illustrated here:

In [17]:
g.reset(seed=12345); print_generated_sequence(g, num=5)
g.reset(seed=9999); print_generated_sequence(g, num=5)

Generated sequence: Foo_001, Foo_002, Foo_003, Foo_004, Foo_005
Generated sequence: Foo_001, Foo_002, Foo_003, Foo_004, Foo_005


If a new `Sequential` generator is created from an existing one via the `_spawn()` method then its count will start again from 1.

In [18]:
g1 = Sequential(prefix="Quux_", digits=2)
g1.reset(seed=12345)
print_generated_sequence(g1, num=5)

g2 = g1._spawn()
print_generated_sequence(g1, num=5)
print_generated_sequence(g2, num=5)

Generated sequence: Quux_01, Quux_02, Quux_03, Quux_04, Quux_05
Generated sequence: Quux_06, Quux_07, Quux_08, Quux_09, Quux_10
Generated sequence: Quux_01, Quux_02, Quux_03, Quux_04, Quux_05


## Class `SelectOne`

In [19]:
g = SelectOne(values=['foobar', 42, 'quux', True, 1.2345])

In [20]:
g.reset(seed=12345); print_generated_sequence(g, num=15)
g.reset(seed=9999); print_generated_sequence(g, num=15)

Generated sequence: True, foobar, quux, quux, 42, quux, 1.2345, True, 42, quux, foobar, True, quux, 1.2345, 42
Generated sequence: foobar, 42, foobar, 42, 42, quux, 42, 1.2345, 1.2345, quux, 1.2345, 42, foobar, 1.2345, 1.2345


## Class `SelectMultiple`

In [21]:
g = SelectMultiple(values=['foobar', 42, 'quux', True, 1.2345], size=3)

In [22]:
g.reset(seed=12345); print_generated_sequence(g, num=4)
g.reset(seed=99999); print_generated_sequence(g, num=4)

Generated sequence: (True, 1.2345, True), (42, True, 'foobar'), (42, True, 'quux'), (1.2345, 42, 'foobar')
Generated sequence: (42, 'quux', 1.2345), (True, 42, 1.2345), ('quux', 1.2345, True), ('foobar', 1.2345, 1.2345)


It is possible to pass a random generator for the argument `n`. This produces tuples of _varying_ length, where the length of each tuple is determined by the values produced by this generator.

In [23]:
rand_nums = Integer(low=2, high=5)

In [24]:
g = SelectMultiple(values=['a', 'b', 'c', 'd', 'e'], size=rand_nums)

In [25]:
g.reset(seed=11111); print_generated_sequence(g, num=10, sep='\n')

Generated sequence:
('b', 'c', 'c', 'b', 'd')
('c', 'b', 'd', 'b')
('a', 'a', 'e', 'e')
('a', 'c', 'a')
('c', 'e', 'b', 'd')
('c', 'c')
('d', 'c', 'd', 'a')
('c', 'a', 'b', 'c', 'e')
('c', 'd', 'e', 'e')
('c', 'a', 'e', 'e', 'c')


## Class `CharString`

In [26]:
chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789,.?!"

In [27]:
g = CharString(length=15, chars=chars)
g.reset(seed=12345); print_generated_sequence(g, num=5)
g.reset(seed=9999); print_generated_sequence(g, num=5)

Generated sequence: 1bMVyI3uVp3HwxT, l0?vsAjyRPd6Rd!, 1aauwKm1nxWDQfD, p.BxG3qfvMbtDSR, uXPbgh4kXwbVB!p
Generated sequence: pujqyKyMChk84iL, AxKuZa,3ExHdWdo, 0OGq9xgbQ?6N9wA, 2zKKBvLLjZsoek5, 9oKsBeUAj3gnULf


## Class `DigitString`

In [28]:
g = DigitString(length=15)
g.reset(seed=12345); print_generated_sequence(g, num=5)
g.reset(seed=9999); print_generated_sequence(g, num=5)

Generated sequence: 604534962516482, 982518698223135, 507508600292416, 192638950317382, 496820240235852
Generated sequence: 121234399483089, 197719432949260, 798639929408608, 165427208059874, 723638443244916


## Class `HashDigest`

In [29]:
g = HashDigest(length=8)
g.reset(seed=12345); print_generated_sequence(g, num=9)
g.reset(seed=9999); print_generated_sequence(g, num=9)

Generated sequence: D09B68D5, B3D855B2, D54626AA, 0EA0D005, 593D35C7, A173F658, D4159047, BA5CA011, E2C50B63
Generated sequence: 35246969, 712FE296, 595C0FD7, 580C03DA, 84F510AE, 9F56D699, 65992C43, 12EF3946, 1B62D13B


In [30]:
g = HashDigest(length=20)
g.reset(seed=12345); print_generated_sequence(g, num=4)
g.reset(seed=9999); print_generated_sequence(g, num=4)

Generated sequence: D09B68D5B3D855B2D546, 26AA0EA0D005593D35C7, A173F658D4159047BA5C, A011E2C50B63F10FAD6B
Generated sequence: 35246969712FE296595C, 0FD7580C03DA84F510AE, 9F56D69965992C4312EF, 39461B62D13B91A9474C


## Class `Geolocation`

In [31]:
g = GeolocationPair()
g.reset(seed=12345); print_generated_sequence(g, num=5, sep='\n')

Generated sequence:
(-30.016845883677178, -15.008422941838589)
(-176.3390989954554, -88.1695494977277)
(117.07434333134756, 58.53717166567378)
(-72.48965212814659, -36.244826064073294)
(-47.37179178414874, -23.68589589207437)


## Class `TimestampNEW`

In [32]:
from tohu.generators import TimestampNEW

In [33]:
g = TimestampNEW(start='2016-02-14', end='2016-02-18')

In [34]:
g.reset(seed=12345); print_generated_sequence(g, num=5, sep='\n')

Generated sequence:
2016-02-16 12:40:28
2016-02-18 10:42:18
2016-02-14 01:28:51
2016-02-18 23:26:47
2016-02-18 20:55:23


In [35]:
g = TimestampNEW(start='1998-03-01 00:02:00', end='1998-03-01 00:02:15')

In [36]:
g.reset(seed=99999); print_generated_sequence(g, num=10, sep='\n')

Generated sequence:
1998-03-01 00:02:03
1998-03-01 00:02:09
1998-03-01 00:02:07
1998-03-01 00:02:11
1998-03-01 00:02:13
1998-03-01 00:02:06
1998-03-01 00:02:08
1998-03-01 00:02:12
1998-03-01 00:02:06
1998-03-01 00:02:01


Note that the generated items are `datetime` objects (even though they appear as strings when printed above).

In [37]:
type(next(g))

datetime.datetime

## Using tohu generators as iterators

Each `tohu` generator can also be used as a Python iterator producing an (infinite) series of elements.

In [38]:
int_generator = Integer(low=100, high=500, seed=99999)

for i, x in enumerate(int_generator):
    if i > 20:
        break
    print(x, end=" ")

161 258 356 432 478 221 281 311 203 229 307 470 410 410 367 203 130 455 270 370 296 