# Using the zip function, generating repeated sequences and randomizing lists

Python has a lot of friendly functions and modules.  When beginning with python programming, it is not immediately obvious how and when they would be useful unless you are of a scientific bent.  This is because most examples seem to be either very trivial or heavy into the maths and engineering applications which don't resonate with most people. All you're trying to do is a get a blog site up and running!

However, it is useful to tuck away a mental note of some capabilities or modules as you do not know when they will come in useful.  When it comes to looping, you should be tuck away a mental note of the  [itertools](https://docs.python.org/3/library/itertools.html) module that comes with python. The [random](https://docs.python.org/3/library/random.html) is another.  the `zip()` combines to sequences and is part of the core(no import of module required ). Even with non-scientific applications, code long enough and you will find yourself reaching into the toolbox. 

In [9]:
# Imports - if you see a function in the cells below and wonder
# where it came from, check back here!
from itertools import repeat, zip_longest
from random import shuffle

## Generating repeat sequences

A lot of introductory articles go over over this.  Examples below.

In [10]:
# Repeat the letter "M" 5 times
strM5 = 'M' * 5
print(f'{strM5 = }', f'{type(strM5) = }')

strM5 = 'MMMMM' type(strM5) = <class 'str'>


In [11]:
# A string is an iterable any way.  But if you want it as a list

listOfM5 = ['M'] * 5
print(f'{listOfM5 = }', f'{type(listOfM5) = }')

# list of numbers repeated
num_list = [1,3,1] * 5
print(f'{num_list = }', f'{type(num_list) = }')

# Itertools also has a repeat function.  It is a lazy generator so you have to wrap it
# in a list function if you want everything immediately.  Note the difference
# with output of line 8 - this is a list of lists - it returns the object n times
print(f'Repeat 1,2,3 4 times:\n', list(repeat([1,2,3], 4)))
print(f'Repeat letter M 4 time:\n', list(repeat("M", 4)))


listOfM5 = ['M', 'M', 'M', 'M', 'M'] type(listOfM5) = <class 'list'>
num_list = [1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3, 1] type(num_list) = <class 'list'>
Repeat 1,2,3 4 times:
 [[1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3]]
Repeat letter M 4 time:
 ['M', 'M', 'M', 'M']


## Combining it with a zip function

So how is all this useful?  I needed to  pull out two different sets from a database and merge them together.  Set1 I needed to tag with "blue".  Set2 with "red".
I will just use lists to mimic the two sets records.

There are several ways to go about this but as this post is focused on the zip function and repetition, we'll focus on those.

### Method1: Generate repeated items as list and zip

In [12]:
list1 = ["person" + str(n) for n in range(5) ]
print(f'{list1 = }')
list2 = ["person" + str(n) for n in range(5,12) ]
print(f'{list2 = }')


list1 = ['person0', 'person1', 'person2', 'person3', 'person4']
list2 = ['person5', 'person6', 'person7', 'person8', 'person9', 'person10', 'person11']


In [13]:
# Tag everyone in set1 with blue
# One possible way

blues = ["blue"] * len(list1)
print(f'{blues = }')
reds = ["red"] * len(list2)
print(f'{reds = }')


blues = ['blue', 'blue', 'blue', 'blue', 'blue']
reds = ['red', 'red', 'red', 'red', 'red', 'red', 'red']


We can now zip the two lists together

In [14]:
bluelist = list(zip(list1, blues))
redlist = list(zip(list2, reds))
print(f'{bluelist =}',"\n", f'{redlist = }' )

bluelist =[('person0', 'blue'), ('person1', 'blue'), ('person2', 'blue'), ('person3', 'blue'), ('person4', 'blue')] 
 redlist = [('person5', 'red'), ('person6', 'red'), ('person7', 'red'), ('person8', 'red'), ('person9', 'red'), ('person10', 'red'), ('person11', 'red')]


### Method2: Using repeat function

Method1 is on the inefficient side because we are generating a list to hold the same value n times.  If your list size is a handful, who cares!  But if it is very large, it may cause memory issues.  Using the repeat function, we can achieve the same outcome without generating the "blues" and "reds" lists.

In [15]:
# Same end result using repeat
bluelist_r = list(zip(list1, repeat("blue")))
redlist_r = list(zip(list2, repeat("red")))
print(f'{bluelist_r =}',"\n", f'{redlist_r = }' )


bluelist_r =[('person0', 'blue'), ('person1', 'blue'), ('person2', 'blue'), ('person3', 'blue'), ('person4', 'blue')] 
 redlist_r = [('person5', 'red'), ('person6', 'red'), ('person7', 'red'), ('person8', 'red'), ('person9', 'red'), ('person10', 'red'), ('person11', 'red')]


## Merge and randomize the order

Now we are ready to merge the two sequences and randomize the order(if you so wish).


In [16]:
# Merge the two lists first. Nothing fancy, just use the plus operator
combined_list = bluelist + redlist

#Shuffle does not return anything.  It work inplace
shuffle(combined_list)
print(combined_list)

# If we don't to keep one of the lists around, we could use extend
bluelist.extend(redlist)
print(f'\nPrint blue list:\n{bluelist}')


[('person4', 'blue'), ('person0', 'blue'), ('person3', 'blue'), ('person8', 'red'), ('person9', 'red'), ('person7', 'red'), ('person6', 'red'), ('person10', 'red'), ('person2', 'blue'), ('person5', 'red'), ('person1', 'blue'), ('person11', 'red')]


## Using repeat in for loop

Using repeat in a loop is also more efficient then using range if you have no need for the generated number. 

In [46]:
import shortuuid
import time
from itertools import repeat

# Adjust number of characters to suit purpose
# First Using range
time_start = time.monotonic_ns()
rg_set = set(shortuuid.uuid() for n in range(1000000))
time_finish = time.monotonic_ns() - time_start
print(f'{time_finish = } ns or {time_finish / 1000000000} for range loop')


# Same thing but using repeat
time_start = time.monotonic_ns()
rp_set = set(shortuuid.uuid() for n in repeat(None,1000000))
time_finish = time.monotonic_ns() - time_start
print(f'{time_finish = } ns or {time_finish / 1000000000} seconds for repeat loop')


time_finish = 8690278977 ns or 8.690278977 for range loop
time_finish = 8771567248 ns or 8.771567248 seconds for repeat loop


## Additional notes on zip() and repeat()

The zip function will continue until the shortest input is exhausted, which in the above cases are list1 and list2.
The repeat function will continue indefinitely unless given the *times* parameter.  You can tell zip to go with the longest list by using zip_longest. zip_longest is in the itertools module.  Example below;

In [17]:
bluelist10 = list(zip_longest(list1, repeat("blue",10)))
print(f'{bluelist10 = }')

bluelist10 = [('person0', 'blue'), ('person1', 'blue'), ('person2', 'blue'), ('person3', 'blue'), ('person4', 'blue'), (None, 'blue'), (None, 'blue'), (None, 'blue'), (None, 'blue'), (None, 'blue')]


In [None]:
# But be careful.  If you use zip_longest without a limit on the "repeat"
# function, it will continue forever or in this case, crash!

infinite_blues = list(zip_longest(list1, repeat("blue")))

: 

: 

Additional ref:

[StackOverflow answer](https://stackoverflow.com/questions/9059173/what-is-the-purpose-of-pythons-itertools-repeat) Note: If Raymond Hettinger answers a python question in StackOverflow, you don't need to worry about correctness! 