I've recently been sharpening my fake data creation skills. Mainly because I've been in a couple situations where I've needed to develop things but real data was not available. `numpy` is full of great functions to make this a streamlined process and I'm going to talk through the `.tile` and `.repeat` functions. Let's get some context for the data I'm going to fakify.

> We've got some data that tracks the different quantity options of different types of product.

For example, we can sell different types of chairs in quantities of 10 and 20.

In [40]:
import pandas as pd
import numpy as np
pd.DataFrame({
    'name':np.repeat(['silk', 'expression'], 2),
    'size':np.tile([10, 20], 2)
})

Unnamed: 0,name,size
0,silk,10
1,silk,20
2,expression,10
3,expression,20


Here is a really quick demonstration of the differences of these 2 functions. `.repeat` will element-wise repeat each value, while `.tile` will repeat the entire sequence that many times. Very helpful! Let's extend this a tad to get something we can play around with. If we really think about this here, we need to repeat the names (unique vals) times, and we need to tile the values (unique names) times.

In [41]:
names = ['silk', 'expression']
vals = [10, 20]

pd.DataFrame({
    'name':np.repeat(names, len(vals)),
    'size':np.tile(vals, len(names))
})

Unnamed: 0,name,size
0,silk,10
1,silk,20
2,expression,10
3,expression,20


By adding `len(vals)` and `len(names)`, if I want to create another name or add a value to my list it won't require me to change anything. I'll wrap the df construction in function and prove it to you.

In [42]:
def buildFrame(names, vals):
    """ Create a dataframe with names and values.
    
    Args:
        names (list): Some names
        vals (list): Some values
    Returns:
        pd.DataFrame: Dataframe with a record for each name and value
    """
    
    return pd.DataFrame({
        'name':np.repeat(names, len(vals)),
        'size':np.tile(vals, len(names))
    })

n1 = ['anthony', 'jim']
v1 = [10]

n2 = ['a', 'b', 'c']
v2 = [10, 20]

In [43]:
buildFrame(n1, v1)

Unnamed: 0,name,size
0,anthony,10
1,jim,10


In [44]:
buildFrame(n2, v2)

Unnamed: 0,name,size
0,a,10
1,a,20
2,b,10
3,b,20
4,c,10
5,c,20
