# Creating dummy values

Some time has passed since we created the database and now some customes have been (surprisingly) using the store's services and we have some data about them.


## Process

We're going to add dummy values to the `rental` and `customer` tables (like we did in the `store` table) so that we can de more complex queries in the final part of the project.

+ `customer`: using a couple libraries, we're going to add some random values to the table.
+ `rental`: we're going to use the original `rental.csv`, cleaning it and changing the date values so that it makes sense and randomnly changing the `customer_id` with the ids we've created on the previous steps

In [1]:
# Necessary installs

# %pip install faker

In [2]:
# libraries
import pandas as pd
pd.set_option('display.max_columns', None)

import numpy as np

import mysql.connector
from sqlalchemy import create_engine
from passwords import CURSOR

import warnings
warnings.filterwarnings('ignore')

import random as rd
from faker import Faker

import sys
sys.path.insert(1, '../')

from src import format_phone_number

## Filling customer table

In [3]:
# fetch the dable from sql
          
sql_query = pd.read_sql_query (
                                '''
                               SELECT
                               *
                               FROM customer
                               '''
                                , CURSOR)

df = pd.DataFrame(sql_query)

df

Unnamed: 0,customer_id,first_name,last_name,email,phone_number,gender,location


In [4]:
['a', 'b', 'c'][rd.randint(0, 2)]

'c'

In [5]:
# we're going to create 25 dummy costumers

# set faker to get random spanish values
fake = Faker('es_ES')

# i + 1 will be the index
for i in range(25):
    # generate random gender
    gender = ['male', 'female', 'non-binary'][rd.randint(0,2)]
    
    # assign names depending on gender
    if gender == 'male':
        fname = fake.first_name_male()
        
    elif gender == 'female':
        fname = fake.first_name_female()
    
    else:
        fname = fake.first_name()
        
    lname = fake.last_name() + ' ' + fake.last_name()
        
    # creating dummy data with faker    
    location = fake.address()
    phone = fake.phone_number()
    email_domain = fake.domain_name()
    mail = f"{fname.lower()}.{lname.split()[0].lower()}@{email_domain}"
    
    # and append everything
    df.loc[i] = [i+1, fname, lname, mail, phone, gender, location]
    
    

In [9]:
df.head(10)

Unnamed: 0,customer_id,first_name,last_name,email,phone_number,gender,location
0,1,Vinicio,Nuñez Pelayo,vinicio.nuñez@landa.com,+34 758 511 35,male,"Ronda de Gabino Vilar 84\nAsturias, 03626"
1,2,Ana Belén,Ródenas Escolano,ana belén.ródenas@gutierrez.net,+34 784 417 64,female,"Ronda Isabela Muñoz 47\nMadrid, 37112"
2,3,Gabriel,Matas Sanabria,gabriel.matas@palacios.com,+34 432 206 19,male,Vial de Horacio Reguera 65 Puerta 3 \nValencia...
3,4,Edmundo,Arnau Nebot,edmundo.arnau@aguado.com,+34 265 775 21,male,"Glorieta de Palmira Vara 40 Piso 9 \nCáceres, ..."
4,5,Mónica,Tamarit Pomares,mónica.tamarit@miralles-carballo.es,+34 843 412 97,female,"Alameda de Adelaida Bauzà 5\nBurgos, 02699"
5,6,Cipriano,Cerro Ramón,cipriano.cerro@roda.com,+34 438 546 09,non-binary,Avenida de José Mari Palacio 78 Apt. 60 \nOure...
6,7,Delia,Quiroga Cañete,delia.quiroga@arce.es,+34 845 971 54,female,"C. de Isaura Serrano 65\nTarragona, 37284"
7,8,Duilio,Corral Amo,duilio.corral@cuenca-alegria.org,+34 389 711 92,male,"Camino Adelia Egea 61\nZamora, 40752"
8,9,Adelaida,Zurita Crespi,adelaida.zurita@heredia.es,+34 465 070 51,female,Calle de Natalia Villegas 545 Apt. 16 \nValenc...
9,10,Lucía,Rojas Araujo,lucía.rojas@gallego-vilaplana.org,+34 729 248 17,female,"Cuesta de Ale Malo 14\nOurense, 11209"


In [7]:
help(format_phone_number)

Help on function format_phone_number in module src:

format_phone_number(phone)
    Returns the phone number introduced to the standar Spanish format "+34 XXX XXX XXX".
    
    Arguments:
        phone: a string of a phone number with or without '+' and with or without 34 at the beggining (it will be added if it doesn't have it).
                
                If the string is not digit, the function will not register it as an error and will return unexpected values.
        
    Returns:
        formatted_phone: a string with the formatted phone like this "+34 XXX XXX XXX"



In [8]:
# Apply the function to the 'Phone' column
df.phone_number = df.phone_number.apply(format_phone_number)

df.head()

Unnamed: 0,customer_id,first_name,last_name,email,phone_number,gender,location
0,1,Vinicio,Nuñez Pelayo,vinicio.nuñez@landa.com,+34 758 511 35,male,"Ronda de Gabino Vilar 84\nAsturias, 03626"
1,2,Ana Belén,Ródenas Escolano,ana belén.ródenas@gutierrez.net,+34 784 417 64,female,"Ronda Isabela Muñoz 47\nMadrid, 37112"
2,3,Gabriel,Matas Sanabria,gabriel.matas@palacios.com,+34 432 206 19,male,Vial de Horacio Reguera 65 Puerta 3 \nValencia...
3,4,Edmundo,Arnau Nebot,edmundo.arnau@aguado.com,+34 265 775 21,male,"Glorieta de Palmira Vara 40 Piso 9 \nCáceres, ..."
4,5,Mónica,Tamarit Pomares,mónica.tamarit@miralles-carballo.es,+34 843 412 97,female,"Alameda de Adelaida Bauzà 5\nBurgos, 02699"


In [11]:
# and now we append all the dummy data
df.to_sql(name='customer',
          con=CURSOR,
          if_exists='append',
          index=False)

25