## Using **Personas** to Simulate Hallux Data


### This example has two goals

* To assign personas to the different entities in the Hallux database either randomly or based on existing data.

* To use the assigned personas to predict an entity's behavoior in order to create new data.



### The example starts by importing the required libraries and creating dictionaries


In [2]:

# import libraries

from os import getenv
import pandas as pd
import random
from datetime import *; from dateutil.relativedelta import *
import pymssql
from sqlalchemy import create_engine


# Personas dictionary (defined in Hallux tables Persona_Type and Persona)

Personas = {
    "Album" : ("Bomb","Non-Certified","Gold","Platinum","Multi Platinum","Diamond"), 
    "Band" : ("Bar Band","Cover","Tribute","1 hit wonder","Popular","Cult Following","Megaband"),
    "Customer" : ("Occasioanl","Discount","Loyal"),
    "Employee" : ("Bad","Good","Overachiever"),
    "Experience" : ("Bad","Good","Wonderful"),
    "Fan" : ("Casual","Fan","Fanatic"),
    "Music" : ("Popular", "Regional","Seasonal","Specialty","Nostalgic"),
    "Musician" : ("Amateur","Studio","Backup","Lead","Superstar"),
    "Performance" : ("Bad","Good","Epic"),
    "Person" : ("Student","Young Adult","Mature"),
    "Song" : ("Bad","Good","Popular","Hit"),
    "Video" : ("Bad","Good","Viral")
}

# Hallux entity types dictionary defined in the Entity_Type table
# Added an additional value list (table, id column, persona type)
# and put in the order needed for processing if not using random selection
# For example, a band's persona will be determined by the personas of its band members.

Entities = {
    
    "Genre" : ("Genre","Genre_Id","Music"), 
    "Member" : ("Band_Member","Member_Id","Musician"),
    "Band" : ("Band","Band_Id","Band"),
    "Agent"   : ("Agent","Agent_Id","Employee"),
    "Album" : ("Album","Album_Id","Album"), 
    "Song" : ("Song","Song_Id","Song"),
    "Performance" : ("Performance","Performance_Id","Performance"),
    "Producer"   : ("Producer","Producer_Id","Employee"),
    "Video" : ("Video","Video_Id","Video"),
    "Profile" : ("Customer_Profile","Profile_Id","Person"),
    "Customer" : ("Customer","Customer_Id","Customer"),
    "Follower" : ("Band_Follower","Follower_Id","Fan"),
    "Playlist" : ("Playlist","Playlist_Id","Experience"),
    "Order"   : ("Order","Order_Id","Experience"),
    "Stream"   : ("Stream","Stream_Id","Experience")
}



### Set an individual entity's persona

The set_persona function will update the database with an entities persona 

*Note: This function uses a stored procedure to update data.  
This allows for greater security and users do not need update, insert or delete permission on specific tables, only execute permission for this stored procedure.*  


In [20]:

def set_persona ( a_entity_type, a_entity_id, a_persona_type, a_persona ):   

    sp_exec_sql = "exec prc_set_Persona  @a_Entity_Type = %s, @a_Entity_Id = %s, @a_Persona_Type = %s, @a_Persona = %s"

    ret_val = -1
    
    curr_parms = (a_entity_type, a_entity_id, a_persona_type, a_persona )
    cursor.execute(sp_exec_sql,curr_parms)    
    
    sp_ret_val = cursor.fetchone()
    ret_val = sp_ret_val[0]
    
    return ret_val


### Get an individual entity's persona

The get_persona function will return an entity's persona 

*any note*  

In [4]:

def get_persona ( a_entity_type, a_entity_id, a_persona_type):   
    
    my_persona = ''
    
    return my_persona


###  What is my Persona? 

The my_persona function will determine the persona for an entity

*Will start by using a random selection* 

In [5]:


def whats_my_persona ( a_entity_type, a_entity_id, a_persona_type ):   

    
    my_persona = ''
        
    
    if a_persona_type in Personas:
        persona_list = Personas[a_persona_type]
        ndx = random.randint(1, len(persona_list)) - 1       
        my_persona = persona_list[ndx]
                             

    return my_persona





###  Assign Personas 

The assign_personas function will loop thru all the members of a specific Hallux entity and assign them personas 

*This is where the hard work is done!*  

In [6]:

def assign_personas ( a_entity_type, a_persona_type ):   

    
    ret_val = -1
    
    if a_entity_type not in ('Member') :
        return 0
    
    # get all the entity ids
    
    if a_entity_type in Entities:
        
        entity_list = Entities[a_entity_type]
        sql_select = 'SELECT ' + entity_list[1] + ' FROM ' + entity_list[0] + ' order by ' + entity_list[1]
        df_entity = pd.read_sql(sql_select, eng)
        df_entity.set_index(entity_list[1])
        ret_val = len(df_entity)
    
    # assign and save persona
    
    for i in range(len(df_entity)):
        
        curr_entity_id = df_entity[entity_list[1]][i]
        curr_persona   = whats_my_persona ( a_entity_type, curr_entity_id, a_persona_type )
    
        if curr_persona != '' :
            ret_sub = set_persona ( a_entity_type, curr_entity_id, a_persona_type, curr_persona )   
            if ret_sub < 0 :
                #ret_val = ret_sub
                break
            
    
    return ret_val




### Simulate Follower Data

This fuction will determine which bands the specified customer profile will follow, if any 

*any note*  

In [7]:

def sim_follower_data (a_start_date, a_end_date):   
    
    ret_val = 0
    return ret_val


### Simulate Playlist Data

This fuction will determine which playlists the specified customer profile will create, if any 

*any note*  

In [8]:

def sim_playlist_data (a_start_date, a_end_date):   
    
    ret_val = 0
    return ret_val



### Simuate Streaming Data

This fuction will determine which songs the specified customer profile will stream, if any 

*any note*  

In [9]:

def sim_streaming_data (a_start_date, a_end_date):   
    
    ret_val = 0
    return ret_val




### Simuate Data

This fuction calls the appropriate simulation function based on the entitity type, otherwise return -1



In [10]:

def sim_data ( a_entity_type, a_start_date, a_end_date ):   

    ret_val = -1
    
    if a_entity_type == 'Follower':
        ret_val = sim_follower_data (a_start_date, a_end_date )
        
    if a_entity_type == 'Playlist':
        ret_val = sim_playlist_data (a_start_date, a_end_date )
        
    if a_entity_type == 'Stream':
        ret_val = sim_streaming_data (a_start_date, a_end_date )
        
    
    return ret_val



### Here we go...

* Initalization, database engine 

* Assign personas to Hallux entities 

* Generate the data for the specified date range!


In [22]:


hallux_svr = getenv("halluxsvr")
hallux_usr = getenv("halluxusr")
hallux_psd = getenv("halluxpsd")
hallux_db  = getenv("halluxdb")


# for use in creating a dataframe
conn_string = f"mssql+pyodbc://{hallux_usr}:{hallux_psd}@{hallux_svr}/{hallux_db}?driver=SQL+Server+Native+Client+11.0&TrustServerCertificate=yes"
eng = create_engine(conn_string, fast_executemany=True)

# used for executing dynamic sql
conn   = pymssql.connect(hallux_svr, hallux_usr, hallux_psd, hallux_db)
cursor = conn.cursor()



# Let's start with band members!

# STEP 1 - loop thru the entities and assign the appropriate persona
print("Step 1")

for curr_entity, curr_value in Entities.items():
    ret = assign_personas (curr_entity, curr_value[2])
    print(curr_entity, ret)

# STEP 2 - Loop thru the entities and Simulate data 
print("Step 2")

start_date = date(2023, 1, 1)
end_date   = date(2023, 12, 31)  

for curr_entity, curr_value in Entities.items():
    #ret = sim_data (curr_entity, start_date, end_date)
    print(curr_entity, ret )


conn.close()
eng.dispose()



0
Step 1
Step 2
