# Weekly Challenge 12

*Original URL* https://community.alteryx.com/t5/Weekly-Challenge/Challenge-12-Creating-an-HR-Hierarchy/td-p/36740 and [**My Alteryx Approach**](https://github.com/dsmdavid/Alteryx-Weekly-Challenge/tree/master/submitted/sub_Challenge%2312)

## Brief

### Basic Text Mining:

For this challenge letâ€™s look at creating a multi-level hierarchy from employee-manager data. As always there are several ways to do this challenge, I have designated it as an advanced challenge because there is an elegant way to solve it using iterative macros. The advantage to the iterative macro solution is that it becomes very dynamic. Other hard coded solutions would get you to the answer with this data, but if the depth of the hierarchy were to change, you would have to modify the workflow to support the change. It is a great example to see how iterative macros can make a workflow dynamic.

#### The use case:

An HR department wants to use Alteryx to quickly understand the reporting structure for employees across their organization.

The Input source contains 5 employees and an identifier that uniquely identifies the individual and the manager they report to.
The goal is to create a hierarchy field identifying each relationship between employee and manager(s). For example, a Director reports directly to the Vice President which is 1 level up. The Director is then 2 levels away from the CEO (in this data set). As a result the hierarchy identifier represents how many levels removed the employee is from management team they report into.

In [1]:
import pandas as pd

## Approach I want to follow:
1. Read the data.
1. Create function to obtain the hierarchy from a single element.
1. Obtain the hierarchy for all.
1. Summarize the results.

**<a href=#condensed> Go to condensed approach </a>**


In [2]:
#Load the data
df = pd.read_csv("./12_files/input.csv", encoding="latin")
df.head()

Unnamed: 0,employee,id,man_id
0,Analyst,3,2.0
1,Manager,2,1.0
2,Director,1,4.0
3,Vice President,4,5.0
4,CEO,5,


In [3]:
df.set_index('id', inplace=True)

In [4]:
df.man_id.fillna(value=0, inplace=True)
df.man_id = df.man_id.apply(int)
df

Unnamed: 0_level_0,employee,man_id
id,Unnamed: 1_level_1,Unnamed: 2_level_1
3,Analyst,2
2,Manager,1
1,Director,4
4,Vice President,5
5,CEO,0


In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 5 entries, 3 to 5
Data columns (total 2 columns):
employee    5 non-null object
man_id      5 non-null int64
dtypes: int64(1), object(1)
memory usage: 120.0+ bytes


In [6]:
def findHierarchy(someone_id):
    '''returns the list of managers'''
    
    someone_id = someone_id
    #print(someone_id)
    
    manager_id = df.loc[someone_id,'man_id']
    
    if manager_id == 0:
        return 'None'
    else:
        manager = df.loc[manager_id,'employee']

        return manager+","+findHierarchy(manager_id)

In [7]:
hierarchy = {}

In [8]:
def retDf(someone_id):
    someone = df.loc[someone_id,'employee']
    hierarchy[someone] = [findHierarchy(someone_id)]

In [9]:
for i in df.index:
    retDf(i)
hierarchy

{'Analyst': ['Manager,Director,Vice President,CEO,None'],
 'Manager': ['Director,Vice President,CEO,None'],
 'Director': ['Vice President,CEO,None'],
 'Vice President': ['CEO,None'],
 'CEO': ['None']}

In [10]:
df_h = pd.DataFrame.from_dict(hierarchy, orient='index')

In [11]:
df_h

Unnamed: 0,0
Analyst,"Manager,Director,Vice President,CEO,None"
Manager,"Director,Vice President,CEO,None"
Director,"Vice President,CEO,None"
Vice President,"CEO,None"
CEO,


In [12]:
new = df_h[0].str.split(",", expand = True)
new

Unnamed: 0,0,1,2,3,4
Analyst,Manager,Director,Vice President,CEO,
Manager,Director,Vice President,CEO,,
Director,Vice President,CEO,,,
Vice President,CEO,,,,
CEO,,,,,


In [13]:
df_n = new.unstack().reset_index()
df_n.columns = ['order','employee','manager']

In [14]:
df_n

Unnamed: 0,order,employee,manager
0,0,Analyst,Manager
1,0,Manager,Director
2,0,Director,Vice President
3,0,Vice President,CEO
4,0,CEO,
5,1,Analyst,Director
6,1,Manager,Vice President
7,1,Director,CEO
8,1,Vice President,
9,1,CEO,


In [15]:
df_n.dropna(axis=0).reset_index().sort_values(by=['employee','order'])

Unnamed: 0,index,order,employee,manager
0,0,0,Analyst,Manager
5,5,1,Analyst,Director
9,10,2,Analyst,Vice President
12,15,3,Analyst,CEO
14,20,4,Analyst,
4,4,0,CEO,
2,2,0,Director,Vice President
7,7,1,Director,CEO
11,12,2,Director,
1,1,0,Manager,Director


In [16]:
df_n.dropna(axis=0).sort_values(by=['employee','order']).reset_index(drop=True)

Unnamed: 0,order,employee,manager
0,0,Analyst,Manager
1,1,Analyst,Director
2,2,Analyst,Vice President
3,3,Analyst,CEO
4,4,Analyst,
5,0,CEO,
6,0,Director,Vice President
7,1,Director,CEO
8,2,Director,
9,0,Manager,Director


<a id='condensed'></a> 
## Condensed approach:


In [17]:
import time
t1 = time.time()
import pandas as pd


#Input data
df = pd.read_csv("./12_files/input.csv", encoding="latin").fillna(value=0)
df.set_index('id', drop=False, inplace=True)

#Create functions:
def findHierarchy(someone_id):
    '''returns the list of managers'''
    
    someone_id = someone_id
    #print(someone_id)
    
    manager_id = df.loc[someone_id,'man_id']
    
    if manager_id == 0:
        return 'None'
    else:
        manager = df.loc[manager_id,'employee']

        return manager+","+findHierarchy(manager_id)

#Get the hierarchy
df['managers'] = df['id'].apply(findHierarchy)

#Format the hierarchy
df_h = df.set_index('employee')['managers'].str.split(",",expand=True).unstack().reset_index()
df_h.columns = ['order','employee','manager']

df_h = df_h[df_h['manager'] !='None'].dropna().sort_values(by=['employee','order']).reset_index(drop=True)

t2 =time.time()
df_h

Unnamed: 0,order,employee,manager
0,0,Analyst,Manager
1,1,Analyst,Director
2,2,Analyst,Vice President
3,3,Analyst,CEO
4,0,Director,Vice President
5,1,Director,CEO
6,0,Manager,Director
7,1,Manager,Vice President
8,2,Manager,CEO
9,0,Vice President,CEO


In [18]:
print(t2-t1)

0.01805901527404785


In [19]:
dict(zip(df['employee'],list(map(findHierarchy, df['id']))))

{'Analyst': 'Manager,Director,Vice President,CEO,None',
 'Manager': 'Director,Vice President,CEO,None',
 'Director': 'Vice President,CEO,None',
 'Vice President': 'CEO,None',
 'CEO': 'None'}