# Applying Advanced Transformation

### The Data

You will be working with a heavily modified version of the Superheroes dataset from Kaggle.

The dataset includes two csv's:

superhero_info.csv:
* Contains Name, Publisher, Demographic Info, and Body measurements.

superhero_powers.csv:
* Contains Hero name and list of powers

## The Task
Your task is two-fold:

I. Clean the files and combine them into one final DataFrame.

This dataframe should have the following columns:
* Hero (Just the name of the Hero)
* Publisher
* Gender
* Eye color
* Race
* Hair color
* Height (numeric)
* Skin color
* Alignment
* Weight (numeric)
Plus, one-hot-encoded columns for every power that appears in the dataset. E.g.:
* Agility
* Flight
* Superspeed etc.

-Hint: There is a space in "100 kg" or "52.5 cm"



II. Use your combined DataFrame to answer the following questions.

* Compare the average weight of super powers who have Super Speed to those who do not.
* What is the average height of heroes for each publisher?


### import libraries

In [108]:
!pip install plotly



In [109]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


## Importing the OS and JSON Modules
import os,json

import plotly.express as px

In [110]:
## load data
df = pd.read_csv('Data/superhero_info - superhero_info.csv')
df.head()

Unnamed: 0,Hero|Publisher,Gender,Race,Alignment,Hair color,Eye color,Skin color,Measurements
0,A-Bomb|Marvel Comics,Male,Human,good,No Hair,yellow,Unknown,"{'Height': '203.0 cm', 'Weight': '441.0 kg'}"
1,Abe Sapien|Dark Horse Comics,Male,Icthyo Sapien,good,No Hair,blue,blue,"{'Height': '191.0 cm', 'Weight': '65.0 kg'}"
2,Abin Sur|DC Comics,Male,Ungaran,good,No Hair,blue,red,"{'Height': '185.0 cm', 'Weight': '90.0 kg'}"
3,Abomination|Marvel Comics,Male,Human / Radiation,bad,No Hair,green,Unknown,"{'Height': '203.0 cm', 'Weight': '441.0 kg'}"
4,Absorbing Man|Marvel Comics,Male,Human,bad,No Hair,blue,Unknown,"{'Height': '193.0 cm', 'Weight': '122.0 kg'}"


In [111]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 463 entries, 0 to 462
Data columns (total 8 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   Hero|Publisher  463 non-null    object
 1   Gender          463 non-null    object
 2   Race            463 non-null    object
 3   Alignment       463 non-null    object
 4   Hair color      463 non-null    object
 5   Eye color       463 non-null    object
 6   Skin color      463 non-null    object
 7   Measurements    463 non-null    object
dtypes: object(8)
memory usage: 29.1+ KB


In [112]:
# load second csv
powers_df = pd.read_csv('Data/superhero_powers - superhero_powers.csv')

# check
powers_df.info()
powers_df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 667 entries, 0 to 666
Data columns (total 2 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   hero_names  667 non-null    object
 1   Powers      667 non-null    object
dtypes: object(2)
memory usage: 10.5+ KB


Unnamed: 0,hero_names,Powers
0,3-D Man,"Agility,Super Strength,Stamina,Super Speed"
1,A-Bomb,"Accelerated Healing,Durability,Longevity,Super..."
2,Abe Sapien,"Agility,Accelerated Healing,Cold Resistance,Du..."
3,Abin Sur,Lantern Power Ring
4,Abomination,"Accelerated Healing,Intelligence,Super Strengt..."


In [113]:
powers_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 667 entries, 0 to 666
Data columns (total 2 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   hero_names  667 non-null    object
 1   Powers      667 non-null    object
dtypes: object(2)
memory usage: 10.5+ KB


### Separate a string column into multiple columns

In [114]:
# Exploring existing format with a few examples
df['Hero|Publisher'].head()

0            A-Bomb|Marvel Comics
1    Abe Sapien|Dark Horse Comics
2              Abin Sur|DC Comics
3       Abomination|Marvel Comics
4     Absorbing Man|Marvel Comics
Name: Hero|Publisher, dtype: object

In [115]:
## adding expand=True
df['Hero|Publisher'].str.split('|',expand=True)

Unnamed: 0,0,1
0,A-Bomb,Marvel Comics
1,Abe Sapien,Dark Horse Comics
2,Abin Sur,DC Comics
3,Abomination,Marvel Comics
4,Absorbing Man,Marvel Comics
...,...,...
458,Yellowjacket,Marvel Comics
459,Yellowjacket II,Marvel Comics
460,Yoda,George Lucas
461,Zatanna,DC Comics


In [116]:
## save the 2 new columns into the dataframe
df[['Hero','Publisher']] = df['Hero|Publisher'].str.split('|',expand=True)
df.head(2)

Unnamed: 0,Hero|Publisher,Gender,Race,Alignment,Hair color,Eye color,Skin color,Measurements,Hero,Publisher
0,A-Bomb|Marvel Comics,Male,Human,good,No Hair,yellow,Unknown,"{'Height': '203.0 cm', 'Weight': '441.0 kg'}",A-Bomb,Marvel Comics
1,Abe Sapien|Dark Horse Comics,Male,Icthyo Sapien,good,No Hair,blue,blue,"{'Height': '191.0 cm', 'Weight': '65.0 kg'}",Abe Sapien,Dark Horse Comics


In [117]:
## drop the original column 
df = df.drop(columns=['Hero|Publisher'])
df.head(2)

Unnamed: 0,Gender,Race,Alignment,Hair color,Eye color,Skin color,Measurements,Hero,Publisher
0,Male,Human,good,No Hair,yellow,Unknown,"{'Height': '203.0 cm', 'Weight': '441.0 kg'}",A-Bomb,Marvel Comics
1,Male,Icthyo Sapien,good,No Hair,blue,blue,"{'Height': '191.0 cm', 'Weight': '65.0 kg'}",Abe Sapien,Dark Horse Comics


 ## Converting a string column of dictionaries into actual dictionaries


In [118]:
## examining a single value from the Measurements col
measurement = df.loc[0,'Measurements']
print(type(measurement))
measurement

<class 'str'>


"{'Height': '203.0 cm', 'Weight': '441.0 kg'}"

In [119]:
import json
#json.loads(measurement)

In [120]:
measurement

"{'Height': '203.0 cm', 'Weight': '441.0 kg'}"

In [121]:
measurement = measurement.replace("'",'"')
measurement

'{"Height": "203.0 cm", "Weight": "441.0 kg"}'

In [122]:
## now we can use json.loads
fixed_measurement = json.loads(measurement)
print(type(fixed_measurement))
fixed_measurement

<class 'dict'>


{'Height': '203.0 cm', 'Weight': '441.0 kg'}

In [123]:
## use .str.replace to replace all single quotes
df['Measurements'] = df['Measurements'].str.replace("'",'"')
## Apply the json.loads to the full column
df['Measurements'] = df['Measurements'].apply(json.loads)
df['Measurements'].head()

0    {'Height': '203.0 cm', 'Weight': '441.0 kg'}
1     {'Height': '191.0 cm', 'Weight': '65.0 kg'}
2     {'Height': '185.0 cm', 'Weight': '90.0 kg'}
3    {'Height': '203.0 cm', 'Weight': '441.0 kg'}
4    {'Height': '193.0 cm', 'Weight': '122.0 kg'}
Name: Measurements, dtype: object

In [124]:
## check a single value after transformation
test_measurement = df.loc[0, 'Measurements']
print(type(test_measurement))
test_measurement

<class 'dict'>


{'Height': '203.0 cm', 'Weight': '441.0 kg'}

In [125]:
# We now want to convert the single "Measurements" column into 2 separate columns,
Height_Weight = df['Measurements'].apply(pd.Series)
Height_Weight

Unnamed: 0,Height,Weight
0,203.0 cm,441.0 kg
1,191.0 cm,65.0 kg
2,185.0 cm,90.0 kg
3,203.0 cm,441.0 kg
4,193.0 cm,122.0 kg
...,...,...
458,183.0 cm,83.0 kg
459,165.0 cm,52.0 kg
460,66.0 cm,17.0 kg
461,170.0 cm,57.0 kg


In [126]:
# concat Height_Weight with original dataframe
df = pd.concat((df, Height_Weight), axis = 1)
df.head(2)

Unnamed: 0,Gender,Race,Alignment,Hair color,Eye color,Skin color,Measurements,Hero,Publisher,Height,Weight
0,Male,Human,good,No Hair,yellow,Unknown,"{'Height': '203.0 cm', 'Weight': '441.0 kg'}",A-Bomb,Marvel Comics,203.0 cm,441.0 kg
1,Male,Icthyo Sapien,good,No Hair,blue,blue,"{'Height': '191.0 cm', 'Weight': '65.0 kg'}",Abe Sapien,Dark Horse Comics,191.0 cm,65.0 kg


In [127]:
# drop Measurements column and double checking
df = df.drop(columns=['Measurements'])
df.head(2)

Unnamed: 0,Gender,Race,Alignment,Hair color,Eye color,Skin color,Hero,Publisher,Height,Weight
0,Male,Human,good,No Hair,yellow,Unknown,A-Bomb,Marvel Comics,203.0 cm,441.0 kg
1,Male,Icthyo Sapien,good,No Hair,blue,blue,Abe Sapien,Dark Horse Comics,191.0 cm,65.0 kg


In [128]:
# change 'Height' and 'Weight' column names
df.rename(columns = {'Height': 'Height (cm)',
                      'Weight': 'Weight (kg)'}, 
            inplace = True)

# check
df.head(2)

Unnamed: 0,Gender,Race,Alignment,Hair color,Eye color,Skin color,Hero,Publisher,Height (cm),Weight (kg)
0,Male,Human,good,No Hair,yellow,Unknown,A-Bomb,Marvel Comics,203.0 cm,441.0 kg
1,Male,Icthyo Sapien,good,No Hair,blue,blue,Abe Sapien,Dark Horse Comics,191.0 cm,65.0 kg


In [129]:
# take off 'cm' and 'kg' from cell values in 'Height' and 'Weight'
to_replace = [' cm', ' kg']

for char in to_replace:
    df['Height (cm)'] = df['Height (cm)'].str.replace(char, 
                                                          '', 
                                                          regex = False)
    df['Weight (kg)'] = df['Weight (kg)'].str.replace(char, 
                                                          '', 
                                                          regex = False)
    
# check
df[['Height (cm)', 'Weight (kg)']].head(2)

Unnamed: 0,Height (cm),Weight (kg)
0,203.0,441.0
1,191.0,65.0


In [130]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 463 entries, 0 to 462
Data columns (total 10 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   Gender       463 non-null    object
 1   Race         463 non-null    object
 2   Alignment    463 non-null    object
 3   Hair color   463 non-null    object
 4   Eye color    463 non-null    object
 5   Skin color   463 non-null    object
 6   Hero         463 non-null    object
 7   Publisher    463 non-null    object
 8   Height (cm)  463 non-null    object
 9   Weight (kg)  463 non-null    object
dtypes: object(10)
memory usage: 36.3+ KB


In [131]:
# convert 'Height (cm)' and 'Weight (kg)' columns from object-
# types to numeric

df['Height (cm)'] = df['Height (cm)'].astype(float)
df['Weight (kg)'] = df['Weight (kg)'].astype(float)

# check
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 463 entries, 0 to 462
Data columns (total 10 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   Gender       463 non-null    object 
 1   Race         463 non-null    object 
 2   Alignment    463 non-null    object 
 3   Hair color   463 non-null    object 
 4   Eye color    463 non-null    object 
 5   Skin color   463 non-null    object 
 6   Hero         463 non-null    object 
 7   Publisher    463 non-null    object 
 8   Height (cm)  463 non-null    float64
 9   Weight (kg)  463 non-null    float64
dtypes: float64(2), object(8)
memory usage: 36.3+ KB


## Power

In [132]:
powers_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 667 entries, 0 to 666
Data columns (total 2 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   hero_names  667 non-null    object
 1   Powers      667 non-null    object
dtypes: object(2)
memory usage: 10.5+ KB


In [133]:
# check type of powers['Powers'] column
power_df = powers_df.loc[0, 'Powers']
print(type(powers_df))
print(powers_df)

<class 'pandas.core.frame.DataFrame'>
          hero_names                                             Powers
0            3-D Man         Agility,Super Strength,Stamina,Super Speed
1             A-Bomb  Accelerated Healing,Durability,Longevity,Super...
2         Abe Sapien  Agility,Accelerated Healing,Cold Resistance,Du...
3           Abin Sur                                 Lantern Power Ring
4        Abomination  Accelerated Healing,Intelligence,Super Strengt...
..               ...                                                ...
662  Yellowjacket II                 Flight,Energy Blasts,Size Changing
663             Ymir  Cold Resistance,Durability,Longevity,Super Str...
664             Yoda  Agility,Stealth,Danger Sense,Marksmanship,Weap...
665          Zatanna  Cryokinesis,Telepathy,Magic,Fire Control,Proba...
666             Zoom  Super Speed,Intangibility,Time Travel,Time Man...

[667 rows x 2 columns]


In [134]:
## showing the lists are really strings
powers_df.loc[0,'Powers']

'Agility,Super Strength,Stamina,Super Speed'

In [145]:
# Create a new column 

# split string on comma
powers['powers_split'] = powers['Powers'].str.split(',')

# check
powers[['Powers', 'powers_split']].head(2)

Unnamed: 0,Powers,powers_split
0,"Agility,Super Strength,Stamina,Super Speed","[Agility, Super Strength, Stamina, Super Speed]"
1,"Accelerated Healing,Durability,Longevity,Super...","[Accelerated Healing, Durability, Longevity, S..."


In [148]:
powers_df['powers_split'].value_counts()

Intelligence                                                                                                                                                                                                                                                         8
Durability,Super Strength                                                                                                                                                                                                                                            5
Agility,Stealth,Marksmanship,Weapons Master,Stamina                                                                                                                                                                                                                  4
Marksmanship                                                                                                                                                                                                       

In [154]:
## exploding the column of lists

exploded = powers.explode('powers_split')
exploded[['hero_names', 'Powers', 'powers_split']].head(5)


Unnamed: 0,hero_names,Powers,powers_split
0,3-D Man,"Agility,Super Strength,Stamina,Super Speed",Agility
0,3-D Man,"Agility,Super Strength,Stamina,Super Speed",Super Strength
0,3-D Man,"Agility,Super Strength,Stamina,Super Speed",Stamina
0,3-D Man,"Agility,Super Strength,Stamina,Super Speed",Super Speed
1,A-Bomb,"Accelerated Healing,Durability,Longevity,Super...",Accelerated Healing


In [155]:
## saving the unique values from the exploded column
cols_to_make = exploded['powers_split'].dropna().unique()
cols_to_make

array(['Agility', 'Super Strength', 'Stamina', 'Super Speed',
       'Accelerated Healing', 'Durability', 'Longevity', 'Camouflage',
       'Self-Sustenance', 'Cold Resistance', 'Underwater breathing',
       'Marksmanship', 'Weapons Master', 'Intelligence', 'Telepathy',
       'Immortality', 'Reflexes', 'Enhanced Sight', 'Sub-Mariner',
       'Lantern Power Ring', 'Invulnerability', 'Animation',
       'Super Breath', 'Dimensional Awareness', 'Flight', 'Size Changing',
       'Teleportation', 'Magic', 'Dimensional Travel',
       'Molecular Manipulation', 'Energy Manipulation', 'Power Cosmic',
       'Energy Absorption', 'Elemental Transmogrification',
       'Fire Resistance', 'Natural Armor', 'Heat Resistance',
       'Matter Absorption', 'Regeneration', 'Stealth', 'Power Suit',
       'Energy Blasts', 'Energy Beams', 'Heat Generation', 'Danger Sense',
       'Phasing', 'Force Fields', 'Hypnokinesis', 'Invisibility',
       'Enhanced Senses', 'Jump', 'Shapeshifting', 'Elasticity',
 

In [157]:
# create new column for each unique value in list
# populate cells with True or False

frames = [powers]

for col in cols_to_make:
    frame = pd.DataFrame()
    frame[col] = powers['Powers'].str.contains(col)
    frames.append(frame)
    merged_powers = pd.concat(frames, axis = 1)
    
# check
merged_powers.head()

Unnamed: 0,hero_names,Powers,powers_split,Agility,Super Strength,Stamina,Super Speed,Accelerated Healing,Durability,Longevity,...,Weather Control,Omnipresent,Omniscient,Hair Manipulation,Nova Force,Odin Force,Phoenix Force,Intuitive aptitude,Melting,Changing Armor
0,3-D Man,"Agility,Super Strength,Stamina,Super Speed","[Agility, Super Strength, Stamina, Super Speed]",True,True,True,True,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1,A-Bomb,"Accelerated Healing,Durability,Longevity,Super...","[Accelerated Healing, Durability, Longevity, S...",False,True,True,False,True,True,True,...,False,False,False,False,False,False,False,False,False,False
2,Abe Sapien,"Agility,Accelerated Healing,Cold Resistance,Du...","[Agility, Accelerated Healing, Cold Resistance...",True,True,True,False,True,True,True,...,False,False,False,False,False,False,False,False,False,False
3,Abin Sur,Lantern Power Ring,[Lantern Power Ring],False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
4,Abomination,"Accelerated Healing,Intelligence,Super Strengt...","[Accelerated Healing, Intelligence, Super Stre...",False,True,True,True,True,False,False,...,False,False,False,False,False,False,False,False,False,False


In [159]:
# drop 'Powers' and 'powers_split' columns
merged_powers.drop(columns = ['Powers', 'powers_split'], inplace = True)

# check
merged_powers.head()

Unnamed: 0,hero_names,Agility,Super Strength,Stamina,Super Speed,Accelerated Healing,Durability,Longevity,Camouflage,Self-Sustenance,...,Weather Control,Omnipresent,Omniscient,Hair Manipulation,Nova Force,Odin Force,Phoenix Force,Intuitive aptitude,Melting,Changing Armor
0,3-D Man,True,True,True,True,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1,A-Bomb,False,True,True,False,True,True,True,True,True,...,False,False,False,False,False,False,False,False,False,False
2,Abe Sapien,True,True,True,False,True,True,True,False,False,...,False,False,False,False,False,False,False,False,False,False
3,Abin Sur,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
4,Abomination,False,True,True,True,True,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False


In [171]:
# change 'hero_names' column name to match df['Hero']
merged_powers.rename(columns = {'hero_names': 'Hero'}, 
                     inplace = True)

# check
merged_powers.head()

Unnamed: 0,Hero,Agility,Super Strength,Stamina,Super Speed,Accelerated Healing,Durability,Longevity,Camouflage,Self-Sustenance,...,Weather Control,Omnipresent,Omniscient,Hair Manipulation,Nova Force,Odin Force,Phoenix Force,Intuitive aptitude,Melting,Changing Armor
0,3-D Man,True,True,True,True,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1,A-Bomb,False,True,True,False,True,True,True,True,True,...,False,False,False,False,False,False,False,False,False,False
2,Abe Sapien,True,True,True,False,True,True,True,False,False,...,False,False,False,False,False,False,False,False,False,False
3,Abin Sur,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
4,Abomination,False,True,True,True,True,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False


In [161]:
# check info for merged_powers before merging
merged_powers.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 667 entries, 0 to 666
Columns: 168 entries, Hero to Changing Armor
dtypes: bool(167), object(1)
memory usage: 114.1+ KB


In [163]:
# check info for info before merging
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 463 entries, 0 to 462
Data columns (total 10 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   Gender       463 non-null    object 
 1   Race         463 non-null    object 
 2   Alignment    463 non-null    object 
 3   Hair color   463 non-null    object 
 4   Eye color    463 non-null    object 
 5   Skin color   463 non-null    object 
 6   Hero         463 non-null    object 
 7   Publisher    463 non-null    object 
 8   Height (cm)  463 non-null    float64
 9   Weight (kg)  463 non-null    float64
dtypes: float64(2), object(8)
memory usage: 36.3+ KB


In [169]:

# change the order of columns for readability
df2 = df [['Hero', 'Publisher', 'Gender', 'Race',
                 'Alignment', 'Hair color', 'Eye color',
                 'Skin color', 'Height (cm)', 'Weight (kg)']]

# check
df2.head()

Unnamed: 0,Hero,Publisher,Gender,Race,Alignment,Hair color,Eye color,Skin color,Height (cm),Weight (kg)
0,A-Bomb,Marvel Comics,Male,Human,good,No Hair,yellow,Unknown,203.0,441.0
1,Abe Sapien,Dark Horse Comics,Male,Icthyo Sapien,good,No Hair,blue,blue,191.0,65.0
2,Abin Sur,DC Comics,Male,Ungaran,good,No Hair,blue,red,185.0,90.0
3,Abomination,Marvel Comics,Male,Human / Radiation,bad,No Hair,green,Unknown,203.0,441.0
4,Absorbing Man,Marvel Comics,Male,Human,bad,No Hair,blue,Unknown,193.0,122.0


In [170]:
# combine two dataframes on df['Hero'] and merged_powers['Hero']
hero_df = pd.merge(df2, 
                   merged_powers, 
                   on = 'Hero')

# check
hero_df.head()

Unnamed: 0,Hero,Publisher,Gender,Race,Alignment,Hair color,Eye color,Skin color,Height (cm),Weight (kg),...,Weather Control,Omnipresent,Omniscient,Hair Manipulation,Nova Force,Odin Force,Phoenix Force,Intuitive aptitude,Melting,Changing Armor
0,A-Bomb,Marvel Comics,Male,Human,good,No Hair,yellow,Unknown,203.0,441.0,...,False,False,False,False,False,False,False,False,False,False
1,Abe Sapien,Dark Horse Comics,Male,Icthyo Sapien,good,No Hair,blue,blue,191.0,65.0,...,False,False,False,False,False,False,False,False,False,False
2,Abin Sur,DC Comics,Male,Ungaran,good,No Hair,blue,red,185.0,90.0,...,False,False,False,False,False,False,False,False,False,False
3,Abomination,Marvel Comics,Male,Human / Radiation,bad,No Hair,green,Unknown,203.0,441.0,...,False,False,False,False,False,False,False,False,False,False
4,Absorbing Man,Marvel Comics,Male,Human,bad,No Hair,blue,Unknown,193.0,122.0,...,False,False,False,False,False,False,False,False,False,False


## II. Use your combined DataFrame to answer the following questions.

* Compare the average weight of super powers who have Super Speed to those who do not.


In [172]:
hero_df.groupby('Super Speed')['Weight (kg)'].mean().round(2)

Super Speed
False    101.77
True     129.40
Name: Weight (kg), dtype: float64

^ the average weight of super speed is 129.40 and those who do not is 101.77

* What is the average height of heroes for each publisher?

In [176]:
hero_df.groupby('Publisher')['Height (cm)'].mean().round(2).sort_values(ascending=False)

Publisher
Image Comics         211.00
Marvel Comics        191.55
DC Comics            181.92
Star Trek            181.50
Team Epic TV         180.75
Unknown              178.00
Dark Horse Comics    176.91
Shueisha             171.50
George Lucas         159.60
Name: Height (cm), dtype: float64

Here is a list of publishers with the average height, Image Comics with the tallest average height of 211.00 cm and George Lucas being the shortest average height of 159.60 cm.