# Brandon Ellis Spotlight

## Pygal data visualizaiton library

Pygal is a data visualiation tool that offers interactive plots that can be embedded in to web browsers. What differentiates Pygal from other interactive visualization libraries is the ability to output charts as SVGs. The only downfall to being able to create SVG as an output is if you are using hundreds of thousands of data points, you will have trouble rendering the SVG to be a clear image.

### Pygal Setup
1. Install Pygal using "pip install pygal"
2. To render the charts in the browser you have to install lxml using "conda install --name cs489 -c anaconda lxml=3.6.4"



### The data
I used the dc comic characters from the github five thirty eight files. Since this data set gives distinguishing characteristics for all of the comic characters I wanted to evaluate these with this package library, and eventually try to see if there were key characteristics separating Good Characters from Bad Characters.

In [22]:
import pandas as pd
# data source - https://github.com/fivethirtyeight/data/blob/master/comic-characters/dc-wikia-data.csv
dc_df = pd.read_csv("./dc-wikia-data.csv")
dc_df

Unnamed: 0,page_id,name,urlslug,ID,ALIGN,EYE,HAIR,SEX,GSM,ALIVE,APPEARANCES,FIRST APPEARANCE,YEAR
0,1422,Batman (Bruce Wayne),\/wiki\/Batman_(Bruce_Wayne),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,,Living Characters,3093.0,"1939, May",1939.0
1,23387,Superman (Clark Kent),\/wiki\/Superman_(Clark_Kent),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,,Living Characters,2496.0,"1986, October",1986.0
2,1458,Green Lantern (Hal Jordan),\/wiki\/Green_Lantern_(Hal_Jordan),Secret Identity,Good Characters,Brown Eyes,Brown Hair,Male Characters,,Living Characters,1565.0,"1959, October",1959.0
3,1659,James Gordon (New Earth),\/wiki\/James_Gordon_(New_Earth),Public Identity,Good Characters,Brown Eyes,White Hair,Male Characters,,Living Characters,1316.0,"1987, February",1987.0
4,1576,Richard Grayson (New Earth),\/wiki\/Richard_Grayson_(New_Earth),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,,Living Characters,1237.0,"1940, April",1940.0
5,1448,Wonder Woman (Diana Prince),\/wiki\/Wonder_Woman_(Diana_Prince),Public Identity,Good Characters,Blue Eyes,Black Hair,Female Characters,,Living Characters,1231.0,"1941, December",1941.0
6,1486,Aquaman (Arthur Curry),\/wiki\/Aquaman_(Arthur_Curry),Public Identity,Good Characters,Blue Eyes,Blond Hair,Male Characters,,Living Characters,1121.0,"1941, November",1941.0
7,1451,Timothy Drake (New Earth),\/wiki\/Timothy_Drake_(New_Earth),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,,Living Characters,1095.0,"1989, August",1989.0
8,71760,Dinah Laurel Lance (New Earth),\/wiki\/Dinah_Laurel_Lance_(New_Earth),Public Identity,Good Characters,Blue Eyes,Blond Hair,Female Characters,,Living Characters,1075.0,"1969, November",1969.0
9,1380,Flash (Barry Allen),\/wiki\/Flash_(Barry_Allen),Secret Identity,Good Characters,Blue Eyes,Blond Hair,Male Characters,,Living Characters,1028.0,"1956, October",1956.0


## Example of the outputs

Here is an example 2 different ways to output the files, via SVG file, or through the browser.
Note that the .render_in_browser() requires lxml to be installed. Please refer to the Setup description above for install instructions.

In [31]:
import pygal                                                       # First import pygal
bar_chart = pygal.Bar()                                            # Then create a bar graph object
bar_chart.add('Fibonacci', [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55])  # Add some values
bar_chart.render_to_file('bar_chart.svg') # render file to svg then open it with broser
bar_chart.render_in_browser() # render the chart to your default browser

file:///var/folders/__/61br4kn94fg9cf63qy5qqx1r0000gn/T/tmpuOJFtu.html


### Number of Good Characters vs Bad Characters vs Nuetral Characters 

In [24]:
# number Good guys vs Bad Guys by decades
# grabbing the data 

def find_decade(x): # function to bucket the characters by decades in format [x,y)
    if((x >=  1930) & (x < 1940)):
        return 0
    if((x >= 1940) & (x < 1950)):
        return 1
    if((x >= 1950) & (x < 1960)):
        return 2
    if((x >= 1960) & (x < 1970)):
        return 3
    if((x >= 1970) & (x < 1980)):
        return 4
    if((x >= 1980) & (x < 1990)):
        return 5
    if((x >= 1990) & (x < 2000)):
        return 6
    if((x >= 2000) & (x < 2010)):
        return 7
    if((x >= 2010) & (x < 2020)):
        return 8

good_by_dec = [0,0,0,0,0,0,0,0,0]
bad_by_dec = [0,0,0,0,0,0,0,0,0]
neut_by_dec = [0,0,0,0,0,0,0,0,0]
error = 0
    
for index, row in dc_df.iterrows():
    #find the decade charcater was in
    decade = find_decade(dc_df['YEAR'][index])
    if decade is None:
        error += 1
    elif(row['ALIGN']=="Good Characters"):
        good_by_dec[decade] = good_by_dec[decade] + 1
    elif(row['ALIGN']=="Bad Characters"):
        bad_by_dec[decade] = bad_by_dec[decade] + 1
    elif(row['ALIGN']=="Neutral Characters"): #nuetral characters
        neut_by_dec[decade] = neut_by_dec[decade] + 1
        

##### Graphing number of characters by alignment into decades

In [34]:
#how many good guys vs bad guys vs neutral guys

# create a bar graph object
dec_chart = pygal.Bar(title="Character numbers by decades",x_title="Decades (format [start, finish))",  y_title="# of characters")

# You can also create the title through attributes of the pygal.Bar object type like so..
#dec_chart.title = "Character numbers by decades"

dec_chart.x_labels = ['\'30-\'40','\'40-\'50','\'50-\'60','\'60-\'70','\'70-\'80','\'80-\'90','\'90-\'00','\'00-\'10','\'10-\'20']
dec_chart.add('Bad Characters', bad_by_dec)  # Add some values
dec_chart.add('Good Characters', good_by_dec)  # Add some values
dec_chart.add('Neutral Charac', neut_by_dec)  # Add some values
dec_chart.render_in_browser()

file:///var/folders/__/61br4kn94fg9cf63qy5qqx1r0000gn/T/tmpENcotc.html


## Determine if characteristics of comic characters help determine if they are good, bad, or nuetral

To do this, I tried to find distinguishing features from the data set given.

First I looked at Eye Color, then Sexual Orientation.

Lastly, I combined all of the data I thought would be relevant to help determine dinstinguishing charactersists by plotting data into a Radar chart.

##### Grabbing data for Eye Color and Sexual Orientation:

In [26]:
# Grabbing data for Eye color

dc_df['EYE'].unique()
#blue, brown, green, purple, black, white, red, 
#photocellular, hazel, amber, yellow, nan, grey, pink, violet, gold, orange, auburn hair

def find_color(x):
    if x == 'Blue Eyes':
        return 0
    elif (x == 'Brown Eyes') | (x == 'Hazel Eyes'):
        return 1
    elif x == 'Green Eyes':
        return 2
    elif (x == 'Purple Eyes') | (x == 'Violet Eyes'):
        return 3
    elif x == 'Black Eyes':
        return 4
    elif x == 'White Eyes':
        return 5
    elif x == 'Red Eyes':
        return 6
    elif (x == 'Amber Eyes') | (x == 'Orange Eyes'):
        return 7
    elif (x == 'Yellow Eyes') | (x == 'Gold Eyes'):
        return 8
    elif x == 'Grey Eyes':
        return 9
    elif x == 'Pink Eyes':
        return 10
    else:
        return -1
    
def find_sex(x):
    if x == 'Male Characters':
        return 0 #0 = good, 0+1 = bad
    elif x == 'Female Characters':
        return 2 #2 = good, 2+1 = bad
    elif x == 'Genderless Characters':
        return 4 # "other"
    elif x == 'Transgender Characters':
        return 4 # "other"
    else:
        return -1

good_chr_eyes = [0,0,0,0,0,0,0,0,0,0,0]
bad_chr_eyes = [0,0,0,0,0,0,0,0,0,0,0]
neut_chr_eyes = [0,0,0,0,0,0,0,0,0,0,0]
eye_error = 0


sex_of_char = [0,0,0,0,0]
sex_error = 0
character_total = 0



for index, row in dc_df.iterrows():
    eyecolor = find_color(row['EYE'])
    sex_chr = find_sex(row['SEX'])
    if eyecolor == -1:
        eye_error += 1
    elif row['ALIGN'] == 'Good Characters':
        good_chr_eyes[eyecolor] += 1
    elif row['ALIGN'] == 'Bad Characters':
        bad_chr_eyes[eyecolor] += 1
    elif row['ALIGN'] == 'Neutral Characters':
        neut_chr_eyes[eyecolor] += 1
        
    if sex_chr == -1:
        sex_error += 1
    elif sex_chr == 4:
        sex_of_char[sex_chr] += 1
        character_total += 1
    elif row['ALIGN'] == 'Good Characters':
        sex_of_char[sex_chr] += 1
        character_total += 1
    elif row['ALIGN'] == 'Bad Characters':
        sex_of_char[sex_chr+1] += 1
        character_total += 1


#### Graphing the data about the eye color
I also want to note that Pygal is very customizable. In Pygal there is a style attribute for all of the charts, and the style attribute can be given either built-in styles by the library package or user defined styles.

Here in this example you can toggle between the two lines where I define the pygal.HorizontalBar to see an example of the 2 different built-in stlyes.

You can see other built-in stlyes at: http://www.pygal.org/en/stable/documentation/builtin_styles.html

For general styling refer to: http://www.pygal.org/en/stable/documentation/styles.html

In [35]:
#built-in styles imports
from pygal.style import DarkStyle
from pygal.style import DarkSolarizedStyle

#Toggle here to view 2 different built-in types
eye_dot = pygal.HorizontalBar(style=DarkStyle)
#eye_dot = pygal.HorizontalBar(style=DarkSolarizedStyle)

eye_dot.title = 'Eye Color Association with Character Alignment'
eye_dot.x_labels = ['Blue', 'Brown/Hazel', 'Green', 'Purple/Violet', 'Black', 'White', 'Red', 'Orange/Amber', 'Yellow/Gold', 'Grey', 'Pink']
eye_dot.add('Bad',bad_chr_eyes)
eye_dot.add('Good',good_chr_eyes)
eye_dot.add('Neutral', neut_chr_eyes)
eye_dot.render_in_browser()

#notes about chart
    # red & black standout for bad characters
    #brown/hazel & blue standout for good characters

file:///var/folders/__/61br4kn94fg9cf63qy5qqx1r0000gn/T/tmpvuCdcp.html


#### Notes from chart:
I want to note that the colors red & black standout for bad characters while brown/hazel & blue standout for good characters

##### Plotting sexual Orientation Data
If you refer to: http://www.pygal.org/en/stable/documentation/types/pie.html
One interesting thing that I found out was you are able to mix and match some of the attributes of different graph types. In other words here I used both "half pie" and "inner radius" in one chart to show that this is accepted.

I did not test it with other chart types but maybe you can check that out and see how much more is avaiable.

In [36]:
#table with sexual orientation
from pygal.style import LightStyle

sex_orientation = pygal.Pie(inner_radius=.4, half_pie=True, style=LightStyle)
sex_orientation.title = 'Sexual Orientation of Characters (in %)'
sex_orientation.add('Good Male', sex_of_char[0]/float(character_total)*100)
sex_orientation.add('Bad Male', sex_of_char[1]/float(character_total)*100)
sex_orientation.add('Good Female', sex_of_char[2]/float(character_total)*100)
sex_orientation.add('Bad Female', sex_of_char[3]/float(character_total)*100)
sex_orientation.add('Other Bad/Good', sex_of_char[4]/float(character_total)*100)
sex_orientation.render_in_browser()

file:///var/folders/__/61br4kn94fg9cf63qy5qqx1r0000gn/T/tmpizaft5.html


#### Notes from chart:
From this data I cannot really conclude with anything specific except that a majority of the characters are Male.


##### Grabbing data for Radar Chart:
In this chart I used the properties:
1. brown/blue eyes
2. living
3. male
4. Public ID
5. Red & Black Eyes
6. Deceased
7. Female
8. Secret ID

for Good Characters, Bad Characters, and Nuetral Characters individually. This way I can compare the results from the three to try and determine key features in these aligmnet characters to better guess what a new character might be.

In [29]:
# gather information for the Radar Chart below

# indeces meaning
    # 0 - brown/blue eyes
    # 1 - living
    # 2 - male
    # 3 - Public ID
    # 4 - Red & Black Eyes
    # 5 - Deceased
    # 6 - Female
    # 7 - Secret ID

radar_good = [0,0,0,0,0,0,0,0]
radar_bad = [0,0,0,0,0,0,0,0]
radar_nuet = [0,0,0,0,0,0,0,0]

def fill_array(row, array):
    if (row['EYE'] == 'Brown Eyes') | (row['EYE'] == 'Blue Eyes'):
        array[0] += 1
    if (row['ALIVE'] == 'Living Characters'):
        array[1] += 1
    if (row['SEX'] == 'Male Characters'):
        array[2] += 1
    if (row['ID'] == 'Public Identity'):
        array[3] += 1
    if (row['EYE'] == 'Red Eyes') | (row['EYE'] == 'Black Eyes'):
        array[4] += 1
    if (row['ALIVE'] == 'Deceased Characters'):
        array[5] += 1
    if (row['SEX'] == 'Female Characters'):
        array[6] += 1
    if (row['ID'] == 'Secret Identity'):
        array[7] += 1

for index, row in dc_df.iterrows():
    if row['ALIGN'] == 'Good Characters':
        fill_array(row, radar_good)
    elif row['ALIGN'] == 'Bad Characters':
        fill_array(row, radar_bad)
    elif row['ALIGN'] == 'Neutral Characters':
        fill_array(row, radar_nuet)
        

#### Graphing the data onto a Radar Plot

In [37]:

# Style allows you to custom the syle for the chart
from pygal.style import Style

# EXAMPLE of creating your custom stlying.
# source - http://www.pygal.org/en/stable/documentation/custom_styles.html
custom_style = Style(
    background = '#464851',
    plot_background = 'transparent',
    foreground = '#eaebed',
    foreground_strong='#c2c2c4',
    foreground_subtle='#c2c2c4',
    opacity='.4',
    opacity_hover='.9',
    transition='400ms ease-in',
    colors = ('#ff3728', '#33cc33','#ff9933')
)

char_radar = pygal.Radar(style=custom_style, fill=True)
char_radar.title = 'Character Radar Chart'
char_radar.x_labels = ['Brown & Blue Eyes', 'Living', 'Male', 'Public ID', 'Red & Black Eyes', 'Deceased', 'Female', 'Secret ID']
char_radar.add('Bad', radar_bad)
char_radar.add('Good', radar_good)
char_radar.add('Neutral', radar_nuet)
char_radar.render_in_browser()

file:///var/folders/__/61br4kn94fg9cf63qy5qqx1r0000gn/T/tmpSW4_jD.html


##### Conclusion from charts: 
I was unable to determine key characteristics between Good, Bad, and Nuetral Characters from the data given. BOO :/


## Embedding Pygal into HTML pages

It's super easy to embed these charts/graphs into html pages the simplest way is to simply export the chart/graph as a SVG file and then use the figure and embed tags as shown below

source: http://www.pygal.org/en/stable/documentation/web.html

##### NOTE: You have to be sure to put "image/svg+xml"  in the type attribute for this to work properly

In [10]:
# this uses the example dot graph from: http://www.pygal.org/en/stable/documentation/types/dot.html
from IPython.core.display import display, HTML
display(HTML('<!DOCTYPE html><html><head><h1>Html Page with Pygal</h1><br></head><body><figure><embed type="image/svg+xml" src="dot.svg" /></figure></body></html>'))

### Sources:
1. https://github.com/fivethirtyeight/data/blob/master/comic-characters/dc-wikia-data.csv
2. http://www.pygal.org/en/stable/index.html
3. https://anaconda.org/anaconda/lxml