# Data Visualization Project

Open:FactSet Insights and Analysis team.  

The team mission is to educate, inspire, and empower FactSetters and clients to utilize Open:FactSet Data and Solutions.  

As part of the team you will be tasked with creating and communicating compelling stories for Marketplace content.  Secondarily, the team collective will need to develop best practices and methods for disseminating knowledge both internally and externally.  For the interview please prepare a 20-30 minute presentation covering:

1. For an audience of CTS Sales, create a presentation showcasing a FactSet content set using Python, R, or SQL.  The goal is to educate on potential applications, how a given technology is applied and how to pitch this to their client base. 
    - Python, Jupyter Notebook, Ondemand?, 
    - Unique Datasets - RBICs + SCG Data
    - Exploratory Data Analysis
    - Hans Rosling Chart


        
2. Present an idea of how the Insights and Analysis team can approach educating, inspiring, or empowering FactSetters and clients in FY19.  
    - Code-along programs
    - package together internet resources for people interested in learning
    - Newsletters + sharing of code, methodology, and data.  Make sure it's error-free documentation
    - Github for sharing and distributing across the firm
    - Get all teams involved in Machine Learning involved in an initiative to teach Data Science and Analysis techniques across the firm
    - Phase 1 team - Jupyter, Publish in Marketplace, Story-telling with EDA
    - Phase 2 team - d3.js - more interactive datavisualization tools
    - Conferences: Jupyter Conference, Tableau Conference, etc...
    - FactSet Surveys to collect unique content


Create a Hans Rosling Chart.

Hypothesis:
- Analyze the impact of ESG Data, Market Cap, and Revenue on S&P500 stocks against other companies in the same revere classification
- does the Rich get richer while the Poor get poorer?
- Hypothesize that the growth of FAANG has meant areas of growth for other companies in their Revere Sectors


Extract and Perform EDA
- Scatter Plot data points = Company
- X = **Revenue**, Growth Rate
- Y = # of Employees, Revenue, **GDPR**

- Time Series - plotted individually
- Size = Market Capitalization
- Color = Revere Classification


In [3]:
import matplotlib.pyplot as plt

In [2]:
# Sample Code from DataCamp Program

# Scatter plot
plt.scatter(x = gdp_cap, y = life_exp, s = np.array(pop) * 2, c = col, alpha = 0.8)

# Previous customizations
plt.xscale('log') 
plt.xlabel('GDP per Capita [in USD]')
plt.ylabel('Life Expectancy [in years]')
plt.title('World Development in 2007')
plt.xticks([1000,10000,100000], ['1k','10k','100k'])

# Additional customizations
plt.text(1550, 71, 'India')
plt.text(5700, 80, 'China')

# Add grid() call
plt.grid(True)

# Show the plot
plt.show()


**Size**
- Right now, the scatter plot is just a cloud of blue dots, indistinguishable from each other. Let's change this. Wouldn't it be nice if the size of the dots corresponds to the population?

- To accomplish this, there is a list pop loaded in your workspace. It contains population numbers for each country expressed in millions. You can see that this list is added to the scatter method, as the argument s, for size.

**Color**
 - The next step is making the plot more colorful! To do this, a list col has been created for you. It's a list with a color for each corresponding country, depending on the continent the country is part of.

- How did we make the list col you ask? The Gapminder data contains a list continent with the continent each country belongs to. A dictionary is constructed that maps continents onto colors:

In [None]:

dict = {
    'Asia':'red',
    'Europe':'green',
    'Africa':'blue',
    'Americas':'yellow',
    'Oceania':'black'
}

In [1]:
from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd

In [2]:

data  = Path('.', 'assets', 'rbics_esg_data.csv')
hans_rosling = pd.read_csv(data)

In [3]:
hans_rosling

Unnamed: 0,Symbol,Name,Date,mkt_val,rbics_econn,rbics_econ,rbics_sectn,rbics_sect,rbics_subsectn,rbics_subsect,rbics_indgrpn,rbics_indgrp,rbics_indn,rbics_ind,rbics_subindn,rbics_subind,msci_esg_env,msci_esg_gov,msci_esg_social
0,MMM,3M Company,12/31/2007,59796.03653,45,Non-Energy Materials,4520,Manufactured Products,452020,Other Materials,45202015,Diversified Materials,4520201510,Diversified Materials,4.5202E+11,Diversified Materials,4.260000229,5.75,6.610000134
1,ABT,Abbott Laboratories,12/31/2007,87027.45436,35,Healthcare,3510,Biopharmaceuticals,351015,Other Biopharmaceuticals,35101540,Diversified Biopharmaceuticals,3510154010,Diversified Biopharmaceuticals,3.51015E+11,Diversified Biopharmaceuticals,5.690000057,5.329999924,5.260000229
2,ANF,Abercrombie & Fitch Co. Class A,12/31/2007,6881.25856,20,Consumer Cyclicals,2025,Consumer Retail,202510,Apparel and Accessories Retail,20251020,Apparel Retail,2025102010,Casual and Specialty Apparel Retail,2.0251E+11,Teen/Young Adults Apparel and Accessories Retail,2.970000029,4.119999886,3.180000067
3,CB,ACE Limited,12/31/2007,20369.14593,30,Finance,3015,Insurance,301510,Insurance,30151020,Property and Casualty Insurance,3015102010,Commercial Insurance,3.0151E+11,Diversified Commercial Insurance,4.989999771,5.369999886,6.130000114
4,ADBE,Adobe Systems Incorporated,12/31/2007,24416.30657,55,Technology,5520,Software and Consulting,552015,Software,55201510,Design and Engineering Software,5520151025,Specialized Design and Engineering Software,5.52015E+11,Multimedia Design and Engineering Software,5.5,6.409999847,7.96999979
5,AMD,"Advanced Micro Devices, Inc.",12/31/2007,4492.5,55,Technology,5510,Electronic Components and Manufacturing,551020,Semiconductor Manufacturing,55102030,Processor Semiconductors,5510203025,Microprocessor Semiconductors,5.5102E+11,Microprocessor (MPU) Semiconductors,8.5,8.329999924,9.279999733
6,AES,AES Corporation,12/31/2007,14338.5695,65,Utilities,6510,Utilities,651010,Energy Utilities,65101025,Wholesale Power Generation and Marketing,6510102520,Other International Wholesale Power,6.5101E+11,Multinational Wholesale Power,4.409999847,3.930000067,5.039999962
7,AET,Aetna Inc.,12/31/2007,28651.399,35,Healthcare,3515,Healthcare Services,351510,Healthcare Support Services,35151010,Health Plan Providers,3515101020,Other Managed Care,3.5151E+11,Other Managed Care,4.519999981,9,6.889999866
8,ACS,Affiliated Computer Services Inc. Cl A,12/31/2007,4328.4274,10,Business Services,1010,Business Services,101015,Other Professional Services,10101515,Consulting/Business Process Outsourcing Services,1010151510,Business Process Outsourcing Services,1.01015E+11,Diverse Business Process Outsourcing Services,3.039999962,2.519999981,5.869999886
9,AFL,Aflac Incorporated,12/31/2007,30471.3739,30,Finance,3015,Insurance,301510,Insurance,30151015,Life and Health Insurance,3015101515,Supplemental Health Insurance,3.0151E+11,Other Supplemental Health Insurance,4.489999771,4.150000095,4.829999924
