
Wind Turbines are one of several [energy sources](https://www.theatlantic.com/video/index/278324/how-much-energy-does-the-us-use/) are that tied to the future of supplying [ massive thirst for electricity](https://en.wikipedia.org/wiki/List_of_countries_by_electricity_consumption) in the US.

## Questions
* What states are producing the most or least wind energy? 
* Which ones have invested in the largest wind turbines? 
* Could any US state ever become self sufficient on wind energy like [Scotland](https://www.independent.co.uk/environment/scotland-renewable-wind-energy-power-electricity-three-million-homes-118-per-cent-of-households-a7855846.html)?  Are  any even near 100%?!

[US Wind Turbine Database](https://eerscmap.usgs.gov/uswtdb/)

## Columns 
* name - definition - type
* case_id - uswtdb id - Numeric
* faa_ors - unique identifier for each turbine for cross-reference to the faa digital obstacle files (faa dof ) - String
* faa_asn - obstruction evaluation airport airspace analysis (oe-aaa) aeronautical study number (asn) - String
* usgs_pr_id - unique, stable object number for cross-reference - Numeric
* t_state - US State - String
* t_county - US County - String
* t_fips - state and county fips (a 5 digit code) where turbine is located, based on spatial join of turbine points with US state and county shapefile. - Numeric
* p_name - Wind project name - String
* p_year - Year wind project online - Numeric
* p_tnum - number of turbines in the wind power project - Numeric
* p_cap - Project total capacity - Numeric
* t_manu - Turbine manufacturer - String
* t_model - Turbine model - String
* t_cap - Turbine capacity [kW] - Numeric
* t_hh - Turbine hub height [m] - Numeric
* t_rd - Turbine rotor diameter [m] - Numeric
* t_rsa - Turbine rotor swept area [m2] - Numeric
* t_ttlh - Turbine tip height [m] - Numeric
* t_conf_atr - Level of confidence in the turbine's attributes, from low to high - Numeric
* t_conf_loc - Level of confidence in turbine location, from low to high - Numeric
* t_img_date - date of image used to visually verify turbine location (note if NAIP is the image source the month and day were set to 01/01) - DateTime
* t_img_srce - source of image used to visually verify turbine location - String
* xlong - Turbine latitude - Numeric
* ylat - Turbine longitude - Numeric

In [13]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

import os
print(os.listdir("../input"))

# Any results you write to the current directory are saved as output.

In [14]:
# Load
wind = pd.read_csv("../input/uswtdb_v1_0_20180419.csv")   


state_code_to_name = {
    'AK': 'Alaska',
    'AL': 'Alabama',
    'AR': 'Arkansas',
    'AZ': 'Arizona',
    'CA': 'California',
    'CO': 'Colorado',
    'CT': 'Connecticut',
    'DC': 'District of Columbia',
    'DE': 'Delaware',
    'FL': 'Florida',
    'GA': 'Georgia',
    'GU': 'Guam',
    'HI': 'Hawaii',
    'IA': 'Iowa',
    'ID': 'Idaho',
    'IL': 'Illinois',
    'IN': 'Indiana',
    'KS': 'Kansas',
    'KY': 'Kentucky',
    'LA': 'Louisiana',
    'MA': 'Massachusetts',
    'MD': 'Maryland',
    'ME': 'Maine',
    'MI': 'Michigan',
    'MN': 'Minnesota',
    'MO': 'Missouri',
    'MS': 'Mississippi',
    'MT': 'Montana',
    'NC': 'North Carolina',
    'ND': 'North Dakota',
    'NE': 'Nebraska',
    'NH': 'New Hampshire',
    'NJ': 'New Jersey',
    'NM': 'New Mexico',
    'NV': 'Nevada',
    'NY': 'New York',
    'OH': 'Ohio',
    'OK': 'Oklahoma',
    'OR': 'Oregon',
    'PA': 'Pennsylvania',
    'PR': 'Puerto Rico',
    'RI': 'Rhode Island',
    'SC': 'South Carolina',
    'SD': 'South Dakota',
    'TN': 'Tennessee',
    'TX': 'Texas',
    'UT': 'Utah',
    'VA': 'Virginia',
    'VT': 'Vermont',
    'WA': 'Washington',
    'WI': 'Wisconsin',
    'WV': 'West Virginia',
    'WY': 'Wyoming'
}

wind['state'] = wind['t_state'].apply(lambda x: state_code_to_name[x])

In [15]:
# Clean
print("null?: ", wind.isnull().values.any())
print("null count: ", wind.isnull().sum().sum())

In [16]:
# I want to count the number of models and manufacturers
# https://stackoverflow.com/questions/29791785/python-pandas-add-a-column-to-my-dataframe-that-counts-a-variable
df = wind
df['model_count'] = df.groupby('t_model')['t_model'].transform('count')
df['manu_count'] = df.groupby('t_manu')['t_manu'].transform('count')
df_set = df[['state','p_name','p_year','t_manu','manu_count','t_model','model_count','t_cap','p_cap','t_rd','xlong','ylat']]
wind_set_two_years = df_set.query('p_year>2016')
wind_set_two_years.head()
#Monthly Mean

In [17]:
# I want to roll up the states 
# https://chrisalbon.com/python/data_wrangling/pandas_apply_operations_to_groups/
groupby_state = df['manu_count'].groupby(df['state'])

# groupby_state.mean()
groupby_state.describe()

In [None]:
# model_count must be wrong?! why do so m
'''
aggs = ["count","sum","avg","median","mode","rms","stddev","min","max","first","last"]

agg = []
agg_func = []
for i in range(0, len(aggs)):
    agg = dict(
        args=['transforms[0].aggregations[0].func', aggs[i]],
        label=aggs[i],
        method='restyle'
    )
    agg_func.append(agg)
'''

data = [dict(
  type = 'bar',
  x = wind_set_two_years.state,
  y = wind_set_two_years.model_count,
  transforms = [dict(
    type = 'aggregate',
    groups = wind_set_two_years.state,
    aggregations = [dict(
        target = 'y', func = 'count', enabled = True)
    ]
  )]
)]

layout = dict(
  title = '<b>Plotly Aggregations</b><br>use dropdown to change aggregation',
  xaxis = dict(title = 'Subject'),
  yaxis = dict(title = 'Score', range = [0,1500]),
  updatemenus = [dict(
        x = 0.85,
        y = 1.15,
        xref = 'paper',
        yref = 'paper',
        yanchor = 'top',
        active = 1,
        showactive = False,
        buttons = agg_func
  )]
)

iplot({'data': data,'layout': layout}, validate=False)

In [19]:
# Visualize
import plotly.plotly as py
import plotly.graph_objs as go
from plotly import tools
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode()

scl = [ [0,"rgb(5, 10, 172)"],[0.35,"rgb(40, 60, 190)"],[0.5,"rgb(70, 100, 245)"],\
    [0.6,"rgb(90, 120, 245)"],[0.7,"rgb(106, 137, 247)"],[1,"rgb(220, 220, 220)"] ]

data = [ dict(
        type = 'scattergeo',
        locationmode = 'USA-states',
        lon = wind_set_two_years['xlong'],
        lat = wind_set_two_years['ylat'],
        text = wind_set_two_years['p_name'] + ' | Rotor Diameter: ' + wind_set_two_years['t_rd'].astype(str),
        mode = 'markers',
        marker = dict(
            size = wind_set_two_years['t_rd'] / 8,
            opacity = 0.9,
            reversescale = True,
            autocolorscale = False,
            symbol = 'circle',
            line = dict(
                width=1,
                color='rgba(102, 102, 102)'
            ),
            colorscale = scl,
            cmin = 0,
            color = wind_set_two_years['p_cap'],
            cmax = wind_set_two_years['p_cap'].max
            (),
            colorbar=dict(
                title="Power Capacity"
            )
        ))]

layout = dict(
        title = 'US Windturbines 2016 - 2018 <br><span style="font-size:12px">(Circle shows Rotor Size - Hover for windturbine names)</span>',
        colorbar = True,
        geo = dict(
            scope='usa',
            projection=dict( type='albers usa' ),
            showland = True,
            landcolor = "rgb(250, 250, 250)",
            subunitcolor = "rgb(217, 217, 217)",
            countrycolor = "rgb(217, 217, 217)",
            countrywidth = 0.5,
            subunitwidth = 0.5
        ),
    )

fig = dict( data=data, layout=layout )
iplot( fig, validate=False, filename='us-windturbines' )


In [20]:
# any states close to providing enough wind energy to power their entire state?


In [21]:
'''
data = [ dict(
        type = 'bar',
        locationmode = 'USA-states',
        lon = wind_set_two_years['xlong'],
        lat = wind_set_two_years['ylat'],
        text = wind_set_two_years['p_name'] + ' | Rotor Diameter: ' + wind_set_two_years['t_rd'].astype(str),
        mode = 'markers',
        marker = dict(
            size = wind_set_two_years['t_rd'] / 8,
            opacity = 0.9,
            reversescale = True,
            autocolorscale = False,
            symbol = 'circle',
            line = dict(
                width=1,
                color='rgba(102, 102, 102)'
            ),
            colorscale = scl,
            cmin = 0,
            color = wind_set_two_years['p_cap'],
            cmax = wind_set_two_years['p_cap'].max
            (),
            colorbar=dict(
                title="Power Capacity"
            )
        ))]

layout = dict(
        title = 'US Windturbines 2016 - 2018 <br><span style="font-size:12px">(Circle shows Rotor Size - Hover for windturbine names)</span>',
        colorbar = True,
        geo = dict(
            scope='usa',
            projection=dict( type='albers usa' ),
            showland = True,
            landcolor = "rgb(250, 250, 250)",
            subunitcolor = "rgb(217, 217, 217)",
            countrycolor = "rgb(217, 217, 217)",
            countrywidth = 0.5,
            subunitwidth = 0.5
        ),
    )

fig = dict( data=data, layout=layout )

iplot(kind='bar', yTitle='Number of Complaints', title='NYC 311 Complaints',
             filename='cufflinks/categorical-bar-chart')
'''

# Insights

Between 2016 and 2018, most of the wind turbines were built near the the middle of the US. I was expect to see more along the coastline which I assumed would generate more wind. 