<a href="https://colab.research.google.com/github/buckley2/ME364/blob/main/Homework_III.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Homework III - ME 364 (Spring 2022)

For this homework assignment, you are going to use <u>part</u> of the United States Wind Turbine Database (USWTDB), containing the information about the locations of land-based and offshore wind turbines in the United States, corresponding wind project information, and turbine technical specifications (for more information see: https://eerscmap.usgs.gov/uswtdb). The dataset is in the same zip file with this notebook. The variables in the dataset are:

- **case_id**: Unique uswtdb id
- **faa_ors**: Federal Avaiation Administration digital obstacle file (dof) for obstacle repository system (ors)
- **faa_asn**: Federal Avaiation Administration obstruction evaluation - airport airspace analysis (oe-aaa) aeronautical study number (asn)
- **usgs_pr_id**: United States Geological Survey id from prior turbine dataset
- **eia_id**: Energy Information Administration plant id from eia form 860
- **t_state**: State where turbine is located
- **t_county**: County where turbine is located
- **t_fips**: State and county fips where turbine is located
- **p_name**: Project name
- **p_year**: Year project became operational
- **p_tnum**: Number of turbines in project
- **p_cap**: Project capacity (MW)
- **t_manu**: Turbine original equipment manufacturer
- **t_model**: Turbine model
- **t_cap**: Turbine capacity (kW)
- **t_hh**: Turbine hub height (meters)
- **t_rd**: Turbine rotor diameter (meters)
- **t_rsa**: Turbine rotor swept area (meters^2)
- **t_ttlh**: Turbine total height - calculated (meters)
- **t_conf_atr**: Turbine characteristic confidence (0-3)
- **t_conf_loc**: Location confidence (0-3)
- **t_img_date**: Date of image used to visually verify turbine location
- **t_img_srce**: Source of image used to visually verify turbine location
- **xlong**: Longitude (decimal degrees - NAD 83 datum)
- **ylat**: Latitude (decimal degrees - NAD 83 datum)

<font color='red'>__IMPORTANT NOTE :__</font> _for all the plots, make sure that your axes and all the variables shown on the plots are properly named (not the default abbreviations used in the dataset) and they all have units associated with them, as long as the variable has a unit._

In [15]:
import pandas as pd
import plotly.express as px  
import numpy as np
import folium
from google.colab import files

<font color='blue'>__(1)__</font>  Import the data to the notebook. How many entries (i.e., rows) do we have in this dataset? Show the first five rows of the dataset.

In [22]:
#Uploading data
url = 'https://raw.githubusercontent.com/buckley2/ME364/main/Data%20Files/USWTDB_v3.csv'  
df = pd.read_csv(url)

#Printing information
print('There are',len(df.index), 'rows in this dataset.') #prints row count
df.head(5) #prints 5 rows

There are 21826 rows in this dataset.


Unnamed: 0,case_id,faa_ors,faa_asn,usgs_pr_id,eia_id,t_state,t_county,t_fips,p_name,p_year,...,t_hh,t_rd,t_rsa,t_ttlh,t_conf_atr,t_conf_loc,t_img_date,t_img_srce,xlong,ylat
0,3046335,25-025116,2013-WTE-5773-OE,26722.0,58661.0,MA,Barnstable County,25001,6th Space Warning Squadron,2013.0,...,80.0,82.5,5345.62,121.3,3,3,9/1/2019,Digital Globe,-70.545303,41.754192
1,3046262,25-025115,2013-WTE-5497-OE,26723.0,58661.0,MA,Barnstable County,25001,6th Space Warning Squadron,2013.0,...,80.0,82.5,5345.62,121.3,3,3,9/1/2019,Digital Globe,-70.541801,41.752491
2,3039278,25-022038,2011-WTE-7517-OE,26677.0,57253.0,MA,Barnstable County,25001,AFCEE MMR Turbines,2011.0,...,80.0,77.0,4656.63,118.6,3,3,9/1/2019,Digital Globe,-70.547798,41.75959
3,3039277,25-022039,2011-WTE-7516-OE,26676.0,57253.0,MA,Barnstable County,25001,AFCEE MMR Turbines,2011.0,...,80.0,77.0,4656.63,118.6,3,3,9/1/2019,Digital Globe,-70.545303,41.757591
4,3014014,39-003863,2003-AGL-6902-OE,35938.0,56226.0,OH,Wood County,39173,AMP-Ohio/Green Mountain Energy Wind Farm,2004.0,...,78.0,80.0,5026.55,118.0,3,3,6/11/2017,Digital Globe,-83.736298,41.382492


<font color='blue'>__(2)__</font> Provide a bubble chart, representing the turbine capacity versus turbine roto diameter with the size of the markers representing the project capacity. Show the Turbine characteristic confidence on your plot as well. Do not forget to use option `labels` to properly name all the variables.

In [8]:
px.scatter(df, x='t_cap', y='t_rd', size='p_cap', color='t_conf_atr', size_max=50,                            #defining data to characterize chart
           labels={'t_cap':'Tuurbine Capacity [kW]','t_rd':'Turbine Rotor Diameter [m]',
                   'p_cap':'Project Capacity [kW]','t_conf_atr':'Turbine Characteristics Confidence (0-3)'},  #defining axis labels
           title='Unites States Wind Turbine Sizing',                                                         #defining chart title
           width=1500,                                                                                        #defining chart size
           height=800)

<font color='blue'>__(3)__</font> Create a map centered on the US (look up the US latitude and longitude) and represent the locations of the wind turbine projects on the map. Use green circles to represent each project and make sure that the project name is shown as a popup for each project. Save the map as an html file and submit it along with your notebook for this assignment.

In [17]:
#Defining center of USA coordinates
lat = 37.09024;
lon = -95.712891;

#Generating plot
turb_map = folium.Map(location=[lat, lon], zoom_start=6, tiles='Stamen Terrain') 

for lat, lon, label in zip(df['ylat'], df['xlong'], df['p_name']):               #plotting points on top of map
        folium.CircleMarker(location=([lat, lon]),
            radius=5, 
            color='red',
            fill=True,
            fill_color='green',
            fill_opacity=0.6,
            popup=label
        ).add_to(turb_map)

#Downloading graph
turb_map.save('turb_map.html') 
files.download('turb_map.html')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<font color='blue'>__(4)__</font> For the state of Texas, provide a pie chart showing the project capacity as a percentage of total capacity for each year. Make sure that you use option `labels` to properly name the variables with their units and put the years and percentages inside each slice. (**note**: you first need to define a new dataframe that only includes the data for the state of Texas.)

In [21]:
#Creating new dataframe
dft = df[df['t_state']=='TX']

#Defining figure
fig = px.pie(dft, values='p_cap', names='p_year',
             title='Texas Project Capacity 1999-2015 [MW]',
             labels={'p_cap':'Project Capacity [MW]','p_year':'Project Year'})
fig.update_traces(textposition='inside', textinfo='percent+label')
fig.show()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>