# PV Rooftop - Aspects

This notebook is an example about how to access the PV rooftop dataset named *aspects* through OEDI data lake.

## 0. Prerequisites

To run this example, it requires you have OEDI data lake deployed, where all quries run through. About how to deploy OEDI data lake, please refer to the documentation here - https://openedi.github.io/open-data-access-tools/.

In this example, the deployed database is `oedi_data_lake`, where the table related to *aspects* dataset is named `pv_rooftop_aspects`, the staging location for queries is `s3://nrel-tests`.

In [1]:
database_name = "oedi_data_lake"
table_name = "pv_rooftop_aspects"
staging_location = "s3://nrel-tests/"

## 1. Aspects Metadata
In oedi, the `OEDIGlue` class provides utility methods to retrieve the metadata from the database, where the metadata includes `Columns`, `Partition Keys`, and `Partition Values`.

In [2]:
from oedi.AWS.glue import OEDIGlue

In [3]:
glue = OEDIGlue()

In [4]:
# Table Column Definition
glue.get_table_columns(database_name, table_name)

Unnamed: 0,Name,Type
0,gid,bigint
1,city,string
2,state,string
3,year,bigint
4,bldg_fid,bigint
5,aspect,bigint
6,the_geom_96703,string
7,the_geom_4326,string


In [5]:
# Table Parition Keys
glue.get_partition_keys(database_name, table_name)

Unnamed: 0,Name,Type
0,city_year,string


In [6]:
# Table Partition Values
glue.get_partition_values(database_name, table_name)

['baltimore_md_13',
 'toledo_oh_06',
 'dayton_oh_06',
 'lubbock_tx_08',
 'richmond_va_08',
 'milwaukee_wi_13',
 'charlotte_nc_12',
 'losangeles_ca_07',
 'poughkeepsie_ny_12',
 'frankfort_ky_12',
 'salem_or_08',
 'madison_wi_10',
 'sarasota_fl_09',
 'washington_dc_12',
 'bakersfield_ca_10',
 'allentown_pa_06',
 'jeffersoncity_mo_08',
 'winstonsalem_nc_09',
 'cincinnati_oh_10',
 'toledo_oh_12',
 'trenton_nj_08',
 'corpuschristi_tx_12',
 'oklahomacity_ok_07',
 'shreveport_la_08',
 'boise_id_07',
 'fresno_ca_13',
 'houston_tx_10',
 'bridgeport_ct_06',
 'detroit_mi_12',
 'montpelier_vt_09',
 'tallahassee_fl_09',
 'sanfrancisco_ca_13',
 'tampa_fl_13',
 'austin_tx_06',
 'manhattan_ny_07',
 'missionviejo_ca_13',
 'charlotte_nc_06',
 'minneapolis_mn_07',
 'albuquerque_nm_06',
 'springfield_ma_07',
 'carsoncity_nv_09',
 'charleston_sc_10',
 'reno_nv_07',
 'mobile_al_10',
 'springfield_il_09',
 'syracuse_ny_08',
 'batonrouge_la_12',
 'cheyenne_wy_08',
 'fresno_ca_06',
 'lexington_ky_12',
 'milwau

## 2. Run Query
Based on the metadata retrieved above, we can query data by using method in `OEDIAthena` class. In the example below, we select records from partition `topeka_ks_08`.

In [7]:
from pyproj import CRS
from oedi.AWS.athena import OEDIAthena

In [8]:
athena = OEDIAthena(staging_location=staging_location, region_name="us-west-2")

In [9]:
query_string = f"""
    SELECT gid, bldg_fid, aspect, the_geom_4326
    FROM {database_name}.{table_name}
    WHERE city_year='topeka_ks_08'
"""
gdf = athena.run_query(query_string, geometry="the_geom_4326")

In [10]:
gdf.crs = CRS("EPSG:4326")

In [11]:
gdf

Unnamed: 0,gid,bldg_fid,aspect,the_geom_4326
0,496,70560,4,"MULTIPOLYGON (((-95.78945 39.17040, -95.78945 ..."
1,998,70506,3,"MULTIPOLYGON (((-95.57186 39.17412, -95.57185 ..."
2,3949,70217,3,"MULTIPOLYGON (((-95.56859 39.16762, -95.56860 ..."
3,5757,70064,4,"MULTIPOLYGON (((-95.72628 39.15991, -95.72628 ..."
4,20158,68639,4,"MULTIPOLYGON (((-95.78229 39.14319, -95.78230 ..."
...,...,...,...,...
853864,621567,18308,4,"MULTIPOLYGON (((-95.70404 39.01144, -95.70405 ..."
853865,621568,18302,2,"MULTIPOLYGON (((-95.70177 39.01149, -95.70178 ..."
853866,621569,18310,7,"MULTIPOLYGON (((-95.70122 39.01150, -95.70125 ..."
853867,621586,18201,2,"MULTIPOLYGON (((-95.65775 39.01250, -95.65776 ..."


In [12]:
# Check geometry
geom = gdf.iloc[1]["the_geom_4326"]
geom.bounds

(-95.5718572081215, 39.1741102680828, -95.5718453147948, 39.17411952549)

## 3. Aspect Visualization

To visualize the aspect of PV rooftops on map, and see if any trends on its geographic distribution.

In [13]:
import folium
import plotly.graph_objs as go

In [14]:
# Aggregation for counting aspects
agg = gdf.groupby("aspect", as_index=False)["gid"].count()
agg.rename(columns={"gid": "count"}, inplace=True)

# Histogram
aspects = [0, 1, 2, 3, 4, 5, 6, 7, 8]
colors = {
    0: "#ff0000", 
    1: "#ffbf00", 
    2: "#ffff00",
    3: "#00ff00",
    4: "#00ffff",
    5: "#00bfff",
    6: "#0000ff",
    7: "#8000ff",
    8: "#ff00ff"
}
bars = []
for aspect in aspects:
    bar = go.Bar(
        x=[aspect], 
        y=agg[agg["aspect"]==aspect]["count"].values, 
        name=aspect, 
        marker={"color": colors[aspect]}
    )
    bars.append(bar)
fig = go.FigureWidget(data=bars)
fig.layout.title = "Aspect Histogram"
fig

FigureWidget({
    'data': [{'marker': {'color': '#ff0000'},
              'name': '0',
              'type': …

In [15]:
# Sample datasets, otherwise datasets is too large to show.
samples = gdf.sample(n=10000)

In [16]:
imap = folium.Map(location=[39.0473, -95.6752], zoom_start=11, tiles="Stamen Toner")

# Style function
def style_function(feature):
    aspect = feature["properties"]["aspect"]
    return {
        "fillOpacity": 0.75,
        "fillColor": colors[aspect],
        "wight": 0.1,
        "color": colors[aspect]
    }

# GeoJSON
folium.GeoJson(
    name="PV Rooftops",
    data=samples.to_json(),
    style_function=style_function
).add_to(imap)

imap