Demand mapping #717

yashkumar1803 · 2020-08-06T23:41:35Z

Added new visualization functions, which can be analyzed in the electricity-demand-mapping repo

Minor merge conflicts to resolve. All Travis / fast tests pass, with the exception of the known non_net_metering_eia861 duplicate rows issue. Also had to re-format docstrings in demand_mapping.py

codecov · 2020-08-07T00:05:03Z

Codecov Report

Merging #717 into sprint22 will decrease coverage by 3.17%.
The diff coverage is 6.73%.

@@             Coverage Diff              @@
##           sprint22     #717      +/-   ##
============================================
- Coverage     74.39%   71.22%   -3.17%     
============================================
  Files            39       39              
  Lines          4639     4819     +180     
============================================
- Hits           3451     3432      -19     
- Misses         1188     1387     +199

Impacted Files	Coverage Δ
src/pudl/analysis/service_territory.py	`21.60% <0.00%> (-0.35%)`	⬇️
src/pudl/helpers.py	`87.02% <ø> (ø)`
src/pudl/output/ferc714.py	`17.74% <ø> (ø)`
src/pudl/transform/eia861.py	`96.57% <ø> (ø)`
src/pudl/analysis/demand_mapping.py	`9.52% <6.80%> (-2.21%)`	⬇️
src/pudl/workspace/datastore.py	`43.02% <0.00%> (-17.88%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 200dab7...e5bab5f. Read the comment docs.

zaneselvans · 2020-08-07T19:27:00Z

Hey @yashkumar1803 it's not technically part of this PR, but I do want to check in about a few things in the most recent version of your notebook, over in the demand mapping repo.

Is there a reason that you're pulling the tract level layer of the census geometries, and then dissolving them to the county level, rather than directly using the provided county layer of the census data?
There's a function integrated into PUDL for obtaining the Census DP1 and storing it locally:
- pudl.analysis.service_territory.get_census2010_gdf() -- and you can set the desired layer to state, county, or tract.
The FERC 714 doesn't need to be obtained independently, as it will be downloaded automatically when you ask for the FERC 714 ETL to be run, which happens automatically inside of the PUDL output object (including inside of the FERC 714 Respondents class). All access to the FERC 714 data should be made through the PUDL output object, the interface ought to be pretty stable at this point. Also, it runs the ETL, so you don't need to do the extract / transform steps in the notebook anywhere. So for example...

pudl_settings = pudl.workspace.setup.get_defaults()
pudl_engine = sa.create_engine(pudl_settings['pudl_db'])
pudl_out = pudl.output.pudltabl.PudlTabl(pudl_engine)

# Get the county level Census geometries / data:
county_gdf = pudl.analysis.service_territory.get_census2010_gdf(pudl_settings, layer="counties")

ferc714_out = pudl.output.ferc714.Respondents(pudl_out)
ba_county_map = ferc714_out.georef_counties()

# FERC 714 hourly demand data:
dhpa_ferc714 = pudl_out.demand_hourly_pa_ferc714()

pd.read_csv() is happy to take a pathlib.Path object as an argument -- you don't need to str() the path first.
To select records from a given year, when you have a date field, you can use the datetime accessor for the datetime column so like ba_county_map_2010 = ba_county_map[ba_county_map.report_date.dt.year == 2010]

…ggregation criteria

…github issue #7

…APE side-by-side

zaneselvans

I didn't get to everything, but there's a lot here already. We should probably merge it, but keep working on the module as a whole in smaller chunks, one function at a time.

src/pudl/analysis/demand_mapping.py

zaneselvans · 2020-08-19T23:09:51Z

src/pudl/analysis/demand_mapping.py

+            ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'])
+
+
+def uncovered_area_mismatch(disagg_geom, total_geom, title="Area Coverage (By Planning Area)"):


In the notebook output for this map, the PJM territory seems vastly overrepresented, since there are several FERC 714 respondent IDs that have been assigned to the PJM EIA ID. However, only one of them has any demand associated with it. Would it make sense to exclude any area that doesn't have any reported demand at all in the year in question from consideration? Probable sometime early on -- the annualized() version of the FERC714 respondents output includes the annual sum of that respondent's reported demand.

src/pudl/analysis/demand_mapping.py

zaneselvans · 2020-08-20T03:56:25Z

src/pudl/analysis/demand_mapping.py

+
+def error_heatmap(alloc_df, actual_df, demand_columns, region_col="pca", error_metric="r2", leap_exception=False):
+    """
+    Create heatmap of 365X24 dimension to visualize the annual hourly error.


This function does a very particular thing, and I think it might be better separated into a few smaller functions that are more reusable -- like one which, given a year of hourly data (a Datetime index + a data column) makes a heatmap, and another separate one which takes 2 hourly demand allocations and calculates the delta between them, the output for which can be fed into the plotting function. That we we can make other 365 x 24 heatmaps to show other kinds of variables that make sense that way too.

Visually, can we scale it down so it fits on one screen? I think the Y-axis can also be simplified to have day-of-year (integer) as the labels. Could label every 7 days to highlight the weekly pattern? Could just have the first letters of the months JFMAMJJASOND. Allowing the UTC time to be localized for display so that the zero hour is local zero would also be good, since it would make the familiar diurnal pattern clearer, and allow uniform visual comparison between different plots of this type.

Added `pudl.analysis.demand_mapping.sales_ratio_by_class_fips()` function which uses the EIA 861 Sales and Service Territory tables to estimate the breakdown of electricity sales to residential, commercial, industrial, and transportation customers in each year and county. Closes #720

…ive/pudl into demand_mapping

* Along with the county geometries, bring in county area and population in the `pudl.analysis.service_territory.add_geometry()` function. This means calculating the true areas of the counties in a projected (equal area) coordinate system. * When service territory geometries are being dissolved, sum the areas and populations similarly to keep them self-consistent. * In the FERC 714 territory demand summary method, also make sure that the population and area are available, and calculate some informative ratios (population density, demand per unit area, and demand per capita) for use in identifying bad service territory geometries. * Add the mccabe Flake8 plugin to the pudl-dev to calculate function complexity Progress toward #716

…o demand_mapping

yashkumar1803 and others added 10 commits July 6, 2020 20:16

Updated demand mapping allocate functions

a0c8d67

Merge branch 'sprint18' into demand_mapping

2206b5e

Edit allocate_and_aggregate function for faster vectorized calculations

c492ec0

Merge branch 'sprint18' into demand_mapping

633911f

Minor merge conflicts to resolve. All Travis / fast tests pass, with the exception of the known non_net_metering_eia861 duplicate rows issue. Also had to re-format docstrings in demand_mapping.py

Merge branch 'sprint19' into demand_mapping

ebe408b

Added section for allocated demand and error visualization functions

24530f3

Merge remote-tracking branch 'origin/sprint20' into demand_mapping

ffec7f2

Merge branch 'ferc714-output' into demand_mapping

0948099

src/pudl/analysis/demand_mapping.py

3b46836

Merge branch 'sprint20' into demand_mapping

29eac50

zaneselvans requested review from zaneselvans and ezwelty August 7, 2020 13:23

Merge branch 'sprint20' into demand_mapping

8b794fa

yashkumar1803 added 4 commits August 8, 2020 02:09

Make correlation figures and demand profiles function robust. Added a…

68d3b77

…ggregation criteria

Updated all visualizations in the demand_mapping module according to …

b3ed4d5

…github issue #7

comment changes in the demand_mapping module

8de9fae

Merge branch 'sprint20' into demand_mapping

696a96c

zaneselvans changed the base branch from sprint20 to sprint21 August 10, 2020 23:29

yashkumar1803 added 3 commits August 19, 2020 14:25

Merge branch 'sprint21' into demand_mapping

ef4be07

Update timescale error figures, removed NA figure, display RMSE and M…

3bc9b66

…APE side-by-side

Adding region selection option in error_figure function

895d606

zaneselvans reviewed Aug 20, 2020

View reviewed changes

zaneselvans linked an issue Aug 20, 2020 that may be closed by this pull request

Calculate metric of customer class electricity sales splits #720

Closed

yashkumar1803 and others added 4 commits August 21, 2020 18:22

Merge branch 'demand_mapping' of https://github.com/catalyst-cooperat…

d738fae

…ive/pudl into demand_mapping

Addressed PR comments

a485553

Merge remote-tracking branch 'refs/remotes/origin/demand_mapping' int…

db0153c

…o demand_mapping

Clean up docstring formatting

7caef10

zaneselvans changed the base branch from sprint21 to sprint22 August 27, 2020 14:42

Merge branch 'sprint22' into demand_mapping

e5bab5f

zaneselvans merged commit bbf3907 into sprint22 Aug 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Demand mapping #717

Demand mapping #717

yashkumar1803 commented Aug 6, 2020

codecov bot commented Aug 7, 2020 •

edited

zaneselvans commented Aug 7, 2020

zaneselvans left a comment

zaneselvans Aug 19, 2020

zaneselvans Aug 20, 2020

		['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'])


		def uncovered_area_mismatch(disagg_geom, total_geom, title="Area Coverage (By Planning Area)"):

Demand mapping #717

Demand mapping #717

Conversation

yashkumar1803 commented Aug 6, 2020

codecov bot commented Aug 7, 2020 • edited

Codecov Report

zaneselvans commented Aug 7, 2020

zaneselvans left a comment

Choose a reason for hiding this comment

zaneselvans Aug 19, 2020

Choose a reason for hiding this comment

zaneselvans Aug 20, 2020

Choose a reason for hiding this comment

codecov bot commented Aug 7, 2020 •

edited