Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create DB schema for persistent data #124

Open
taiphuong1 opened this issue Dec 20, 2023 · 10 comments
Open

Create DB schema for persistent data #124

taiphuong1 opened this issue Dec 20, 2023 · 10 comments

Comments

@taiphuong1
Copy link
Contributor

taiphuong1 commented Dec 20, 2023

-- Edited by group on Dec 26 to harmonize HOME and HeatingLoadAnalysis here, with rules engine and spreadsheet

Note: This table represent a single location (ie. a house)

Create table LOCATION 
location_id (Primary Key)
address
address_2 -- (need address line 2?)
city
state
zip
country

Note: This is the main table for each case, referencing a location associate with the location_id FK

Telephone

Create table HOME
home_id (Primary Key)
location_id (Foreign Key)
first_name
last_name
living_area --(sq ft)
design_temperature  --(get set by choosing weather station)
design_temperature_override
fuel_type
fuel_type_secondary -- (After v1.0 only)
heating_system_efficiency -- (%)
heating_system_efficiency_secondary -- (After v1.0 only)
thermostat_set_point
setback_temperature
setback_time -- (hrs) -- Note that rules_engine uses `setback_hours_per_day`
non_heating_usage -- (fuel type unit / day), double check: !B16 / !B15
      -- fuel oil this is an input, otherwise ...
      --     only used for override. For Beta 2. 
      --     On spreadsheet shown as  "Other Fuel Usage".
raw_energy_use_data -- blob csv of natural gas, may not have for OilPropane, etc. 
Create Table HeatingLoadAnalysis -- aka SummaryOutput from RulesEngine
version_rules_engine -- commit sha format
balance_point_initial -- currently 60 on rules engine.
balance_point -- (Tbp) -- In RulesEngine final is `estimated_balance_point`, 60 is the initial guess?
other_fuel_usage -- (therms/day) calculated -- NOTE: For fuel oil this is an input, for natural gas is an output.
balance_point_sensitivity
average_indoor_temperature -- (T-indoors)
// difference_between_ti_and_tbp -- (calculated T-indoors - Tbp), Tai had suggested Difference_ti_tbp -- This probably does not belong in the database, it is just a subtraction. Do not store.
whole_home_heat_loss_rate -- Tai had suggested: Whole_home_ua
standard_deviation_of_heat_loss_rate: -- tai had suggested: Standard_deviation_ua
average_heat_load
maximum_heat_load

Note: This table holds the data for usage csv for each case referencing the case_id FK

Create table ENERGY_USE_PERIOD -- was FUEL_USAGE (load from usage csv)
home_id (Foreign Key)
-- each home has one fuel type which determines what usage unit means.
start_date
end_date
usage_days
for_primary_or_secondary_fuel
usage_quantity -- (floating point : code should know therms OR gallons OR kWh )
included_automatically -- bool / was Inclusion -- (Y/N)
exclusion_manually -- bool
inclusion_manually -- bool / was Inclusion_override -- possibility for nullable enum?
@alanisaac
Copy link
Collaborator

alanisaac commented Dec 20, 2023

https://github.com/codeforboston/home-energy-analysis-tool/blob/summary-outputs/rules-engine/src/rules_engine/pydantic_models.py is where our schema is, you can get a flavor for some of the language we use for the fields in the excel sheet.

https://dbdiagram.io/ is a free, neat tool to create database diagrams if that might help folks visualize the proposed schema.

@jkoren
Copy link
Collaborator

jkoren commented Dec 21, 2023

@alanisaac this is helpful. It looks like we have been working on similar ideas. It will be helpful to coordinate. We can standardize on your names, if preferred. The idea of the schema is just for data to be persisted now. We can add calculated data (for example the number of days between 2 dates) later.
I tried the dbdiagram.io tool - its nice.
Here is a diagram I put together. The premise is that there could be multiple cases for the same location.
The fuel usage table is to reflect the monthly usage readings from the energy bill.
This is a starting point - so open for revision and discussion.
HEAT Calculator

@jkoren
Copy link
Collaborator

jkoren commented Dec 21, 2023

It's nice that the tool creates SQL too.

@baipai
Copy link
Collaborator

baipai commented Dec 21, 2023

I don't think we need a phone number, or maybe I'm wrong. @stevebreit

@jkoren
Copy link
Collaborator

jkoren commented Jan 3, 2024

In ENERGY_USE_PERIOD we can probably avoid persisting usage_days as that is just end_date - start_date

@jkoren
Copy link
Collaborator

jkoren commented Jan 3, 2024

I am not clear what these fields in ENERGY_USE_PERIOD represent:

included_automatically
exclusion_manually
inclusion_manually

standard_deviation_of_heat_loss_rate is better than Standard_deviation_ua
and
standard_deviation_of_heat_loss_rate is better than Standard_deviation_ua

@thadk
Copy link
Member

thadk commented Jan 3, 2024

Energy Use Parent Class
=> Delivery Period Subclass (date of delivery and quantity)
=> Billing Period Subclass (start/number of days, end, and quantity)
- 3 values that define it, you need at least 2.
- it depends which company supplied the date.
- Some companies describe end date.
- Some companies describe start date.

@thadk
Copy link
Member

thadk commented Jan 3, 2024

Michelle's Beta 1 (and always): Export data format that can be re-ingested again

Notes from Javascript team breakout about where we're at on this issue:
Jeff's Posed Goal for this moment/persistence team:
- Decide on data structure and names of fields

Rules engine has it's ideas.

Michelle:

  • What we need to persist in order to show the user what they need to know,
  • and output that in some kind of data format (whether it be CSV or something else)
  • and reliably compare new calculations to old calculations

@thadk
Copy link
Member

thadk commented Jan 3, 2024

heat-stack/_heat ~~~~> / (e.g. Terms and Conditions)
All homes /all-homes
heat-stack/_Home ~~~~>/home/1245/*
- all these include the "Return to site" blue nav header
- some of these include the side nav
- Input screens for a specific home
- heat analysis screen
_home+/$homeid+/analysis ~~~>

My page /me
- List of my homes
Admin page `/admin
List of users
List of homes

@jkoren jkoren mentioned this issue Jan 7, 2024
@alanisaac
Copy link
Collaborator

alanisaac commented Jan 10, 2024

Rules engine has its ideas.

When we discussed this last week, I wanted to put together a proposal that maps to what we have in the rules engine. I haven't had a ton of time, but I thought I'd spend part of today's session putting that proposal together. Here it is in DBML.

image

Table case {
  id integer [primary key]
  home_id integer [ref: > home.id]
  created_at timestamp 
  last_updated_on timestamp
  first_name varchar
  last_name varchar
  // case worker info?
}

Table location {
  id integer [primary key]
  address varchar
  address_line_2 varchar
  city varchar
  state varchar
  zip varchar
  country varchar
}

Table home {
  id integer [primary key]
  location_id integer [ref: > location.id]
  // summary inputs
  living_area float
  fuel_type integer
  design_temperature_override float
  heating_system_efficiency float
  thermostat_set_point float
  setback_temperature float
  setback_hours_per_day float
  // domestic hot water inputs
  number_of_occupants integer
  estimated_water_heating_efficiency float
  stand_by_losses float
}

Table heat_load_analysis {
  id integer [primary key]
  home_id integer [ref: > home.id]
  rules_engine_version varchar
  estimated_balance_point float
  other_fuel_usage float
  average_indoor_temperature float
  difference_between_ti_and_tbp float
  design_temperature float
  whole_home_heat_loss_rate float
  standard_deviation_of_heat_loss_rate float
  average_heat_load float
  maximum_heat_load float
}

Table natural_gas_bill {
  id integer [primary key]
  case_id integer [ref: > case.id]
  provider varchar
}

Table natural_gas_bill_records {
  natural_gas_bill_id integer [ref: > natural_gas_bill.id]
  period_start_date datetime
  period_end_date datetime
  usage_therms float
  inclusion_override integer
}

Table oil_propane_bill {
  id integer [primary key]
  case_id integer [ref: > case.id]
  provider varchar
  preceding_delivery_date datetime
}

Table oil_propane_bill_records {
  oil_propane_bill_id integer [ref: > oil_propane_bill.id]
  period_end_date datetime
  gallons float
  inclusion_override boolean
}

Differences

  • Separates out the concept of a case from the data needed for the home analysis.
  • Includes fields for Domestic Hot Water in the home table (see the "DHW" tab in the Heat Load Analysis Tool spreadsheet)
  • Moves design_temperature to an output in heat_load_analysis rather than an input. Since we decided to pull temperature data for locations instead of selecting a weather station, IIRC this will be calculated based on the home location (unless using the override), but I am double-checking with @stevebreit.
  • Fixes up the inclusion for billing records mentioned above: I'm not sure we need to persist the extra fields besides the override.
  • The biggest differences are in the way the existing heating data (oil / propane or natural gas) are represented:
    • I split out one table to store the concept of a bill / statement (*_bill), and one to store the billing period records associated with that statement (*_bill_records). The bill table lets us store additional information about the energy provider, as well as supports the idea that oil / propane needs one extra field for the preceding delivery date (cell B6 in the spreadsheet).
    • I split the table out into two, natural_gas_* vs oil_propane_*. I know it means more tables, but the differences that already exist between the two types of energy may not be the only ones. In our last meeting @stevebreit made mention of some other strange scenarios specific to energy types in the past that may become more common and have a need to be represented officially as the tool scales up. This is the way we represent the data in the rules engine as well.

@jkoren jkoren mentioned this issue Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants