# Project Journal

**Name:** Adriana Watson

**Research Question:** The Energy Policy Act of 2005 (EPAct) included legislation intended to promote the use of solar, wind, geothermal, hydroelectric, biomass, and biofuel energy consumption. Can the changes in renewable energy consumption after the implementation of EPAct be explained by the individual contributions of solar, wind, geothermal, hydroelectric, biomass, and biofuel consumption?

**Variables:**
$$Y: Total Renewable Energy \newline
X_1: Solar Energy \newline
X_2: Wind Energy \newline
X_3: Geothermal Energy \newline
X_4: Hydroelectric Power \newline
X_5: Biomass Energy Consumption \newline
X_6: Other Biofuels \newline
X_7: After\_EPAct \newline
X_8: EPAct\_Time 

## Data Prep & EDA
**Dates:** November 1 - November 7

**Meeting Date:** November 7

### Data Cleaning Summary

**Summary of data cleaning process:**
1. Load libraries
2. Import the dataset
3. Check for and fill missing values
4. Create new variables (```After_EPAct``` and ```EPAct_Time```)
5. Remove unnecessary columns

**Issues Encountered and Resolutions:**
Only minor syntax issues and forgetting to install libraries. 


In [22]:
# Step 1: Load the Necessary Libraries
#install.packages("lubridate")
library(dplyr)      # For data manipulation
library(ggplot2)    # For data visualization
library(lubridate)  # For date handling

In [23]:
# Step 2: Import the Dataset
data <- read.csv("../USRenewableEnergyConsumption.csv")

In [24]:
# Step 3: Check for and Fill Missing Values
# Check for missing values
missing_values <- colSums(is.na(data))
print(missing_values)  # Print the count of missing values in each column

                              Year                              Month 
                                 0                                  0 
                            Sector                Hydroelectric.Power 
                                 0                                  0 
                 Geothermal.Energy                       Solar.Energy 
                                 0                                  0 
                       Wind.Energy                        Wood.Energy 
                                 0                                  0 
                      Waste.Energy Fuel.Ethanol..Excluding.Denaturant 
                                 0                                  0 
    Biomass.Losses.and.Co.products                     Biomass.Energy 
                                 0                                  0 
            Total.Renewable.Energy              Renewable.Diesel.Fuel 
                                 0                                  0 
      

In [25]:
data <- data %>%
  mutate(After_EPAct = ifelse((Year > 2005) | (Year == 2005 & Month >= 8), 1, 0),  # Binary variable
         EPAct_Time = ifelse(Year < 2005, NA, 
                             (Year - 2005) * 12 + (Month - 8)))  # Time since EPAct, in months

In [27]:
# Step 5: Remove Unnecessary Columns and Select Relevant Columns
data <- data %>%
  select(Total_Renewable_Energy = Total.Renewable.Energy,  # Y
         Solar_Energy = Solar.Energy,                      # X_1
         Wind_Energy = Wind.Energy,                        # X_2
         Geothermal_Energy = Geothermal.Energy,            # X_3
         Hydroelectric_Power = Hydroelectric.Power,        # X_4
         Biomass_Energy_Consumption = Biomass.Energy,      # X_5
         Other_Biofuels = Other.Biofuels,                  # X_6
         After_EPAct,                                      # X_7
         EPAct_Time)                                       # X_8

In [28]:
head(data)

Unnamed: 0_level_0,Total_Renewable_Energy,Solar_Energy,Wind_Energy,Geothermal_Energy,Hydroelectric_Power,Biomass_Energy_Consumption,Other_Biofuels,After_EPAct,EPAct_Time
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,0.57,0,0,0.0,0.0,0.57,0,0,
2,89.223,0,0,0.49,0.0,0.211,0,0,
3,99.973,0,0,0.0,1.04,98.933,0,0,
4,30.074,0,0,0.0,0.0,0.0,0,0,
5,0.0,0,0,0.0,0.0,0.0,0,0,
6,0.515,0,0,0.0,0.0,0.515,0,0,


### Exploratory Data Analysis Findings
**Key Visualizations:** 

In [None]:
# Add any data visualizations you made here!
# Feel free to add in as many code blocks as you need.

### Summary Statistics

In [None]:
# Add your summary statisics here

***
## Model Building
**Dates:** November 8 - November 14

**Meeting Date:** November 14

### Model Equation

**Equation:** 
[Write out the model equation here based on your selected predictors]

Note: you can write equations as follows: 
$$Y = \beta_0 +  \beta_1X_1 + \beta_2X_2 + \epsilon

### Model Fitting

In [None]:
# Model fitting code (e.g., lm() function)

### Interaction Terms
**Explanation of Interaction Terms:**
[Briefly describe any interaction terms included in the model]


In [None]:
# Add any interaction plots here

### Model Summary and Diagonostics

In [None]:
# Model summary
summary(model)

# ANOVA table
anova(model)

# Diagnostics: Residual Plots, Normality, etc.

### Feature Selection Plan
Describe strategies for reducing the model (if necessary) and rationale.

***
## Model Evaluation & Validation
**Dates:** November 15 - November 21

**Meeting Date:** November 21

### Documentation of Model Adjustments

In [None]:
# Model adjustments made based on your feature selection plan
# You can add as many code/markdown blocks as you need to show 
# the iterative thought process here as you go. 

Summary of iterative process:
1. First I did this
2. Then I did this because...
3. Then I did this because...

Final Model Equation: 

### Model Evaluation
#### Significance Tests

In [None]:
# Add your significance test code with outputs here

#### Model Performance Metrics

In [None]:
# Add your model performance code with outputs here

### Validation Findings

In [None]:
# Add any validation code here

### Summary of Findings

[Summarize your findings from the model evaluation and validation here. Don't forget to bring it back to your hypothesis and include your final model!]

***
Team Reminder: After this meeting, agree on a report/presentation format and make all of the needed documentation.

***
## Report and Presentation
**Dates:** November 22 - December 1

**Meeting Date:** November 28

No code neccesary here (yay)! Use the space below to brainstorm which graphs you want to include in the report and how you want to tell the story of your model!