### Imports/Settings

In [1]:

import wrangle
import models
import exploration

# for presentation purposes
import warnings
warnings.filterwarnings("ignore")

# libraries
import numpy as np
import pandas as pd

# visualize 
import matplotlib.pyplot as plt
import seaborn as sns

# for tsa 
import statsmodels.api as sm

# holt's linear trend model. 
from statsmodels.tsa.api import Holt

IndentationError: unindent does not match any outer indentation level (models.py, line 62)

In [None]:
# plotting defaults
plt.rc('figure', figsize=(11, 5))
plt.style.use('seaborn-whitegrid')
plt.rc('font', size=16)

# Fossil Fuel Project - Consumption/Production Report

##### Reported by: Craig Calzado  -   Date: April 27, 2022

<h2>Goals</h2>

- Determine the Consumption/Production of Fossil Fuel for the next 5 years.

<h2>Executive Summary</h2>
"

## Acquiring and Preparing the Data:

The data for this project is from the [United States Energy Information Administration](https://www.eia.gov/opendata/).

To access the data, we will use the [API](https://www.eia.gov/opendata/register.php).

Once registered, we will be able to access the data with an API key.

You will need to create an env.py file that will contain your API key named api_key.

Ensure the env.py file is in the same directory as this notebook and wrangle.py


In [None]:
# Calls the wrangle.py file to get the data from epi_category
data = wrangle.epi_category('711238')

In [None]:
# Calls the wrangle.py file to get the data from data_manipulation
# data_manipulation() ia a function that returns a list of series_id and series_name pulled from the api
# the list of series_id and series_name is used to build the dataframe
series_id_list, series_name_list = wrangle.data_manipulation(data)
print('series_id_list:\n ', series_id_list,
      '\nseries_name_list:\n ', series_name_list)

In [None]:
# Calls the wrangle.py file to get the data from build_df_list_rename
# build_df_list_rename() is a function that takes in a list of series_id and names and builds another list fof the data for each series_id properly naming the values
df_list = wrangle.build_df_list_rename(series_id_list, series_name_list)

In [None]:
# Calls the wrangle.py file to get the data from build_df
# prep_data() is a function that transform the list create by build_df_list_rename() into a dataframe.
df = wrangle.prep_data(df_list)


In [None]:
# feature engineering
# create a new column that is the difference between the fossil fuel production and fossil fuel consumption and other energy sources
df = wrangle.feat_eng(df)   

In [None]:
# Creates the data frame for the fossil fuels
df = wrangle.fossil_fuels(df)

In [None]:
# Check the dataframe
df.info()

Takeaways:
- Acquire the data from the API
- Created a list of the data we want to use
- Transformed the data into workable dataframe
- Added difference is production and consumption
- Created final dataframe for exploration

This results are 589 entries, 1973-01-01 to 2022-01-01 no null values.

## Exploring the data:

While exploring, we need to answer 2 main questions:
### 1.) What area would be the most impactful for our marketing campaign?

In order to answer this question, we went from a macro to micro overview of areas that currently produced the least profit...

In [None]:
train, test = exploration.explore_split(df)

In [None]:
exploration.box_plots(train)

In [None]:
exploration.line_plots(train)

## Modeling the data:

In [None]:
train, validate, test = models.split_data_model(df)

In [None]:
train.info()

In [None]:
yhat_df = models.holts_15_12_model(train, validate)

In [None]:
models.previous_year(df)

## Conclusion:

In [None]:
models.conclusion_model(df)