# U.S. Crude Oil Production by States
## by Junyoung Seo

Dataset from [Bureau of Transportation Statistics](https://www.bts.gov/browse-statistical-products-and-data/freight-facts-and-figures/us-crude-oil-production-state)

In this project, I will visualize the history of oil production by state in the United States. I will create animated choropleth maps using Plotly Express, which will display the production of oil for each state over time. Every one second, the visualization will transition to the next year, allowing viewers to observe the changes in production over time.

In [80]:
import pandas as pd

#loading the data
df = pd.read_csv('U.S. Oil Production by State.csv')

df.head()

Unnamed: 0,State,Year,Crude oil production (thousands of barrels)
0,Wyoming,2022,91013.0
1,West Virginia,2022,16278.0
2,Virginia,2022,5.0
3,Utah,2022,46481.0
4,Texas,2022,1840014.0


To use the Plotly U.S. database, we need to ensure that the state names are in their abbreviated form. For example, instead of "Massachusetts," we would use "MA."

In [81]:
abbrev_names = {
    "Alabama": "AL",
    "Alaska": "AK",
    "Arizona": "AZ",
    "Arkansas": "AR",
    "California": "CA",
    "Colorado": "CO",
    "Connecticut": "CT",
    "Delaware": "DE",
    "Florida": "FL",
    "Georgia": "GA",
    "Hawaii": "HI",
    "Idaho": "ID",
    "Illinois": "IL",
    "Indiana": "IN",
    "Iowa": "IA",
    "Kansas": "KS",
    "Kentucky": "KY",
    "Louisiana": "LA",
    "Maine": "ME",
    "Maryland": "MD",
    "Massachusetts": "MA",
    "Michigan": "MI",
    "Minnesota": "MN",
    "Mississippi": "MS",
    "Missouri": "MO",
    "Montana": "MT",
    "Nebraska": "NE",
    "Nevada": "NV",
    "New Hampshire": "NH",
    "New Jersey": "NJ",
    "New Mexico": "NM",
    "New York": "NY",
    "North Carolina": "NC",
    "North Dakota": "ND",
    "Ohio": "OH",
    "Oklahoma": "OK",
    "Oregon": "OR",
    "Pennsylvania": "PA",
    "Rhode Island": "RI",
    "South Carolina": "SC",
    "South Dakota": "SD",
    "Tennessee": "TN",
    "Texas": "TX",
    "Utah": "UT",
    "Vermont": "VT",
    "Virginia": "VA",
    "Washington": "WA",
    "West Virginia": "WV",
    "Wisconsin": "WI",
    "Wyoming": "WY",
    "District of Columbia": "DC",
    "American Samoa": "AS",
    "Guam": "GU",
    "Northern Mariana Islands": "MP",
    "Puerto Rico": "PR",
    "United States Minor Outlying Islands": "UM",
    "U.S. Virgin Islands": "VI",
}

def convert_state_name(df, state_name):
    df[state_name] = df[state_name].map(abbrev_names)  #mapping state name to abbreviations
    return df

df = convert_state_name(df, 'State')  #changing state names for plotly database
df.head()

Unnamed: 0,State,Year,Crude oil production (thousands of barrels)
0,WY,2022,91013.0
1,WV,2022,16278.0
2,VA,2022,5.0
3,UT,2022,46481.0
4,TX,2022,1840014.0


In [82]:
from dash import Dash, html, dcc, Input, Output
import plotly.express as px
import pandas as pd


#Creating an application
app = Dash('US oil production')

#layout
app.layout = html.Div([
    dcc.Graph(id='maps'),  #the graph will be a map
    dcc.Slider(
        id='year_slider',   #creating slider with year
        min=df['Year'].min(),
        max=df['Year'].max(),
        value=df['Year'].min(),
        marks={str(Year): str(Year) for Year in range(1981, df['Year'].max() + 1, 5)},  #mark in 5 years interval
        step=1
    ),
    dcc.Interval(
        id='update',  #updating in fixed interval of time
        interval=1000,   #1 second
        n_intervals=0   #storing number of intervals
    )
])

#callback
@app.callback(
    [Output('maps', 'figure'),  #updating map
     Output('year_slider', 'value')],  # updating slider too
    [Input('update', 'n_intervals')]  # this would be an auto update for the input(for the animation)
)

#updating the map
def update_map(n_intervals):
    #with number of interval value, calculating the current index year
    min_year = df['Year'].min()
    max_year = df['Year'].max()
    total_year = max_year - min_year + 1
    current_index = n_intervals % (total_year)
    
    #updating current year
    current_year = min_year + current_index
    
    #with the value of current year, filter the data with current year
    current_year_data = df[df['Year'] == current_year]
    
    #using choropleth from plotly.express, visualize the map
    maps = px.choropleth(current_year_data,
                        locations='State',    # using adjusted state names
                        locationmode='USA-states',   # location is U.S
                        color='Crude oil production (thousands of barrels)',  #chaning by their production
                        scope="usa",  #scoping in U.S.
                        title=f'U.S. Oil Production by State in {current_year}',
                        color_continuous_scale="greys"   #setting grey as a base color
                        )
    
    return maps, current_year

if __name__ == '__main__':
    app.run_server(debug=True)