<h1>ECON 2 Problem Set 1</h1>

<h2>Gas taxes and gas prices in the EU</h2>

(Many thanks to Prof. Emmanuel Saez at UC Berkeley for developing this problem.)

Most governments levy and collect taxes on gasoline, which remains the primary fuel for automobiles. But different governments levy different levels of a gas tax. How does the empirical relationship between gas taxes and gas prices across different geographies with different governments compare with the theoretical relationship that we expect to see?

Usually we conceptualize a <em>sales tax</em> like a gas tax as a proportional tax, like the 10.25% tax in Berkeley, CA. But it is also useful to look at the currency value of the tax per unit, compared with the currency value of the market price per unit.

<h2>Data sources</h2>

I retrieved [data on gas prices in the EU](https://www.mappr.co/thematic-maps/fuel-prices-europe/) and [data on gas taxes](https://taxfoundation.org/data/all/eu/diesel-gas-taxes-europe/) on September 5, 2025.

<h2>Google Sheets</h2>

These are fantastic. In ECON 2, you can use Google Sheets only and skip the __R__ code below if you like. Or you can press ahead and see something that might be new and unfamiliar, but which is much more powerful. You will use something like this in ECON 140 and possibly in STAT 20 or DATA 8.

Run the code below to load up some useful libraries in __R__:

In [None]:
# This call pulls in a package that lets us read Google files like Sheets
install.packages("googlesheets4")
library(googlesheets4)

# This call allows the notebook to skip Google authorization, to access publicly 
# viewable files 
gs4_deauth()

# Let's also install and load up a handy graphing library
install.packages("ggplot2")
library(ggplot2)

The dataset sits in this Google Sheets file: [EU gas taxes and prices](https://docs.google.com/spreadsheets/d/1GRRoQAsHtWYZ3vhyPfx5SYsZd-o3JnaHkJGiPn_QPYU/edit?gid=0#gid=0), which you can access directly. To edit, make a copy of the file into your own @Berkeley Google account, which you can access by navigating to [https://drive.google.com](https://drive.google.com) after authenticating with your @Berkeley account.

In [None]:
# Define a string as the URL for the public sheet
sheet_url = "https://docs.google.com/spreadsheets/d/1GRRoQAsHtWYZ3vhyPfx5SYsZd-o3JnaHkJGiPn_QPYU/edit?gid=0#gid=0"

# This call to read_sheet creates a data frame called "eu_gas_tp" containing 
# data from the range shown
eu_gas_tp <- read_sheet(sheet_url,
                        range = "A1:F28")

This code shows the data frame, which contains 27 observations on country-level gas taxes and gas prices. For both taxes and prices we have measures in euros per liter and also in dollars per gallon, which might be more familiar to U.S. students.

In [None]:
eu_gas_tp

<h2>Pretty graphs vs. pretty code</h2>

One of the challenges is that creating pretty graphics can be complicated, code-wise. Let's start with an ugly graphic that is simple. Here is all the code we really need:

In [None]:
plot(eu_gas_tp$gastax_el, eu_gas_tp$gasprice_el)

In the code above, we're asking __R__ to make a scatterplot with the `gastax_el` column as the X-variable and the `gasprice_el` column as the Y-variable. 

Simple. A little ugly.

Here's some more complicated code that graphs things a little more elegantly using the `ggplot2` package:

In [None]:
# Copied and pasted and altered from Gemini, thanks:
#
# We can use ggplot() to create a pretty scatterplot
# 1. The first argument is the dataframe.
# 2. Inside the aes() (aesthetics) function, we map our variables:
#    - x = gastax_el:    maps the 'gastax_el' column to the x-axis.
#    - y = gasprice_el:  maps the 'gasprice_el' column to the y-axis.
# 3. We then add (+) the geom_point() layer to tell ggplot we want a scatterplot.

scatterplot <- ggplot(eu_gas_tp, aes(x = gastax_el, y = gasprice_el)) +
  geom_point() +
  labs(
    title = "The gas price as a function of the gas tax in the EU, 2025",
    x = "Gas tax in euros per liter",
    y = "Gas price in euros per liter"
  ) +
  theme_minimal() # Applying a clean theme for better readability

print(scatterplot)

Voilà. Go ahead and play around with this if you like. Maybe ask Gemini or ChatGPT how to make it prettier if you like. You could also ask it about the `geom_smooth()` option, which cuts to the chase with a trendline.

You could run the code below and take a time machine into ECON 140 to interpret it. This code runs the <em>linear regression</em> of the Y variable shown in the graph on the X variable, and it tells us the estimate of the Y-intercept and the linear slope coefficient on the X variable, `gastax_el`.

(No, this will not be on the ECON 2 exams. It will be on the ECON 140 exams.)

In [None]:
reg1 <- lm(gasprice_el ~ gastax_el,
           data = eu_gas_tp)
summary(reg1)

<div style="text-align: right"> <span style="font-family:Papyrus; ">And they lived happily ever after. The End.</span></div>