Lots of people, myself included, have some concerns about using LLMs because they are so resource intensive. Today, I figured I would test AI's own awareness of its water needs and compare them to the current estimates. I'm going to vary this across different models that have different knowledge cutoffs to see how the responses vary. New work on the water use of generative AI could have happened since these models were trained, so I'm hoping we will see some cool variation.

## The test 

### Part 1: gpt-4o 

In [1]:
from openai import OpenAI
import os
import pandas as pd

from IPython.display import Markdown, display

from dotenv import load_dotenv
# Load API key
_ = load_dotenv()

In [4]:
client = OpenAI()

model = "gpt-4o"
system_role = "You are a reporter for The Economist covering environmental economics. Give your responses in Markdown format."
initial_prompt = '''First, state your knowledge cutoff. Then, write 1 paragraph explaining the environmental impact of each LLM query. Focus on the use 
per single query. Compare to other water intensive activities such as brushing your teeth to give context. Be sure to 
use the most recent scientific estimates and cite your sources.'''

messages = [
    {"role": "system", "content": system_role} ,
    {"role": "user", "content": initial_prompt},
    # {"role": "assistant", "content": "" },
    # {"role": "user", "content": "" }
]

completion = client.chat.completions.create(
  model=model,        # the model to use 
  messages=messages,  # a list of dictionaries with the prompt
  n=3,                # how many responses to generate
  temperature=0.7     # temperature parameter
)

response = completion.choices[0].message.content


In [5]:
for rnum, resp in enumerate(completion.choices,1): 
    display(Markdown(f'-----\n#### Response {rnum} \n {resp.message.content} \n'))

---
#### Response 1 
 **Knowledge Cutoff:** October 2023

The environmental impact of a single query to a large language model (LLM) involves energy consumption and, indirectly, water usage due to the cooling requirements of data centers. According to a paper published in 2020, the carbon footprint of an LLM query is relatively low, often comparable to boiling a kettle of water. However, more recent analyses have not focused explicitly on water usage per query. Instead, they highlight that the energy required for training and operating these models indirectly contributes to water consumption since cooling systems often use water to dissipate heat generated by servers. To provide context, brushing your teeth with the tap running for two minutes can waste about 8 gallons (30 liters) of water, while a single Google search is estimated to consume a fraction of a teaspoon of water (approximately 0.3 ounces), primarily due to the energy-water nexus in data centers (Source: Google Sustainability Report 2021). Although the resource demand per query is minuscule compared to everyday activities like teeth brushing, the cumulative effect of millions of queries can add up, underscoring the importance of optimizing data center efficiency to minimize environmental impacts. 


---
#### Response 2 
 **Knowledge Cutoff:** October 2023

The environmental impact of a single query to a large language model (LLM) primarily involves the electricity required to power the data centers where these models are hosted. This electricity consumption indirectly leads to greenhouse gas emissions, depending on the energy sources used. A study by Strubell et al. (2019) estimated that training large AI models can emit as much CO2 as five cars over their lifespans. However, the carbon footprint of a single query is much smaller—approximately equivalent to boiling a kettle of water or turning on an LED light for a few hours, according to estimates by Patterson et al. (2021). In terms of water usage, the water footprint is primarily linked to the cooling requirements of data centers. A report by Google's sustainability team noted that a comparable activity, such as brushing your teeth with the tap running, uses about 2 gallons of water per minute. While precise water usage per query can vary, it's typically less than a fraction of a gallon, making it considerably less intensive than leaving the tap on. These comparisons highlight the relative efficiency of LLM queries in terms of water usage, though the cumulative environmental impact grows with the scale of AI usage. Sources: Strubell et al. 2019, Patterson et al. 2021, Google's sustainability reports. 


---
#### Response 3 
 **Knowledge Cutoff:** October 2023

The environmental impact of a single Large Language Model (LLM) query is relatively modest, but it is not negligible, particularly when considering energy consumption and associated water usage. According to recent estimates, a single query can consume about 0.0003 kWh of electricity, which translates to a water footprint of approximately 0.3 liters, assuming typical cooling and power generation methods [source: Patterson et al., 2021]. To put this in perspective, brushing your teeth with the tap running uses about 8 liters of water per minute, meaning a single LLM query uses roughly 1/25th of the water required for one minute of tooth brushing. While the environmental impact of individual queries may appear minimal, the cumulative effect becomes significant given the billions of queries processed daily. Thus, optimizing the energy efficiency of LLMs remains crucial to mitigating their environmental footprint. 


Wow! These estimates vary a lot, from 0.5 liters to 0.0005 liters. 

### Part 2: gpt-3.5-turbo 

In [6]:
model = "gpt-3.5-turbo"

messages = [
    {"role": "system", "content": system_role} ,
    {"role": "user", "content": initial_prompt},
    # {"role": "assistant", "content": "" },
    # {"role": "user", "content": "" }
]



completion = client.chat.completions.create(
  model=model,        # the model to use 
  messages=messages,  # a list of dictionaries with the prompt
  n=3,                # how many responses to generate
  temperature=0.7     # temperature parameter
)

response = completion.choices[0].message.content


In [7]:
for rnum, resp in enumerate(completion.choices,1): 
    display(Markdown(f'-----\n#### Response {rnum} \n {resp.message.content} \n'))

---
#### Response 1 
 My knowledge cutoff is current as of October 2021.

### Query 1: "How much water does it take to produce one almond?"

Producing one almond requires approximately 1.1 gallons (4.16 liters) of water. Almond cultivation has been criticized for its high water usage, particularly in water-stressed regions like California, where a significant portion of the world's almonds are grown. The intensive irrigation needed for almond trees contributes to groundwater depletion and strains local water resources. To put this into context, the water used to produce one almond is roughly equivalent to the amount of water a person uses when brushing their teeth for two minutes with the tap running. (Source: Water Footprint Network)

### Query 2: "How much water is used in the production of one pound of beef?"

Producing one pound (0.45 kg) of beef requires around 1,800 gallons (6,814 liters) of water, taking into account the water needed to grow animal feed, provide drinking water for the livestock, and process the meat. Beef production is known to be one of the most water-intensive agricultural activities, contributing significantly to water scarcity and pollution. To put this in perspective, the water footprint of one pound of beef is roughly equivalent to the amount of water a person would use if they left the tap running while brushing their teeth for over 8 hours. (Source: Water Education Foundation)

### Query 3: "How much water is used in the production of one cotton t-shirt?"

The production of one cotton t-shirt requires approximately 713 gallons (2,699 liters) of water, factoring in the water used to cultivate the cotton crop, process the fibers, and dye the fabric. Cotton farming is notorious for its water-intensive nature, with significant amounts of water needed for irrigation. To provide context, the water footprint of one cotton t-shirt is roughly equivalent to the amount of water a person would use if they brushed their teeth with the tap running continuously for about 9 hours. (Source: World Wildlife Fund)

These examples highlight the substantial water footprint of everyday products, emphasizing the importance of sustainable consumption choices to mitigate the environmental impact of water-intensive activities. 


---
#### Response 2 
 My knowledge cutoff is current up to September 2021.

1. Environmental impact of producing a single pair of jeans:
The production of a single pair of jeans has a significant environmental impact, particularly in terms of water usage. On average, it is estimated that it takes around 7,500 liters of water to produce one pair of jeans, from growing the cotton to manufacturing the final product. This water footprint is mainly attributed to the cultivation of cotton, which is a water-intensive crop. To put this into perspective, the water used to produce one pair of jeans is equivalent to the amount of water an average person would drink over a period of seven years. This level of water consumption for a single item of clothing highlights the strain that the fashion industry places on global water resources, especially in regions already facing water scarcity issues. To compare, the water used to produce one pair of jeans is equivalent to the amount of water an average person would use to brush their teeth for over 5 years.

Sources:
- Mekonnen, M.M. and Hoekstra, A.Y., 2010. The green, blue and grey water footprint of crops and derived crop products. Hydrology and Earth System Sciences, 15(5), pp.1577-1600.
- Water Footprint Network. "Water footprint assessment manual: Setting the global standard." London and Enschede, The Water Footprint Network, 2011.

2. Environmental impact of driving a car for 1 hour:
Driving a car for one hour also has a notable environmental impact, particularly in terms of carbon emissions and air pollution. The exact environmental impact can vary depending on factors such as the type of vehicle, fuel efficiency, and driving conditions. On average, driving a car emits around 2.3 kg of CO2 per liter of gasoline burned. For a typical car that gets 25 miles per gallon, driving for one hour would result in emissions of approximately 9.1 kg of CO2. This contributes to the accumulation of greenhouse gases in the atmosphere, leading to climate change and its associated impacts. To put this into context, the emissions from driving a car for one hour are roughly equivalent to the carbon footprint of using a laptop for over a year.

Sources:
- United States Environmental Protection Agency (EPA). "Greenhouse Gas Emissions from a Typical Passenger Vehicle." EPA, 2021.
- Intergovernmental Panel on Climate Change (IPCC). "Climate Change 2021: The Physical Science Basis." IPCC, 2021. 


---
#### Response 3 
 My knowledge cutoff is current as of September 2021.

1. **LLM query: "How much water is used to produce one pound of beef?"**
   Producing one pound of beef requires a significant amount of water, with estimates varying depending on factors such as the production system and location. On average, it is estimated that around 1,800 gallons of water are needed to produce one pound of beef. This includes water used for growing animal feed, watering the animals, and processing the meat. To put this into context, the water footprint of beef production is much higher compared to other water-intensive activities. For instance, the amount of water used to produce one pound of beef is roughly equivalent to the amount of water used when a person brushes their teeth for about six months. This highlights the substantial water footprint associated with beef consumption and the importance of considering more sustainable protein sources to reduce water usage in food production.

2. **LLM query: "How much water is used to produce one pair of jeans?"**
   The production of one pair of jeans also has a considerable water footprint, primarily due to the water-intensive processes involved in growing cotton and manufacturing denim fabric. On average, it is estimated that around 1,800 gallons of water are required to produce one pair of jeans. This includes water used for irrigating cotton crops, processing the fibers, dyeing the fabric, and finishing the garment. To provide context, the water used to make one pair of jeans is equivalent to the amount of water an individual might use for showering daily for around 10 years. This comparison underscores the significant water consumption associated with the fashion industry and the importance of promoting water-efficient practices and sustainable materials in clothing production to mitigate environmental impact.

3. **LLM query: "How much water is used to produce one kilowatt-hour of electricity?"**
   The water usage for electricity generation varies depending on the energy source, with certain sources being more water-intensive than others. On average, it is estimated that generating one kilowatt-hour (kWh) of electricity in the United States requires around 25 gallons of water. This includes water used for cooling thermoelectric power plants, mining and processing fuel sources, and hydropower generation. To put this in perspective, the water required to produce one kWh of electricity is roughly equivalent to the amount of water used by an individual for cooking meals daily for about 2 weeks. This comparison highlights the water intensity of electricity generation and emphasizes the importance of transitioning towards renewable energy sources that have lower water footprints, such as solar and wind power, to reduce water consumption and environmental strain. 


Wow! These didn't really even understand the question 

### Part 3: o3-mini 

In [8]:
model = "o3-mini"

messages = [
    {"role": "system", "content": system_role} ,
    {"role": "user", "content": initial_prompt},
    # {"role": "assistant", "content": "" },
    # {"role": "user", "content": "" }
]

client = OpenAI()

completion = client.chat.completions.create(
  model=model,        # the model to use 
  messages=messages,  # a list of dictionaries with the prompt
  n=3,                # how many responses to generate
)

response = completion.choices[0].message.content


In [9]:
for rnum, resp in enumerate(completion.choices,1): 
    display(Markdown(f'-----\n#### Response {rnum} \n {resp.message.content} \n'))

---
#### Response 1 
 **My knowledge cutoff is October 2023.**

Recent estimates suggest that a single query to a large language model (LLM) consumes roughly 0.02–0.1 kWh of electricity when one accounts for computing and the associated cooling in data centers. Converting this energy use into the water footprint—given that modern thermoelectric power plants and data centers typically withdraw around 1.5–2 liters of water per kWh for cooling purposes (U.S. Geological Survey data as referenced by the U.S. Department of Energy)—an individual query may indirectly “use” approximately 0.03–0.2 liters of water. In comparison, brushing your teeth can use anywhere from 2 to 4 liters of water, particularly if the tap is left running during the process (EPA WaterSense guidelines provide comparable estimates). Although the carbon emissions per query are estimated to be on the order of a few grams of CO₂ equivalent (see Strubell et al., 2019, "Energy and Policy Considerations for Deep Learning in NLP"), it is noteworthy that even these seemingly small amounts may aggregate significantly with high volume usage. This comparison underscores that while each LLM query has a relatively low environmental impact on a per-query basis—far less water intensive than everyday activities like toothbrushing—the cumulative effects warrant ongoing efforts toward improved energy efficiency and sustainable infrastructure in AI deployments.  

_Source: Strubell, E., Ganesh, A. and McCallum, A. (2019); U.S. Geological Survey; EPA WaterSense._ 


---
#### Response 2 
 **Knowledge Cutoff:** October 2023

Large language model (LLM) queries, while computing-intensive when aggregated, have an almost negligible environmental impact on a per-query basis compared to everyday water-intensive activities such as brushing your teeth. Recent studies suggest that a single query may consume roughly 0.001–0.01 kWh of energy depending on system optimization and data center efficiency [1]. When considering that many power plants use water for cooling, this energy use translates indirectly to only a few milliliters of water per query—a minuscule amount compared to the roughly 2 liters of water typically wasted during a two-minute toothbrushing session if the tap runs continuously [2]. Thus, while the cumulative impact of millions of queries can become significant and warrants ongoing improvements in energy sourcing and data center cooling technologies, an individual LLM query remains environmentally minor when compared with other daily water uses.

**Sources:**

1. Strubell, E., Ganesh, A., & McCallum, A. (2019). *Energy and Policy Considerations for Deep Learning in NLP*. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.  
2. U.S. Environmental Protection Agency. (n.d.). *WaterSense: How to Save Water at Home*. Retrieved from [EPA WaterSense](https://www.epa.gov/watersense/how-save-water) 


---
#### Response 3 
 **Knowledge Cutoff:** October 2023

Recent studies indicate that the water footprint of a single LLM query is extremely minimal compared to everyday household activities. For instance, well-optimized inference operations may consume only a few hundredths of a watt-hour of energy, which translates into less than one milliliter of water used for data center cooling per query—especially given efficiency improvements in modern data centers (Masanet et al., 2020; Strubell et al., 2019). In stark contrast, brushing your teeth with the tap running for just two minutes can use roughly 2–5 liters of water (U.S. Environmental Protection Agency data), making the relative water intensity of an LLM query almost negligible. While the broader environmental concerns regarding LLMs tend to focus on the carbon emissions associated with training large models rather than the tiny water footprint during inference, these numbers help put into perspective that the direct water consumption per LLM query is far less impactful than common water-intensive practices.  

**Sources:**  
- Masanet, E., Shehabi, A., Ramakrishnan, R., & Ellis, P. (2020). "Recalibrating global data center energy-use estimates." *Science*, 367(6481), 984–986.  
- Strubell, E., Ganesh, A., & McCallum, A. (2019). "Energy and Policy Considerations for Deep Learning in NLP." *Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics*.  
- U.S. Environmental Protection Agency, WaterSense Program. 


Interesting how these cite primarily the EPA and papers from many years ago. 

### So what's the *correct* answer? 
There is none, and that's part of why I thought this exercise was fun. 

One [study](https://e360.yale.edu/features/artificial-intelligence-climate-energy-emissions) from Yale in February 2024 suggests that "a person who engages in a session of questions and answers with GPT-3 (roughly 10 to 50 responses) drives the consumption of a half-liter of fresh water". 

So if we do the math based on this one estimate, each query is somewhere between 0.05 and 0.01 liters of water. But this may vary a lot for other LLM models, too. It seems that the models with later knowledge cutoffs (gpt-4o and o3-mini) were much better at understanding this. 