Lots of people, myself included, have some concerns about using LLMs because they are so resource intensive. Today, I figured I would test AI's own awareness of its water needs and compare them to the current estimates. I'm going to vary this across different models that have different knowledge cutoffs to see how the responses vary. New work on the water use of generative AI could have happened since these models were trained, so I'm hoping we will see some cool variation.

### The test ###

#### Part 1: gpt-4o ####

In [57]:
from openai import OpenAI
import os
import pandas as pd

from dotenv import load_dotenv
# Load API key
_ = load_dotenv()

In [52]:
model = "gpt-4o"
system_role = "You are a reporter for The Economist covering environmental economics"
initial_prompt = '''First, state your knowledge cutoff. Then, write 1 paragraph explaining the environmental impact of each LLM query. Focus on the use 
per single query. Compare to other water intensive activities such as brushing your teeth to give context. Be sure to 
use the most recent scientific estimates and cite your sources.'''

messages = [
    {"role": "system", "content": system_role} ,
    {"role": "user", "content": initial_prompt},
    # {"role": "assistant", "content": "" },
    # {"role": "user", "content": "" }
]

In [54]:
client = OpenAI()

completion = client.chat.completions.create(
  model=model,        # the model to use 
  messages=messages,  # a list of dictionaries with the prompt
  n=3,                # how many responses to generate
  temperature=0.7     # temperature parameter
)

response = completion.choices[0].message.content

print(messages)

[{'role': 'system', 'content': 'You are a reporter for The Economist covering environmental economics'}, {'role': 'user', 'content': 'First, state your knowledge cutoff. Then, write 1 paragraph explaining the environmental impact of each LLM query. Focus on the use \nper single query. Compare to other water intensive activities such as brushing your teeth to give context. Be sure to \nuse the most recent scientific estimates and cite your sources.'}]


In [55]:
for resp in completion.choices: 
    print(resp.message.content)
    print('---')

My knowledge cutoff date is October 2023. 

The environmental impact of each large language model (LLM) query, specifically in terms of water consumption, is relatively modest when compared to traditional water-intensive activities. Recent estimates suggest that a single LLM query consumes about 0.3 liters of water, which is used primarily for cooling the data centers where the computations occur (Patterson et al., 2022). To put this into perspective, this is significantly less than the water usage involved in brushing your teeth, which can use up to 8 liters if the tap is left running (EPA, 2023). While the water use per LLM query is comparatively low, the cumulative impact can become significant given the scale at which these models are deployed across the globe. This highlights the importance of optimizing data center cooling systems and improving the energy efficiency of LLMs to mitigate their environmental footprint. 

References:
- Patterson, D. et al. (2022). 'Carbon Emissions a

Wow! These estimates vary a lot, from 0.5 liters to 0.0005 liters. 

#### Part 2: gpt-3.5-turbo ####

In [56]:
model = "gpt-3.5-turbo"
system_role = "You are a reporter for The Economist covering environmental economics"
initial_prompt = '''First, state your knowledge cutoff. Then, write 1 paragraph explaining the environmental impact of each LLM query. Focus on the use 
per single query. Compare to other water intensive activities such as brushing your teeth to give context. Be sure to 
use the most recent scientific estimates and cite your sources.'''

messages = [
    {"role": "system", "content": system_role} ,
    {"role": "user", "content": initial_prompt},
    # {"role": "assistant", "content": "" },
    # {"role": "user", "content": "" }
]

client = OpenAI()

completion = client.chat.completions.create(
  model=model,        # the model to use 
  messages=messages,  # a list of dictionaries with the prompt
  n=3,                # how many responses to generate
  temperature=0.7     # temperature parameter
)

response = completion.choices[0].message.content

print(messages)

[{'role': 'system', 'content': 'You are a reporter for The Economist covering environmental economics'}, {'role': 'user', 'content': 'First, state your knowledge cutoff. Then, write 1 paragraph explaining the environmental impact of each LLM query. Focus on the use \nper single query. Compare to other water intensive activities such as brushing your teeth to give context. Be sure to \nuse the most recent scientific estimates and cite your sources.'}]


In [58]:
for resp in completion.choices: 
    print(resp.message.content)
    print('---')

My knowledge cutoff date is September 2021. 

1. "Environmental impact of almond farming": The environmental impact of almond farming is significant due to its high water consumption. It is estimated that it takes around 1.1 gallons of water to produce a single almond, making almond farming one of the most water-intensive agricultural activities. To put this into context, the water used to produce a single almond is equivalent to the water used for brushing your teeth for two weeks. This high water consumption has raised concerns about the sustainability of almond farming, especially in water-stressed regions like California. (Source: University of California, Davis)

2. "Environmental impact of beef production": Beef production has a considerable environmental impact, primarily due to its high water usage and greenhouse gas emissions. It is estimated that producing a single pound of beef requires around 1,800 gallons of water, making it one of the most water-intensive foods. To put th

Wow! These didn't really even understand the question 

#### Part 3: o3-mini ####

In [64]:
model = "o3-mini"
system_role = "You are a reporter for The Economist covering environmental economics"
initial_prompt = '''First, state your knowledge cutoff. Then, write 1 paragraph explaining the environmental impact of each LLM query. Focus on the use 
per single query. Compare to other water intensive activities such as brushing your teeth to give context. Be sure to 
use the most recent scientific estimates and cite your sources.'''

messages = [
    {"role": "system", "content": system_role} ,
    {"role": "user", "content": initial_prompt},
    # {"role": "assistant", "content": "" },
    # {"role": "user", "content": "" }
]

client = OpenAI()

completion = client.chat.completions.create(
  model=model,        # the model to use 
  messages=messages,  # a list of dictionaries with the prompt
  n=3,                # how many responses to generate
)

response = completion.choices[0].message.content


In [62]:
for resp in completion.choices: 
    print(resp.message.content)
    print('---')

My knowledge cutoff is October 2023. Recent estimates suggest that the environmental footprint of a single query to a large language model (LLM) is extremely small—in the order of a few watt‐hours of electricity, which, through the necessary cooling processes in modern data centers, might translate into perhaps a few hundredths of a liter of water consumed per query. To put this into perspective, the water used in powering and cooling the servers that handle such a query would be several orders of magnitude lower than the 2–4 liters typically wasted during a two‐minute session of brushing one’s teeth under a running tap (U.S. Environmental Protection Agency, 2022). The rigorous energy and water management protocols now adopted by major cloud providers mean that each LLM query carries a lighter environmental toll than other daily water-intensive activities; indeed, while thousands of queries may collectively contribute to a measurable environmental demand, an individual query’s impact r

Interesting how these cite primarily the EPA and papers from many years ago. 

### So what's the *correct* answer? 
There is none, and that's part of why I thought this exercise was fun. 

One [study](https://e360.yale.edu/features/artificial-intelligence-climate-energy-emissions) from Yale in February 2024 suggests that "a person who engages in a session of questions and answers with GPT-3 (roughly 10 to 50 responses) drives the consumption of a half-liter of fresh water". 

So if we do the math based on this one estimate, each query is somewhere between 0.05 and 0.01 liters of water. But this may vary a lot for other LLM models, too. It seems that the models with later knowledge cutoffs (gpt-4o and o3-mini) were much better at understanding this. 