### Rewriting EKE code to run more effectively on NERSC

#### Fill in the project name and existing code location.

In [1]:
project_name = "EKE"
existing_code_location = "EKE-example-original"

#### Set up the AI

In [2]:
import script
import gpt_engineer.steps as steps
from gpt_engineer.ai import AI

ai, dbs = script.set_up(project_name, existing_code_location)

def do_step(step):
    messages = step(ai, dbs)
    if messages:
        dbs.logs[step.__name__] = AI.serialize_messages(messages)
    return messages

The following location will be used for processing
The code will be output to the workspace directory of that location
/Users/kberket/src/scalesci-demo/from_existing_code_eng/EKE
Model gpt-4 not available for provided API key. Reverting to gpt-3.5-turbo-16k. Sign up for the GPT-4 wait list here: https://openai.com/waitlist/gpt-4-api



#### Let's do it

In [3]:
do_step(script.get_summary)

The code calculates the eddy kinetic energy (EKE) and plots the meridional distribution. It reads in WRF data for a given time period and filters the u and v variables for waves with periods between 3-5 days. It then calculates the square of the filtered u and v variables, averages them over time, and calculates the zonal average. The EKE is then multiplied by 0.5 and divided by the acceleration due to gravity. The meridional average is calculated and integrated over the pressure dimension. Finally, the total EKE and the averaged EKE are plotted and saved as a PDF file.

[SystemMessage(content='You are an expert in optimizing scientific computations on HPC systems. \nYou will help this scientist take their existing code and turn it into a Jupyter notebook \nutilizing dask with improved performance (faster, more interactive).\n\nUseful to know:\nYou almost always put different classes in different files.\nFor Python, you always create an appropriate requirements.txt file.\nFor NodeJS, you always create an appropriate package.json file.\nYou always add a comment briefly describing the purpose of the function definition.\nYou try to add comments explaining very complex bits of logic.\nYou always follow the best practices for the requested languages in terms of describing the code written as a defined\npackage/project.\n\n\nPython toolbelt preferences:\n- pytest\n- dataclasses\n', additional_kwargs={}),
 HumanMessage(content='Instructions: I have this scientific computation that I wrote. \nI would like to optimize it such that it runs faster (utilizing par

In [4]:
do_step(script.get_improvement_suggestions)

1. Use Dask to parallelize the computation and take advantage of multiple cores or nodes in the HPC system.
2. Convert the code into a Jupyter notebook for a more interactive and visual experience.
3. Optimize the filtering function by using Dask arrays instead of NumPy arrays for better performance.
4. Use Dask's lazy evaluation to delay the computation until necessary, reducing memory usage.
5. Use Dask's distributed scheduler to distribute the computation across multiple nodes in the HPC system.
6. Use Dask's built-in plotting capabilities to visualize the results directly in the notebook.
7. Optimize the code by removing unnecessary imports and variables.
8. Use dataclasses to organize and manage the input and output data.
9. Use pytest for testing the code and ensure its correctness.
10. Create a requirements.txt file to specify the dependencies of the code.

[SystemMessage(content='You are an expert in optimizing scientific computations on HPC systems. \nYou will help this scientist take their existing code and turn it into a Jupyter notebook \nutilizing dask with improved performance (faster, more interactive).\n\nUseful to know:\nYou almost always put different classes in different files.\nFor Python, you always create an appropriate requirements.txt file.\nFor NodeJS, you always create an appropriate package.json file.\nYou always add a comment briefly describing the purpose of the function definition.\nYou try to add comments explaining very complex bits of logic.\nYou always follow the best practices for the requested languages in terms of describing the code written as a defined\npackage/project.\n\n\nPython toolbelt preferences:\n- pytest\n- dataclasses\n', additional_kwargs={}),
 HumanMessage(content='Instructions: I have this scientific computation that I wrote. \nI would like to optimize it such that it runs faster (utilizing par

In [5]:
while True:
    msgs = do_step(script.implement_improvement)
    if msgs is None:
        break

The code can be optimized using Dask to parallelize the computation and take advantage of multiple cores or nodes in the HPC system. Here's the breakdown of the core classes, functions, and methods that will be necessary:

1. `lat_lon()`: A function to get the map projection and the lat and lon values from the WRF data.
2. `butter_bandpass_filter()`: A function to filter variables for waves with periods between 3-5 days using a Butterworth bandpass filter.
3. `main()`: The main function that performs the computation and generates the plot.

Now, let's proceed with the implementation of the code in separate files.

**wrf_eke_nersc_ex.py**

```python
from __future__ import division
import os
from netCDF4 import Dataset
import numpy as np
import xarray as xr
import wrf as wrf
from datetime import datetime
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
from matplotlib import rcParams
import matplotlib.patches as mpatches
from matplotlib.backend

In [6]:
print(ai.format_token_usage_log())

step_name,prompt_tokens_in_step,completion_tokens_in_step,total_tokens_in_step,total_prompt_tokens,total_completion_tokens,total_tokens
get_summary,2544,128,2672,2544,128,2672
get_improvement_suggestions,2574,178,2752,5118,306,5424
implement_improvement,6485,3566,10051,11603,3872,15475
implement_improvement,7082,2733,9815,18685,6605,25290
implement_improvement,6135,2616,8751,24820,9221,34041
implement_improvement,6036,2636,8672,30856,11857,42713
implement_improvement,6147,2725,8872,37003,14582,51585
implement_improvement,6203,2694,8897,43206,17276,60482
implement_improvement,5352,1878,7230,48558,19154,67712
implement_improvement,5020,2364,7384,53578,21518,75096
implement_improvement,5508,2364,7872,59086,23882,82968
implement_improvement,5422,2280,7702,64508,26162,90670

