## Generate tasks for benchmarking set

This notebook is for generating the tasks for benchmarking. Generation requires the seed tasks prepared by researchers. 

After generating the draft tasks with LLM, be sure to read them carefully and make mindful selection for the benchmarking set, so the set meets your research criteria.

### Imports

In [None]:
import os

from typing import Annotated
from typing_extensions import TypedDict

from langchain_core.tools import StructuredTool
from langchain_anthropic import ChatAnthropic
from langgraph.graph.message import add_messages

from geobenchx.constants import MODEL_CLAUDE, DATA_FOLDER
from geobenchx.dataclasses import Task, TaskSet
from geobenchx.tools import (
    get_unique_values_tool, 
    load_data_tool, 
    load_geodata_tool, 
    make_choropleth_map_tool, 
    make_bivariate_map_tool,
    merge_dataframes_tool, 
    filter_categorical_tool, 
    filter_numerical_tool,
    select_features_by_spatial_relationship_tool,
    filter_points_by_raster_values_tool,
    create_buffer_tool,
    get_raster_path_tool,
    get_raster_description_tool,
    get_values_from_raster_with_geometries_tool,
    analyze_raster_overlap_tool,
    calculate_line_lengths_tool,
    calculate_columns_tool,
    scale_column_by_value_tool,
    make_heatmap_tool,
    visualize_geographies_tool,
    get_centroids_tool,
    generate_contours_display_tool,
    calculate_column_statistics_tool,
    DATA_CATALOG, GEO_CATALOG, COLORMAPS)

In [None]:
# Imports related to Anthropic APIs

# Loading API key

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")

In [3]:
class State(TypedDict):
    # Messages have the type "list". The `add_messages` function
    # in the annotation defines how this state key should be updated
    # (in this case, it appends messages to the list, rather than overwriting them)
    messages: Annotated[list, add_messages]

In [None]:
# notebook constants
TEMPERATURE = 0
GENERATE_FEASIBLE = 80
GENERATE_CONFUSING = 20

### Making the generation prompt

In [None]:
# Making list of tools for the TOOLS_PROMPT

tools_prompt_text =[]

tools_list = [
    get_unique_values_tool, 
    load_data_tool, 
    load_geodata_tool, 
    make_choropleth_map_tool, 
    make_bivariate_map_tool,
    merge_dataframes_tool, 
    filter_categorical_tool, 
    filter_numerical_tool,
    select_features_by_spatial_relationship_tool,
    filter_points_by_raster_values_tool,
    create_buffer_tool,
    get_raster_path_tool,
    get_raster_description_tool,
    get_values_from_raster_with_geometries_tool,
    analyze_raster_overlap_tool,
    calculate_line_lengths_tool,
    calculate_columns_tool,
    scale_column_by_value_tool,
    make_heatmap_tool,
    visualize_geographies_tool,
    get_centroids_tool,
    generate_contours_display_tool,
    calculate_column_statistics_tool
]
for item in tools_list:
    tool_name = item.name
    tool_desc = item.description
    # tool_code = inspect.getsource(item.func)
    tools_prompt_text.append(f"{tool_name}\n{tool_desc}")

In [None]:
tools_prompt_text

In [6]:
TOOLS_PROMPT = f"""
COLORMAPS = {COLORMAPS}

{tools_prompt_text}
"""

In [None]:
TOOLS_PROMPT

In [8]:
DATA_PROMPT = f"""
DATA_CATALOG = {DATA_CATALOG} 

GEO_CATALOG = {GEO_CATALOG}

In the file with geometries (countries), the countries assigned to various regions. Available classifications:
REGION_WB:'East Asia & Pacific',  'Latin America & Caribbean',  'Europe & Central Asia',  'South Asia',  'Middle East & North Africa',  'Sub-Saharan Africa',  'North America',  'Antarctica'.
SUBREGION: 'South-Eastern Asia',
 'South America',
 'Western Asia',
 'Southern Asia',
 'Eastern Asia',
 'Eastern Africa',
 'Western Europe',
 'Northern Africa',
 'Central America',
 'Middle Africa',
 'Eastern Europe',
 'Southern Africa',
 'Caribbean',
 'Central Asia',
 'Northern Europe',
 'Southern Europe',
 'Western Africa',
 'Northern America',
 'Melanesia',
 'Australia and New Zealand',
 'Polynesia',
 'Seven seas (open ocean)',
 'Micronesia'.
REGION_UN: 'Asia', 'Americas', 'Africa', 'Europe', 'Oceania', 'Seven seas (open ocean)'.
CONTINENT:'Asia',  'South America',  'Africa',  'Europe',  'North America',  'Oceania',  'Seven seas (open ocean)'
INCOME_GRP:'4. Lower middle income',  '3. Upper middle income',  '2. High income: nonOECD',  '1. High income: OECD',  '5. Low income'

Timeseries in the statistical data go back to 1960 and have columns for every year since then. Not all of them have full set of data.
"""

In [9]:
SETTING_PROMPT = f"""
You are testing a Large Language Model (LLM) for its ability to solve tasks requiring some geospatial operations. The LLM will use function/tool calling to solve tasks.
You provide the LLM with a list of tools.
<TOOLS> {TOOLS_PROMPT}</TOOLS>
You provide the LLM with datasets and geometries:
<DATA> {DATA_PROMPT}</DATA>
"""

In [None]:
print(SETTING_PROMPT)

In [None]:
## Examples of tasks for the group 'Process - Merge - Visualize'

# feasible_tasks_examples = f"""
# 1) Make a map of level of usage of water resources in countries of Africa.
# Solution: 
# 1. load_data(dataset='Annual freshwater withdrawals, total (% of internal resources)', output_dataframe_name='water_usage') 
# 2. load_geodata(geodataset='Countries', output_geodataframe_name='countries') 
# 3. filter_categorical(dataframe_name='countries', output_dataframe_name='african_countries', filters="CONTINENT": "Africa") 
# 4. merge_dataframes(statkey='Country Name', geokey='NAME_EN', dataframe_name='water_usage', geodataframe_name='african_countries', output_dataframe_name='african_water_usage') 
# 5. make_map(colormap='Water', dataframe_name='african_water_usage', legendtext='Annual freshwater withdrawals, total (% of internal resources)', mappingkey='2021') 
 
# 2) Make a series of maps with share of forest covered areas in countries of South Asia in 2005, 2015 and the last available year.
# Solution:
# 1. load_data(dataset='Forest area (% of land area)', output_dataframe_name='forest_data') 
# 2. load_geodata(geodataset='Countries', output_geodataframe_name='countries') 
# 3. filter_categorical(dataframe_name='countries', filters='REGION_WB': 'South Asia', output_dataframe_name='south_asia') 
# 4. merge_dataframes(dataframe_name='forest_data', geodataframe_name='south_asia', statkey='Country Name', geokey='NAME_EN', output_dataframe_name='south_asia_forest') 
# 5. make_map(dataframe_name='south_asia_forest', mappingkey='2005', legendtext='Forest area (% of land area) in 2005', colormap='Forest') 
# 6. make_map(dataframe_name='south_asia_forest', mappingkey='2015', legendtext='Forest area (% of land area) in 2015', colormap='Forest') 
# 7. make_map(dataframe_name='south_asia_forest', mappingkey='2021', legendtext='Forest area (% of land area) in 2021', colormap='Forest')
 
# 3) Make a series of separate maps of electric consumption per capita per country for subregions of Europe.
# Solution:
# 1. load_data(dataset='Electric power consumption (kWh per capita)', output_dataframe_name='power_data') 
# 2. load_geodata(geodataset='Countries', output_geodataframe_name='countries_geo') 
# 3. merge_dataframes(dataframe_name='power_data', geodataframe_name='countries_geo', statkey='Country Name', geokey='NAME_EN', output_dataframe_name='merged_power') 
# 4. filter_categorical(dataframe_name='merged_power', filters='CONTINENT': 'Europe', output_dataframe_name='europe_power') 
# 5. get_unique_values(dataframe_name='europe_power', column='SUBREGION') 
# 6. filter_categorical(dataframe_name='europe_power', filters='SUBREGION': 'Western Europe', output_dataframe_name='western_europe') 
# 7. make_map(dataframe_name='western_europe', mappingkey='2014', legendtext='Electric power consumption (kWh per capita) - Western Europe', colormap='Economics') 
# 8. filter_categorical(dataframe_name='europe_power', filters='SUBREGION': 'Eastern Europe', output_dataframe_name='eastern_europe') 
# 9. make_map(dataframe_name='eastern_europe', mappingkey='2014', legendtext='Electric power consumption (kWh per capita) - Eastern Europe', colormap='Economics') 
# 10. filter_categorical(dataframe_name='europe_power', filters='SUBREGION': 'Northern Europe', output_dataframe_name='northern_europe') 
# 11. make_map(dataframe_name='northern_europe', mappingkey='2014', legendtext='Electric power consumption (kWh per capita) - Northern Europe', colormap='Economics') 
# 12. filter_categorical(dataframe_name='europe_power', filters='SUBREGION': 'Southern Europe', output_dataframe_name='southern_europe') 
# 13. make_map(dataframe_name='southern_europe', mappingkey='2014', legendtext='Electric power consumption (kWh per capita) - Southern Europe', colormap='Economics')
 
# 4) Visualize annual freshwater withdrawals in Asia.
# Solution:
# 1. load_data(dataset='Annual freshwater withdrawals, total (billion cubic meters)', output_dataframe_name='water_data') 
# 2. load_geodata(geodataset='Countries', output_geodataframe_name='countries_geo') 
# 3. merge_dataframes(dataframe_name='water_data', geodataframe_name='countries_geo', statkey='Country Name', geokey='NAME_EN', output_dataframe_name='merged_water') 
# 4. filter_categorical(dataframe_name='merged_water', filters='CONTINENT': 'Asia', output_dataframe_name='asia_water') 
# 5. make_map(dataframe_name='asia_water', mappingkey='2021', legendtext='Annual freshwater withdrawals (billion cubic meters)', colormap='Water')
 
# 5) Visualize how contribution of agriculture to GDP varies in countries of Americas.
# Solution:
# 1. load_data(dataset='Agriculture, value added (% of GDP)', output_dataframe_name='agri_gdp') 
# 2. load_geodata(geodataset='Countries', output_geodataframe_name='countries') 
# 3. merge_dataframes(dataframe_name='agri_gdp', geodataframe_name='countries', statkey='Country Name', geokey='NAME_EN', output_dataframe_name='merged_data') 
# 4. filter_categorical(dataframe_name='merged_data', filters="CONTINENT": "North America", "South America", output_dataframe_name='americas_data') 
# 5. make_map(dataframe_name='americas_data', mappingkey='2021', legendtext='Agriculture, value added (% of GDP)', colormap='Agriculture')

# 6) Map greenhouse gas emissions per capita in high-income countries.
# Solution:
# 1. load_data(dataset='Greenhouse gases emission per capita, carbon dioxide-equivalents, tons', output_dataframe_name='ghg_emissions') 
# 2. load_geodata(geodataset='Countries', output_geodataframe_name='countries_geodata') 
# 3. get_unique_values(dataframe_name='countries_geodata', column='INCOME_GRP') 
# 4. filter_categorical(dataframe_name='countries_geodata', filters="'INCOME_GRP': ['1. High income: OECD', '2. High income: nonOECD']", output_dataframe_name='high_income_countries') 
# 5. merge_dataframes(dataframe_name='ghg_emissions', geodataframe_name='high_income_countries', statkey='Country Name', geokey='NAME_EN', output_dataframe_name='merged_ghg_high_income') 
# 6. make_map(dataframe_name='merged_ghg_high_income', mappingkey='2023', legendtext='Greenhouse Gas Emissions per Capita (tons) - 2023', colormap='Environment')
# """

In [None]:
## Examples of tasks for the group 'Spatial Operations'

# feasible_tasks_examples = f"""
# 1.How many people were affected by flood of August 2018 in Bangladesh?
# Solution:
# 1. get_raster_path(rasterdataset='Tibetan Plato South Asia flood extent August 2018') 
# 2. get_raster_path(rasterdataset='Bangladesh population 2018, people, resolution 3 arc or apprx 100 m') 
# 3. analyze_raster_overlap(raster1_path='zip://G:/My Drive/Geo Agent/Tasks data/GeoData/DFO_4665_From_20180802_to_20180810.zip!/DFO_4665_From_20180802_to_20180810.tif', raster2_path='G:/My Drive/Geo Agent/Tasks data/GeoData/bgd_ppp_2018_UNadj.tif', output_variable_name='flood_impact_analysis') 

# 2.How many people live within 1 km from a railway in Bangladesh?
# Solution:
# 1. load_geodata(geodataset='Railway lines in Bangladesh', output_geodataframe_name='bangladesh_railways') 
# 2. create_buffer(geodataframe_name='bangladesh_railways', buffer_size='1000', output_geodataframe_name='railway_buffer') 
# 3. get_raster_path(rasterdataset='Bangladesh population 2018, people, resolution 3 arc or apprx 100 m') 
# 4. get_values_from_raster_with_geometries(raster_path='G:/My Drive/Geo Agent/Tasks data/GeoData/bgd_ppp_2018_UNadj.tif', geodataframe_name='railway_buffer', output_variable_name='population_in_buffer') 

# 3.How many towns in USA within 5 km from an Amtrak station have accumulated snow cover in the current season over 3 feet?
# Solution:
# 1. load_geodata(geodataset='Amtrak railway stations', output_geodataframe_name='amtrak_stations') 
# 2. load_geodata(geodataset='Cities and Towns of the United States, 2014', output_geodataframe_name='us_towns') 
# 3. create_buffer(geodataframe_name='amtrak_stations', buffer_size='5000', output_geodataframe_name='amtrak_buffer') 
# 4. select_features_by_spatial_relationship(features_geodataframe_name='us_towns', reference_geodataframe_name='amtrak_buffer', spatial_predicate='within', output_geodataframe_name='towns_near_amtrak') 
# 5. get_raster_path(rasterdataset='Accumulated snow cover season 2024-2025 till February 3, 2025, USA, inches') 
# 6. filter_points_by_raster_values(raster_path='G:/My Drive/Geo Agent/Tasks data/GeoData/sfav2_CONUS_2024093012_to_2025020312.tif', points_geodataframe_name='towns_near_amtrak', value_column='snow_depth', filter_type='greater', threshold1='36', output_geodataframe_name='snowy_towns_near_amtrak')

# 4.Make a map of towns in USA that are within 1 mile from an Amtrak station and accumulated 3-4 feet of snow in the last season.
# Solution:
# 1. load_geodata(geodataset='Amtrak railway stations', output_geodataframe_name='amtrak_stations') 
# 2. load_geodata(geodataset='Cities and Towns of the United States, 2014', output_geodataframe_name='us_towns') 
# 3. get_raster_path(rasterdataset='Accumulated snow cover seazon 2023-2024, USA, inches') 
# 4. create_buffer(geodataframe_name='amtrak_stations', buffer_size='1609', output_geodataframe_name='amtrak_buffers') 
# 5. select_features_by_spatial_relationship(features_geodataframe_name='us_towns', reference_geodataframe_name='amtrak_buffers', spatial_predicate='within', output_geodataframe_name='towns_near_amtrak') 
# 6. filter_points_by_raster_values(raster_path='G:/My Drive/Geo Agent/Tasks data/GeoData/sfav2_CONUS_2023093012_to_2024093012.tif', points_geodataframe_name='towns_near_amtrak', value_column='snow_depth', output_geodataframe_name='final_towns', filter_type='between', threshold1='36', threshold2='48') 
# 7. make_map(dataframe_name='final_towns', mappingkey='snow_depth', legendtext='Snow Depth (inches)', colormap='Hazards')

# 5. How many people in USA live in 5-mile radius from an Amtrak station?
# Solution:
# 1. load_geodata(geodataset='Amtrak railway stations', output_geodataframe_name='amtrak_stations') 
# 2. get_raster_path(rasterdataset='USA population 2020, people, resolution 1 km') 
# 3. create_buffer(geodataframe_name='amtrak_stations', buffer_size='8047', output_geodataframe_name='amtrak_stations_buffer') 
# 4. get_values_from_raster_with_geometries(raster_path='G:/My Drive/Geo Agent/Tasks data/GeoData/usa_ppp_2020_1km_Aggregated_UNadj.tif', geodataframe_name='amtrak_stations_buffer', output_variable_name='population_within_5_miles', plot_result='True') 

# 6. List and make a map of all counties that touch or include the Mississippi River (Penn State)
# Solution:
# 1. load_geodata(geodataset='Rivers in North America', output_geodataframe_name='mississippi_river') 
# 2. load_geodata(geodataset='USA countires borders', output_geodataframe_name='usa_counties') 
# 3. filter_categorical(dataframe_name='mississippi_river', filters="{"'NameEn': 'Mississippi River'"}", output_dataframe_name='mississippi_river_filtered') 
# 4. select_features_by_spatial_relationship(features_geodataframe_name='usa_counties', reference_geodataframe_name='mississippi_river_filtered', spatial_predicate='intersects', output_geodataframe_name='counties_touching_mississippi') 
# 5. visualize_geographies(geodataframe_name='counties_touching_mississippi', basemap_style='OpenStreetMap', geometry_color='blue', geometry_alpha='0.6', geometry_linewidth='1.0', title='Counties Touching or Including the Mississippi River') 

# 7. Generate a list of all states whose boundaries touch Wyoming (Penn State)
# Solution:
# 1. load_geodata(geodataset='USA states borders', output_geodataframe_name='us_states') 
# 2. filter_categorical(dataframe_name='us_states', filters="{"'NAME': 'Wyoming'"}", output_dataframe_name='wyoming_state') 
# 3. select_features_by_spatial_relationship(features_geodataframe_name='us_states', reference_geodataframe_name='wyoming_state', spatial_predicates='['touches']', output_geodataframe_name='wyoming_neighbors') 
# 4. get_unique_values(dataframe_name='wyoming_neighbors', column='NAME') 

# """

In [13]:
# Examples of tasks for group 'Heatmaps, Contour lines'

feasible_tasks_examples = f"""
1. Make a heatmap final size of fire incidents in USA
Solution
1. load_geodata(geodataset='Current Wildland Fire Incident Locations, size in acres', output_geodataframe_name='fires') 
2. make_heatmap(geodataframe_name='fires', value_column='IncidentSi', map_style='carto-positron', radius='30') 
 
2. Make a heatmap of TB cases in Massachusetts
Solution
1. load_data(dataset='Incidence of Tuberculosis Disease 2023 Massachusetts Counties', output_dataframe_name='tb_cases') 
2. load_geodata(geodataset='USA counties borders', output_geodataframe_name='ma_counties') 
3. filter_categorical(dataframe_name='ma_counties', filters="", output_dataframe_name='massachusetts_counties') 
4. merge_dataframes(dataframe_name='tb_cases', geodataframe_name='massachusetts_counties', statkey='County', geokey='NAME', output_dataframe_name='merged_tb_data') 
5. get_centroids(geodataframe_name='merged_tb_data', output_geodataframe_name='centroids_tb_data') 
6. make_heatmap(geodataframe_name='centroids_tb_data', value_column='Number of Cases', output_html_name='tb_cases_heatmap.html', map_style='carto-positron', radius='15') 
 
3.Chart contour lines for accumulated snow fall in winter 2023-2024
Solution:
1. get_raster_path(rasterdataset='Accumulated snow cover season 2023-2024, USA, inches') 
2. get_raster_description(raster_path='') 
3. plot_contour_lines(raster_path='', output_geodataframe_name='snow_contours_2023_2024', interval='5', min_value='0', plot_result='True', title='Contour Lines for Accumulated Snowfall in Winter 2023-2024') 
 
4.Show zones with similar population in Bangladesh
Solution:
1. get_raster_path(rasterdataset='Bangladesh population 2018, people, resolution 3 arc or apprx 100 m') 
2. get_raster_description(raster_path='G:/My Drive/Geo Agent/Tasks data/GeoData/bgd_ppp_2018_UNadj.tif') 
3. plot_contour_lines(raster_path='G:/My Drive/Geo Agent/Tasks data/GeoData/bgd_ppp_2018_UNadj.tif', output_geodataframe_name='bangladesh_pop_contours', interval='500', title='Population Density Zones in Bangladesh (2018)') 
"""

In [13]:
confusing_tasks_manual = f"""
1) Map population density in Sub-Saharan Africa.
Why not solvable:
No data on countries areas in sq km.

2) Map value of agriculture sector in USA counties along the Great Lakes shores.
Why not solvable:
No data on value of agriculture sector/production by USA counties is provided
No data allowing calculate the value is provided

3) Map forest areas per capita in 50 km zone from the Mekong river.
Why not solvable:
No Mekong river geometry is provided, no forest areas geometry or raster is provided.

4) Map GHG emissions in major urban areas
Why not solvable:
No data on GHG by cities or urban areas
No poligon/point geodataset with urban areas 
No point geodataset with cities locations
"""

In [None]:
# # Task set 4
# PROMPT_FEASIBLE_TASK = f"""
# Here are examples of the tasks that the LLM can solve with these tools and datasets:
# <EXAMPLES>{feasible_tasks_examples}</EXAMPLES>
# Generate {GENERATE_FEASIBLE} more questions that are possible to solve using these tools and data. 
# <GUIDELINES>
#  - Make sure that datasets from DATA_CATALOG in DATA are enough to solve the tasks.
#  - Make sure usage of raster files or spatial selection tool is required.
#  - Make sure that tools from TOOLS are enough.
#  - Make sure classifications of the countires are enough to solve the task.
#  - Do not generate solution. Only the task.
#  - Please, present the results as a python list with no additional text.
#  - For this set of tasks, make sure that most require to make a heatmap or a contour lines map.
# </GUIDELINES>
# Follow GUIDELINES while providing response. 
# """

By changing the examples of the tasks, you can tailor the prompt to generate tasks for different groups of tasks

In [16]:
# Task set 3
PROMPT_FEASIBLE_TASK = f"""
Generate {GENERATE_FEASIBLE} more questions that are possible to solve using these tools and data. 
<GUIDELINES>
 - Make sure that datasets from DATA_CATALOG in DATA are enough to solve the tasks.
 - Make sure usage of raster files or spatial selection tool is required.
 - Make sure that tools from TOOLS are enough.
 - Make sure classifications of the countires are enough to solve the task.
 - Do not generate solution. Only the task.
 - Please, present the results as a python list with no additional text.
</GUIDELINES>
Follow GUIDELINES while providing response. 
"""

In [None]:
# # Task set 2
# PROMPT_FEASIBLE_TASK = f"""
# Here are examples of the tasks that the LLM can solve with these tools and datasets:
# <EXAMPLES>{feasible_tasks_examples}</EXAMPLES>
# Generate {GENERATE_FEASIBLE} more questions that are possible to solve using these tools and data. 
# <GUIDELINES>
#  - Make sure that datasets from DATA_CATALOG in DATA are enough to solve the tasks.
#  - Make sure that tools from TOOLS are enough.
#  - Make sure classifications of the countires are enough to solve the task.
#  - Do not generate solution. Only the task.
#  - Please, present the results as a python list with no additional text.
# </GUIDELINES>
# Follow GUIDELINES while providing response. 
# """

In [None]:
PROMPT_CONFUSING_TASK = f"""
Here are examples of the tasks that the LLM can NOT solve with these tools and datasets, but they look similar to the ones it can solve:
{confusing_tasks_manual}
Generate {GENERATE_CONFUSING} more questions that look similar, but cannot be answered using these tools and datasets.
 Please, present the results as a python list with no additional text.
"""

### Generating the tasks

In [21]:
# Generating the feasible tasks. 

llm = ChatAnthropic(model=MODEL_CLAUDE, temperature=TEMPERATURE, max_tokens = 4000)
input = SETTING_PROMPT+PROMPT_FEASIBLE_TASK
response = llm.invoke(input)
print(response.content)

[
    "Create a bivariate map showing the relationship between forest area (% of land area) and CO2 emissions per capita in South American countries for the year 2020. Use appropriate color schemes for environmental and hazard variables.",
    
    "Compare total freshwater withdrawals (billion cubic meters) between East Asian and South Asian countries in 2015, visualizing the results as a choropleth map. Calculate the mean withdrawal values for both regions.",
    
    "Create a map showing GDP per capita (current US$) for African countries in 2019, but only for countries that have power stations within their borders. Use the spatial relationship selection tool to identify qualifying countries.",
    
    "Generate a visualization comparing forest area (sq. km) between high-income OECD countries and upper-middle-income countries in 2018. Calculate the mean forest area for each income group.",
    
    "Create a map showing the electric power consumption (kWh per capita) in 2019 for co

In [None]:
draft_tasks = [
    # insert here tasks generated by Claude. Sonnet generates a string, not a list, but since it is formatted as a list, simple copy-paste and assigning it to a valiable works.
    
]

In [None]:
# Saving the tasks. Note that metadata will be later filled with information on model that generated canditate solutions, model that evaluated them, etc. It is OK to leave it empty for now.

file_name = "draftTaskSet.json"
tasks = TaskSet(metadata = {}, tasks = [Task(task_text = draft_task) for draft_task in draft_tasks])
tasks.save_to_file(file_name, DATA_FOLDER)

In [None]:
# Generrating confusing tasks - check the 'feasible' task first, most likely there are enough tasks that are not actiually solvable there. 
# Use the code from above cell to save the tasks

llm = ChatAnthropic(model=MODEL_CLAUDE, temperature=TEMPERATURE, max_tokens = 4000)
input = SETTING_PROMPT+PROMPT_CONFUSING_TASK
response = llm.invoke(input)
print(response.content)