FEAT: Add fetch function for SecLists AI LLM Bias Testing datasets (#267)#280
Conversation
rdheekonda
left a comment
There was a problem hiding this comment.
Great work, Kutal and Roman.
|
Hello @romanlutz, @rdheekonda, During testing of the Steps to Reproduce
Alternatively, it should be possible to just run the Error MessageError fetching data from table PromptMemoryEntries: (duckdb.duckdb.ConversionException) Conversion Error: Could not convert string '2965249278352' to INT128
[SQL: SELECT "PromptMemoryEntries".id AS "PromptMemoryEntries_id", "PromptMemoryEntries".role AS "PromptMemoryEntries_role", "PromptMemoryEntries".conversation_id AS "PromptMemoryEntries_conversation_id", "PromptMemoryEntries".sequence AS "PromptMemoryEntries_sequence", "PromptMemoryEntries".timestamp AS "PromptMemoryEntries_timestamp", "PromptMemoryEntries".labels AS "PromptMemoryEntries_labels", "PromptMemoryEntries".prompt_metadata AS "PromptMemoryEntries_prompt_metadata", "PromptMemoryEntries".converter_identifiers AS "PromptMemoryEntries_converter_identifiers", "PromptMemoryEntries".prompt_target_identifier AS "PromptMemoryEntries_prompt_target_identifier", "PromptMemoryEntries".orchestrator_identifier AS "PromptMemoryEntries_orchestrator_identifier", "PromptMemoryEntries".response_error AS "PromptMemoryEntries_response_error", "PromptMemoryEntries".original_value_data_type AS "PromptMemoryEntries_original_value_data_type", "PromptMemoryEntries".original_value AS "PromptMemoryEntries_original_value", "PromptMemoryEntries".original_value_sha256 AS "PromptMemoryEntries_original_value_sha256", "PromptMemoryEntries".converted_value_data_type AS "PromptMemoryEntries_converted_value_data_type", "PromptMemoryEntries".converted_value AS "PromptMemoryEntries_converted_value", "PromptMemoryEntries".converted_value_sha256 AS "PromptMemoryEntries_converted_value_sha256"
FROM "PromptMemoryEntries"
WHERE ("PromptMemoryEntries".orchestrator_identifier ->> $1) = $2::UUID]
[parameters: ('id', UUID('b88e999d-2595-4b96-8188-f8abeb52fdfa'))]
(Background on this error at: https://sqlalche.me/e/20/9h9h)
Traceback (most recent call last):
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\engine\base.py", line 1970, in _exec_single_context
self.dialect.do_execute(
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\engine\default.py", line 924, in do_execute
cursor.execute(statement, parameters)
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\duckdb_engine\__init__.py", line 162, in execute
self.__c.execute(statement, parameters)
duckdb.duckdb.ConversionException: Conversion Error: Could not convert string '2965249278352' to INT128
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\vkuta\projects\PyRIT\pyrit\memory\duckdb_memory.py", line 272, in query_entries
return query.all()
^^^^^^^^^^^
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\orm\query.py", line 2673, in all
return self._iter().all() # type: ignore
^^^^^^^^^^^^
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\orm\query.py", line 2827, in _iter
result: Union[ScalarResult[_T], Result[_T]] = self.session.execute(
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\orm\session.py", line 2306, in execute
return self._execute_internal(
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\orm\session.py", line 2191, in _execute_internal
result: Result[Any] = compile_state_cls.orm_execute_statement(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\orm\context.py", line 293, in orm_execute_statement
result = conn.execute(
^^^^^^^^^^^^^
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\engine\base.py", line 1421, in execute
return meth(
^^^^^
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\sql\elements.py", line 514, in _execute_on_connection
return connection._execute_clauseelement(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\engine\base.py", line 1643, in _execute_clauseelement
ret = self._execute_context(
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\engine\base.py", line 1849, in _execute_context
return self._exec_single_context(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\engine\base.py", line 1989, in _exec_single_context
self._handle_dbapi_exception(
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\engine\base.py", line 2356, in _handle_dbapi_exception
raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\engine\base.py", line 1970, in _exec_single_context
self.dialect.do_execute(
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\sqlalchemy\engine\default.py", line 924, in do_execute
cursor.execute(statement, parameters)
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\site-packages\duckdb_engine\__init__.py", line 162, in execute
self.__c.execute(statement, parameters)
sqlalchemy.exc.DataError: (duckdb.duckdb.ConversionException) Conversion Error: Could not convert string '2965249278352' to INT128
[SQL: SELECT "PromptMemoryEntries".id AS "PromptMemoryEntries_id", "PromptMemoryEntries".role AS "PromptMemoryEntries_role", "PromptMemoryEntries".conversation_id AS "PromptMemoryEntries_conversation_id", "PromptMemoryEntries".sequence AS "PromptMemoryEntries_sequence", "PromptMemoryEntries".timestamp AS "PromptMemoryEntries_timestamp", "PromptMemoryEntries".labels AS "PromptMemoryEntries_labels", "PromptMemoryEntries".prompt_metadata AS "PromptMemoryEntries_prompt_metadata", "PromptMemoryEntries".converter_identifiers AS "PromptMemoryEntries_converter_identifiers", "PromptMemoryEntries".prompt_target_identifier AS "PromptMemoryEntries_prompt_target_identifier", "PromptMemoryEntries".orchestrator_identifier AS "PromptMemoryEntries_orchestrator_identifier", "PromptMemoryEntries".response_error AS "PromptMemoryEntries_response_error", "PromptMemoryEntries".original_value_data_type AS "PromptMemoryEntries_original_value_data_type", "PromptMemoryEntries".original_value AS "PromptMemoryEntries_original_value", "PromptMemoryEntries".original_value_sha256 AS "PromptMemoryEntries_original_value_sha256", "PromptMemoryEntries".converted_value_data_type AS "PromptMemoryEntries_converted_value_data_type", "PromptMemoryEntries".converted_value AS "PromptMemoryEntries_converted_value", "PromptMemoryEntries".converted_value_sha256 AS "PromptMemoryEntries_converted_value_sha256"
FROM "PromptMemoryEntries"
WHERE ("PromptMemoryEntries".orchestrator_identifier ->> $1) = $2::UUID]
[parameters: ('id', UUID('b88e999d-2595-4b96-8188-f8abeb52fdfa'))]
(Background on this error at: https://sqlalche.me/e/20/9h9h)
Traceback (most recent call last):
File "C:\Users\vkuta\projects\PyRIT\doc\demo\8_test_seclists_bias_testing.py", line 142, in <module>
asyncio.run(run())
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\asyncio\runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\asyncio\runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\vkuta\anaconda3\envs\pyrit-dev\Lib\asyncio\base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "C:\Users\vkuta\projects\PyRIT\doc\demo\8_test_seclists_bias_testing.py", line 115, in run
memory = orchestrator.get_memory()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\vkuta\projects\PyRIT\pyrit\orchestrator\orchestrator_class.py", line 75, in get_memory
return self._memory.get_prompt_request_piece_by_orchestrator_id(orchestrator_id=self._id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\vkuta\projects\PyRIT\pyrit\memory\memory_interface.py", line 163, in get_prompt_request_piece_by_orchestrator_id
return sorted(prompt_pieces, key=lambda x: (x.conversation_id, x.timestamp))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not iterableThank you for your wonderful support so far! Any pointers, hints, or suggestions on how to resolve this error would be greatly appreciated! |
|
@KutalVolkan can you try deleting your results folder and rerun? Over time, we may make small changes to the DB and that can throw it off. Sadly, the errors are fairly hard to decipher. In most cases that I've seen deleting the results folder gives you a fresh start and the error doesn't show again. |
…d fixed issues across all files.- Updated .gitignore for better exclusion.- Import changes in many_shot_jailbreak.ipynb and .py - Updated datasets initialization and fetching scripts.- Added new seclists_bias_testing notebooks and scripts.
fed64e4 to
4f9855b
Compare
Hi Roman, Your suggestion to delete the results folder and rerun worked perfectly, thank you! Additionally, I've successfully merged my branch with the PyRIT main branch and added the fetch function for SecLists AI LLM Bias Testing datasets. You can review all the changes at your convenience. Please feel free to provide any feedback, and if there are any changes you'd like to see, I'll be happy to make them. Thanks again for your help! Best, |
romanlutz
left a comment
There was a problem hiding this comment.
Looks great! A few tweaks and we should be able to merge. I'll make sure to ask my teammates for feedback as well (if any)
…ed .gitignore to ignore unnecessary files.- Modified many_shot_jailbreak.ipynb and many_shot_jailbreak.py with improvements.- Modified seclists_bias_testing.ipynb and seclists_bias_testing.py for better functionality.- Updated fetch_example_datasets.py for enhanced placeholder management.Next steps:- Write comprehensive unit and integration tests to validate functionality.
…chestrator logic in seclists_bias_testing.ipynb
…chestrator logic in seclists_bias_testing.ipynb
romanlutz
left a comment
There was a problem hiding this comment.
Amazing! Thank you for your patience and for incorporating all our feedback. Let me know if there's anything you still meant to add, otherwise I'll be happy to merge.
|
Hi @romanlutz , I wanted to let you know that I've resolved all issues. Additionally, I've updated our file handling to include encoding='utf-8' to ensure compatibility with different languages. Everything is done on my end, and it can be merged. Please, let me know if you need anything else, and thank you for your great support! |
|
Awesome! Can you add pycountry to the pyproject.toml? I think that's all that's left. |
Hello Roman, Done! Thank you for your great support! |
Hi @romanlutz,
To the best of my knowledge, I have completed the code implementation for the SecLists AI LLM Bias Testing (#267).
Summary
Added a function to fetch SecLists AI LLM Bias Testing datasets, process the data, and convert it into a
PromptDataset. This includes handling placeholders for Country, Region, Nationality, Gender, and Skin-Color.Changes
pyrit/datasets/fetch_examples.pywith the new functionfetch_seclists_bias_testing_examples.8_test_seclists_bias_testing.pyto use the new fetch function.Please review the changes and let me know if there are any improvements or adjustments needed.
Best regards,
Volkan