Skip to content

Conversation

@jsong468
Copy link
Contributor

@jsong468 jsong468 commented Jan 13, 2026

Description

Adds a new binary_path data type to represent arbitrary binary data stored in blob or disk storage, particularly useful for XPIA attacks where targets may not support specific file types.

  • Added binary_path to PromptDataType Literal in literals.py
  • Created BlobPathDataTypeSerializer class in data_type_serializer.py where data_on_disk()=True, uses /binaries subdirectory, default extension .bin
  • Updated data_serializer_factory to route binary_path to the new serializer
  • Updated PDFConverter to output binary_path instead of url which was semantically incorrect.

Minor change:

  • content is no longer passed as an arg to value in data_serializer_factory in PDF converter. This was misleading since value should be a file path and was overriden in save_data method call in the next line anyways.

Tests and Documentation

Tests added/updated:

  • Added 6 new tests in test_data_type_serializer.py:
  • Updated test_literals.py to include binary_path in expected literals
  • Updated test_pdf_converter.py to expect binary_path output type instead of url
  • Updated test_prompt_converter.py to ensure correct expected_output_type

Notebook:

  • Re-ran 5_file_converters.ipynb notebook

@jsong468 jsong468 changed the title FEAT: add_blob_data_type FEAT: Add blob_path data type Jan 13, 2026
@jsong468 jsong468 changed the title FEAT: Add blob_path data type FEAT: Add binary_path data type Jan 14, 2026
@hannahwestra25
Copy link
Contributor

"content is no longer passed as an arg to value in data_serializer_factory. This was misleading since value should be a file path and was overriden in save_data method call in the next line anyways."
This is not in this PR right ? or am I missing something ?

@jsong468
Copy link
Contributor Author

"content is no longer passed as an arg to value in data_serializer_factory. This was misleading since value should be a file path and was overriden in save_data method call in the next line anyways." This is not in this PR right ? or am I missing something ?

Line 412 of pdf_converter.py :) @hannahwestra25

@jsong468 jsong468 merged commit 4a40aa3 into Azure:main Jan 15, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants