Description
🔍 Bug Summary
An automatic scan is sending over [object] in the tag list, but a manual scan sends the proper list of existing tags
📖 Description
I noticed that when doing manual scans i was consistently getting different results than the "scan now" button on the same documents. I looked at the api request going to the LLM and noticed a difference, the existing paperless-ngx tags are not being transmitted when hitting scan now.
🔄 Steps to Reproduce
i did a fresh install of paperless-ngx and paperless-ai. analyze a document manually, and analzye a document with the scan now button while looking at the api calls. I am running a local llm, which makes looking at the api calls fairly easy.
✅ Expected Behavior
the tag list should be transmitted during an automated scan and a manual scan
❌ Actual Behavior
the tag list is not being transmitted to the ai api during an automated scan
🏷️ Paperless-AI Version
3.0.6
📜 Docker Logs
#relevant portion of API request being transmitted during a manual scan
"messages": [
{
"role": "system",
"content": "\n Prexisting tags: Health/Prescription, inbox, Taxes/2024, Taxes/2025, Taxes/HSA\n\n\n Prexisiting correspondent: \n\n\n \n\n\n Return the result EXCLUSIVELY as a JSON object. The Tags, Title and Document_Type MUST be in the language that is used in the document.:\n IMPORTANT: The custom_fields are optional and can be left out if not needed, only try to fill out the values if you find a matching information in the document.\n Do not change the value of field_name, only fill out the values. If the field is about money only add the number without currency and always use a . for decimal places.\n {\n \"title\": \"xxxxx\",\n \"correspondent\": \"xxxxxxxx\",\n \"tags\": [\"Tag1\", \"Tag2\", \"Tag3\", \"Tag4\"],\n \"document_type\": \"Invoice/Contract/...\",\n \"document_date\": \"YYYY-MM-DD\",\n \"language\": \"en/de/es/...\",\n \"custom_fields\": {\n \"0\": {\n \"field_name\": \"Total\",\n \"value\": \"Fill in the value based on your analysis\"\n }\n }\n }"
},
#Relevant portion of the API requwest being transmitted during the "scan now" button
"messages": [
{
"role": "system",
"content": "\n Prexisting tags: [object Object], [object Object], [object Object], [object Object], [object Object], [object Object]\n\n\n Prexisiting correspondent: CVS Pharmacy #8617\n\n\n \n\n\n Return the result EXCLUSIVELY as a JSON object. The Tags, Title and Document_Type MUST be in the language that is used in the document.:\n IMPORTANT: The custom_fields are optional and can be left out if not needed, only try to fill out the values if you find a matching information in the document.\n Do not change the value of field_name, only fill out the values. If the field is about money only add the number without currency and always use a . for decimal places.\n {\n \"title\": \"xxxxx\",\n \"correspondent\": \"xxxxxxxx\",\n \"tags\": [\"Tag1\", \"Tag2\", \"Tag3\", \"Tag4\"],\n \"document_type\": \"Invoice/Contract/...\",\n \"document_date\": \"YYYY-MM-DD\",\n \"language\": \"en/de/es/...\",\n \"custom_fields\": {\n \"0\": {\n \"field_name\": \"Total\",\n \"value\": \"Fill in the value based on your analysis\"\n }\n }\n }"
},
📜 Paperless-ngx Logs
🖼️ Screenshots of your settings page
No response
🖥️ Desktop Environment
macOS
💻 OS Version
macos 16
🌐 Browser
None
🔢 Browser Version
No response
🌐 Mobile Browser
No response
📝 Additional Information
- I have checked existing issues and this is not a duplicate
- I have tried debugging this issue on my own
- I can provide a fix and submit a PR
- I am sure that this problem is affecting everyone, not only me
- I have provided all required information above
📌 Extra Notes
I'm not a programmer, but i can kinda look through code. I found some debug settings commented out in customService.js, lines 166-177, i removed the comments to see if i got any extra info. Something about a manual scan vs the scan now button is getting/processing the existing tags differently.
Manual Scan
paperless-ai | [DEBUG] System prompt:
paperless-ai | Prexisting tags: ai-processed, inbox, Medication, Metoprolol Tartrate, Pharmacy, Prescription, Refill, Refill Information
paperless-ai | Prexisiting correspondent: CVS Pharmacy #8617
paperless-ai |
paperless-ai | Return the result EXCLUSIVELY as a JSON object. The Tags, Title and Document_Type MUST be in the language that is used in the document.:
paperless-ai | IMPORTANT: The custom_fields are optional and can be left out if not needed, only try to fill out the values if you find a matching information in the document.
paperless-ai | Do not change the value of field_name, only fill out the values. If the field is about money only add the number without currency and always use a . for decimal places.
paperless-ai | {
paperless-ai | "title": "xxxxx",
paperless-ai | "correspondent": "xxxxxxxx",
paperless-ai | "tags": ["Tag1", "Tag2", "Tag3", "Tag4"],
paperless-ai | "document_type": "Invoice/Contract/...",
paperless-ai | "document_date": "YYYY-MM-DD",
paperless-ai | "language": "en/de/es/...",
paperless-ai | "custom_fields": {
paperless-ai | "0": {
paperless-ai | "field_name": "Total",
paperless-ai | "value": "Fill in the value based on your analysis"
paperless-ai | }
paperless-ai | }
paperless-ai | }
paperless-ai | [DEBUG] Prompt tags:
paperless-ai | [DEBUG] Model: mlx-community/QwQ-32B-bf16
paperless-ai | [DEBUG] Custom fields: "custom_fields": {
paperless-ai | "0": {
paperless-ai | "field_name": "Total",
paperless-ai | "value": "Fill in the value based on your analysis"
paperless-ai | }
paperless-ai | }
paperless-ai | [DEBUG] Existing tags: ai-processed, inbox, Medication, Metoprolol Tartrate, Pharmacy, Prescription, Refill, Refill Information
paperless-ai | [DEBUG] Existing correspondents: CVS Pharmacy #8617
paperless-ai | [DEBUG] Custom prompt: null
paperless-ai | [DEBUG] External API data: null
paperless-ai | ######################################################################
Scan Button
paperless-ai | [DEBUG] System prompt:
paperless-ai | Prexisting tags: [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object]
paperless-ai | Prexisiting correspondent: CVS Pharmacy #8617
paperless-ai |
paperless-ai | Return the result EXCLUSIVELY as a JSON object. The Tags, Title and Document_Type MUST be in the language that is used in the document.:
paperless-ai | IMPORTANT: The custom_fields are optional and can be left out if not needed, only try to fill out the values if you find a matching information in the document.
paperless-ai | Do not change the value of field_name, only fill out the values. If the field is about money only add the number without currency and always use a . for decimal places.
paperless-ai | {
paperless-ai | "title": "xxxxx",
paperless-ai | "correspondent": "xxxxxxxx",
paperless-ai | "tags": ["Tag1", "Tag2", "Tag3", "Tag4"],
paperless-ai | "document_type": "Invoice/Contract/...",
paperless-ai | "document_date": "YYYY-MM-DD",
paperless-ai | "language": "en/de/es/...",
paperless-ai | "custom_fields": {
paperless-ai | "0": {
paperless-ai | "field_name": "Total",
paperless-ai | "value": "Fill in the value based on your analysis"
paperless-ai | }
paperless-ai | }
paperless-ai | }
paperless-ai | [DEBUG] Prompt tags:
paperless-ai | [DEBUG] Model: mlx-community/QwQ-32B-bf16
paperless-ai | [DEBUG] Custom fields: "custom_fields": {
paperless-ai | "0": {
paperless-ai | "field_name": "Total",
paperless-ai | "value": "Fill in the value based on your analysis"
paperless-ai | }
paperless-ai | }
paperless-ai | [DEBUG] Existing tags: [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object], [object Object]
paperless-ai | [DEBUG] Existing correspondents: CVS Pharmacy #8617
paperless-ai | [DEBUG] Custom prompt: null
paperless-ai | [DEBUG] External API data: null
paperless-ai | ######################################################################