Commit ef9d3a1
fix: sanitize null bytes from text fields before PostgreSQL insertion (#238)
* fix: sanitize null bytes from text fields before PostgreSQL insertion
Fixes 'invalid byte sequence for encoding UTF8: 0x00' error during batch retain
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor: consolidate _sanitize_text into fact_extraction module
Address review feedback: reuse existing _sanitize_text from fact_extraction
instead of duplicating in fact_storage.
The consolidated function now handles both:
- Null bytes (\x00) for PostgreSQL compatibility
- Unicode surrogates (U+D800-U+DFFF) for UTF-8/LLM API compatibility
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>1 parent d788a55 commit ef9d3a1
File tree
2 files changed
+18
-12
lines changed- hindsight-api/hindsight_api/engine/retain
2 files changed
+18
-12
lines changedLines changed: 13 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
60 | | - | |
| 60 | + | |
61 | 61 | | |
62 | | - | |
| 62 | + | |
63 | 63 | | |
64 | | - | |
65 | | - | |
66 | | - | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
67 | 67 | | |
68 | | - | |
69 | | - | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
70 | 72 | | |
| 73 | + | |
| 74 | + | |
71 | 75 | | |
72 | 76 | | |
73 | | - | |
74 | | - | |
| 77 | + | |
| 78 | + | |
75 | 79 | | |
76 | 80 | | |
77 | 81 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
| |||
47 | 48 | | |
48 | 49 | | |
49 | 50 | | |
50 | | - | |
| 51 | + | |
51 | 52 | | |
52 | 53 | | |
53 | 54 | | |
| |||
56 | 57 | | |
57 | 58 | | |
58 | 59 | | |
59 | | - | |
| 60 | + | |
60 | 61 | | |
61 | 62 | | |
62 | 63 | | |
| |||
157 | 158 | | |
158 | 159 | | |
159 | 160 | | |
160 | | - | |
| 161 | + | |
| 162 | + | |
161 | 163 | | |
162 | 164 | | |
163 | 165 | | |
| |||
0 commit comments