Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow caller more control in stream decoding #448

Merged
merged 96 commits into from
Feb 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
b509efd
Monad support for stream decoding
david-sledge Jun 27, 2022
54a1ad9
Refactored remove STMonadTrans dependency
david-sledge Jun 28, 2022
fc7b665
Remove unused commented code
david-sledge Jun 28, 2022
4dbf493
change source module for runRW#
david-sledge Jun 28, 2022
c052ad6
Added byte position (Int) argument to DecodeErrorM type.
david-sledge Jul 1, 2022
2e2c541
make streamDecodeUtf8WithM safe with Cont and List monads
david-sledge Jul 2, 2022
41492dc
Added StreamDecode which is like Decoding with more options. Added st…
david-sledge Jul 3, 2022
1a74c5a
Extracted Word8 from Maybe in StreamDecode: Offending Word8 is never …
david-sledge Jul 3, 2022
6b648ee
Indentation error in StreamDecode data type.
david-sledge Jul 3, 2022
bc72e4e
I see how it is, Haddock 8.x...
david-sledge Jul 3, 2022
8554b98
Design change to StreamDecode
david-sledge Jul 5, 2022
1bd9ef0
Stream decoding added for utf-16 and utf-32. Either function added fo…
david-sledge Jul 6, 2022
22b8666
Typo in documentation
david-sledge Jul 7, 2022
1dd6f34
Redesigned with different approach
david-sledge Jul 11, 2022
c154410
fix documentation
david-sledge Jul 11, 2022
8414af3
Lazy and strict stream decoders built from chunk decoders
david-sledge Jul 12, 2022
e1eded8
removed decodeUtf[X]Stream functions but left the function chunksDeco…
david-sledge Jul 13, 2022
ff571c0
Added missing lazy decodeAsciiE function. Updated documentation inclu…
david-sledge Jul 13, 2022
7d0757a
created module Data.Text.Encoding.Common and moved the contents Types…
david-sledge Jul 16, 2022
270a605
replace <> with mappend
david-sledge Jul 17, 2022
135e7f5
fix haddocks, remove commented code, remove mtl dependecy for tests
david-sledge Jul 17, 2022
7452074
remove unused extension from Tests.Properties.Transcoding
david-sledge Jul 17, 2022
438ab9b
Accommodating older versions of Haddock
david-sledge Jul 17, 2022
623eff3
Removed double copy of UTF-8 optimization. Refactored UTF-16 and UTF-…
david-sledge Jul 20, 2022
30a3ae1
Clarify DecodeResult documentation. Simplify WriteAndProgress data co…
david-sledge Jul 21, 2022
0b4bcea
Replaced decodeAsciiE with decodeAsciiChunks (lazy and strict)
david-sledge Jul 23, 2022
e1b2569
minor mistake in calculating inital array length for text array in de…
david-sledge Jul 23, 2022
c5594dc
Fixed default isValidBS function
david-sledge Jul 23, 2022
0fefb3a
Corrected logic for UTF-8 boundary checking
david-sledge Jul 24, 2022
d098649
Cleaned up decodeChunks tests
david-sledge Jul 24, 2022
0483279
Refactored chunk and ASCII decoders
david-sledge Jul 26, 2022
82ab16a
version typo
david-sledge Jul 26, 2022
0b6e5c5
rename from 'Decode' to 'Detect'
david-sledge Jul 27, 2022
63790fc
Remove unused newtype CodePoint
david-sledge Jul 27, 2022
f4bf3a7
Added missing error message in streamDecodeUtf8With
david-sledge Jul 27, 2022
18edcb2
Reintroduce simdutf
david-sledge Aug 4, 2022
d69964f
Realized the otherwise scenario in guessUtf8Boundary was acting like …
david-sledge Aug 9, 2022
0db3c8a
A little clean up.
david-sledge Aug 9, 2022
c48ea34
move test utility function whenEqProp
david-sledge Aug 10, 2022
1c5f61a
Refactor decodeAsciiPrefix
Lysxia Aug 28, 2022
1ae9ede
Stylistic nits in decodeUtf8Chunks
Lysxia Aug 28, 2022
0a11101
Back-to-the-drawing-board redesign of a UTF-8 decoder with more calle…
david-sledge Oct 2, 2022
2594c42
Merge branch 'haskell:master' into streamDecodeUtf8WithM
david-sledge Oct 2, 2022
423ef2b
Update changelog
david-sledge Oct 2, 2022
d2aa031
When possible copy whole bytestring at once regardless of code point …
david-sledge Oct 6, 2022
ac6a31c
More test cases
david-sledge Oct 6, 2022
340934e
start of decomposition of decodeUtf8Chunks
david-sledge Oct 11, 2022
1e1b4ad
Merge branch 'haskell:master' into streamDecodeUtf8WithM
david-sledge Oct 12, 2022
05d898e
little refactorin' on the prototype.
david-sledge Oct 12, 2022
344c5c1
parseUtf8Chunk adjustments
david-sledge Oct 17, 2022
dd1f57d
a little more prototypin' UTF-8 parser/decoder...
david-sledge Oct 17, 2022
ee53c15
getting there...
david-sledge Oct 23, 2022
c971a88
A little refactorin'
david-sledge Oct 24, 2022
d63effa
Documenting new functions.
david-sledge Oct 26, 2022
1f18883
Check tests before pushing...
david-sledge Oct 26, 2022
8dc84fe
Finish documentation... for now.
david-sledge Nov 1, 2022
37d19de
minor changes
david-sledge Nov 14, 2022
57abbb4
Merge branch 'haskell:master' into streamDecodeUtf8WithM
david-sledge Nov 19, 2022
ee60956
Merge branch 'haskell:master' into streamDecodeUtf8WithM
david-sledge Nov 23, 2022
51f754b
Merge branch 'haskell:master' into streamDecodeUtf8WithM
david-sledge Dec 11, 2022
c630f83
Merge branch 'haskell:master' into streamDecodeUtf8WithM
david-sledge Dec 14, 2022
1e3db5b
Update src/Data/Text/Encoding.hs
david-sledge Dec 30, 2022
90a0c5b
encode partial codepoint in Word32
david-sledge Jan 14, 2023
2e4988e
Rebase against master
andrewthad Jan 3, 2023
6e51fdd
move test utility function whenEqProp
david-sledge Aug 10, 2022
8514677
Merge branch 'master' into streamDecodeUtf8WithM
david-sledge Jan 14, 2023
47b4dca
remove sponge left in patient
david-sledge Jan 14, 2023
00fe6cc
Merge remote-tracking branch 'origin/master' into streamDecodeUtf8WithM
Lysxia Jan 30, 2023
9fb6205
Refactor decodeASCII
Lysxia Jan 30, 2023
fe5b4e6
Refactor decodeUtf8 (WIP)
Lysxia Jan 31, 2023
4d9c090
More streaming decodeUtf8 (WIP)
Lysxia Jan 31, 2023
1a96f2d
Fix bugs, fix tests
Lysxia Feb 3, 2023
6a65ed6
Rework docs
Lysxia Feb 3, 2023
27d2321
Fix tests
Lysxia Feb 3, 2023
4479e6e
Space
Lysxia Feb 3, 2023
01d8824
Fix test for old base
Lysxia Feb 4, 2023
25859ad
Add textToStrictBuilder
Lysxia Feb 4, 2023
ace7c61
Update changelog.md
Lysxia Feb 4, 2023
2603294
Add laws
Lysxia Feb 4, 2023
01344a2
Add some inline pragmas
Lysxia Feb 4, 2023
c36c95f
Optimize first iteration of decodeUtf8With1
Lysxia Feb 5, 2023
89f2207
Go CPS
Lysxia Feb 5, 2023
b3ec653
More doc, explicit exports
Lysxia Feb 6, 2023
7656b01
Make StrictBuilder module, explicit exports
Lysxia Feb 6, 2023
f663a6e
Revert changes to Data.Text.Internal.Encoding.Utf8
Lysxia Feb 6, 2023
bc65921
Docs
Lysxia Feb 6, 2023
28a7460
Fix short-circuit
Lysxia Feb 6, 2023
9e8e0f9
Doc
Lysxia Feb 6, 2023
1f5a873
Sort imports
Lysxia Feb 6, 2023
72f91ee
Undo useless optimization
Lysxia Feb 6, 2023
d347bba
import Semigroup for old base
Lysxia Feb 6, 2023
82bcef3
Merge remote-tracking branch 'origin/master' into streamDecodeUtf8WithM
Lysxia Feb 6, 2023
9a0c8a8
Clean up imports
Lysxia Feb 6, 2023
64fd029
Minimize test diff
Lysxia Feb 6, 2023
e136cad
test: sort imports
Lysxia Feb 6, 2023
74db3eb
Apply suggestions
Lysxia Feb 6, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,31 @@

* Remove support for GHC 8.0.

### 2.0.2

* Add decoding functions in `Data.Text.Encoding` that allow
more control for error handling and for how to allocate text.
(https://github.com/haskell/text/pull/448 Thanks to David Sledge)
* `decodeASCIIPrefix`
* `decodeUtf8Chunk`
* `decodeUtf8More`
* `Utf8ValidState`
* `startUtf8ValidState`
* `StrictBuilder`
* `strictBuilderToText`
* `textToStrictBuilder`
* `validateUtf8Chunk`
* `validateUtf8More`

* Fix quadratic slowdown when decoding invalid UTF-8 bytestrings
(https://github.com/haskell/text/issues/495)

* Add internal module `Data.Text.Internal.StrictBuilder`

* Add internal module `Data.Text.Internal.Encoding`

* Add `Data.Text.Internal.Encoding.Utf8.updateDecoderState` and export `utf8{Accept,Reject}State` from the same module.

### 2.0.1

* Improve portability of C and C++ code.
Expand Down
Loading