Skip to content

ziman/text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

86 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text

This library is an attempt to clean up text processing in Idris, especially with respect to text encodings.

The idea is that a string is a sequence of Unicode scalar values; this sequence can be encoded in various ways

  • a sequence of bytes
    • UTF-8
    • UTF-16
    • etc.
  • a List of code points
    • like String in Haskell
  • etc.

This library should provide a convenient interface to them.

MORE DOC TBD

Key types

  • Data.Text.Text as a UTF-8 specialisation of Data.Text.EncodedString
  • Data.Text.CodePoint.CodePoint
  • Data.Text.Encoding.Encoding

Note

This is mostly an API prototype and its implementation should certainly be improved in various ways, especially wrt. performance and error checking.

(A very non-exhaustive) list of things to do:

  • Create Text-only specialisations of the current modules (Data.Text, Lightyear.Text) to aid elaboration by fixing the encoding to UTF-8. (The encoding e in EncodedString e is sometimes not possible to infer if it is left totally general.)

Related work

About

Text framework for Idris

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages