Permalink
Browse files

Fixed the UTF-8 line folding bug.

git-svn-id: http://codespeak.net/svn/iCalendar/trunk@34875 fd0d7bf2-dfb6-0310-8d31-b7ecfe96aada
  • Loading branch information...
1 parent df9a3a3 commit ddf2f78c408689161cf2088109f0c7cbe79bd40b regebro committed Nov 22, 2006
Showing with 36 additions and 2 deletions.
  1. +7 −0 CHANGES.txt
  2. +29 −2 src/icalendar/parser.py
View
@@ -4,4 +4,11 @@ iCalendar 1.1 (unreleased)
* Fixed a bug in caselessdicts popitem.
(thanks to Michael Smith <msmith@fluendo.com>)
+* The RFC 2445 was a bit unclear on how to handle line folding when it
+ happened to be in the middle of a UTF-8 character. This has been clarified
+ in the following discussion:
+ http://lists.osafoundation.org/pipermail/ietf-calsify/2006-August/001126.html
+ And this is now implemented in iCalendar. It will not fold in the middle
+ of a UTF-8 character, but may fold in the middle of a UTF-8 composing
+ character sequence.
@@ -261,6 +261,11 @@ class Contentline(str):
>>> c
'123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 '
+ We do not fold withing a UTF-8 character:
+ >>> c = Contentline('This line has a UTF-8 character where it should be folded. Make sure it g\xc3\xabts folded before that character.')
+ >>> '\xc3\xab' in str(c)
+ True
+
It can parse itself into parts. Which is a tuple of (name, params, vals)
>>> c = Contentline('dtstart:20050101T120000')
@@ -354,6 +359,7 @@ class Contentline(str):
>>> c = Contentline('key;param="pValue":value', strict=True)
>>> c.parts()
('key', Parameters({'PARAM': 'pValue'}), 'value')
+
"""
def __new__(cls, st, strict=False):
@@ -417,8 +423,29 @@ def __str__(self):
"Long content lines are folded so they are less than 75 characters wide"
l_line = len(self)
new_lines = []
- for i in range(0, l_line, 74):
- new_lines.append(self[i:i+74])
+ start = 0
+ end = 74
+ while True:
+ if end > l_line:
+ end = l_line
+ else:
+ # Check that we don't fold in the middle of a UTF-8 character:
+ # http://lists.osafoundation.org/pipermail/ietf-calsify/2006-August/001126.html
+ while True:
+ char_value = ord(self[end])
+ if char_value < 128 or char_value >= 192:
+ # This is not in the middle of a UTF-8 character, so we
+ # can fold here:
+ break
+ else:
+ end -= 1
+
+ new_lines.append(self[start:end])
+ if end == l_line:
+ # Done
+ break
+ start = end
+ end = start + 74
return '\r\n '.join(new_lines)

0 comments on commit ddf2f78

Please sign in to comment.