email.Header misparses mixed headers #37495

iko · 2002-11-18T14:33:32Z

BPO	640110
Nosy	@warsaw
Files	Header.diff: Patch for Header.py to fix header decoding issues

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/warsaw'
closed_at = <Date 2005-04-27.12:23:00.000>
created_at = <Date 2002-11-18.14:33:32.000>
labels = ['library']
title = 'email.Header misparses mixed headers'
updated_at = <Date 2005-04-27.12:23:00.000>
user = 'https://bugs.python.org/iko'

bugs.python.org fields:

activity = <Date 2005-04-27.12:23:00.000>
actor = 'kalinda'
assignee = 'barry'
closed = True
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2002-11-18.14:33:32.000>
creator = 'iko'
dependencies = []
files = ['691']
hgrepos = []
issue_num = 640110
keywords = []
message_count = 6.0
messages = ['13351', '13352', '13353', '13354', '13355', '13356']
nosy_count = 3.0
nosy_names = ['barry', 'iko', 'kalinda']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue640110'
versions = ['Python 2.2']

iko · 2002-11-18T14:33:32Z

email.Header.decode_header() misparses headers with
both encoded an unencoded words. This example from RFC2047

=?ISO-8859-1?Q?Andr=E9?= Pirard <PIRARD@vm1.ulg.ac.be>

gets parsed as

AndréPirard <PIRARD@vm1.ulg.ac.be>

where there should obviously be a space between André
and Pirard. RFC2047 says to ignore spaces between
encoded words (but not between encoded and unencoded
words, though it doesn't explicitly say so from what I
could find, and obviously not between unencoded words).

Also, I see it's trying to handle continuation lines,
but it only does it if there are encoded words in the
continuation line. It barfs badly on this test case:

'Re: =?mac-iceland?q?r=8Aksm=9Arg=8Cs?= baz\n foo bar
=?mac-iceland?q?r=8Aksm=9Arg=8Cs?='

I think I'll just do a patch...

/Anders

P.S. It seems at least remotely related to Bug#552957

warsaw · 2003-03-06T06:50:26Z

Logged In: YES
user_id=12800

The first bug above has already been fixed in email 2.5
(python 2.3 cvs). The second pointed to a real bug, now
fixed I believe.

iko · 2003-03-06T14:15:28Z

Logged In: YES
user_id=14

The first bug is still there... With version 1.19 from CVS I
get this with my example:

>>> print
unicode(Header.make_header(Header.decode_header('=?ISO-8859-1?Q?Andr=E9?=
Pirard <PIRARD@vm1.ulg.ac.be>'))).encode('latin-1')
AndréPirard <PIRARD@vm1.ulg.ac.be>

(The problem is that whitespaces get stripped of on line 91:
unenc = parts.pop(0).strip()
before we know whether they are significant or not.

The continuation line bug seems to be fixed however.

/Anders

warsaw · 2003-03-06T16:21:54Z

Logged In: YES
user_id=12800

Try current cvs.

iko · 2003-03-06T16:43:47Z

Logged In: YES
user_id=14

Looks OK.

kalinda · 2005-04-27T12:23:00Z

Logged In: YES
user_id=661399

I am using python 2.4 and still have this problem. To be
more exact, line 73 in Header.py still strips the parts.
Is there a reason for this not being fixed?

iko mannequin closed this as completed Nov 18, 2002

iko mannequin assigned warsaw Nov 18, 2002

iko mannequin added the stdlib Python modules in the Lib dir label Nov 18, 2002

iko mannequin closed this as completed Nov 18, 2002

iko mannequin assigned warsaw Nov 18, 2002

iko mannequin added the stdlib Python modules in the Lib dir label Nov 18, 2002

ezio-melotti transferred this issue from another repository Apr 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

email.Header misparses mixed headers #37495

email.Header misparses mixed headers #37495

iko mannequin commented Nov 18, 2002

iko mannequin commented Nov 18, 2002

warsaw commented Mar 6, 2003

iko mannequin commented Mar 6, 2003

warsaw commented Mar 6, 2003

iko mannequin commented Mar 6, 2003

kalinda mannequin commented Apr 27, 2005

email.Header misparses mixed headers #37495

email.Header misparses mixed headers #37495

Comments

iko mannequin commented Nov 18, 2002

iko mannequin commented Nov 18, 2002

warsaw commented Mar 6, 2003

iko mannequin commented Mar 6, 2003

warsaw commented Mar 6, 2003

iko mannequin commented Mar 6, 2003

kalinda mannequin commented Apr 27, 2005