Skip to content

Commit 22b1db9

Browse files
committed
Bugfix and optimize archive_wstring_append_from_mbs()
The cal to mbrtowc() or mbtowc() should read up to mbs_length bytes and not wcs_length. This avoids out-of-bounds reads. mbrtowc() and mbtowc() return (size_t)-1 wit errno EILSEQ when they encounter an invalid multibyte character and (size_t)-2 when they they encounter an incomplete multibyte character. As we return failure and all our callers error out it makes no sense to continue parsing mbs. As we allocate `len` wchars at the beginning and each wchar has at least one byte, there will never be need to grow the buffer, so the code can be left out. On the other hand, we are always allocatng more memory than we need. As long as wcs_length == mbs_length == len we can omit wcs_length. We keep the old code commented if we decide to save memory and use autoexpanding wcs_length in the future. Fixes #1276
1 parent dc06487 commit 22b1db9

File tree

1 file changed

+17
-11
lines changed

1 file changed

+17
-11
lines changed

Diff for: libarchive/archive_string.c

+17-11
Original file line numberDiff line numberDiff line change
@@ -591,7 +591,7 @@ archive_wstring_append_from_mbs(struct archive_wstring *dest,
591591
* No single byte will be more than one wide character,
592592
* so this length estimate will always be big enough.
593593
*/
594-
size_t wcs_length = len;
594+
// size_t wcs_length = len;
595595
size_t mbs_length = len;
596596
const char *mbs = p;
597597
wchar_t *wcs;
@@ -600,7 +600,11 @@ archive_wstring_append_from_mbs(struct archive_wstring *dest,
600600

601601
memset(&shift_state, 0, sizeof(shift_state));
602602
#endif
603-
if (NULL == archive_wstring_ensure(dest, dest->length + wcs_length + 1))
603+
/*
604+
* As we decided to have wcs_length == mbs_length == len
605+
* we can use len here instead of wcs_length
606+
*/
607+
if (NULL == archive_wstring_ensure(dest, dest->length + len + 1))
604608
return (-1);
605609
wcs = dest->s + dest->length;
606610
/*
@@ -609,6 +613,12 @@ archive_wstring_append_from_mbs(struct archive_wstring *dest,
609613
* multi bytes.
610614
*/
611615
while (*mbs && mbs_length > 0) {
616+
/*
617+
* The buffer we allocated is always big enough.
618+
* Keep this code path in a comment if we decide to choose
619+
* smaller wcs_length in the future
620+
*/
621+
/*
612622
if (wcs_length == 0) {
613623
dest->length = wcs - dest->s;
614624
dest->s[dest->length] = L'\0';
@@ -618,24 +628,20 @@ archive_wstring_append_from_mbs(struct archive_wstring *dest,
618628
return (-1);
619629
wcs = dest->s + dest->length;
620630
}
631+
*/
621632
#if HAVE_MBRTOWC
622-
r = mbrtowc(wcs, mbs, wcs_length, &shift_state);
633+
r = mbrtowc(wcs, mbs, mbs_length, &shift_state);
623634
#else
624-
r = mbtowc(wcs, mbs, wcs_length);
635+
r = mbtowc(wcs, mbs, mbs_length);
625636
#endif
626637
if (r == (size_t)-1 || r == (size_t)-2) {
627638
ret_val = -1;
628-
if (errno == EILSEQ) {
629-
++mbs;
630-
--mbs_length;
631-
continue;
632-
} else
633-
break;
639+
break;
634640
}
635641
if (r == 0 || r > mbs_length)
636642
break;
637643
wcs++;
638-
wcs_length--;
644+
// wcs_length--;
639645
mbs += r;
640646
mbs_length -= r;
641647
}

0 commit comments

Comments
 (0)