/
qbytearray.cpp
5231 lines (4075 loc) · 149 KB
/
qbytearray.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
/****************************************************************************
**
** Copyright (C) 2019 The Qt Company Ltd.
** Copyright (C) 2016 Intel Corporation.
** Copyright (C) 2019 Klarälvdalens Datakonsult AB, a KDAB Group company, info@kdab.com, author Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
** Contact: https://www.qt.io/licensing/
**
** This file is part of the QtCore module of the Qt Toolkit.
**
** $QT_BEGIN_LICENSE:LGPL$
** Commercial License Usage
** Licensees holding valid commercial Qt licenses may use this file in
** accordance with the commercial license agreement provided with the
** Software or, alternatively, in accordance with the terms contained in
** a written agreement between you and The Qt Company. For licensing terms
** and conditions see https://www.qt.io/terms-conditions. For further
** information use the contact form at https://www.qt.io/contact-us.
**
** GNU Lesser General Public License Usage
** Alternatively, this file may be used under the terms of the GNU Lesser
** General Public License version 3 as published by the Free Software
** Foundation and appearing in the file LICENSE.LGPL3 included in the
** packaging of this file. Please review the following information to
** ensure the GNU Lesser General Public License version 3 requirements
** will be met: https://www.gnu.org/licenses/lgpl-3.0.html.
**
** GNU General Public License Usage
** Alternatively, this file may be used under the terms of the GNU
** General Public License version 2.0 or (at your option) the GNU General
** Public license version 3 or any later version approved by the KDE Free
** Qt Foundation. The licenses are as published by the Free Software
** Foundation and appearing in the file LICENSE.GPL2 and LICENSE.GPL3
** included in the packaging of this file. Please review the following
** information to ensure the GNU General Public License requirements will
** be met: https://www.gnu.org/licenses/gpl-2.0.html and
** https://www.gnu.org/licenses/gpl-3.0.html.
**
** $QT_END_LICENSE$
**
****************************************************************************/
#include "qbytearray.h"
#include "qbytearraymatcher.h"
#include "private/qtools_p.h"
#include "qhashfunctions.h"
#include "qstring.h"
#include "qlist.h"
#include "qlocale.h"
#include "qlocale_p.h"
#include "qlocale_tools_p.h"
#include "private/qnumeric_p.h"
#include "private/qsimd_p.h"
#include "qstringalgorithms_p.h"
#include "qscopedpointer.h"
#include "qbytearray_p.h"
#include <qdatastream.h>
#include <qmath.h>
#ifndef QT_NO_COMPRESS
#include <zconf.h>
#include <zlib.h>
#endif
#include <ctype.h>
#include <limits.h>
#include <string.h>
#include <stdlib.h>
#define IS_RAW_DATA(d) ((d)->offset != sizeof(QByteArrayData))
QT_BEGIN_NAMESPACE
// Latin 1 case system, used by QByteArray::to{Upper,Lower}() and qstr(n)icmp():
/*
#!/usr/bin/perl -l
use feature "unicode_strings";
for (0..255) {
$up = uc(chr($_));
$up = chr($_) if ord($up) > 0x100 || length $up > 1;
printf "0x%02x,", ord($up);
print "" if ($_ & 0xf) == 0xf;
}
*/
static const uchar latin1_uppercased[256] = {
0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e,0x0f,
0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0x1a,0x1b,0x1c,0x1d,0x1e,0x1f,
0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,0x28,0x29,0x2a,0x2b,0x2c,0x2d,0x2e,0x2f,
0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x3a,0x3b,0x3c,0x3d,0x3e,0x3f,
0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f,
0x50,0x51,0x52,0x53,0x54,0x55,0x56,0x57,0x58,0x59,0x5a,0x5b,0x5c,0x5d,0x5e,0x5f,
0x60,0x41,0x42,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f,
0x50,0x51,0x52,0x53,0x54,0x55,0x56,0x57,0x58,0x59,0x5a,0x7b,0x7c,0x7d,0x7e,0x7f,
0x80,0x81,0x82,0x83,0x84,0x85,0x86,0x87,0x88,0x89,0x8a,0x8b,0x8c,0x8d,0x8e,0x8f,
0x90,0x91,0x92,0x93,0x94,0x95,0x96,0x97,0x98,0x99,0x9a,0x9b,0x9c,0x9d,0x9e,0x9f,
0xa0,0xa1,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xaa,0xab,0xac,0xad,0xae,0xaf,
0xb0,0xb1,0xb2,0xb3,0xb4,0xb5,0xb6,0xb7,0xb8,0xb9,0xba,0xbb,0xbc,0xbd,0xbe,0xbf,
0xc0,0xc1,0xc2,0xc3,0xc4,0xc5,0xc6,0xc7,0xc8,0xc9,0xca,0xcb,0xcc,0xcd,0xce,0xcf,
0xd0,0xd1,0xd2,0xd3,0xd4,0xd5,0xd6,0xd7,0xd8,0xd9,0xda,0xdb,0xdc,0xdd,0xde,0xdf,
0xc0,0xc1,0xc2,0xc3,0xc4,0xc5,0xc6,0xc7,0xc8,0xc9,0xca,0xcb,0xcc,0xcd,0xce,0xcf,
0xd0,0xd1,0xd2,0xd3,0xd4,0xd5,0xd6,0xf7,0xd8,0xd9,0xda,0xdb,0xdc,0xdd,0xde,0xff
};
/*
#!/usr/bin/perl -l
use feature "unicode_strings";
for (0..255) {
$up = lc(chr($_));
$up = chr($_) if ord($up) > 0x100 || length $up > 1;
printf "0x%02x,", ord($up);
print "" if ($_ & 0xf) == 0xf;
}
*/
static const uchar latin1_lowercased[256] = {
0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e,0x0f,
0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0x1a,0x1b,0x1c,0x1d,0x1e,0x1f,
0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,0x28,0x29,0x2a,0x2b,0x2c,0x2d,0x2e,0x2f,
0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x3a,0x3b,0x3c,0x3d,0x3e,0x3f,
0x40,0x61,0x62,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6a,0x6b,0x6c,0x6d,0x6e,0x6f,
0x70,0x71,0x72,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7a,0x5b,0x5c,0x5d,0x5e,0x5f,
0x60,0x61,0x62,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6a,0x6b,0x6c,0x6d,0x6e,0x6f,
0x70,0x71,0x72,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7a,0x7b,0x7c,0x7d,0x7e,0x7f,
0x80,0x81,0x82,0x83,0x84,0x85,0x86,0x87,0x88,0x89,0x8a,0x8b,0x8c,0x8d,0x8e,0x8f,
0x90,0x91,0x92,0x93,0x94,0x95,0x96,0x97,0x98,0x99,0x9a,0x9b,0x9c,0x9d,0x9e,0x9f,
0xa0,0xa1,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xaa,0xab,0xac,0xad,0xae,0xaf,
0xb0,0xb1,0xb2,0xb3,0xb4,0xb5,0xb6,0xb7,0xb8,0xb9,0xba,0xbb,0xbc,0xbd,0xbe,0xbf,
0xe0,0xe1,0xe2,0xe3,0xe4,0xe5,0xe6,0xe7,0xe8,0xe9,0xea,0xeb,0xec,0xed,0xee,0xef,
0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xd7,0xf8,0xf9,0xfa,0xfb,0xfc,0xfd,0xfe,0xdf,
0xe0,0xe1,0xe2,0xe3,0xe4,0xe5,0xe6,0xe7,0xe8,0xe9,0xea,0xeb,0xec,0xed,0xee,0xef,
0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7,0xf8,0xf9,0xfa,0xfb,0xfc,0xfd,0xfe,0xff
};
int qFindByteArray(
const char *haystack0, int haystackLen, int from,
const char *needle0, int needleLen);
/*****************************************************************************
Safe and portable C string functions; extensions to standard string.h
*****************************************************************************/
/*! \relates QByteArray
Returns a duplicate string.
Allocates space for a copy of \a src, copies it, and returns a
pointer to the copy. If \a src is \nullptr, it immediately returns
\nullptr.
Ownership is passed to the caller, so the returned string must be
deleted using \c delete[].
*/
char *qstrdup(const char *src)
{
if (!src)
return nullptr;
char *dst = new char[strlen(src) + 1];
return qstrcpy(dst, src);
}
/*! \relates QByteArray
Copies all the characters up to and including the '\\0' from \a
src into \a dst and returns a pointer to \a dst. If \a src is
\nullptr, it immediately returns \nullptr.
This function assumes that \a dst is large enough to hold the
contents of \a src.
\note If \a dst and \a src overlap, the behavior is undefined.
\sa qstrncpy()
*/
char *qstrcpy(char *dst, const char *src)
{
if (!src)
return nullptr;
#ifdef Q_CC_MSVC
const int len = int(strlen(src));
// This is actually not secure!!! It will be fixed
// properly in a later release!
if (len >= 0 && strcpy_s(dst, len+1, src) == 0)
return dst;
return nullptr;
#else
return strcpy(dst, src);
#endif
}
/*! \relates QByteArray
A safe \c strncpy() function.
Copies at most \a len bytes from \a src (stopping at \a len or the
terminating '\\0' whichever comes first) into \a dst and returns a
pointer to \a dst. Guarantees that \a dst is '\\0'-terminated. If
\a src or \a dst is \nullptr, returns \nullptr immediately.
This function assumes that \a dst is at least \a len characters
long.
\note If \a dst and \a src overlap, the behavior is undefined.
\note When compiling with Visual C++ compiler version 14.00
(Visual C++ 2005) or later, internally the function strncpy_s
will be used.
\sa qstrcpy()
*/
char *qstrncpy(char *dst, const char *src, uint len)
{
if (!src || !dst)
return nullptr;
if (len > 0) {
#ifdef Q_CC_MSVC
strncpy_s(dst, len, src, len - 1);
#else
strncpy(dst, src, len);
#endif
dst[len-1] = '\0';
}
return dst;
}
/*! \fn uint qstrlen(const char *str)
\relates QByteArray
A safe \c strlen() function.
Returns the number of characters that precede the terminating '\\0',
or 0 if \a str is \nullptr.
\sa qstrnlen()
*/
/*! \fn uint qstrnlen(const char *str, uint maxlen)
\relates QByteArray
\since 4.2
A safe \c strnlen() function.
Returns the number of characters that precede the terminating '\\0', but
at most \a maxlen. If \a str is \nullptr, returns 0.
\sa qstrlen()
*/
/*!
\relates QByteArray
A safe \c strcmp() function.
Compares \a str1 and \a str2. Returns a negative value if \a str1
is less than \a str2, 0 if \a str1 is equal to \a str2 or a
positive value if \a str1 is greater than \a str2.
Special case 1: Returns 0 if \a str1 and \a str2 are both \nullptr.
Special case 2: Returns an arbitrary non-zero value if \a str1 is
\nullptr or \a str2 is \nullptr (but not both).
\sa qstrncmp(), qstricmp(), qstrnicmp(), {8-bit Character Comparisons},
QByteArray::compare()
*/
int qstrcmp(const char *str1, const char *str2)
{
return (str1 && str2) ? strcmp(str1, str2)
: (str1 ? 1 : (str2 ? -1 : 0));
}
/*! \fn int qstrncmp(const char *str1, const char *str2, uint len);
\relates QByteArray
A safe \c strncmp() function.
Compares at most \a len bytes of \a str1 and \a str2.
Returns a negative value if \a str1 is less than \a str2, 0 if \a
str1 is equal to \a str2 or a positive value if \a str1 is greater
than \a str2.
Special case 1: Returns 0 if \a str1 and \a str2 are both \nullptr.
Special case 2: Returns a random non-zero value if \a str1 is \nullptr
or \a str2 is \nullptr (but not both).
\sa qstrcmp(), qstricmp(), qstrnicmp(), {8-bit Character Comparisons},
QByteArray::compare()
*/
/*! \relates QByteArray
A safe \c stricmp() function.
Compares \a str1 and \a str2 ignoring the case of the
characters. The encoding of the strings is assumed to be Latin-1.
Returns a negative value if \a str1 is less than \a str2, 0 if \a
str1 is equal to \a str2 or a positive value if \a str1 is greater
than \a str2.
Special case 1: Returns 0 if \a str1 and \a str2 are both \nullptr.
Special case 2: Returns a random non-zero value if \a str1 is \nullptr
or \a str2 is \nullptr (but not both).
\sa qstrcmp(), qstrncmp(), qstrnicmp(), {8-bit Character Comparisons},
QByteArray::compare()
*/
int qstricmp(const char *str1, const char *str2)
{
const uchar *s1 = reinterpret_cast<const uchar *>(str1);
const uchar *s2 = reinterpret_cast<const uchar *>(str2);
if (!s1)
return s2 ? -1 : 0;
if (!s2)
return 1;
enum { Incomplete = 256 };
qptrdiff offset = 0;
auto innerCompare = [=, &offset](qptrdiff max, bool unlimited) {
max += offset;
do {
uchar c = latin1_lowercased[s1[offset]];
int res = c - latin1_lowercased[s2[offset]];
if (Q_UNLIKELY(res))
return res;
if (Q_UNLIKELY(!c))
return 0;
++offset;
} while (unlimited || offset < max);
return int(Incomplete);
};
#if defined(__SSE4_1__) && !(defined(__SANITIZE_ADDRESS__) || __has_feature(address_sanitizer))
enum { PageSize = 4096, PageMask = PageSize - 1 };
const __m128i zero = _mm_setzero_si128();
forever {
// Calculate how many bytes we can load until we cross a page boundary
// for either source. This isn't an exact calculation, just something
// very quick.
quintptr u1 = quintptr(s1 + offset);
quintptr u2 = quintptr(s2 + offset);
uint n = PageSize - ((u1 | u2) & PageMask);
qptrdiff maxoffset = offset + n;
for ( ; offset + 16 <= maxoffset; offset += sizeof(__m128i)) {
// load 16 bytes from either source
__m128i a = _mm_loadu_si128(reinterpret_cast<const __m128i *>(s1 + offset));
__m128i b = _mm_loadu_si128(reinterpret_cast<const __m128i *>(s2 + offset));
// compare the two against each oher
__m128i cmp = _mm_cmpeq_epi8(a, b);
// find NUL terminators too
cmp = _mm_min_epu8(cmp, a);
cmp = _mm_cmpeq_epi8(cmp, zero);
// was there any difference or a NUL?
uint mask = _mm_movemask_epi8(cmp);
if (mask) {
// yes, find out where
uint start = qCountTrailingZeroBits(mask);
uint end = sizeof(mask) * 8 - qCountLeadingZeroBits(mask);
Q_ASSUME(end >= start);
offset += start;
n = end - start;
break;
}
}
// using SIMD could cause a page fault, so iterate byte by byte
int res = innerCompare(n, false);
if (res != Incomplete)
return res;
}
#endif
return innerCompare(-1, true);
}
/*! \relates QByteArray
A safe \c strnicmp() function.
Compares at most \a len bytes of \a str1 and \a str2 ignoring the
case of the characters. The encoding of the strings is assumed to
be Latin-1.
Returns a negative value if \a str1 is less than \a str2, 0 if \a str1
is equal to \a str2 or a positive value if \a str1 is greater than \a
str2.
Special case 1: Returns 0 if \a str1 and \a str2 are both \nullptr.
Special case 2: Returns a random non-zero value if \a str1 is \nullptr
or \a str2 is \nullptr (but not both).
\sa qstrcmp(), qstrncmp(), qstricmp(), {8-bit Character Comparisons},
QByteArray::compare()
*/
int qstrnicmp(const char *str1, const char *str2, uint len)
{
const uchar *s1 = reinterpret_cast<const uchar *>(str1);
const uchar *s2 = reinterpret_cast<const uchar *>(str2);
int res;
uchar c;
if (!s1 || !s2)
return s1 ? 1 : (s2 ? -1 : 0);
for (; len--; s1++, s2++) {
if ((res = (c = latin1_lowercased[*s1]) - latin1_lowercased[*s2]))
return res;
if (!c) // strings are equal
break;
}
return 0;
}
/*!
\internal
\since 5.12
A helper for QByteArray::compare. Compares \a len1 bytes from \a str1 to \a
len2 bytes from \a str2. If \a len2 is -1, then \a str2 is expected to be
'\\0'-terminated.
*/
int qstrnicmp(const char *str1, qsizetype len1, const char *str2, qsizetype len2)
{
Q_ASSERT(str1);
Q_ASSERT(len1 >= 0);
Q_ASSERT(len2 >= -1);
const uchar *s1 = reinterpret_cast<const uchar *>(str1);
const uchar *s2 = reinterpret_cast<const uchar *>(str2);
if (!s2)
return len1 == 0 ? 0 : 1;
int res;
uchar c;
if (len2 == -1) {
// null-terminated str2
qsizetype i;
for (i = 0; i < len1; ++i) {
c = latin1_lowercased[s2[i]];
if (!c)
return 1;
res = latin1_lowercased[s1[i]] - c;
if (res)
return res;
}
c = latin1_lowercased[s2[i]];
return c ? -1 : 0;
} else {
// not null-terminated
for (qsizetype i = 0; i < qMin(len1, len2); ++i) {
c = latin1_lowercased[s2[i]];
res = latin1_lowercased[s1[i]] - c;
if (res)
return res;
}
if (len1 == len2)
return 0;
return len1 < len2 ? -1 : 1;
}
}
/*!
\internal
### Qt6: replace the QByteArray parameter with [pointer,len] pair
*/
int qstrcmp(const QByteArray &str1, const char *str2)
{
if (!str2)
return str1.isEmpty() ? 0 : +1;
const char *str1data = str1.constData();
const char *str1end = str1data + str1.length();
for ( ; str1data < str1end && *str2; ++str1data, ++str2) {
int diff = int(uchar(*str1data)) - uchar(*str2);
if (diff)
// found a difference
return diff;
}
// Why did we stop?
if (*str2 != '\0')
// not the null, so we stopped because str1 is shorter
return -1;
if (str1data < str1end)
// we haven't reached the end, so str1 must be longer
return +1;
return 0;
}
/*!
\internal
### Qt6: replace the QByteArray parameter with [pointer,len] pair
*/
int qstrcmp(const QByteArray &str1, const QByteArray &str2)
{
int l1 = str1.length();
int l2 = str2.length();
int ret = memcmp(str1.constData(), str2.constData(), qMin(l1, l2));
if (ret != 0)
return ret;
// they matched qMin(l1, l2) bytes
// so the longer one is lexically after the shorter one
return l1 - l2;
}
// the CRC table below is created by the following piece of code
#if 0
static void createCRC16Table() // build CRC16 lookup table
{
unsigned int i;
unsigned int j;
unsigned short crc_tbl[16];
unsigned int v0, v1, v2, v3;
for (i = 0; i < 16; i++) {
v0 = i & 1;
v1 = (i >> 1) & 1;
v2 = (i >> 2) & 1;
v3 = (i >> 3) & 1;
j = 0;
#undef SET_BIT
#define SET_BIT(x, b, v) (x) |= (v) << (b)
SET_BIT(j, 0, v0);
SET_BIT(j, 7, v0);
SET_BIT(j, 12, v0);
SET_BIT(j, 1, v1);
SET_BIT(j, 8, v1);
SET_BIT(j, 13, v1);
SET_BIT(j, 2, v2);
SET_BIT(j, 9, v2);
SET_BIT(j, 14, v2);
SET_BIT(j, 3, v3);
SET_BIT(j, 10, v3);
SET_BIT(j, 15, v3);
crc_tbl[i] = j;
}
printf("static const quint16 crc_tbl[16] = {\n");
for (int i = 0; i < 16; i +=4)
printf(" 0x%04x, 0x%04x, 0x%04x, 0x%04x,\n", crc_tbl[i], crc_tbl[i+1], crc_tbl[i+2], crc_tbl[i+3]);
printf("};\n");
}
#endif
static const quint16 crc_tbl[16] = {
0x0000, 0x1081, 0x2102, 0x3183,
0x4204, 0x5285, 0x6306, 0x7387,
0x8408, 0x9489, 0xa50a, 0xb58b,
0xc60c, 0xd68d, 0xe70e, 0xf78f
};
/*!
\relates QByteArray
Returns the CRC-16 checksum of the first \a len bytes of \a data.
The checksum is independent of the byte order (endianness) and will be
calculated accorded to the algorithm published in ISO 3309 (Qt::ChecksumIso3309).
\note This function is a 16-bit cache conserving (16 entry table)
implementation of the CRC-16-CCITT algorithm.
*/
quint16 qChecksum(const char *data, uint len)
{
return qChecksum(data, len, Qt::ChecksumIso3309);
}
/*!
\relates QByteArray
\since 5.9
Returns the CRC-16 checksum of the first \a len bytes of \a data.
The checksum is independent of the byte order (endianness) and will
be calculated accorded to the algorithm published in \a standard.
\note This function is a 16-bit cache conserving (16 entry table)
implementation of the CRC-16-CCITT algorithm.
*/
quint16 qChecksum(const char *data, uint len, Qt::ChecksumType standard)
{
quint16 crc = 0x0000;
switch (standard) {
case Qt::ChecksumIso3309:
crc = 0xffff;
break;
case Qt::ChecksumItuV41:
crc = 0x6363;
break;
}
uchar c;
const uchar *p = reinterpret_cast<const uchar *>(data);
while (len--) {
c = *p++;
crc = ((crc >> 4) & 0x0fff) ^ crc_tbl[((crc ^ c) & 15)];
c >>= 4;
crc = ((crc >> 4) & 0x0fff) ^ crc_tbl[((crc ^ c) & 15)];
}
switch (standard) {
case Qt::ChecksumIso3309:
crc = ~crc;
break;
case Qt::ChecksumItuV41:
break;
}
return crc & 0xffff;
}
/*!
\fn QByteArray qCompress(const QByteArray& data, int compressionLevel)
\relates QByteArray
Compresses the \a data byte array and returns the compressed data
in a new byte array.
The \a compressionLevel parameter specifies how much compression
should be used. Valid values are between 0 and 9, with 9
corresponding to the greatest compression (i.e. smaller compressed
data) at the cost of using a slower algorithm. Smaller values (8,
7, ..., 1) provide successively less compression at slightly
faster speeds. The value 0 corresponds to no compression at all.
The default value is -1, which specifies zlib's default
compression.
\sa qUncompress()
*/
/*! \relates QByteArray
\overload
Compresses the first \a nbytes of \a data at compression level
\a compressionLevel and returns the compressed data in a new byte array.
*/
#ifndef QT_NO_COMPRESS
QByteArray qCompress(const uchar* data, int nbytes, int compressionLevel)
{
if (nbytes == 0) {
return QByteArray(4, '\0');
}
if (!data) {
qWarning("qCompress: Data is null");
return QByteArray();
}
if (compressionLevel < -1 || compressionLevel > 9)
compressionLevel = -1;
ulong len = nbytes + nbytes / 100 + 13;
QByteArray bazip;
int res;
do {
bazip.resize(len + 4);
res = ::compress2((uchar*)bazip.data()+4, &len, data, nbytes, compressionLevel);
switch (res) {
case Z_OK:
bazip.resize(len + 4);
bazip[0] = (nbytes & 0xff000000) >> 24;
bazip[1] = (nbytes & 0x00ff0000) >> 16;
bazip[2] = (nbytes & 0x0000ff00) >> 8;
bazip[3] = (nbytes & 0x000000ff);
break;
case Z_MEM_ERROR:
qWarning("qCompress: Z_MEM_ERROR: Not enough memory");
bazip.resize(0);
break;
case Z_BUF_ERROR:
len *= 2;
break;
}
} while (res == Z_BUF_ERROR);
return bazip;
}
#endif
/*!
\fn QByteArray qUncompress(const QByteArray &data)
\relates QByteArray
Uncompresses the \a data byte array and returns a new byte array
with the uncompressed data.
Returns an empty QByteArray if the input data was corrupt.
This function will uncompress data compressed with qCompress()
from this and any earlier Qt version, back to Qt 3.1 when this
feature was added.
\b{Note:} If you want to use this function to uncompress external
data that was compressed using zlib, you first need to prepend a four
byte header to the byte array containing the data. The header must
contain the expected length (in bytes) of the uncompressed data,
expressed as an unsigned, big-endian, 32-bit integer.
\sa qCompress()
*/
#ifndef QT_NO_COMPRESS
namespace {
struct QByteArrayDataDeleter
{
static inline void cleanup(QTypedArrayData<char> *d)
{ if (d) QTypedArrayData<char>::deallocate(d); }
};
}
static QByteArray invalidCompressedData()
{
qWarning("qUncompress: Input data is corrupted");
return QByteArray();
}
/*! \relates QByteArray
\overload
Uncompresses the first \a nbytes of \a data and returns a new byte
array with the uncompressed data.
*/
QByteArray qUncompress(const uchar* data, int nbytes)
{
if (!data) {
qWarning("qUncompress: Data is null");
return QByteArray();
}
if (nbytes <= 4) {
if (nbytes < 4 || (data[0]!=0 || data[1]!=0 || data[2]!=0 || data[3]!=0))
qWarning("qUncompress: Input data is corrupted");
return QByteArray();
}
ulong expectedSize = uint((data[0] << 24) | (data[1] << 16) |
(data[2] << 8) | (data[3] ));
ulong len = qMax(expectedSize, 1ul);
const ulong maxPossibleSize = MaxAllocSize - sizeof(QByteArray::Data);
if (Q_UNLIKELY(len >= maxPossibleSize)) {
// QByteArray does not support that huge size anyway.
return invalidCompressedData();
}
QScopedPointer<QByteArray::Data, QByteArrayDataDeleter> d(QByteArray::Data::allocate(expectedSize + 1));
if (Q_UNLIKELY(d.data() == nullptr))
return invalidCompressedData();
d->size = expectedSize;
forever {
ulong alloc = len;
int res = ::uncompress((uchar*)d->data(), &len,
data+4, nbytes-4);
switch (res) {
case Z_OK:
Q_ASSERT(len <= alloc);
Q_UNUSED(alloc);
d->size = len;
d->data()[len] = 0;
{
QByteArrayDataPtr dataPtr = { d.take() };
return QByteArray(dataPtr);
}
case Z_MEM_ERROR:
qWarning("qUncompress: Z_MEM_ERROR: Not enough memory");
return QByteArray();
case Z_BUF_ERROR:
len *= 2;
if (Q_UNLIKELY(len >= maxPossibleSize)) {
// QByteArray does not support that huge size anyway.
return invalidCompressedData();
} else {
// grow the block
QByteArray::Data *p = QByteArray::Data::reallocateUnaligned(d.data(), len + 1);
if (Q_UNLIKELY(p == nullptr))
return invalidCompressedData();
d.take(); // don't free
d.reset(p);
}
continue;
case Z_DATA_ERROR:
qWarning("qUncompress: Z_DATA_ERROR: Input data is corrupted");
return QByteArray();
}
}
}
#endif
/*!
\class QByteArray
\inmodule QtCore
\brief The QByteArray class provides an array of bytes.
\ingroup tools
\ingroup shared
\ingroup string-processing
\reentrant
QByteArray can be used to store both raw bytes (including '\\0's)
and traditional 8-bit '\\0'-terminated strings. Using QByteArray
is much more convenient than using \c{const char *}. Behind the
scenes, it always ensures that the data is followed by a '\\0'
terminator, and uses \l{implicit sharing} (copy-on-write) to
reduce memory usage and avoid needless copying of data.
In addition to QByteArray, Qt also provides the QString class to
store string data. For most purposes, QString is the class you
want to use. It stores 16-bit Unicode characters, making it easy
to store non-ASCII/non-Latin-1 characters in your application.
Furthermore, QString is used throughout in the Qt API. The two
main cases where QByteArray is appropriate are when you need to
store raw binary data, and when memory conservation is critical
(e.g., with Qt for Embedded Linux).
One way to initialize a QByteArray is simply to pass a \c{const
char *} to its constructor. For example, the following code
creates a byte array of size 5 containing the data "Hello":
\snippet code/src_corelib_tools_qbytearray.cpp 0
Although the size() is 5, the byte array also maintains an extra
'\\0' character at the end so that if a function is used that
asks for a pointer to the underlying data (e.g. a call to
data()), the data pointed to is guaranteed to be
'\\0'-terminated.
QByteArray makes a deep copy of the \c{const char *} data, so you
can modify it later without experiencing side effects. (If for
performance reasons you don't want to take a deep copy of the
character data, use QByteArray::fromRawData() instead.)
Another approach is to set the size of the array using resize()
and to initialize the data byte per byte. QByteArray uses 0-based
indexes, just like C++ arrays. To access the byte at a particular
index position, you can use operator[](). On non-const byte
arrays, operator[]() returns a reference to a byte that can be
used on the left side of an assignment. For example:
\snippet code/src_corelib_tools_qbytearray.cpp 1
For read-only access, an alternative syntax is to use at():
\snippet code/src_corelib_tools_qbytearray.cpp 2
at() can be faster than operator[](), because it never causes a
\l{deep copy} to occur.
To extract many bytes at a time, use left(), right(), or mid().
A QByteArray can embed '\\0' bytes. The size() function always
returns the size of the whole array, including embedded '\\0'
bytes, but excluding the terminating '\\0' added by QByteArray.
For example:
\snippet code/src_corelib_tools_qbytearray.cpp 48
If you want to obtain the length of the data up to and
excluding the first '\\0' character, call qstrlen() on the byte
array.
After a call to resize(), newly allocated bytes have undefined
values. To set all the bytes to a particular value, call fill().
To obtain a pointer to the actual character data, call data() or
constData(). These functions return a pointer to the beginning of the data.
The pointer is guaranteed to remain valid until a non-const function is
called on the QByteArray. It is also guaranteed that the data ends with a
'\\0' byte unless the QByteArray was created from a \l{fromRawData()}{raw
data}. This '\\0' byte is automatically provided by QByteArray and is not
counted in size().
QByteArray provides the following basic functions for modifying
the byte data: append(), prepend(), insert(), replace(), and
remove(). For example:
\snippet code/src_corelib_tools_qbytearray.cpp 3
The replace() and remove() functions' first two arguments are the
position from which to start erasing and the number of bytes that
should be erased.
When you append() data to a non-empty array, the array will be
reallocated and the new data copied to it. You can avoid this
behavior by calling reserve(), which preallocates a certain amount
of memory. You can also call capacity() to find out how much
memory QByteArray actually allocated. Data appended to an empty
array is not copied.
A frequent requirement is to remove whitespace characters from a
byte array ('\\n', '\\t', ' ', etc.). If you want to remove
whitespace from both ends of a QByteArray, use trimmed(). If you
want to remove whitespace from both ends and replace multiple
consecutive whitespaces with a single space character within the
byte array, use simplified().
If you want to find all occurrences of a particular character or
substring in a QByteArray, use indexOf() or lastIndexOf(). The
former searches forward starting from a given index position, the
latter searches backward. Both return the index position of the
character or substring if they find it; otherwise, they return -1.
For example, here's a typical loop that finds all occurrences of a
particular substring:
\snippet code/src_corelib_tools_qbytearray.cpp 4
If you simply want to check whether a QByteArray contains a
particular character or substring, use contains(). If you want to
find out how many times a particular character or substring
occurs in the byte array, use count(). If you want to replace all
occurrences of a particular value with another, use one of the
two-parameter replace() overloads.
\l{QByteArray}s can be compared using overloaded operators such as
operator<(), operator<=(), operator==(), operator>=(), and so on.
The comparison is based exclusively on the numeric values
of the characters and is very fast, but is not what a human would
expect. QString::localeAwareCompare() is a better choice for
sorting user-interface strings.
For historical reasons, QByteArray distinguishes between a null
byte array and an empty byte array. A \e null byte array is a
byte array that is initialized using QByteArray's default
constructor or by passing (const char *)0 to the constructor. An
\e empty byte array is any byte array with size 0. A null byte
array is always empty, but an empty byte array isn't necessarily
null:
\snippet code/src_corelib_tools_qbytearray.cpp 5
All functions except isNull() treat null byte arrays the same as
empty byte arrays. For example, data() returns a valid pointer
(\e not nullptr) to a '\\0' character for a byte array
and QByteArray() compares equal to QByteArray(""). We recommend
that you always use isEmpty() and avoid isNull().
\section1 Maximum size and out-of-memory conditions
The current version of QByteArray is limited to just under 2 GB (2^31
bytes) in size. The exact value is architecture-dependent, since it depends
on the overhead required for managing the data block, but is no more than
32 bytes. Raw data blocks are also limited by the use of \c int type in the
current version to 2 GB minus 1 byte.
In case memory allocation fails, QByteArray will throw a \c std::bad_alloc
exception. Out of memory conditions in the Qt containers are the only case
where Qt will throw exceptions.
Note that the operating system may impose further limits on applications
holding a lot of allocated memory, especially large, contiguous blocks.
Such considerations, the configuration of such behavior or any mitigation
are outside the scope of the QByteArray API.
\section1 Notes on Locale
\section2 Number-String Conversions
Functions that perform conversions between numeric data types and
strings are performed in the C locale, irrespective of the user's
locale settings. Use QString to perform locale-aware conversions
between numbers and strings.
\section2 8-bit Character Comparisons
In QByteArray, the notion of uppercase and lowercase and of which
character is greater than or less than another character is done
in the Latin-1 locale. This affects functions that support a case
insensitive option or that compare or lowercase or uppercase
their arguments. Case insensitive operations and comparisons will
be accurate if both strings contain only Latin-1 characters.
Functions that this affects include contains(), indexOf(),
lastIndexOf(), operator<(), operator<=(), operator>(),
operator>=(), isLower(), isUpper(), toLower() and toUpper().
This issue does not apply to \l{QString}s since they represent
characters using Unicode.
\sa QString, QBitArray
*/
/*!
\enum QByteArray::Base64Option
\since 5.2
This enum contains the options available for encoding and decoding Base64.
Base64 is defined by \l{RFC 4648}, with the following options:
\value Base64Encoding (default) The regular Base64 alphabet, called simply "base64"
\value Base64UrlEncoding An alternate alphabet, called "base64url", which replaces two