-
Notifications
You must be signed in to change notification settings - Fork 6
/
malloc.c
5250 lines (4290 loc) · 170 KB
/
malloc.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
/* Malloc implementation for multiple threads without lock contention.
* asdasd
Copyright (C) 1996-2016 Free Software Foundation, Inc.
This file is part of the GNU C Library.
Contributed by Wolfram Gloger <wg@malloc.de>
and Doug Lea <dl@cs.oswego.edu>, 2001.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public License as
published by the Free Software Foundation; either version 2.1 of the
License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; see the file COPYING.LIB. If
not, see <http://www.gnu.org/licenses/>. */
/*
This is a version (aka ptmalloc2) of malloc/free/realloc written by
Doug Lea and adapted to multiple threads/arenas by Wolfram Gloger.
There have been substantial changes made after the integration into
glibc in all parts of the code. Do not look for much commonality
with the ptmalloc2 version.
* Version ptmalloc2-20011215
based on:
VERSION 2.7.0 Sun Mar 11 14:14:06 2001 Doug Lea (dl at gee)
* Quickstart
In order to compile this implementation, a Makefile is provided with
the ptmalloc2 distribution, which has pre-defined targets for some
popular systems (e.g. "make posix" for Posix threads). All that is
typically required with regard to compiler flags is the selection of
the thread package via defining one out of USE_PTHREADS, USE_THR or
USE_SPROC. Check the thread-m.h file for what effects this has.
Many/most systems will additionally require USE_TSD_DATA_HACK to be
defined, so this is the default for "make posix".
* Why use this malloc?
This is not the fastest, most space-conserving, most portable, or
most tunable malloc ever written. However it is among the fastest
while also being among the most space-conserving, portable and tunable.
Consistent balance across these factors results in a good general-purpose
allocator for malloc-intensive programs.
The main properties of the algorithms are:
* For large (>= 512 bytes) requests, it is a pure best-fit allocator,
with ties normally decided via FIFO (i.e. least recently used).
* For small (<= 64 bytes by default) requests, it is a caching
allocator, that maintains pools of quickly recycled chunks.
* In between, and for combinations of large and small requests, it does
the best it can trying to meet both goals at once.
* For very large requests (>= 128KB by default), it relies on system
memory mapping facilities, if supported.
For a longer but slightly out of date high-level description, see
http://gee.cs.oswego.edu/dl/html/malloc.html
You may already by default be using a C library containing a malloc
that is based on some version of this malloc (for example in
linux). You might still want to use the one in this file in order to
customize settings or to avoid overheads associated with library
versions.
* Contents, described in more detail in "description of public routines" below.
Standard (ANSI/SVID/...) functions:
malloc(size_t n);
calloc(size_t n_elements, size_t element_size);
free(void* p);
realloc(void* p, size_t n);
memalign(size_t alignment, size_t n);
valloc(size_t n);
mallinfo()
mallopt(int parameter_number, int parameter_value)
Additional functions:
independent_calloc(size_t n_elements, size_t size, void* chunks[]);
independent_comalloc(size_t n_elements, size_t sizes[], void* chunks[]);
pvalloc(size_t n);
cfree(void* p);
malloc_trim(size_t pad);
malloc_usable_size(void* p);
malloc_stats();
* Vital statistics:
Supported pointer representation: 4 or 8 bytes
Supported size_t representation: 4 or 8 bytes
Note that size_t is allowed to be 4 bytes even if pointers are 8.
You can adjust this by defining INTERNAL_SIZE_T
Alignment: 2 * sizeof(size_t) (default)
(i.e., 8 byte alignment with 4byte size_t). This suffices for
nearly all current machines and C compilers. However, you can
define MALLOC_ALIGNMENT to be wider than this if necessary.
Minimum overhead per allocated chunk: 4 or 8 bytes
Each malloced chunk has a hidden word of overhead holding size
and status information.
Minimum allocated size: 4-byte ptrs: 16 bytes (including 4 overhead)
8-byte ptrs: 24/32 bytes (including, 4/8 overhead)
When a chunk is freed, 12 (for 4byte ptrs) or 20 (for 8 byte
ptrs but 4 byte size) or 24 (for 8/8) additional bytes are
needed; 4 (8) for a trailing size field and 8 (16) bytes for
free list pointers. Thus, the minimum allocatable size is
16/24/32 bytes.
Even a request for zero bytes (i.e., malloc(0)) returns a
pointer to something of the minimum allocatable size.
The maximum overhead wastage (i.e., number of extra bytes
allocated than were requested in malloc) is less than or equal
to the minimum size, except for requests >= mmap_threshold that
are serviced via mmap(), where the worst case wastage is 2 *
sizeof(size_t) bytes plus the remainder from a system page (the
minimal mmap unit); typically 4096 or 8192 bytes.
Maximum allocated size: 4-byte size_t: 2^32 minus about two pages
8-byte size_t: 2^64 minus about two pages
It is assumed that (possibly signed) size_t values suffice to
represent chunk sizes. `Possibly signed' is due to the fact
that `size_t' may be defined on a system as either a signed or
an unsigned type. The ISO C standard says that it must be
unsigned, but a few systems are known not to adhere to this.
Additionally, even when size_t is unsigned, sbrk (which is by
default used to obtain memory from system) accepts signed
arguments, and may not be able to handle size_t-wide arguments
with negative sign bit. Generally, values that would
appear as negative after accounting for overhead and alignment
are supported only via mmap(), which does not have this
limitation.
Requests for sizes outside the allowed range will perform an optional
failure action and then return null. (Requests may also
also fail because a system is out of memory.)
Thread-safety: thread-safe
Compliance: I believe it is compliant with the 1997 Single Unix Specification
Also SVID/XPG, ANSI C, and probably others as well.
* Synopsis of compile-time options:
People have reported using previous versions of this malloc on all
versions of Unix, sometimes by tweaking some of the defines
below. It has been tested most extensively on Solaris and Linux.
People also report using it in stand-alone embedded systems.
The implementation is in straight, hand-tuned ANSI C. It is not
at all modular. (Sorry!) It uses a lot of macros. To be at all
usable, this code should be compiled using an optimizing compiler
(for example gcc -O3) that can simplify expressions and control
paths. (FAQ: some macros import variables as arguments rather than
declare locals because people reported that some debuggers
otherwise get confused.)
OPTION DEFAULT VALUE
Compilation Environment options:
HAVE_MREMAP 0
Changing default word sizes:
INTERNAL_SIZE_T size_t
MALLOC_ALIGNMENT MAX (2 * sizeof(INTERNAL_SIZE_T),
__alignof__ (long double))
Configuration and functionality options:
USE_PUBLIC_MALLOC_WRAPPERS NOT defined
USE_MALLOC_LOCK NOT defined
MALLOC_DEBUG NOT defined
REALLOC_ZERO_BYTES_FREES 1
TRIM_FASTBINS 0
Options for customizing MORECORE:
MORECORE sbrk
MORECORE_FAILURE -1
MORECORE_CONTIGUOUS 1
MORECORE_CANNOT_TRIM NOT defined
MORECORE_CLEARS 1
MMAP_AS_MORECORE_SIZE (1024 * 1024)
Tuning options that are also dynamically changeable via mallopt:
DEFAULT_MXFAST 64 (for 32bit), 128 (for 64bit)
DEFAULT_TRIM_THRESHOLD 128 * 1024
DEFAULT_TOP_PAD 0
DEFAULT_MMAP_THRESHOLD 128 * 1024
DEFAULT_MMAP_MAX 65536
There are several other #defined constants and macros that you
probably don't want to touch unless you are extending or adapting malloc. */
/*
void* is the pointer type that malloc should say it returns
*/
#ifndef void
#define void void
#endif /*void*/
#include <stddef.h> /* for size_t */
#include <stdlib.h> /* for getenv(), abort() */
#include <unistd.h> /* for __libc_enable_secure */
#include <malloc-machine.h>
#include <malloc-sysdep.h>
#include <atomic.h>
#include <_itoa.h>
#include <bits/wordsize.h>
#include <sys/sysinfo.h>
#include <ldsodefs.h>
#include <unistd.h>
#include <stdio.h> /* needed for malloc_stats */
#include <errno.h>
#include <shlib-compat.h>
/* For uintptr_t. */
#include <stdint.h>
/* For va_arg, va_start, va_end. */
#include <stdarg.h>
/* For MIN, MAX, powerof2. */
#include <sys/param.h>
/* For ALIGN_UP et. al. */
#include <libc-internal.h>
/*
Debugging:
Because freed chunks may be overwritten with bookkeeping fields, this
malloc will often die when freed memory is overwritten by user
programs. This can be very effective (albeit in an annoying way)
in helping track down dangling pointers.
If you compile with -DMALLOC_DEBUG, a number of assertion checks are
enabled that will catch more memory errors. You probably won't be
able to make much sense of the actual assertion errors, but they
should help you locate incorrectly overwritten memory. The checking
is fairly extensive, and will slow down execution
noticeably. Calling malloc_stats or mallinfo with MALLOC_DEBUG set
will attempt to check every non-mmapped allocated and free chunk in
the course of computing the summmaries. (By nature, mmapped regions
cannot be checked very much automatically.)
Setting MALLOC_DEBUG may also be helpful if you are trying to modify
this code. The assertions in the check routines spell out in more
detail the assumptions and invariants underlying the algorithms.
Setting MALLOC_DEBUG does NOT provide an automated mechanism for
checking that all accesses to malloced memory stay within their
bounds. However, there are several add-ons and adaptations of this
or other mallocs available that do this.
*/
#ifndef MALLOC_DEBUG
#define MALLOC_DEBUG 0
#endif
#ifdef NDEBUG
# define assert(expr) ((void) 0)
#else
# define assert(expr) \
((expr) \
? ((void) 0) \
: __malloc_assert (#expr, __FILE__, __LINE__, __func__))
extern const char *__progname;
static void
__malloc_assert (const char *assertion, const char *file, unsigned int line,
const char *function)
{
(void) __fxprintf (NULL, "%s%s%s:%u: %s%sAssertion `%s' failed.\n",
__progname, __progname[0] ? ": " : "",
file, line,
function ? function : "", function ? ": " : "",
assertion);
fflush (stderr);
abort ();
}
#endif
/*
INTERNAL_SIZE_T is the word-size used for internal bookkeeping
of chunk sizes.
The default version is the same as size_t.
While not strictly necessary, it is best to define this as an
unsigned type, even if size_t is a signed type. This may avoid some
artificial size limitations on some systems.
On a 64-bit machine, you may be able to reduce malloc overhead by
defining INTERNAL_SIZE_T to be a 32 bit `unsigned int' at the
expense of not being able to handle more than 2^32 of malloced
space. If this limitation is acceptable, you are encouraged to set
this unless you are on a platform requiring 16byte alignments. In
this case the alignment requirements turn out to negate any
potential advantages of decreasing size_t word size.
Implementors: Beware of the possible combinations of:
- INTERNAL_SIZE_T might be signed or unsigned, might be 32 or 64 bits,
and might be the same width as int or as long
- size_t might have different width and signedness as INTERNAL_SIZE_T
- int and long might be 32 or 64 bits, and might be the same width
To deal with this, most comparisons and difference computations
among INTERNAL_SIZE_Ts should cast them to unsigned long, being
aware of the fact that casting an unsigned int to a wider long does
not sign-extend. (This also makes checking for negative numbers
awkward.) Some of these casts result in harmless compiler warnings
on some systems.
*/
#ifndef INTERNAL_SIZE_T
#define INTERNAL_SIZE_T size_t
#endif
/* The corresponding word size */
#define SIZE_SZ (sizeof(INTERNAL_SIZE_T))
/*
MALLOC_ALIGNMENT is the minimum alignment for malloc'ed chunks.
It must be a power of two at least 2 * SIZE_SZ, even on machines
for which smaller alignments would suffice. It may be defined as
larger than this though. Note however that code and data structures
are optimized for the case of 8-byte alignment.
*/
#ifndef MALLOC_ALIGNMENT
# if !SHLIB_COMPAT (libc, GLIBC_2_0, GLIBC_2_16)
/* This is the correct definition when there is no past ABI to constrain it.
Among configurations with a past ABI constraint, it differs from
2*SIZE_SZ only on powerpc32. For the time being, changing this is
causing more compatibility problems due to malloc_get_state and
malloc_set_state than will returning blocks not adequately aligned for
long double objects under -mlong-double-128. */
# define MALLOC_ALIGNMENT (2 *SIZE_SZ < __alignof__ (long double) \
? __alignof__ (long double) : 2 *SIZE_SZ)
# else
# define MALLOC_ALIGNMENT (2 *SIZE_SZ)
# endif
#endif
/* The corresponding bit mask value */
#define MALLOC_ALIGN_MASK (MALLOC_ALIGNMENT - 1)
/*
REALLOC_ZERO_BYTES_FREES should be set if a call to
realloc with zero bytes should be the same as a call to free.
This is required by the C standard. Otherwise, since this malloc
returns a unique pointer for malloc(0), so does realloc(p, 0).
*/
#ifndef REALLOC_ZERO_BYTES_FREES
#define REALLOC_ZERO_BYTES_FREES 1
#endif
/*
TRIM_FASTBINS controls whether free() of a very small chunk can
immediately lead to trimming. Setting to true (1) can reduce memory
footprint, but will almost always slow down programs that use a lot
of small chunks.
Define this only if you are willing to give up some speed to more
aggressively reduce system-level memory footprint when releasing
memory in programs that use many small chunks. You can get
essentially the same effect by setting MXFAST to 0, but this can
lead to even greater slowdowns in programs using many small chunks.
TRIM_FASTBINS is an in-between compile-time option, that disables
only those chunks bordering topmost memory from being placed in
fastbins.
*/
#ifndef TRIM_FASTBINS
#define TRIM_FASTBINS 0
#endif
/* Definition for getting more memory from the OS. */
#define MORECORE (*__morecore)
#define MORECORE_FAILURE 0
void * __default_morecore (ptrdiff_t);
void *(*__morecore)(ptrdiff_t) = __default_morecore;
#include <string.h>
/*
MORECORE-related declarations. By default, rely on sbrk
*/
/*
MORECORE is the name of the routine to call to obtain more memory
from the system. See below for general guidance on writing
alternative MORECORE functions, as well as a version for WIN32 and a
sample version for pre-OSX macos.
*/
#ifndef MORECORE
#define MORECORE sbrk
#endif
/*
MORECORE_FAILURE is the value returned upon failure of MORECORE
as well as mmap. Since it cannot be an otherwise valid memory address,
and must reflect values of standard sys calls, you probably ought not
try to redefine it.
*/
#ifndef MORECORE_FAILURE
#define MORECORE_FAILURE (-1)
#endif
/*
If MORECORE_CONTIGUOUS is true, take advantage of fact that
consecutive calls to MORECORE with positive arguments always return
contiguous increasing addresses. This is true of unix sbrk. Even
if not defined, when regions happen to be contiguous, malloc will
permit allocations spanning regions obtained from different
calls. But defining this when applicable enables some stronger
consistency checks and space efficiencies.
*/
#ifndef MORECORE_CONTIGUOUS
#define MORECORE_CONTIGUOUS 1
#endif
/*
Define MORECORE_CANNOT_TRIM if your version of MORECORE
cannot release space back to the system when given negative
arguments. This is generally necessary only if you are using
a hand-crafted MORECORE function that cannot handle negative arguments.
*/
/* #define MORECORE_CANNOT_TRIM */
/* MORECORE_CLEARS (default 1)
The degree to which the routine mapped to MORECORE zeroes out
memory: never (0), only for newly allocated space (1) or always
(2). The distinction between (1) and (2) is necessary because on
some systems, if the application first decrements and then
increments the break value, the contents of the reallocated space
are unspecified.
*/
#ifndef MORECORE_CLEARS
# define MORECORE_CLEARS 1
#endif
/*
MMAP_AS_MORECORE_SIZE is the minimum mmap size argument to use if
sbrk fails, and mmap is used as a backup. The value must be a
multiple of page size. This backup strategy generally applies only
when systems have "holes" in address space, so sbrk cannot perform
contiguous expansion, but there is still space available on system.
On systems for which this is known to be useful (i.e. most linux
kernels), this occurs only when programs allocate huge amounts of
memory. Between this, and the fact that mmap regions tend to be
limited, the size should be large, to avoid too many mmap calls and
thus avoid running out of kernel resources. */
#ifndef MMAP_AS_MORECORE_SIZE
#define MMAP_AS_MORECORE_SIZE (1024 * 1024)
#endif
/*
Define HAVE_MREMAP to make realloc() use mremap() to re-allocate
large blocks.
*/
#ifndef HAVE_MREMAP
#define HAVE_MREMAP 0
#endif
/*
This version of malloc supports the standard SVID/XPG mallinfo
routine that returns a struct containing usage properties and
statistics. It should work on any SVID/XPG compliant system that has
a /usr/include/malloc.h defining struct mallinfo. (If you'd like to
install such a thing yourself, cut out the preliminary declarations
as described above and below and save them in a malloc.h file. But
there's no compelling reason to bother to do this.)
The main declaration needed is the mallinfo struct that is returned
(by-copy) by mallinfo(). The SVID/XPG malloinfo struct contains a
bunch of fields that are not even meaningful in this version of
malloc. These fields are are instead filled by mallinfo() with
other numbers that might be of interest.
*/
/* ---------- description of public routines ------------ */
/*
malloc(size_t n)
Returns a pointer to a newly allocated chunk of at least n bytes, or null
if no space is available. Additionally, on failure, errno is
set to ENOMEM on ANSI C systems.
If n is zero, malloc returns a minumum-sized chunk. (The minimum
size is 16 bytes on most 32bit systems, and 24 or 32 bytes on 64bit
systems.) On most systems, size_t is an unsigned type, so calls
with negative arguments are interpreted as requests for huge amounts
of space, which will often fail. The maximum supported value of n
differs across systems, but is in all cases less than the maximum
representable value of a size_t.
*/
void* __libc_malloc(size_t);
libc_hidden_proto (__libc_malloc)
/*
free(void* p)
Releases the chunk of memory pointed to by p, that had been previously
allocated using malloc or a related routine such as realloc.
It has no effect if p is null. It can have arbitrary (i.e., bad!)
effects if p has already been freed.
Unless disabled (using mallopt), freeing very large spaces will
when possible, automatically trigger operations that give
back unused memory to the system, thus reducing program footprint.
*/
void __libc_free(void*);
libc_hidden_proto (__libc_free)
/*
calloc(size_t n_elements, size_t element_size);
Returns a pointer to n_elements * element_size bytes, with all locations
set to zero.
*/
void* __libc_calloc(size_t, size_t);
/*
realloc(void* p, size_t n)
Returns a pointer to a chunk of size n that contains the same data
as does chunk p up to the minimum of (n, p's size) bytes, or null
if no space is available.
The returned pointer may or may not be the same as p. The algorithm
prefers extending p when possible, otherwise it employs the
equivalent of a malloc-copy-free sequence.
If p is null, realloc is equivalent to malloc.
If space is not available, realloc returns null, errno is set (if on
ANSI) and p is NOT freed.
if n is for fewer bytes than already held by p, the newly unused
space is lopped off and freed if possible. Unless the #define
REALLOC_ZERO_BYTES_FREES is set, realloc with a size argument of
zero (re)allocates a minimum-sized chunk.
Large chunks that were internally obtained via mmap will always
be reallocated using malloc-copy-free sequences unless
the system supports MREMAP (currently only linux).
The old unix realloc convention of allowing the last-free'd chunk
to be used as an argument to realloc is not supported.
*/
void* __libc_realloc(void*, size_t);
libc_hidden_proto (__libc_realloc)
/*
memalign(size_t alignment, size_t n);
Returns a pointer to a newly allocated chunk of n bytes, aligned
in accord with the alignment argument.
The alignment argument should be a power of two. If the argument is
not a power of two, the nearest greater power is used.
8-byte alignment is guaranteed by normal malloc calls, so don't
bother calling memalign with an argument of 8 or less.
Overreliance on memalign is a sure way to fragment space.
*/
void* __libc_memalign(size_t, size_t);
libc_hidden_proto (__libc_memalign)
/*
valloc(size_t n);
Equivalent to memalign(pagesize, n), where pagesize is the page
size of the system. If the pagesize is unknown, 4096 is used.
*/
void* __libc_valloc(size_t);
/*
mallopt(int parameter_number, int parameter_value)
Sets tunable parameters The format is to provide a
(parameter-number, parameter-value) pair. mallopt then sets the
corresponding parameter to the argument value if it can (i.e., so
long as the value is meaningful), and returns 1 if successful else
0. SVID/XPG/ANSI defines four standard param numbers for mallopt,
normally defined in malloc.h. Only one of these (M_MXFAST) is used
in this malloc. The others (M_NLBLKS, M_GRAIN, M_KEEP) don't apply,
so setting them has no effect. But this malloc also supports four
other options in mallopt. See below for details. Briefly, supported
parameters are as follows (listed defaults are for "typical"
configurations).
Symbol param # default allowed param values
M_MXFAST 1 64 0-80 (0 disables fastbins)
M_TRIM_THRESHOLD -1 128*1024 any (-1U disables trimming)
M_TOP_PAD -2 0 any
M_MMAP_THRESHOLD -3 128*1024 any (or 0 if no MMAP support)
M_MMAP_MAX -4 65536 any (0 disables use of mmap)
*/
int __libc_mallopt(int, int);
libc_hidden_proto (__libc_mallopt)
/*
mallinfo()
Returns (by copy) a struct containing various summary statistics:
arena: current total non-mmapped bytes allocated from system
ordblks: the number of free chunks
smblks: the number of fastbin blocks (i.e., small chunks that
have been freed but not use resused or consolidated)
hblks: current number of mmapped regions
hblkhd: total bytes held in mmapped regions
usmblks: the maximum total allocated space. This will be greater
than current total if trimming has occurred.
fsmblks: total bytes held in fastbin blocks
uordblks: current total allocated space (normal or mmapped)
fordblks: total free space
keepcost: the maximum number of bytes that could ideally be released
back to system via malloc_trim. ("ideally" means that
it ignores page restrictions etc.)
Because these fields are ints, but internal bookkeeping may
be kept as longs, the reported values may wrap around zero and
thus be inaccurate.
*/
struct mallinfo __libc_mallinfo(void);
/*
pvalloc(size_t n);
Equivalent to valloc(minimum-page-that-holds(n)), that is,
round up n to nearest pagesize.
*/
void* __libc_pvalloc(size_t);
/*
malloc_trim(size_t pad);
If possible, gives memory back to the system (via negative
arguments to sbrk) if there is unused memory at the `high' end of
the malloc pool. You can call this after freeing large blocks of
memory to potentially reduce the system-level memory requirements
of a program. However, it cannot guarantee to reduce memory. Under
some allocation patterns, some large free blocks of memory will be
locked between two used chunks, so they cannot be given back to
the system.
The `pad' argument to malloc_trim represents the amount of free
trailing space to leave untrimmed. If this argument is zero,
only the minimum amount of memory to maintain internal data
structures will be left (one page or less). Non-zero arguments
can be supplied to maintain enough trailing space to service
future expected allocations without having to re-obtain memory
from the system.
Malloc_trim returns 1 if it actually released any memory, else 0.
On systems that do not support "negative sbrks", it will always
return 0.
*/
int __malloc_trim(size_t);
/*
malloc_usable_size(void* p);
Returns the number of bytes you can actually use in
an allocated chunk, which may be more than you requested (although
often not) due to alignment and minimum size constraints.
You can use this many bytes without worrying about
overwriting other allocated objects. This is not a particularly great
programming practice. malloc_usable_size can be more useful in
debugging and assertions, for example:
p = malloc(n);
assert(malloc_usable_size(p) >= 256);
*/
size_t __malloc_usable_size(void*);
/*
malloc_stats();
Prints on stderr the amount of space obtained from the system (both
via sbrk and mmap), the maximum amount (which may be more than
current if malloc_trim and/or munmap got called), and the current
number of bytes allocated via malloc (or realloc, etc) but not yet
freed. Note that this is the number of bytes allocated, not the
number requested. It will be larger than the number requested
because of alignment and bookkeeping overhead. Because it includes
alignment wastage as being in use, this figure may be greater than
zero even when no user-level chunks are allocated.
The reported current and maximum system memory can be inaccurate if
a program makes other calls to system memory allocation functions
(normally sbrk) outside of malloc.
malloc_stats prints only the most commonly interesting statistics.
More information can be obtained by calling mallinfo.
*/
void __malloc_stats(void);
/*
malloc_get_state(void);
Returns the state of all malloc variables in an opaque data
structure.
*/
void* __malloc_get_state(void);
/*
malloc_set_state(void* state);
Restore the state of all malloc variables from data obtained with
malloc_get_state().
*/
int __malloc_set_state(void*);
/*
posix_memalign(void **memptr, size_t alignment, size_t size);
POSIX wrapper like memalign(), checking for validity of size.
*/
int __posix_memalign(void **, size_t, size_t);
/* mallopt tuning options */
/*
M_MXFAST is the maximum request size used for "fastbins", special bins
that hold returned chunks without consolidating their spaces. This
enables future requests for chunks of the same size to be handled
very quickly, but can increase fragmentation, and thus increase the
overall memory footprint of a program.
This malloc manages fastbins very conservatively yet still
efficiently, so fragmentation is rarely a problem for values less
than or equal to the default. The maximum supported value of MXFAST
is 80. You wouldn't want it any higher than this anyway. Fastbins
are designed especially for use with many small structs, objects or
strings -- the default handles structs/objects/arrays with sizes up
to 8 4byte fields, or small strings representing words, tokens,
etc. Using fastbins for larger objects normally worsens
fragmentation without improving speed.
M_MXFAST is set in REQUEST size units. It is internally used in
chunksize units, which adds padding and alignment. You can reduce
M_MXFAST to 0 to disable all use of fastbins. This causes the malloc
algorithm to be a closer approximation of fifo-best-fit in all cases,
not just for larger requests, but will generally cause it to be
slower.
*/
/* M_MXFAST is a standard SVID/XPG tuning option, usually listed in malloc.h */
#ifndef M_MXFAST
#define M_MXFAST 1
#endif
#ifndef DEFAULT_MXFAST
#define DEFAULT_MXFAST (64 * SIZE_SZ / 4)
#endif
/*
M_TRIM_THRESHOLD is the maximum amount of unused top-most memory
to keep before releasing via malloc_trim in free().
Automatic trimming is mainly useful in long-lived programs.
Because trimming via sbrk can be slow on some systems, and can
sometimes be wasteful (in cases where programs immediately
afterward allocate more large chunks) the value should be high
enough so that your overall system performance would improve by
releasing this much memory.
The trim threshold and the mmap control parameters (see below)
can be traded off with one another. Trimming and mmapping are
two different ways of releasing unused memory back to the
system. Between these two, it is often possible to keep
system-level demands of a long-lived program down to a bare
minimum. For example, in one test suite of sessions measuring
the XF86 X server on Linux, using a trim threshold of 128K and a
mmap threshold of 192K led to near-minimal long term resource
consumption.
If you are using this malloc in a long-lived program, it should
pay to experiment with these values. As a rough guide, you
might set to a value close to the average size of a process
(program) running on your system. Releasing this much memory
would allow such a process to run in memory. Generally, it's
worth it to tune for trimming rather tham memory mapping when a
program undergoes phases where several large chunks are
allocated and released in ways that can reuse each other's
storage, perhaps mixed with phases where there are no such
chunks at all. And in well-behaved long-lived programs,
controlling release of large blocks via trimming versus mapping
is usually faster.
However, in most programs, these parameters serve mainly as
protection against the system-level effects of carrying around
massive amounts of unneeded memory. Since frequent calls to
sbrk, mmap, and munmap otherwise degrade performance, the default
parameters are set to relatively high values that serve only as
safeguards.
The trim value It must be greater than page size to have any useful
effect. To disable trimming completely, you can set to
(unsigned long)(-1)
Trim settings interact with fastbin (MXFAST) settings: Unless
TRIM_FASTBINS is defined, automatic trimming never takes place upon
freeing a chunk with size less than or equal to MXFAST. Trimming is
instead delayed until subsequent freeing of larger chunks. However,
you can still force an attempted trim by calling malloc_trim.
Also, trimming is not generally possible in cases where
the main arena is obtained via mmap.
Note that the trick some people use of mallocing a huge space and
then freeing it at program startup, in an attempt to reserve system
memory, doesn't have the intended effect under automatic trimming,
since that memory will immediately be returned to the system.
*/
#define M_TRIM_THRESHOLD -1
#ifndef DEFAULT_TRIM_THRESHOLD
#define DEFAULT_TRIM_THRESHOLD (128 * 1024)
#endif
/*
M_TOP_PAD is the amount of extra `padding' space to allocate or
retain whenever sbrk is called. It is used in two ways internally:
* When sbrk is called to extend the top of the arena to satisfy
a new malloc request, this much padding is added to the sbrk
request.
* When malloc_trim is called automatically from free(),
it is used as the `pad' argument.
In both cases, the actual amount of padding is rounded
so that the end of the arena is always a system page boundary.
The main reason for using padding is to avoid calling sbrk so
often. Having even a small pad greatly reduces the likelihood
that nearly every malloc request during program start-up (or
after trimming) will invoke sbrk, which needlessly wastes
time.
Automatic rounding-up to page-size units is normally sufficient
to avoid measurable overhead, so the default is 0. However, in
systems where sbrk is relatively slow, it can pay to increase
this value, at the expense of carrying around more memory than
the program needs.
*/
#define M_TOP_PAD -2
#ifndef DEFAULT_TOP_PAD
#define DEFAULT_TOP_PAD (0)
#endif
/*
MMAP_THRESHOLD_MAX and _MIN are the bounds on the dynamically
adjusted MMAP_THRESHOLD.
*/
#ifndef DEFAULT_MMAP_THRESHOLD_MIN
#define DEFAULT_MMAP_THRESHOLD_MIN (128 * 1024)
#endif
#ifndef DEFAULT_MMAP_THRESHOLD_MAX
/* For 32-bit platforms we cannot increase the maximum mmap
threshold much because it is also the minimum value for the
maximum heap size and its alignment. Going above 512k (i.e., 1M
for new heaps) wastes too much address space. */
# if __WORDSIZE == 32
# define DEFAULT_MMAP_THRESHOLD_MAX (512 * 1024)
# else
# define DEFAULT_MMAP_THRESHOLD_MAX (4 * 1024 * 1024 * sizeof(long))
# endif
#endif
/*
M_MMAP_THRESHOLD is the request size threshold for using mmap()
to service a request. Requests of at least this size that cannot
be allocated using already-existing space will be serviced via mmap.
(If enough normal freed space already exists it is used instead.)
Using mmap segregates relatively large chunks of memory so that
they can be individually obtained and released from the host
system. A request serviced through mmap is never reused by any
other request (at least not directly; the system may just so
happen to remap successive requests to the same locations).
Segregating space in this way has the benefits that:
1. Mmapped space can ALWAYS be individually released back
to the system, which helps keep the system level memory
demands of a long-lived program low.
2. Mapped memory can never become `locked' between
other chunks, as can happen with normally allocated chunks, which
means that even trimming via malloc_trim would not release them.
3. On some systems with "holes" in address spaces, mmap can obtain
memory that sbrk cannot.
However, it has the disadvantages that:
1. The space cannot be reclaimed, consolidated, and then
used to service later requests, as happens with normal chunks.
2. It can lead to more wastage because of mmap page alignment
requirements
3. It causes malloc performance to be more dependent on host
system memory management support routines which may vary in
implementation quality and may impose arbitrary
limitations. Generally, servicing a request via normal
malloc steps is faster than going through a system's mmap.
The advantages of mmap nearly always outweigh disadvantages for
"large" chunks, but the value of "large" varies across systems. The
default is an empirically derived value that works well in most
systems.
Update in 2006:
The above was written in 2001. Since then the world has changed a lot.
Memory got bigger. Applications got bigger. The virtual address space
layout in 32 bit linux changed.
In the new situation, brk() and mmap space is shared and there are no
artificial limits on brk size imposed by the kernel. What is more,
applications have started using transient allocations larger than the
128Kb as was imagined in 2001.
The price for mmap is also high now; each time glibc mmaps from the
kernel, the kernel is forced to zero out the memory it gives to the
application. Zeroing memory is expensive and eats a lot of cache and
memory bandwidth. This has nothing to do with the efficiency of the
virtual memory system, by doing mmap the kernel just has no choice but
to zero.
In 2001, the kernel had a maximum size for brk() which was about 800
megabytes on 32 bit x86, at that point brk() would hit the first
mmaped shared libaries and couldn't expand anymore. With current 2.6
kernels, the VA space layout is different and brk() and mmap
both can span the entire heap at will.
Rather than using a static threshold for the brk/mmap tradeoff,
we are now using a simple dynamic one. The goal is still to avoid
fragmentation. The old goals we kept are
1) try to get the long lived large allocations to use mmap()
2) really large allocations should always use mmap()
and we're adding now:
3) transient allocations should use brk() to avoid forcing the kernel
having to zero memory over and over again
The implementation works with a sliding threshold, which is by default
limited to go between 128Kb and 32Mb (64Mb for 64 bitmachines) and starts
out at 128Kb as per the 2001 default.
This allows us to satisfy requirement 1) under the assumption that long
lived allocations are made early in the process' lifespan, before it has