Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse code

nommu: fix malloc performance by adding uninitialized flag

The NOMMU code currently clears all anonymous mmapped memory.  While this
is what we want in the default case, all memory allocation from userspace
under NOMMU has to go through this interface, including malloc() which is
allowed to return uninitialized memory.  This can easily be a significant
performance penalty.  So for constrained embedded systems were security is
irrelevant, allow people to avoid clearing memory unnecessarily.

This also alters the ELF-FDPIC binfmt such that it obtains uninitialised
memory for the brk and stack region.

Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  • Loading branch information...
commit ea637639591def87a54cea811cbac796980cb30d 1 parent 5dc3764
Jie Zhang authored committed
26 Documentation/nommu-mmap.txt
@@ -119,6 +119,32 @@ FURTHER NOTES ON NO-MMU MMAP
119 119 granule but will only discard the excess if appropriately configured as
120 120 this has an effect on fragmentation.
121 121
  122 + (*) The memory allocated by a request for an anonymous mapping will normally
  123 + be cleared by the kernel before being returned in accordance with the
  124 + Linux man pages (ver 2.22 or later).
  125 +
  126 + In the MMU case this can be achieved with reasonable performance as
  127 + regions are backed by virtual pages, with the contents only being mapped
  128 + to cleared physical pages when a write happens on that specific page
  129 + (prior to which, the pages are effectively mapped to the global zero page
  130 + from which reads can take place). This spreads out the time it takes to
  131 + initialize the contents of a page - depending on the write-usage of the
  132 + mapping.
  133 +
  134 + In the no-MMU case, however, anonymous mappings are backed by physical
  135 + pages, and the entire map is cleared at allocation time. This can cause
  136 + significant delays during a userspace malloc() as the C library does an
  137 + anonymous mapping and the kernel then does a memset for the entire map.
  138 +
  139 + However, for memory that isn't required to be precleared - such as that
  140 + returned by malloc() - mmap() can take a MAP_UNINITIALIZED flag to
  141 + indicate to the kernel that it shouldn't bother clearing the memory before
  142 + returning it. Note that CONFIG_MMAP_ALLOW_UNINITIALIZED must be enabled
  143 + to permit this, otherwise the flag will be ignored.
  144 +
  145 + uClibc uses this to speed up malloc(), and the ELF-FDPIC binfmt uses this
  146 + to allocate the brk and stack region.
  147 +
122 148 (*) A list of all the private copy and anonymous mappings on the system is
123 149 visible through /proc/maps in no-MMU mode.
124 150
3  fs/binfmt_elf_fdpic.c
@@ -380,7 +380,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
380 380 down_write(&current->mm->mmap_sem);
381 381 current->mm->start_brk = do_mmap(NULL, 0, stack_size,
382 382 PROT_READ | PROT_WRITE | PROT_EXEC,
383   - MAP_PRIVATE | MAP_ANONYMOUS | MAP_GROWSDOWN,
  383 + MAP_PRIVATE | MAP_ANONYMOUS |
  384 + MAP_UNINITIALIZED | MAP_GROWSDOWN,
384 385 0);
385 386
386 387 if (IS_ERR_VALUE(current->mm->start_brk)) {
5 include/asm-generic/mman-common.h
@@ -19,6 +19,11 @@
19 19 #define MAP_TYPE 0x0f /* Mask for type of mapping */
20 20 #define MAP_FIXED 0x10 /* Interpret addr exactly */
21 21 #define MAP_ANONYMOUS 0x20 /* don't use a file */
  22 +#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
  23 +# define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be uninitialized */
  24 +#else
  25 +# define MAP_UNINITIALIZED 0x0 /* Don't support this flag */
  26 +#endif
22 27
23 28 #define MS_ASYNC 1 /* sync memory asynchronously */
24 29 #define MS_INVALIDATE 2 /* invalidate the caches */
22 init/Kconfig
@@ -1079,6 +1079,28 @@ config SLOB
1079 1079
1080 1080 endchoice
1081 1081
  1082 +config MMAP_ALLOW_UNINITIALIZED
  1083 + bool "Allow mmapped anonymous memory to be uninitialized"
  1084 + depends on EMBEDDED && !MMU
  1085 + default n
  1086 + help
  1087 + Normally, and according to the Linux spec, anonymous memory obtained
  1088 + from mmap() has it's contents cleared before it is passed to
  1089 + userspace. Enabling this config option allows you to request that
  1090 + mmap() skip that if it is given an MAP_UNINITIALIZED flag, thus
  1091 + providing a huge performance boost. If this option is not enabled,
  1092 + then the flag will be ignored.
  1093 +
  1094 + This is taken advantage of by uClibc's malloc(), and also by
  1095 + ELF-FDPIC binfmt's brk and stack allocator.
  1096 +
  1097 + Because of the obvious security issues, this option should only be
  1098 + enabled on embedded devices where you control what is run in
  1099 + userspace. Since that isn't generally a problem on no-MMU systems,
  1100 + it is normally safe to say Y here.
  1101 +
  1102 + See Documentation/nommu-mmap.txt for more information.
  1103 +
1082 1104 config PROFILING
1083 1105 bool "Profiling support (EXPERIMENTAL)"
1084 1106 help
8 mm/nommu.c
@@ -1143,9 +1143,6 @@ static int do_mmap_private(struct vm_area_struct *vma,
1143 1143 if (ret < rlen)
1144 1144 memset(base + ret, 0, rlen - ret);
1145 1145
1146   - } else {
1147   - /* if it's an anonymous mapping, then just clear it */
1148   - memset(base, 0, rlen);
1149 1146 }
1150 1147
1151 1148 return 0;
@@ -1343,6 +1340,11 @@ unsigned long do_mmap_pgoff(struct file *file,
1343 1340 goto error_just_free;
1344 1341 add_nommu_region(region);
1345 1342
  1343 + /* clear anonymous mappings that don't ask for uninitialized data */
  1344 + if (!vma->vm_file && !(flags & MAP_UNINITIALIZED))
  1345 + memset((void *)region->vm_start, 0,
  1346 + region->vm_end - region->vm_start);
  1347 +
1346 1348 /* okay... we have a mapping; now we have to register it */
1347 1349 result = vma->vm_start;
1348 1350

0 comments on commit ea63763

Please sign in to comment.
Something went wrong with that request. Please try again.