inotify.03: missing warning concerning loss of data #11

Closed
xypron opened this Issue Jun 1, 2014 · 6 comments

Comments

Projects
None yet
3 participants

xypron commented Jun 1, 2014

testcases/kernel/syscalls/inotify/inotify03.c
should warn that the drive used for testing will be formatted and this will result in data loss. Only after confirmation by the user the test should continue.

Furthermore, please, add the test to
testcases/kernel/syscalls/inotify/README

Owner

metan-ucw commented Jun 2, 2014

The first part wouldn't be easy to implement. LTP is automated testsuite, i.e. user starts it by the toplevel script, then it runs for hours and once finished the user collects the results. The tests MUST NOT read any input from user.

The second part would be easier. However the same holds for more than the inotify03 testcase, there are at least 30 testcases that works with a device in the same way.

There is a warning already in the top level README:

Warning!
--------
Be careful with these tests!

Don't run them on production systems.

Which clearly says that you should not run LTP on any machine that holds any important data. IMHO the best solution would be to add a more explicit note that devices passed to testcases are formatted there. What do you think?

Owner

vapier commented Jun 2, 2014

i haven't looked at the others, but at least for inotify03, there's no real reason to require a random external device be provided by the user. it should be more than capable of creating a sparse temporary file, formatting it as ext2, and then doing its mount/umount tests, and then deleting it.

Owner

metan-ucw commented Jun 2, 2014

vapier: That is what the runltp script does at the moment.

If no device is passed to it it creates a loop device that is passed to all tests.

I would be happier if the logic was in libltp.a. However there are a few problems that comes with it. One of the problems is that if tests fails to cleanup the loop device it will stay in the system until reboot or manual removal, so after a few iterations the system would be filled with unused loop devices and these tests will fail from that point on.

The second problem is just a matter of time, redesigning the libltp interface is cheap. Rewriting all testcases that use the interface is more time consuming.

Owner

vapier commented Jun 2, 2014

the test cases already need to clean up their mounts, so we can't get away from their inability to clean up after themselves. although that can be mitigated by running test cases in their own mount namespace.

i agree that it'd be much nicer if libltp provided an API for allocating on the fly and for registering their clean up phase.

if we want to close this out sooner rather than later, then just adding a warning to runltp when someone specifies their own device input should be sufficient.

xypron commented Jun 3, 2014

The top level README comment "Don't run them on production systems." does not imply that I should not use tlp on my workstation, which is a development system and not a production system. Even with this comment I expect tlp to change only files inside its own directory tree and afterwards clean up again.

As vapier already mentioned tests for mounting and formatting should be executed using a tempory file formatted as ext2.

metan-ucw wrote:

If no device is passed to it it creates a loop device that is passed to all tests.

My intention was to only execute the inotify tests. Creating a loop device in some script outside the inotify directory does not help in this case.

Owner

metan-ucw commented Jun 4, 2014

xypron wrote:

The top level README comment "Don't run them on production systems." does not imply that I should not use tlp on my workstation, which is a development system and not a production system.

I'm just saying that you should be careful when running the testcases. They are supposed to stress Linux to the limits and try unusual codepaths and may crash the system under test. Furthermore when passing devices to testcase you should really expect that the data would be destroyed.

Even with this comment I expect tlp to change only files inside its own directory tree and afterwards clean up again.

This is not generally true, LTP uses $TMPDIR (defaults to /tmp/) to create all temporary files. It should not, on the other hand, touch anything outside the temp directory created in $TMPDIR and the directory is cleaned up once testcase has finished.

My intention was to only execute the inotify tests. Creating a loop device in some script outside the inotify directory does not help in this case.

You are expected to install LTP and run the test using the runltp script (runltp -f testcase_name). If you need to execute the testcase manually you need to prepare device beforehand. I'm not happy with this and I have this on my TODO, but I cannot promise that this will get fixed soon.

metan-ucw closed this in f8bf865 Jun 19, 2014

@jstancek jstancek added a commit that referenced this issue Dec 22, 2015

@wangli5665 @jstancek wangli5665 + jstancek hugetlb/hugemmap: add new testcase hugemmap06.c
Description of Problem:
  There is a race condition if we map a same file on different processes.
  Region tracking is protected by mmap_sem and hugetlb_instantiation_mutex.
  When we do mmap, we don't grab a hugetlb_instantiation_mutex, but only
  mmap_sem (exclusively).  This doesn't prevent other tasks from modifying
  the region structure, so it can be modified by two processes concurrently.

  Testcase hugemmap06.c is the trigger to cause system crash:
  crash> bt -s
  PID: 4492   TASK: ffff88033e437520  CPU: 2   COMMAND: "hugemmap06"
   #0 [ffff88033dbb3960] machine_kexec+395 at ffffffff8103d1ab
   #1 [ffff88033dbb39c0] crash_kexec+114 at ffffffff810cc4f2
   #2 [ffff88033dbb3a90] oops_end+192 at ffffffff8153c840
   #3 [ffff88033dbb3ac0] die+91 at ffffffff81010f5b
   #4 [ffff88033dbb3af0] do_general_protection+338 at ffffffff8153c332
   #5 [ffff88033dbb3b20] general_protection+37 at ffffffff8153bb05
      [exception RIP: list_del+40]
      RIP: ffffffff812a3598  RSP: ffff88033dbb3bd8  RFLAGS: 00010292
      RAX: dead000000100100  RBX: ffff88013cf37340  RCX: 0000000000002dc2
      RDX: dead000000200200  RSI: 0000000000000046  RDI: 0000000000000009
      RBP: ffff88033dbb3be8   R8: 0000000000015598   R9: 0000000000000000
      R10: 000000000000000f  R11: 0000000000000009  R12: 000000000000000a
      R13: ffff88033d64b9e8  R14: ffff88033e5b9720  R15: ffff88013cf37340
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
   #6 [ffff88033dbb3bf0] region_add+154 at ffffffff811698da
   #7 [ffff88033dbb3c40] alloc_huge_page+669 at ffffffff8116a61d
   #8 [ffff88033dbb3ce0] hugetlb_fault+1083 at ffffffff8116b9bb
   #9 [ffff88033dbb3d90] handle_mm_fault+917 at ffffffff81153295
  #10 [ffff88033dbb3e00] __do_page_fault+326 at ffffffff8104f156
  #11 [ffff88033dbb3f20] do_page_fault+62 at ffffffff8153e78e
  #12 [ffff88033dbb3f50] page_fault+37 at ffffffff8153bb35
      RIP: 00000000004027c6  RSP: 00007f7cadef9e80  RFLAGS: 00010297
      RAX: 000000005a49238f  RBX: 00007ffcb2d19320  RCX: 000000357498e084
      RDX: 000000357498e0b0  RSI: 00007f7cadef9e5c  RDI: 000000357498e4e0
      RBP: 0000000000000008   R8: 000000357498e0a0   R9: 000000357498e100
      R10: 00007f7cadefa9d0  R11: 0000000000000206  R12: 0000000000000007
      R13: 0000000000000002  R14: 0000000000000003  R15: 00002aaaac000000
      ORIG_RAX: ffffffffffffffff  CS: 0033  SS: 002b

The fix are all these below commits:
  f522c3ac00(mm, hugetlb: change variable name reservations to resv)
  9119a41e90(mm, hugetlb: unify region structure handling)
  7b24d8616b(mm, hugetlb: fix race in region tracking)
  1406ec9ba6(mm, hugetlb: improve, cleanup resv_map parameters)

Signed-off-by: Li Wang <liwang@redhat.com>
Signed-off-by: Jan Stancek <jstancek@redhat.com>
6daba08

@jstancek jstancek added a commit that referenced this issue May 22, 2017

@jstancek jstancek pipeio: avoid SAFE macros in cleanup()
Only newlib testcases support SAFE macros in cleanup().
When SAFE_UNLINK fails, it creates infinite loop between
tst_brk_ and cleanup:

 #0  tst_res__ at tst_res.c:153
 #1  0x0000000000407ba8 in tst_brk__ at tst_res.c:480
 #2  0x00000000004081fe in tst_brkm_ at tst_res.c:577
 #3  0x000000000040a7c9 in safe_unlink at safe_macros.c:358
 #4  0x0000000000404abd in cleanup () at pipeio.c:497
 #5  0x0000000000407bc7 in tst_brk__ at tst_res.c:498
 #6  0x00000000004081fe in tst_brkm_ at tst_res.c:577
 #7  0x000000000040c1d6 in def_handler at tst_sig.c:231
 #8  <signal handler called>
 #9  0x00007f29c2cbd1f7 in __GI_raise at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
 #10 0x00007f29c2cbe8e8 in __GI_abort () at abort.c:90
 #11 0x00000000004081af in tst_brkm_ at tst_res.c:581
 #12 0x000000000040a7c9 in safe_unlink at safe_macros.c:358
 #13 0x0000000000404abd in cleanup () at pipeio.c:497
 #14 0x0000000000407bc7 in tst_brk__ at tst_res.c:498
 #15 0x00000000004081fe in tst_brkm_ at tst_res.c:577
 #16 0x000000000040c1d6 in def_handler at tst_sig.c:231
 #17 <signal handler called>
 #18 0x00007f29c2cbd1f7 in __GI_raise at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
 #19 0x00007f29c2cbe8e8 in __GI_abort () at abort.c:90
 #20 0x00000000004081af in tst_brkm_ at tst_res.c:581
 #21 0x000000000040a7c9 in safe_unlink at safe_macros.c:358
 #22 0x0000000000404abd in cleanup () at pipeio.c:497
 ...

Signed-off-by: Jan Stancek <jstancek@redhat.com>
d96efcf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment