rsahlberg/flaky-stgt
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is probably not the version of TGT you are looking for. This is a special version of TGTD that is used to emulate a flaky/broken disk and is not useful to you unless you are a SCSI initiator driver developer or a filesystem developer and you want to test how your error recovery paths work when disks go bad. Do not use this unless you are a SCSI initiator or filesystem developer! Show all current error injections: ./usr/tgtadm --mode error --op show Discard all current error injections: ./usr/tgtadm --mode error --op delete Add a new error injection: ./usr/tgtadm --mode error --op new --tid 1 --lun 3 --error op=<command>,lba=<lba>,len=<len>,pct=<pct>,pause=<seconds>,repeat=<count>,action=<action>,key=<key>,asc=<asc> <command> is the SCSI command. Supported SCSI commands are READ6/10/12/16, WRITE6/10/12/16 <lba> is the first LBA affected by this error. <len> is num ber of blocks, starting at <lba> that is affected by this error. <pct> is probability in percent that this error will trigger. It is usually 100 but you can set it to something smaller to emulate a transient error. Perhaps setting pct=10 for any READ/WRITE to make a disk fail 10% of all operations could be a good torture test for an initiator/filesystem to test that it can handle and redrive I/O on recoverable errors. <seconds> is number of seconds the device will hand for errors triggered by this rule before it will respond to the request. This is probably almost always going to be 0. <count> If set to > 0 then this error will only trigger <count> times before the error rule will become disabled. This is useful for transient errors. <action> Is the type of action that willb e taken if the rule triggers. CHECK_CONDITION means that we will fail the I/O with a SCSI Check Condition. For Check Condition you must also specify key and asc. <key> is the SCSI sense key to return for check condition. Example: key==3 means MEDIUM_ERROR and is useful for emulating damaged disks. <asc> is the SCSI ASC to return for check conditions. Example: asc==0x0c02 means WRITE ERROR AUTOREALLOCATION FAILED. See SPC4 for a list of all ASCs. === Example: show how to create a multi-device filesystem and then "break" one of the devices to see how the fulesystem copes. === # # Create three disks and export them through flaky iSCSI # truncate -s 1G /data/tmp/disk1.img truncate -s 1G /data/tmp/disk2.img truncate -s 1G /data/tmp/disk3.img killall -9 tgtd ./usr/tgtd -f -d 1 & sleep 3 ./usr/tgtadm --op new --mode target --tid 1 -T iqn.ronnie.test ./usr/tgtadm --op new --mode logicalunit --tid 1 --lun 1 -b /data/tmp/disk1.img --blocksize=4096 ./usr/tgtadm --op new --mode logicalunit --tid 1 --lun 2 -b /data/tmp/disk2.img --blocksize=4096 ./usr/tgtadm --op new --mode logicalunit --tid 1 --lun 3 -b /data/tmp/disk3.img --blocksize=4096 ./usr/tgtadm --op bind --mode target --tid 1 -I ALL # # connect to the three disks # iscsiadm --mode discoverydb --type sendtargets --portal 127.0.0.1 --discover iscsiadm --mode node --targetname iqn.ronnie.test --portal 127.0.0.1:3260 --login # # check dmesg, you should now have three new 1G disks # # Use: iscsiadm --mode node --targetname iqn.ronnie.test \ # --portal 127.0.0.1:3260 --logout # to disconnect the disks when you are finished. # create a btrfs filesystem mkfs.btrfs -f -d raid1 /dev/disk/by-path/ip-127.0.0.1:3260-iscsi-iqn.ronnie.test-lun-1 /dev/disk/by-path/ip-127.0.0.1:3260-iscsi-iqn.ronnie.test-lun-2 /dev/disk/by-path/ip-127.0.0.1:3260-iscsi-iqn.ronnie.test-lun-3 # mount the filesystem mount /dev/disk/by-path/ip-127.0.0.1:3260-iscsi-iqn.ronnie.test-lun-1 /mnt # copy some big files to the filesystem # make it impossible to write to the third disk # 3 - MEDIUM ERROR # 0x0c02 - WRITE ERROR AUTOREALLOCATION FAILED # ./usr/tgtadm --mode error --op new --tid 1 --lun 3 --error op=WRITE10,lba=0,len=99999999,pct=100,pause=0,repeat=0,action=CHECK_CONDITION,key=3,asc=0x0c02 which means: WRITE10 overlaping the area LBA:0 and the next 99999999 blocks will in 100% of the time fail with a Check condition and medium error/write error. Setting delay!=0 means that the scsi disk will "hang" for that many seconds before responding. Setting repeat!=0 means that this condition will ONLY apply that many times and then it will "self heal". To show all current error injects: # ./usr/tgtadm --mode error --op show # # To delete/clear all current error injects: # ./usr/tgtadm --mode error --op delete
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published